I am working on a C# project where I need to get data from a secured web site that does not have an API or web services. My plan is to login, get to the page I need, and parse out the HTML to get to the data bits I need to log to a database. Right now I'm testing with a console app, but eventually this will be converted to an Azure Service bus application.
In order to get to anything, you have to login at their login.cfm page, which means I need to load the username and password input controls on the page and click the submit button. Then navigate to the page I need to parse.
Since I don't have a 'browser' to parse for controls, I am trying to use various C# .NET classes to get to the page, set the username and password, and click submit, but nothing seems to work.
Any examples I can look at, or .NET classes I should be reviewing that were designed for this sort of project?
Thanks!
Use the WebClient class in System.Net
For persistence of session cookie you'll have to make a custom WebClient class.
#region webclient with cookies
public class WebClientX : WebClient
{
public CookieContainer cookies = new CookieContainer();
protected override WebRequest GetWebRequest(Uri location)
{
WebRequest req = base.GetWebRequest(location);
if (req is HttpWebRequest)
(req as HttpWebRequest).CookieContainer = cookies;
return req;
}
protected override WebResponse GetWebResponse(WebRequest request)
{
WebResponse res = base.GetWebResponse(request);
if (res is HttpWebResponse)
cookies.Add((res as HttpWebResponse).Cookies);
return res;
}
}
#endregion
Use a browser add-on like FireBug or the development tools built into Chrome to get the HTTP POST data being sent when you submit a form. Send those POSTs using the WebClientX class and parse the response HTML.
The fastest way to parse HTML when you already know the format is using a simple Regex.Match. So you'd go through the actions in your browser using the development tools to record your POSTs, URLs and HTML content then you'll perform the same tasks using the WebClientX.
Ok, so here is the complete Code to login to one page, then read from a 2nd page after the login.
class Program
{
static void Main(string[] args)
{
string uriString = "http://www.remotesite.com/login.cfm";
// Create a new WebClient instance.
WebClientX myWebClient = new WebClientX();
// Create a new NameValueCollection instance to hold some custom parameters to be posted to the URL.
NameValueCollection myNameValueCollection = new NameValueCollection();
// Add necessary parameter/value pairs to the name/value container.
myNameValueCollection.Add("userid", "myname");
myNameValueCollection.Add("mypassword", "mypassword");
Console.WriteLine("\nUploading to {0} ...", uriString);
// 'The Upload(String,NameValueCollection)' implicitly method sets HTTP POST as the request method.
byte[] responseArray = myWebClient.UploadValues(uriString, myNameValueCollection);
// Decode and display the response.
Console.WriteLine("\nResponse received was :\n{0}", Encoding.ASCII.GetString(responseArray));
Console.WriteLine("\n\n\n pausing...");
Console.ReadKey();
// Go to 2nd page on the site to get additional data
Stream myStream = myWebClient.OpenRead("https://www.remotesite.com/status_results.cfm?t=8&prog=d");
Console.WriteLine("\nDisplaying Data :\n");
StreamReader sr = new StreamReader(myStream);
StringBuilder sb = new StringBuilder();
using (StreamReader reader = new StreamReader(myStream, System.Text.Encoding.UTF8))
{
string line;
while ((line = reader.ReadLine()) != null)
{
sb.Append(line + "\r\n");
}
}
using (StreamWriter outfile = new StreamWriter(#"Logfile1.txt"))
{
outfile.Write(sb.ToString());
}
Console.WriteLine(sb.ToString());
Console.WriteLine("\n\n\n pausing...");
Console.ReadKey();
}
}
public class WebClientX : WebClient
{
public CookieContainer cookies = new CookieContainer();
protected override WebRequest GetWebRequest(Uri location)
// public override WebRequest GetWebRequest(Uri location)
{
WebRequest req = base.GetWebRequest(location);
if (req is HttpWebRequest)
(req as HttpWebRequest).CookieContainer = cookies;
return req;
}
protected override WebResponse GetWebResponse(WebRequest request)
{
WebResponse res = base.GetWebResponse(request);
if (res is HttpWebResponse)
cookies.Add((res as HttpWebResponse).Cookies);
return res;
}
}
Related
I am trying to pass login creds from a WebView into an HttpWebRequest but not having any luck getting an authenticated response. I am able to successfully make the request, but the response is acting like I haven't logged in. My app has 5 WebViews contained within Fragment s and I'm logged in on all of them. I've tried using the CookieSyncManager but it's deprecated and .Sync() didn't work. I've tried a lot of different ways of passing the cookies into the HttpRequest with no success and many hours spent.
One would think this is a simple request; user has logged in within the app; they should be authenticated for all requests. Here's the closest that I've gotten, but the response string is still not the same as through my authenticated WebView :
This attempt parses each Cookie into a string and adds it
public string _cookieString { get; set; }
private class ExtWebViewClient : WebViewClient
{
TheFragment5 _fm5 = new TheFragment5();
public override void OnPageFinished(WebView view, string url)
{
var cookieHeader = Android.Webkit.CookieManager.Instance.GetCookie(url);
var cookiePairs = cookieHeader.Split('&');
_fm5._cookieString = "";
foreach (var cookiePair in cookiePairs)
{
var cookiePieces = cookiePair.Split('=');
if (cookiePieces[0].Contains(":"))
cookiePieces[0] = cookiePieces[0].Substring(0, cookiePieces[0].IndexOf(":"));
cookies.Add(new Cookie
{
Name = cookiePieces[0],
Value = cookiePieces[1]
});
}
foreach (Cookie c in cookies)
{
if (_fm5._cookieString == "")
{
_fm5._cookieString = c.ToString();
}
else
{
_fm5._cookieString += c.ToString();
}
}
}
}
I've also tried just doing:
_fm5._cookieString = cookieHeader.ToString();
but neither of those attempts is working when I add the cookie string into my HttpRequest :
public async void GetNotificationText(string url)
{
//var _cmhc = _cookieMan.HasCookies;
await Task.Run(() =>
{
_notificationHttpRequestInProgress = true;
try
{
var _ctxxx = Android.App.Application.Context;
//URL _url2 = new URL("https://bitchute.com/notifications/");
//HttpURLConnection conn = (HttpURLConnection)_url2.OpenConnection();
//conn.ReadTimeout = 10000 /* milliseconds */;
//conn.ConnectTimeout = 15000 /* milliseconds */;
////conn.SetRequestProperty("Cookie", cookies);
//conn.Connect();
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
Uri uri = new Uri(url);
var _req = request;
var _uriii = uri;
var _cookiesss = _fm5._cookieString;
_cookieCon.SetCookies(uri, _cookiesss);
request.CookieContainer = _cookieCon;
//request.CookieContainer.SetCookies(uri, _cookiesss);
request.AutomaticDecompression = DecompressionMethods.GZip;
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
using (Stream stream = response.GetResponseStream())
using (StreamReader reader = new StreamReader(stream))
{
_notificationRawText = reader.ReadToEnd();
Console.WriteLine(_notificationRawText);
_rawNoteText = _notificationRawText;
}
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
_notificationHttpRequestInProgress = false;
});
}
This returns, but not the authenticated webtext request; I get the same response any user would get on a browser having never logged in. If I were to browse out to this same url on any WebView in my app, I'd get a completely different response.
You will also notice some commented out code that was another failed attempt at adding the cookies into a connection. I had also tried using HttpURLConnection.SetRequestProperty("Cookie", cookies);
where cookies was a CookieCollection and that didn't work either. The code is mostly commented out and layered because I've been trying this for days.
Does anyone know how I can pass WebView cookies into an HttpRequest using Xamarin.Android?
I am putting this code below in Fragment5 of my app; you can see and compile the full context here:
https://github.com/hexag0d/BitChute_Mobile_Android_BottomNav/blob/NotificationAdder/Fragments/TheFragment5.cs
I'm not sure exactly why the above example didn't work; maybe if you're better at .NET than I am, you could figure it out. However, I was able to successfully pass WebView creds into an HttpClient by following these steps, which are returning an authenticated response. This may not be the most elegant way of doing it, but you can always refine my answer, or post a better one.
What I had to do was set the HttpClient.DefaultRequestHeaders using the .Add() method like this: _client.DefaultRequestHeaders.Add("Cookie", TheFragment5._cookieHeader);
I got the CookieHeader (which is just a string btw) like this:
//instantiate a string that will house our cookie header
public static string _cookieHeader;
//you might want to make it private to prevent abuse
//but this example is just for demonstration
//the thing is we need a string to house our headers in scope of both the WebView and the HttpClient
//extend the WebViewClient
private class ExtWebViewClient : WebViewClient
{
public override void OnPageFinished(WebView view, string url)
{
//I get the cookies when the page finishes loading because
//then we know the cookie has our login cred header
//also, most of the previous examples got the cookies OnPageFinished
TheFragment5._cookieHeader = Android.Webkit.CookieManager.Instance.GetCookie(url);
}
}
Then we need another method for the HttpClient and HttpClientHandler ... mine scans a webpage for notification text.
public async void GetNotificationText(string url)
{
await Task.Run(() =>
{
/* this line is pretty important,
we need to instantiate an HttpClientHandler
then set it's UseCookies property to false
so that it doesn't override our cookies
*/
HttpClientHandler handler = new HttpClientHandler() { UseCookies = false };
try
{
Uri _notificationURI = new Uri("https://bitchute.com/notifications/");
//instantiate HttpClient using the handler
using (HttpClient _client = new HttpClient(handler))
{
//this line is where the magic happens;
//we set the DefaultRequestHeaders with the cookieheader we got from WebViewClient.OnPageFinished
_client.DefaultRequestHeaders.Add("Cookie", TheFragment5._cookieHeader);
//do a GetAsync request with our cookied up client
var getRequest = _client.GetAsync("https://bitchute.com/notifications/").Result;
//resultContent is the authenticated html string response from the server, ready to be parsed =]
var resultContent = getRequest.Content.ReadAsStringAsync().Result;
/*
I was writing to console to check the
response.. for me, I am now getting
the authenticated notification html
page
*/
Console.WriteLine(resultContent);
}
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
}
}
Hope this helps you, posting for future reference, especially for people using Xamarin.Android.
My Code:
class MyWebClient : WebClient
{
private CookieContainer _cookieContainer = new CookieContainer();
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = _cookieContainer;
}
return request;
}
}
using (var client = new MyWebClient())
{
var data = new NameValueCollection
{
{ "username", "myUser" },
{ "password", "myPw" }
};
client.UploadValues("http://www..tv/takelogin.php", data);
}
MNM3.4:
Response:
Building my app i use 3 sites.. with 2 of them everything works fine but with this no.
Passing a CookieContainer usually does the trick but you're already sending it. Can you confirm the field names?
Also, for some websites, you'll need to post back the hidden fields. I usually perform a GET to the login page and, using an HTML parser (like HtmlAgilityPack), I locate the appropriate form and POST the login request with all INPUT/SELECT fields I find.
I think the best advice here is to use a debugging proxy like Fiddler and try to perform the login from the browser and inspect the generated traffic.
I Found the problem...
client.UploadValues("http://www..tv/takelogin.php", data);
changed to:
client.UploadValues("http://.tv/takelogin.php", data);
That means:
http://www.MY_SITE.tv
dont work, but
http://MY_SITE.tv
works fine.
I'm trying to build c# application, which notify me when there is an "update" in site.
The site login form contains 3 textboxes, and it's login.aspx.
My question is, how can I "send" the 3 details to the site and connect(authenticate) from the application I want to build in c#, and if it's possible, how can I do it?
I looked for any guide or something to read about this but haven't found.
you need to use the WebClient class. More info on this class can be found at http://msdn.microsoft.com/en-us/library/system.net.webclient(v=vs.80).aspx
And a nice example at http://msdn.microsoft.com/en-us/library/system.net.webclient(v=vs.80).aspx?cs-save-lang=1&cs-lang=csharp#code-snippet-4
First you need post a form using c#
HttpWebRequest request = (HttpWebRequest)WebRequest.Create (args[0]);
// Set some reasonable limits on resources used by this request
request.MaximumAutomaticRedirections = 4;
request.MaximumResponseHeadersLength = 4;
// Set credentials to use for this request.
request.Credentials = CredentialCache.DefaultCredentials;
HttpWebResponse response = (HttpWebResponse)request.GetResponse ();
Console.WriteLine ("Content length is {0}", response.ContentLength);
Console.WriteLine ("Content type is {0}", response.ContentType);
// Get the stream associated with the response.
Stream receiveStream = response.GetResponseStream ();
// Pipes the stream to a higher level stream reader with the required encoding format.
StreamReader readStream = new StreamReader (receiveStream, Encoding.UTF8);
Console.WriteLine ("Response stream received.");
Console.WriteLine (readStream.ReadToEnd ());
response.Close ();
readStream.Close ();
then try to save cookie, its required to store aspnet_session_id into client for future requests
private class CookieAwareWebClient : WebClient
{
public CookieAwareWebClient()
: this(new CookieContainer())
{ }
public CookieAwareWebClient(CookieContainer c)
{
this.CookieContainer = c;
}
public CookieContainer CookieContainer { get; set; }
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = this.CookieContainer;
}
return request;
}
}
ensure you send the and restore aspnet_session_id on each request.
And bingo!!
I recommend you to read this.
I almost dare to ask, but how can i get the response data of a URL?
I just can't remember anymore.
My scenario: I'm using the twitter API to get the profile picture of an user. That API URL returns the JPEG location.
So if I actually write this HTML in my views:
<img src="https://api.twitter.com/1/users/profile_image?screen_name=twitterapi&size=bigger"/>
The browser auto uses the response JPEG for the SRC property. Like this:
Now is my question very simple: how can I get that .jpg location in C# to put in my database?
I'm not exactly sure what you are asking.
I think you can use WebClient.DownloadData in c# to call that url. Once you download the file, you can then place it in the database.
byte[] response = new System.Net.WebClient().DownloadData(url);
Download a file over HTTP into a byte array in C#?
EDIT: THIS IS WORKING FOR ME
WebRequest request = WebRequest.Create("https://api.twitter.com/1/users/profile_image?screen_name=twitterapi&size=bigger");
WebResponse response = request.GetResponse();
Console.WriteLine(response.ResponseUri);
Console.Read( );
from A way to figure out redirection URL
EDIT: THIS IS ANOTHER METHOD I THINK...using show.json from Read the absolute redirected SRC attribute URL for an image
http://api.twitter.com/1/users/show.json?screen_name=twitterapi
You can also do it using HttpClient:
public class UriFetcher
{
public Uri Get(string apiUri)
{
using (var httpClient = new HttpClient())
{
var httpResponseMessage = httpClient.GetAsync(apiUri).Result;
return httpResponseMessage.RequestMessage.RequestUri;
}
}
}
[TestFixture]
public class UriFetcherTester
{
[Test]
public void Get()
{
var uriFetcher = new UriFetcher();
var fetchedUri = uriFetcher.Get("https://api.twitter.com/1/users/profile_image?screen_name=twitterapi&size=bigger");
Console.WriteLine(fetchedUri);
}
}
You can use the HttpWebRequest and HttpWebResponse classes (via using System.Net)to achieve this;
HttpWebRequest webRequest =
WebRequest.Create("https://api.twitter.com/1/users/profile_image?screen_name=twitterapi&size=bigger") as HttpWebRequest;
webRequest.Credentials = CredentialCache.DefaultCredentials;
HttpWebResponse response = webRequest.GetResponse() as HttpWebResponse;
string url = response.ResponseUri.OriginalString;
url now contains the string "https://si0.twimg.com/profile_images/1438634086/avatar_bigger.png"
I am trying to write code that will authenticate to the website wallbase.cc. I've looked at what it does using Firfebug/Chrome Developer tools and it seems fairly easy:
Post "usrname=$USER&pass=$PASS&nopass_email=Type+in+your+e-mail+and+press+enter&nopass=0" to the webpage "http://wallbase.cc/user/login", store the returned cookies and use them on all future requests.
Here is my code:
private CookieContainer _cookies = new CookieContainer();
//......
HttpPost("http://wallbase.cc/user/login", string.Format("usrname={0}&pass={1}&nopass_email=Type+in+your+e-mail+and+press+enter&nopass=0", Username, assword));
//......
private string HttpPost(string url, string parameters)
{
try
{
System.Net.WebRequest req = System.Net.WebRequest.Create(url);
//Add these, as we're doing a POST
req.ContentType = "application/x-www-form-urlencoded";
req.Method = "POST";
((HttpWebRequest)req).Referer = "http://wallbase.cc/home/";
((HttpWebRequest)req).CookieContainer = _cookies;
//We need to count how many bytes we're sending. Post'ed Faked Forms should be name=value&
byte[] bytes = System.Text.Encoding.ASCII.GetBytes(parameters);
req.ContentLength = bytes.Length;
System.IO.Stream os = req.GetRequestStream();
os.Write(bytes, 0, bytes.Length); //Push it out there
os.Close();
//get response
using (System.Net.WebResponse resp = req.GetResponse())
{
if (resp == null) return null;
using (Stream st = resp.GetResponseStream())
{
System.IO.StreamReader sr = new System.IO.StreamReader(st);
return sr.ReadToEnd().Trim();
}
}
}
catch (Exception)
{
return null;
}
}
After calling HttpPost with my login parameters I would expect all future calls using this same method to be authenticated (assuming a valid username/password). I do get a session cookie in my cookie collection but for some reason I'm not authenticated. I get a session cookie in my cookie collection regardless of which page I visit so I tried loading the home page first to get the initial session cookie and then logging in but there was no change.
To my knowledge this Python version works: https://github.com/sevensins/Wallbase-Downloader/blob/master/wallbase.sh (line 336)
Any ideas on how to get authentication working?
Update #1
When using a correct user/password pair the response automatically redirects to the referrer but when an incorrect user/pass pair is received it does not redirect and returns a bad user/pass pair. Based on this it seems as though authentication is happening, but maybe not all the key pieces of information are being saved??
Update #2
I am using .NET 3.5. When I tried the above code in .NET 4, with the added line of System.Net.ServicePointManager.Expect100Continue = false (which was in my code, just not shown here) it works, no changes necessary. The problem seems to stem directly from some pre-.Net 4 issue.
This is based on code from one of my projects, as well as code found from various answers here on stackoverflow.
First we need to set up a Cookie aware WebClient that is going to use HTML 1.0.
public class CookieAwareWebClient : WebClient
{
private CookieContainer cookie = new CookieContainer();
protected override WebRequest GetWebRequest(Uri address)
{
HttpWebRequest request = (HttpWebRequest)base.GetWebRequest(address);
request.ProtocolVersion = HttpVersion.Version10;
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = cookie;
}
return request;
}
}
Next we set up the code that handles the Authentication and then finally loads the response.
var client = new CookieAwareWebClient();
client.UseDefaultCredentials = true;
client.BaseAddress = #"http://wallbase.cc";
var loginData = new NameValueCollection();
loginData.Add("usrname", "test");
loginData.Add("pass", "123");
loginData.Add("nopass_email", "Type in your e-mail and press enter");
loginData.Add("nopass", "0");
var result = client.UploadValues(#"http://wallbase.cc/user/login", "POST", loginData);
string response = System.Text.Encoding.UTF8.GetString(result);
We can try this out using the HTML Visualizer inbuilt into Visual Studio while staying in debug mode and use that to confirm that we were able to authenticate and load the Home page while staying authenticated.
The key here is to set up a CookieContainer and use HTTP 1.0, instead of 1.1. I am not entirely sure why forcing it to use 1.0 allows you to authenticate and load the page successfully, but part of the solution is based on this answer.
https://stackoverflow.com/a/10916014/408182
I used Fiddler to make sure that the response sent by the C# Client was the same as with my web browser Chrome. It also allows me to confirm if the C# client is being redirect correctly. In this case we can see that with HTML 1.0 we are getting the HTTP/1.0 302 Found and then redirects us to the home page as intended. If we switch back to HTML 1.1 we will get an HTTP/1.1 417 Expectation Failed message instead.
There is some information on this error message available in this stackoverflow thread.
HTTP POST Returns Error: 417 "Expectation Failed."
Edit: Hack/Fix for .NET 3.5
I have spent a lot of time trying to figure out the difference between 3.5 and 4.0, but I seriously have no clue. It looks like 3.5 is creating a new cookie after the authentication and the only way I found around this was to authenticate the user twice.
I also had to make some changes on the WebClient based on information from this post.
http://dot-net-expertise.blogspot.fr/2009/10/cookiecontainer-domain-handling-bug-fix.html
public class CookieAwareWebClient : WebClient
{
public CookieContainer cookies = new CookieContainer();
protected override WebRequest GetWebRequest(Uri address)
{
var request = base.GetWebRequest(address);
var httpRequest = request as HttpWebRequest;
if (httpRequest != null)
{
httpRequest.ProtocolVersion = HttpVersion.Version10;
httpRequest.CookieContainer = cookies;
var table = (Hashtable)cookies.GetType().InvokeMember("m_domainTable", System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.GetField | System.Reflection.BindingFlags.Instance, null, cookies, new object[] { });
var keys = new ArrayList(table.Keys);
foreach (var key in keys)
{
var newKey = (key as string).Substring(1);
table[newKey] = table[key];
}
}
return request;
}
}
var client = new CookieAwareWebClient();
var loginData = new NameValueCollection();
loginData.Add("usrname", "test");
loginData.Add("pass", "123");
loginData.Add("nopass_email", "Type in your e-mail and press enter");
loginData.Add("nopass", "0");
// Hack: Authenticate the user twice!
client.UploadValues(#"http://wallbase.cc/user/login", "POST", loginData);
var result = client.UploadValues(#"http://wallbase.cc/user/login", "POST", loginData);
string response = System.Text.Encoding.UTF8.GetString(result);
You may need to add the following:
//get response
using (System.Net.WebResponse resp = req.GetResponse())
{
foreach (Cookie c in resp.Cookies)
_cookies.Add(c);
// Do other stuff with response....
}
Another thing that you might have to do is, if the server responds with a 302 (redirect) the .Net web request will automatically follow it and in the process you might lose the cookie you're after. You can turn off this behavior with the following code:
req.AllowAutoRedirect = false;
The Python you reference uses a different referrer (http://wallbase.cc/start/). It is also followed by another post to (http://wallbase.cc/user/adult_confirm/1). Try the other referrer and followup with this POST.
I think you are authenticating correctly, but that the site needs more info/assertions from you before proceeding.