How to interact with a website without a browser? [closed] - c#

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
Say I am building a c# application.
The purpose of application to :
get username & password from user.
and show some information present on the website.
in the background, after taking username and password, it should :
log in to a website with those credentials.
and click on the anchor link that appears after logging in.
find out the span that hold the info.
get the info.
that was an example. I am actually building an app to show bandwidth usage information.
The server does not expose any API for that.
Is there any tutorial/info/article available for similar purpose ? I just don't what to search for ?

Basic Introduction To HttpWebRequests
Firstly, you're going to need the right tools for the job. Go and download the Live HTTP Headers plugin for Firefox. This will allow you to view HTTP headers in real time so you can view the POST data that is sent when you interact with the website. Once you know the data that is sent to the website you can emulate the process by creating your own HTTP web requests programmatically. Tool > Live HTTP Headers
Load Live HTTP Headers by navigating to Tools > Live HTTP Headers. Once you've loaded the GUI navigate to the website you wish to login to, I will use Facebook for demonstration purposes. Type in your credentials ready to login, but before you do Clear the GUI text window and ensure that the check box labeled Capture is checked. Once you hit login you will see the text window flood with various information about the requests including the POST data which you need.
I find it best to click Save All... and then search for your username in the text document so that you can identify the POST data easily. For my request the POST data looked like this:
lsd=AVp-UAbD&display=&legacy_return=1&return_session=0&trynum=1&charset_test=%E2%82%AC%2C%C2%B4%2C%E2%82%AC%2C%C2%B4%2C%E6%B0%B4%2C%D0%94%2C%D0%84&timezone=0&lgnrnd=214119_mDgc&lgnjs=1356154880&email=%myfacebookemail40outlook.com&pass=myfacebookpassword&default_persistent=0
Which can then be defined in C# like so:
StringBuilder postData = new StringBuilder();
postData.Append("lsd=AVqRGVie&display=");
postData.Append("&legacy_return=1");
postData.Append("&return_session=0");
postData.Append("&trynum=1");
postData.Append("&charset_test=%E2%82%AC%2C%C2%B4%2C%E2%82%AC%2C%C2%B4%2C%E6%B0%B4%2C%D0%94%2C%D0%84");
postData.Append("&timezone=0");
postData.Append("&lgnrnd=153743_eO6D");
postData.Append("&lgnjs=1355614667");
postData.Append(String.Format("&email={0}", "CUSTOM_EMAIL"));
postData.Append(String.Format("&pass={0}", "CUSTOM_PASSWORD"));
postData.Append("&default_persistent=0");
I'm aiming to show you the relation between the POST data that we can send 'manually' via the web browser and how we can use said data to emulate the request in C#. Understand that sending POST data is far from deterministic. Different websites work in different ways and can throw all kinds of things your way. Below is a function I put together to validate that Facebook credentials are correct. I can't and shouldn't go into extraordinary depth here as the classes and their members are well self-documented. You can find better information than I can offer about the methods used at MSDN for example, WebRequest.Method Property
private bool ValidateFacebookCredentials(string email, string password)
{
CookieContainer cookies = new CookieContainer();
HttpWebRequest request = null;
HttpWebResponse response = null;
string returnData = string.Empty;
//Need to retrieve cookies first
request = (HttpWebRequest)WebRequest.Create(new Uri("https://www.facebook.com/login.php?login_attempt=1"));
request.Method = "GET";
request.CookieContainer = cookies;
response = (HttpWebResponse)request.GetResponse();
//Set up the request
request = (HttpWebRequest)WebRequest.Create(new Uri("https://www.facebook.com/login.php?login_attempt=1"));
request.Method = "POST";
request.ContentType = "application/x-www-form-urlencoded";
request.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13";
request.Referer = "https://www.facebook.com/login.php?login_attempt=1";
request.AllowAutoRedirect = true;
request.KeepAlive = true;
request.CookieContainer = cookies;
//Format the POST data
StringBuilder postData = new StringBuilder();
postData.Append("lsd=AVqRGVie&display=");
postData.Append("&legacy_return=1");
postData.Append("&return_session=0");
postData.Append("&trynum=1");
postData.Append("&charset_test=%E2%82%AC%2C%C2%B4%2C%E2%82%AC%2C%C2%B4%2C%E6%B0%B4%2C%D0%94%2C%D0%84");
postData.Append("&timezone=0");
postData.Append("&lgnrnd=153743_eO6D");
postData.Append("&lgnjs=1355614667");
postData.Append(String.Format("&email={0}", email));
postData.Append(String.Format("&pass={0}", password));
postData.Append("&default_persistent=0");
//write the POST data to the stream
using(StreamWriter writer = new StreamWriter(request.GetRequestStream()))
writer.Write(postData.ToString());
response = (HttpWebResponse)request.GetResponse();
//Read the web page (HTML) that we retrieve after sending the request
using (StreamReader reader = new StreamReader(response.GetResponseStream()))
returnData = reader.ReadToEnd();
return !returnData.Contains("Please re-enter your password");
}

Sample Code on Grabbing Contents (Screen Scraping)
Uri uri = new Uri("http://www.microsoft.com/default.aspx");
if(uri.Scheme = Uri.UriSchemeHttp)
{
HttpWebRequest request = HttpWebRequest.Create(uri);
request.Method = WebRequestMethods.Http.Get;
HttpWebResponse response = request.GetResponse();
StreamReader reader = new StreamReader(response.GetResponseStream());
string tmp = reader.ReadToEnd();
response.Close();
Response.Write(tmp);
}
Sample Code on how to Post Data to remote Web Page using HttpWebRequest
Uri uri = new Uri("http://www.amazon.com/exec/obidos/search-handle-form/102-5194535-6807312");
string data = "field-keywords=ASP.NET 2.0";
if (uri.Scheme == Uri.UriSchemeHttp)
{
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(uri);
request.Method = WebRequestMethods.Http.Post;
request.ContentLength = data.Length;
request.ContentType = "application/x-www-form-urlencoded";
StreamWriter writer = new StreamWriter(request.GetRequestStream());
writer.Write(data);
writer.Close();
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
StreamReader reader = new StreamReader(response.GetResponseStream());
string tmp = reader.ReadToEnd();
response.Close();
Response.Write(tmp);
}
Source

Any HTTP client implementation, there are tons of open-source libraries for that. look at curl for example. Some dude made a .NET wrapper for it.

You can continue using WebClient to POST (instead of GET, which is the HTTP verb you're currently using with DownloadString), but I think you'll find it easier to work with the (slightly) lower-level classes WebRequest and WebResponse.
There are two parts to this - the first is to post the login form, the second is recovering the "Set-cookie" header and sending that back to the server as "Cookie" along with your GET request. The server will use this cookie to identify you from now on (assuming it's using cookie-based authentication which I'm fairly confident it is as that page returns a Set-cookie header which includes "PHPSESSID").
Click Here to Check in Detail

Related

Do I need to post every request header when simulating a webpage log in through C#?

I am working on getting information that is behind a log in page, and using this as my starting point.
Looking at the Network tab, I looked at the form data and saw there were 3 additional values than just client/password (csrf, time, hash).
I attempted to log into the site as follows.
string formUrl = "mysite_loginaction";
string formParams = string.Format("client_id={0}&password={1}", "client", "password");
string cookieHeader;
WebRequest req = WebRequest.Create(formUrl);
req.ContentType = "application/x-www-form-urlencoded";
req.Method = "POST";
byte[] bytes = Encoding.ASCII.GetBytes(formParams);
req.ContentLength = bytes.Length;
using (Stream os = req.GetRequestStream())
{
os.Write(bytes, 0, bytes.Length);
}
WebResponse resp = req.GetResponse();
cookieHeader = resp.Headers["Set-cookie"];
When I print out the resp to my console, it shows my the log in page, when i was expecting the next page after login (google 2f page).
Do I need to post a csfr, time, and hash values as well to get a successful login?
Like it has been mentioned in your link, there is a concept of sessionid token. If you do want to stay logged in, you need to pass that token everytime for the following http requests.
Also, the CSRF token will always be different each time you do the request, but you do need to pass it along your next request to be successful.
To know more about CSRF, I should redirect you to this link
You're going to have to mess around with it. Most of the time you don't need all the headers, but I would assume that hash is required.

Can you make a server side httprequest that mimicks a html form post?

When you perform a HTML POST to a page, it does this:
validates based on your form fields
redirects the user to a different url and responds with a authentication cookie and another cookie.
If I create a simple html page with a form on it, it works fine and I get redirected.
Would it be possible to do this on the server side only? If the endpoint reads the form fields, and then redirects with cookies then I guess not right since the server side won't have any notion of the cookies?
This should work for you to post server side:
WebRequest request = default(WebRequest);
request = WebRequest.Create(your_url);
request.Method = "POST";
request.ContentType = "application/x-www-form-encoded";
StreamWriter sw = new StreamWriter(request.GetRequestStream);
//'// Read the Response
WebResponse wr = request.GetResponse;
StreamReader sr = new StreamReader(wr.GetResponseStream);
var ReturnValue = sr.ReadToEnd.Trim;
You can set cookies on the post like this:
request.Headers(HttpRequestHeader.Cookie) = "MyCookie=value;";

Retrieve DOM data from site

Is there any chance to retrieve DOM results when I click older posts from the site:
http://www.facebook.com/FamilyGuy
using C# or Java? I heard that it is possible to execute a script with onclick and get results. How I can execute this script:
onclick="(JSCC.get('j4eb9ad57ab8a19f468880561') && JSCC.get('j4eb9ad57ab8a19f468880561').getHandler())(); return false;"
I think older posts link sends an Ajax request and appends the response to the page. (I'm not sure. You should check the page source).
You can emulate this behavior in C#, Java, and JavaScript (you already have the code for javascript).
Edit:
It seems that Facebook uses some sort of internal APIs (JSCC) to load the content and it's undocumented.
I don't know about Facebook Developers' APIs (you may want to check that first) but if you want to emulate exactly what happens in your browser then you can use TamperData to intercept GET requests when you click on more posts link and find the request URL and it's parameters.
After you get this information you have to Login to your account in your application and get the authentication cookie.
C# sample code as you requested:
private CookieContainer GetCookieContainer(string loginURL, string userName, string password)
{
var webRequest = WebRequest.Create(loginURL) as HttpWebRequest;
var responseReader = new StreamReader(webRequest.GetResponse().GetResponseStream());
string responseData = responseReader.ReadToEnd();
responseReader.Close();
// Now you may need to extract some values from the login form and build the POST data with your username and password.
// I don't know what exactly you need to POST but again a TamperData observation will help you to find out.
string postData =String.Format("UserName={0}&Password={1}", userName, password); // I emphasize that this is just an example.
// cookie container
var cookies = new CookieContainer();
// post the login form
webRequest = WebRequest.Create(loginURL) as HttpWebRequest;
webRequest.Method = "POST";
webRequest.ContentType = "application/x-www-form-urlencoded";
webRequest.CookieContainer = cookies;
// write the form values into the request message
var requestWriter = new StreamWriter(webRequest.GetRequestStream());
requestWriter.Write(postData);
requestWriter.Close();
webRequest.GetResponse().Close();
return cookies;
}
Then you can perform GET requests with the cookie you have, on the URL you've got from analyzing that JSCC.get().getHandler() requests using TamperData, and eventually you'll get what you want as a response stream:
var webRequest = WebRequest.Create(url) as HttpWebRequest;
webRequest.CookieContainer = GetCookieContainer(url, userName, password);
var responseStream = webRequest.GetResponse().GetResponseStream();
You can also use Selenium for browser automation. It also has C# and Java APIs (I have no experience using Selenium).
Facebook loads it's content dynamically with AJAX. You can use a tool like Firebug to examine what kind of request is made, and then replicate it.
Or you can use a browser render engine like webkit to process the JavaScript for you and expose the resulting HTML:
http://webscraping.com/blog/Scraping-JavaScript-webpages-with-webkit/

C# REST client sending data using POST

I'm trying to send a simple POST request to a REST web service and print the response (code is below, mostly taken from Yahoo! developer documentation and the MSDN code snippets provided with some of the documentation). I would expect the client to send:
Request Method: POST (i.e. I expect $_SERVER['REQUEST_METHOD'] == 'POST' in PHP)
Data: foo=bar (i.e. $_POST['foo'] == 'bar' in PHP)
However, it seems to be sending:
Request Method: FOO=BARPOST
Data: (blank)
I know the API works as I've tested it with clients written in Python and PHP, so I'm pretty sure it must be a problem with my C#. I'm not a .NET programmer by trade so would appreciate any comments/pointers on how to figure out what the problem is - I'm sure it's something trivial but I can't spot it myself.
uri, user and password variables are set earlier in the code - they work fine with GET requests.
request = (HttpWebRequest) WebRequest.Create(uri);
request.Credentials = new NetworkCredential(user, password);
request.Method = WebRequestMethods.Http.Post;
request.ContentType = "application/x-www-form-urlencoded";
string postData = "foo=bar";
request.ContentLength = postData.Length;
StreamWriter postStream = new StreamWriter(request.GetRequestStream(), System.Text.Encoding.ASCII);
postStream.Write(postData);
postStream.Close();
response = (HttpWebResponse) request.GetResponse();
The REST API is written in PHP, and the $_POST array is empty on the server when using the C# client.
Eventually found the HttpWebRequest.PreAuthenticate property which seems to solve the problem if the code is edited like so:
request = (HttpWebRequest) WebRequest.Create(uri);
request.PreAuthenticate = true;
request.Credentials = new NetworkCredential(user, password);
request.Method = WebRequestMethods.Http.Post;
From the documentation I presume this forces authentication before the actual POST request is sent. I'm not sure why the class doesn't do this automatically (libraries for other languages make this process transparent, unless you explicitly turn it off), but it has solved the problem for me and may save someone else another 2 days of searching and hair-pulling.
For what it's worth, PreAuthenticate doesn't need to be set for GET requests, only POST, although if you do set it for a GET request everything will still work, but take slightly longer.

Writing cookies from CookieContainer to the IE cookie store

I want to navigate to a page in a web app from a desktop app. "No problem", I hear you say, "just fire up the default browser with the correct URL". However, the web app uses ASP.NET Forms Authentication, and the users don't want to see the login page because they have already authenticated with the same credentials in the desktop app.
That sounds simple enough, all I have to do is emit an HTTP POST from the desktop app with that fakes the postback from the web app's login page. The web app will then set its authentication ticket and session state cookies, return them to me, and I will store them in the IE cookie store. I can then navigate to the desired page and the web app will think that it's already authenticated.
I have some working code that constructs the HTTP POST, sends it off, and gets a valid response containing the right cookies. However, I can't see how to write them into the IE cookie store. Can anyone point me in the right direction?
Sample code:
var requestUrl = Properties.Settings.Default.WebsiteLoginPageUrl;
var requestEncoding = Encoding.GetEncoding(1252);
// Simulated login postdata
var requestText = string.Format(
"__VIEWSTATE={2}&__EVENTTARGET={3}&__EVENTARGUMENT={4}&__EVENTVALIDATION={5}&userNameText={0}&passwordText={1}&submitButton=Log+In",
HttpUtility.UrlEncode(Properties.Settings.Default.UserName),
HttpUtility.UrlEncode(Properties.Settings.Default.Password),
Properties.Settings.Default.FakeViewState,
Properties.Settings.Default.FakeEventTarget,
Properties.Settings.Default.FakeEventArgument,
Properties.Settings.Default.FakeEventValidation);
var request = (HttpWebRequest) WebRequest.Create(requestUrl);
request.Method = "POST";
request.Accept = "*/*";
request.ContentType = "application/x-www-form-urlencoded";
request.ContentLength = requestEncoding.GetByteCount(requestText);
request.Headers.Add(HttpRequestHeader.CacheControl, "no-cache");
request.AllowAutoRedirect = false;
request.KeepAlive = false;
request.CookieContainer = new CookieContainer();
using(var writer = new StreamWriter(request.GetRequestStream(), requestEncoding)) {
writer.Write(requestText);
}
var response = (HttpWebResponse) request.GetResponse();
// TODO: Grab the response cookies and save them to the interactive desktop user's cookie store.
Process.Start(new ProcessStartInfo {
FileName = Properties.Settings.Default.WebsiteTargetPageUrl,
UseShellExecute = true,
});
You need to call the unmanaged InternetSetCookie() function. And look! Someone wrote the interop for you already. You should verify its correctness though.

Categories

Resources