eventvalidation error for screen scraping - c#

I'm trying to scrape the login page of a site I work on and submit a username/password via code to log into the site. I want to do this in a checking service for site health reasons. I'm running into a few issues, the first of which deals with getting this message:
Exception information:
Exception type: ArgumentException
Exception message: Invalid postback or callback argument. Event validation is enabled using <pages enableEventValidation="true"/> in configuration or <%# Page EnableEventValidation="true" %> in a page. For security purposes, this feature verifies that arguments to postback or callback events originate from the server control that originally rendered them. If the data is valid and expected, use the ClientScriptManager.RegisterForEventValidation method in order to register the postback or callback data for validation.
I've found sites that say I'd have to turn eventvalidation off, but I don't want to do that for security reasons. Is there a way to get around this?
Here is the code. I basically took it right off K. Scott Allen's article here: http://odetocode.com/Articles/162.aspx
StringBuilder sb = new StringBuilder();
var encryptedConnectionString = GlobalDataObject.EncryptSecure("conn string here", GlobalDataObject.Seed);
sb.AppendFormat("client_id={0};", "client");
sb.AppendFormat("client_directory={0};", "client");
sb.AppendFormat("user_id={0};", "12");
sb.AppendFormat("conn_string={0};", encryptedConnectionString);
StringBuilder cookiesString = sb;
HttpWebRequest webRequest = WebRequest.Create("http://localhost/site/login.aspx?c=client") as HttpWebRequest;
webRequest.Headers.Add("Cookie", cookiesString.ToString());
StreamReader responseReader = new StreamReader(
webRequest.GetResponse().GetResponseStream()
);
string responseData = responseReader.ReadToEnd();
responseReader.Close();
// extract the viewstate value and build out POST data
string viewState = ExtractViewState(responseData);
string postData = string.Format("__VIEWSTATE={0}&Login1$Password={1}&Login1$UserName={2}&Login1$LoginButton={3}",
viewState,
HttpUtility.UrlEncode(username),
HttpUtility.UrlEncode(password),
"Log In");
// have a cookie container ready to receive the forms auth cookie
CookieContainer cookies = new CookieContainer();
// now post to the login form
webRequest = WebRequest.Create("http://localhost/site/login.aspx") as HttpWebRequest;
webRequest.Method = "POST";
webRequest.ContentType = "application/x-www-form-urlencoded";
webRequest.CookieContainer = cookies;
// write the form values into the request message
StreamWriter requestWriter = new StreamWriter(webRequest.GetRequestStream());
requestWriter.Write(postData);
requestWriter.Close();
webRequest.AuthenticationLevel = AuthenticationLevel.None;
// we don't need the contents of the response, just the cookie it issues
webRequest.GetResponse().Close(); ///ERROR HAPPENS HERE
// now we can send out cookie along with a request for the protected page
webRequest = WebRequest.Create("http://localhost/site/user/home.aspx") as HttpWebRequest;
webRequest.CookieContainer = cookies;
responseReader = new StreamReader(webRequest.GetResponse().GetResponseStream());
// and read the response
responseData = responseReader.ReadToEnd();
responseReader.Close();
return responseData;
Thanks.

The idea of this error is that it's looking out for requests that are malformed in way that might compromise the app. Being as how it's a login page, I bet you're not trying to pass in unencoded HTML or something.
Edit: Capture the built-up event validation data and send that back with the login request, same as the viewstate.

Related

Problems authenticating to website from code

I am trying to write code that will authenticate to the website wallbase.cc. I've looked at what it does using Firfebug/Chrome Developer tools and it seems fairly easy:
Post "usrname=$USER&pass=$PASS&nopass_email=Type+in+your+e-mail+and+press+enter&nopass=0" to the webpage "http://wallbase.cc/user/login", store the returned cookies and use them on all future requests.
Here is my code:
private CookieContainer _cookies = new CookieContainer();
//......
HttpPost("http://wallbase.cc/user/login", string.Format("usrname={0}&pass={1}&nopass_email=Type+in+your+e-mail+and+press+enter&nopass=0", Username, assword));
//......
private string HttpPost(string url, string parameters)
{
try
{
System.Net.WebRequest req = System.Net.WebRequest.Create(url);
//Add these, as we're doing a POST
req.ContentType = "application/x-www-form-urlencoded";
req.Method = "POST";
((HttpWebRequest)req).Referer = "http://wallbase.cc/home/";
((HttpWebRequest)req).CookieContainer = _cookies;
//We need to count how many bytes we're sending. Post'ed Faked Forms should be name=value&
byte[] bytes = System.Text.Encoding.ASCII.GetBytes(parameters);
req.ContentLength = bytes.Length;
System.IO.Stream os = req.GetRequestStream();
os.Write(bytes, 0, bytes.Length); //Push it out there
os.Close();
//get response
using (System.Net.WebResponse resp = req.GetResponse())
{
if (resp == null) return null;
using (Stream st = resp.GetResponseStream())
{
System.IO.StreamReader sr = new System.IO.StreamReader(st);
return sr.ReadToEnd().Trim();
}
}
}
catch (Exception)
{
return null;
}
}
After calling HttpPost with my login parameters I would expect all future calls using this same method to be authenticated (assuming a valid username/password). I do get a session cookie in my cookie collection but for some reason I'm not authenticated. I get a session cookie in my cookie collection regardless of which page I visit so I tried loading the home page first to get the initial session cookie and then logging in but there was no change.
To my knowledge this Python version works: https://github.com/sevensins/Wallbase-Downloader/blob/master/wallbase.sh (line 336)
Any ideas on how to get authentication working?
Update #1
When using a correct user/password pair the response automatically redirects to the referrer but when an incorrect user/pass pair is received it does not redirect and returns a bad user/pass pair. Based on this it seems as though authentication is happening, but maybe not all the key pieces of information are being saved??
Update #2
I am using .NET 3.5. When I tried the above code in .NET 4, with the added line of System.Net.ServicePointManager.Expect100Continue = false (which was in my code, just not shown here) it works, no changes necessary. The problem seems to stem directly from some pre-.Net 4 issue.
This is based on code from one of my projects, as well as code found from various answers here on stackoverflow.
First we need to set up a Cookie aware WebClient that is going to use HTML 1.0.
public class CookieAwareWebClient : WebClient
{
private CookieContainer cookie = new CookieContainer();
protected override WebRequest GetWebRequest(Uri address)
{
HttpWebRequest request = (HttpWebRequest)base.GetWebRequest(address);
request.ProtocolVersion = HttpVersion.Version10;
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = cookie;
}
return request;
}
}
Next we set up the code that handles the Authentication and then finally loads the response.
var client = new CookieAwareWebClient();
client.UseDefaultCredentials = true;
client.BaseAddress = #"http://wallbase.cc";
var loginData = new NameValueCollection();
loginData.Add("usrname", "test");
loginData.Add("pass", "123");
loginData.Add("nopass_email", "Type in your e-mail and press enter");
loginData.Add("nopass", "0");
var result = client.UploadValues(#"http://wallbase.cc/user/login", "POST", loginData);
string response = System.Text.Encoding.UTF8.GetString(result);
We can try this out using the HTML Visualizer inbuilt into Visual Studio while staying in debug mode and use that to confirm that we were able to authenticate and load the Home page while staying authenticated.
The key here is to set up a CookieContainer and use HTTP 1.0, instead of 1.1. I am not entirely sure why forcing it to use 1.0 allows you to authenticate and load the page successfully, but part of the solution is based on this answer.
https://stackoverflow.com/a/10916014/408182
I used Fiddler to make sure that the response sent by the C# Client was the same as with my web browser Chrome. It also allows me to confirm if the C# client is being redirect correctly. In this case we can see that with HTML 1.0 we are getting the HTTP/1.0 302 Found and then redirects us to the home page as intended. If we switch back to HTML 1.1 we will get an HTTP/1.1 417 Expectation Failed message instead.
There is some information on this error message available in this stackoverflow thread.
HTTP POST Returns Error: 417 "Expectation Failed."
Edit: Hack/Fix for .NET 3.5
I have spent a lot of time trying to figure out the difference between 3.5 and 4.0, but I seriously have no clue. It looks like 3.5 is creating a new cookie after the authentication and the only way I found around this was to authenticate the user twice.
I also had to make some changes on the WebClient based on information from this post.
http://dot-net-expertise.blogspot.fr/2009/10/cookiecontainer-domain-handling-bug-fix.html
public class CookieAwareWebClient : WebClient
{
public CookieContainer cookies = new CookieContainer();
protected override WebRequest GetWebRequest(Uri address)
{
var request = base.GetWebRequest(address);
var httpRequest = request as HttpWebRequest;
if (httpRequest != null)
{
httpRequest.ProtocolVersion = HttpVersion.Version10;
httpRequest.CookieContainer = cookies;
var table = (Hashtable)cookies.GetType().InvokeMember("m_domainTable", System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.GetField | System.Reflection.BindingFlags.Instance, null, cookies, new object[] { });
var keys = new ArrayList(table.Keys);
foreach (var key in keys)
{
var newKey = (key as string).Substring(1);
table[newKey] = table[key];
}
}
return request;
}
}
var client = new CookieAwareWebClient();
var loginData = new NameValueCollection();
loginData.Add("usrname", "test");
loginData.Add("pass", "123");
loginData.Add("nopass_email", "Type in your e-mail and press enter");
loginData.Add("nopass", "0");
// Hack: Authenticate the user twice!
client.UploadValues(#"http://wallbase.cc/user/login", "POST", loginData);
var result = client.UploadValues(#"http://wallbase.cc/user/login", "POST", loginData);
string response = System.Text.Encoding.UTF8.GetString(result);
You may need to add the following:
//get response
using (System.Net.WebResponse resp = req.GetResponse())
{
foreach (Cookie c in resp.Cookies)
_cookies.Add(c);
// Do other stuff with response....
}
Another thing that you might have to do is, if the server responds with a 302 (redirect) the .Net web request will automatically follow it and in the process you might lose the cookie you're after. You can turn off this behavior with the following code:
req.AllowAutoRedirect = false;
The Python you reference uses a different referrer (http://wallbase.cc/start/). It is also followed by another post to (http://wallbase.cc/user/adult_confirm/1). Try the other referrer and followup with this POST.
I think you are authenticating correctly, but that the site needs more info/assertions from you before proceeding.

HttpWebRequest (post) and redirection

I'm trying to log onto the following website using HttpWebRequest: http://mostanmeldung.moessinger.at/login.php
Texts are in German, but they don't really matter. If you look at the source code (which by the way was not written by me, so don't blame me for its bad style :P), you will see a form tag that contains two input tags. The name of the first one is "BN" (username), and the name of the second one is "PW" (password). I am trying to send data containing values for these two inputs to the webserver using the HttpWebRequest class. However, posting the values redirects the request to another page called "einloggen.php". On that site I am told whether my login was successful.
My problem is that I am able to send the data without any problems, however, all I receive is the content of "login.php", the site you have to enter your username and password on.
This is what my code looks like:
string post = String.Format(PostPattern, Username, Password);
byte[] postBytes = Encoding.ASCII.GetBytes(post);
CookieContainer cookies = new CookieContainer();
// "Address": http://mostanmeldung.moessinger.at/login.php
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(Address);
req.CookieContainer = cookies;
req.Method = "POST";
req.ContentType = "application/x-www-form-urlencoded";
req.ContentLength = postBytes.Length;
req.AllowAutoRedirect = true;
MessageBox.Show(post); // shows me "BN=boop;PW=hi"
Stream reqStream = req.GetRequestStream();
reqStream.Write(postBytes, 0, postBytes.Length);
reqStream.Close();
WebResponse res;
if (/*req.HaveResponse &&*/ (res = req.GetResponse()) != null)
{
StreamReader reader = new StreamReader(res.GetResponseStream());
MessageBox.Show(reader.ReadToEnd());
return AuthResult.Success;
}
return AuthResult.NoResponse;
The message box at line 22 (5 lines before the end) shows me the content of "login.php" instead of "einloggen.php" which I am redirected to. Why is that?
The ACTION on that form points to einloggen.php, not login.php, so you need to send your POST data to einloggen.php instead.

Help Rendering ASP.NET MVC View from a Console App

I have created an ASP.NET MVC View. On my MVC WebApp, it works great.
I would like to be able to (from a console app) render the View out as an HTML Email. I'm wondering what the best way to do this is going to be, the part I'm struggling with is Rendering the View.
Is there any way to do this from a console application?
The webapp simply calls a web-service and formats the data nicely so the console application will have access to the same web-service; however, the ActionResult on the controller is protected by [Authorize] attributes, so not just anyone can get at it.
Yes you can. I assume you are using forms authentication. Just authenticate, grab the session header cookie and copy it to your new web request.
I ended up using HttpWebRequest and the info provided here: http://odetocode.com/articles/162.aspx
From the article:
// first, request the login form to get the viewstate value
HttpWebRequest webRequest = WebRequest.Create(LOGIN_URL) as HttpWebRequest;
StreamReader responseReader = new StreamReader(
webRequest.GetResponse().GetResponseStream()
);
string responseData = responseReader.ReadToEnd();
responseReader.Close();
// extract the viewstate value and build out POST data
string viewState = ExtractViewState(responseData);
string postData =
String.Format(
"__VIEWSTATE={0}&UsernameTextBox={1}&PasswordTextBox={2}&LoginButton=Login",
viewState, USERNAME, PASSWORD
);
// have a cookie container ready to receive the forms auth cookie
CookieContainer cookies = new CookieContainer();
// now post to the login form
webRequest = WebRequest.Create(LOGIN_URL) as HttpWebRequest;
webRequest.Method = "POST";
webRequest.ContentType = "application/x-www-form-urlencoded";
webRequest.CookieContainer = cookies;
// write the form values into the request message
StreamWriter requestWriter = new StreamWriter(webRequest.GetRequestStream());
requestWriter.Write(postData);
requestWriter.Close();
// we don't need the contents of the response, just the cookie it issues
webRequest.GetResponse().Close();
// now we can send out cookie along with a request for the protected page
webRequest = WebRequest.Create(SECRET_PAGE_URL) as HttpWebRequest;
webRequest.CookieContainer = cookies;
responseReader = new StreamReader(webRequest.GetResponse().GetResponseStream());
// and read the response
responseData = responseReader.ReadToEnd();
responseReader.Close();
Response.Write(responseData);

Redirect user to authenticated page that uses forms authentication, using HTTP Location Header, HttpWebRequest/Response and Response.Cookies.Add()

I need to autheticate on a site using forms authentication and then redirect the user to that site along with the session cookie. I have not figured out how to successfully do this. Here's my code so far.. I still get redirected to that apps login page. Any help is much appreciated!
protected void Button1_Click(object sender, EventArgs e)
{
string data = "nickname=&login={0}&password={1}&action_login.x=70&action_login.y=14action_login=Login";
string postdata = String.Format(data, "test", "test");
string page = #"http://1.1.1.1/home.asp";
string loginPage = #"http://1.1.1.1/auth.asp";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(loginPage);
request.Method = "POST";
request.ContentType = "application/x-www-form-urlencoded";
request.AllowAutoRedirect = false;
ASCIIEncoding encoding = new ASCIIEncoding(); //encoder
byte[] requestData = encoding.GetBytes(postdata); //encode post data
request.ContentLength = requestData.Length;
//write the post data to the request
Stream requestStream = request.GetRequestStream();
// Send the data.
requestStream.Write(requestData, 0, requestData.Length);
requestStream.Close();
try
{
HttpWebResponse response = (HttpWebResponse) request.GetResponse();
string cookieHeader = response.GetResponseHeader("Set-Cookie");
string cookieValue = cookieHeader.Replace("pp_session_id=", "");
HttpCookie cookie = new HttpCookie("pp_session_id",cookieValue);
cookie.Domain = "1.1.1.1";
cookie.Path = "/";
Response.Clear();
Response.StatusCode = 302;
//Response.AddHeader("Set-Cookie", cookieHeader);
Response.AddHeader("Location",page);
Response.RedirectLocation = page;
Response.Cookies.Add(cookie);
Response.Flush();
}
catch (WebException ex)
{
Response.Write(ex.Message);
}
}
Use Firebug on Mozilla Firefox to see what exactly the browser does when logging into the webapp. Then simulate the same sequence through code.
Or, you can use wireshark to sniff the requests sent by the browser.
One thing I can see from your code, is that you are adding the cookie explicitly. You shouldnt be doing this. You should set a CookieContainer on the request, so that the cookies get sent with all the requests to that site.
hope that helps.
What's wrong with using the FormsAuthentication class? In particular, have you tried the following sequence (or a variation of it):
FormsAuthentication.Authenticate();
FormsAuthentication.SetAuthCookie();
FormsAuthentication.RedirectFromLoginPage();
i believe you have to do a request to an authenticated page on the remote web app.
you'll have to grab the cookie it gives you so you have a valid session. aspnet session id is passed in the cookie. Then you will need to pass the username and password required for that app along with the cookie you received so you will have a valid authenticated session.

How to make my web scraper log in to this website via C#

I have an application that reads parts of the source code on a website. That all works; but the problem is that the page in question requires the user to be logged in to access this source code. What my program needs a way to initially log the user into the website- after that is done, I'll be able to access and read the source code.
The website that needs to be logged into is:
mmoinn.com/index.do?PageModule=UsersLogin
You can continue using WebClient to POST (instead of GET, which is the HTTP verb you're currently using with DownloadString), but I think you'll find it easier to work with the (slightly) lower-level classes WebRequest and WebResponse.
There are two parts to this - the first is to post the login form, the second is recovering the "Set-cookie" header and sending that back to the server as "Cookie" along with your GET request. The server will use this cookie to identify you from now on (assuming it's using cookie-based authentication which I'm fairly confident it is as that page returns a Set-cookie header which includes "PHPSESSID").
POSTing to the login form
Form posts are easy to simulate, it's just a case of formatting your post data as follows:
field1=value1&field2=value2
Using WebRequest and code I adapted from Scott Hanselman, here's how you'd POST form data to your login form:
string formUrl = "http://www.mmoinn.com/index.do?PageModule=UsersAction&Action=UsersLogin"; // NOTE: This is the URL the form POSTs to, not the URL of the form (you can find this in the "action" attribute of the HTML's form tag
string formParams = string.Format("email_address={0}&password={1}", "your email", "your password");
string cookieHeader;
WebRequest req = WebRequest.Create(formUrl);
req.ContentType = "application/x-www-form-urlencoded";
req.Method = "POST";
byte[] bytes = Encoding.ASCII.GetBytes(formParams);
req.ContentLength = bytes.Length;
using (Stream os = req.GetRequestStream())
{
os.Write(bytes, 0, bytes.Length);
}
WebResponse resp = req.GetResponse();
cookieHeader = resp.Headers["Set-cookie"];
Here's an example of what you should see in the Set-cookie header for your login form:
PHPSESSID=c4812cffcf2c45e0357a5a93c137642e; path=/; domain=.mmoinn.com,wowmine_referer=directenter; path=/; domain=.mmoinn.com,lang=en; path=/;domain=.mmoinn.com,adt_usertype=other,adt_host=-
GETting the page behind the login form
Now you can perform your GET request to a page that you need to be logged in for.
string pageSource;
string getUrl = "the url of the page behind the login";
WebRequest getRequest = WebRequest.Create(getUrl);
getRequest.Headers.Add("Cookie", cookieHeader);
WebResponse getResponse = getRequest.GetResponse();
using (StreamReader sr = new StreamReader(getResponse.GetResponseStream()))
{
pageSource = sr.ReadToEnd();
}
EDIT:
If you need to view the results of the first POST, you can recover the HTML it returned with:
using (StreamReader sr = new StreamReader(resp.GetResponseStream()))
{
pageSource = sr.ReadToEnd();
}
Place this directly below cookieHeader = resp.Headers["Set-cookie"]; and then inspect the string held in pageSource.
You can simplify things quite a bit by creating a class that derives from WebClient, overriding its GetWebRequest method and setting a CookieContainer object on it. If you always set the same CookieContainer instance, then cookie management will be handled automatically for you.
But the only way to get at the HttpWebRequest before it is sent is to inherit from WebClient and override that method.
public class CookieAwareWebClient : WebClient
{
private CookieContainer cookie = new CookieContainer();
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = cookie;
}
return request;
}
}
var client = new CookieAwareWebClient();
client.BaseAddress = #"https://www.site.com/any/base/url/";
var loginData = new NameValueCollection();
loginData.Add("login", "YourLogin");
loginData.Add("password", "YourPassword");
client.UploadValues("login.php", "POST", loginData);
//Now you are logged in and can request pages
string htmlSource = client.DownloadString("index.php");
Matthew Brindley, your code worked very good for some website I needed (with login), but I needed to change to HttpWebRequest and HttpWebResponse otherwise I get a 404 Bad Request from the remote server. Also I would like to share my workaround using your code, and is that I tried it to login to a website based on moodle, but it didn't work at your step "GETting the page behind the login form" because when successfully POSTing the login, the Header 'Set-Cookie' didn't return anything despite other websites does.
So I think this where we need to store cookies for next Requests, so I added this.
To the "POSTing to the login form" code block :
var cookies = new CookieContainer();
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(formUrl);
req.CookieContainer = cookies;
And To the "GETting the page behind the login form" :
HttpWebRequest getRequest = (HttpWebRequest)WebRequest.Create(getUrl);
getRequest.CookieContainer = new CookieContainer();
getRequest.CookieContainer.Add(resp.Cookies);
getRequest.Headers.Add("Cookie", cookieHeader);
Doing this, lets me Log me in and get the source code of the "page behind login" (website based moodle) I know this is a vague use of the CookieContainer and HTTPCookies because we may ask first is there a previously set of cookies saved before sending the request to the server. This works without problem anyway, but here's a good info to read about WebRequest and WebResponse with sample projects and tutorial:
Retrieving HTTP content in .NET
How to use HttpWebRequest and HttpWebResponse in .NET
Sometimes, it may help switching off AllowAutoRedirect and setting both login POST and page GET requests the same user agent.
request.UserAgent = userAgent;
request.AllowAutoRedirect = false;

Categories

Resources