I've been trying to automate a log in to a website I frequent, www.bungie.net. The site is associated with Microsoft and Xbox Live, and as such makes uses of the Windows Live ID API when people log in to their site.
I am relatively new to creating web spiders/robots, and I worry that I'm misunderstanding some of the most basic concepts. I've simulated logins to other sites such as Facebook and Gmail, but live.com has given me nothing but trouble.
Anyways, I've been using Wireshark and the Firefox addon Tamper Data to try and figure out what I need to post, and what cookies I need to include with my requests. As far as I know these are the steps one must follow to log in to this site.
1. Visit https: //login.live.com/login.srf?wa=wsignin1.0&rpsnv=11&ct=1268167141&rver=5.5.4177.0&wp=LBI&wreply=http:%2F%2Fwww.bungie.net%2FDefault.aspx&id=42917
2. Recieve the cookies MSPRequ and MSPOK.
3. Post the values from the form ID "PPSX", the values from the form ID "PPFT", your username, your password all to a changing URL similar to: https: //login.live.com/ppsecure/post.srf?wa=wsignin1.0&rpsnv=11&ct=
(there are a few numbers that change at the end of that URL)
4. Live.com returns the user a page with more hidden forms to post. The client then posts the values from the form "ANON", the value from the form "ANONExp" and the values from the form "t" to the URL: http ://www.bung ie.net/Default.aspx?wa=wsignin1.0
5. After posting that data, the user is returned a variety of cookies the most important of which is "BNGAuth" which is the log in cookie for the site.
Where I am having trouble is on fifth step, but that doesn't neccesarily mean I've done all the other steps correctly. I post the data from "ANON", "ANONExp" and "t" but instead of being returned a BNGAuth cookie, I'm returned a cookie named "RSPMaybe" and redirected to the home page.
When I review the Wireshark log, I noticed something that instantly stood out to me as different between the log when I logged in with Firefox and when my program ran. It could be nothing but I'll include the picture here for you to review. I'm being returned an HTTP packet from the site before I post the data in the fourth step. I'm not sure how this is happening, but it must be a side effect from something I'm doing wrong in the HTTPS steps.
using System;
using System.Collections.Generic;
using System.Collections.Specialized;
using System.Text;
using System.Net;
using System.IO;
using System.IO.Compression;
using System.Security.Cryptography;
using System.Security.Cryptography.X509Certificates;
using System.Web;
namespace SpiderFromScratch
{
class Program
{
static void Main(string[] args)
{
CookieContainer cookies = new CookieContainer();
Uri url = new Uri("https://login.live.com/login.srf?wa=wsignin1.0&rpsnv=11&ct=1268167141&rver=5.5.4177.0&wp=LBI&wreply=http:%2F%2Fwww.bungie.net%2FDefault.aspx&id=42917");
HttpWebRequest http = (HttpWebRequest)HttpWebRequest.Create(url);
http.Timeout = 30000;
http.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.8) Gecko/20100202 Firefox/3.5.8 (.NET CLR 3.5.30729)";
http.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
http.Headers.Add("Accept-Language", "en-us,en;q=0.5");
http.Headers.Add("Accept-Charset", "ISO-8859-1,utf-8;q=0.7,*;q=0.7");
http.Headers.Add("Keep-Alive", "300");
http.Referer = "http://www.bungie.net/";
http.ContentType = "application/x-www-form-urlencoded";
http.CookieContainer = new CookieContainer();
http.Method = WebRequestMethods.Http.Get;
HttpWebResponse response = (HttpWebResponse)http.GetResponse();
StreamReader readStream = new StreamReader(response.GetResponseStream());
string HTML = readStream.ReadToEnd();
readStream.Close();
//gets the cookies (they are set in the eighth header)
string[] strCookies = response.Headers.GetValues(8);
response.Close();
string name, value;
Cookie manualCookie;
for (int i = 0; i < strCookies.Length; i++)
{
name = strCookies[i].Substring(0, strCookies[i].IndexOf("="));
value = strCookies[i].Substring(strCookies[i].IndexOf("=") + 1, strCookies[i].IndexOf(";") - strCookies[i].IndexOf("=") - 1);
manualCookie = new Cookie(name, "\"" + value + "\"");
Uri manualURL = new Uri("http://login.live.com");
http.CookieContainer.Add(manualURL, manualCookie);
}
//stores the cookies to be used later
cookies = http.CookieContainer;
//Get the PPSX value
string PPSX = HTML.Remove(0, HTML.IndexOf("PPSX"));
PPSX = PPSX.Remove(0, PPSX.IndexOf("value") + 7);
PPSX = PPSX.Substring(0, PPSX.IndexOf("\""));
//Get this random PPFT value
string PPFT = HTML.Remove(0, HTML.IndexOf("PPFT"));
PPFT = PPFT.Remove(0, PPFT.IndexOf("value") + 7);
PPFT = PPFT.Substring(0, PPFT.IndexOf("\""));
//Get the random URL you POST to
string POSTURL = HTML.Remove(0, HTML.IndexOf("https://login.live.com/ppsecure/post.srf?wa=wsignin1.0&rpsnv=11&ct="));
POSTURL = POSTURL.Substring(0, POSTURL.IndexOf("\""));
//POST with cookies
http = (HttpWebRequest)HttpWebRequest.Create(POSTURL);
http.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.8) Gecko/20100202 Firefox/3.5.8 (.NET CLR 3.5.30729)";
http.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
http.Headers.Add("Accept-Language", "en-us,en;q=0.5");
http.Headers.Add("Accept-Charset", "ISO-8859-1,utf-8;q=0.7,*;q=0.7");
http.Headers.Add("Keep-Alive", "300");
http.CookieContainer = cookies;
http.Referer = "https://login.live.com/login.srf?wa=wsignin1.0&rpsnv=11&ct=1268158321&rver=5.5.4177.0&wp=LBI&wreply=http:%2F%2Fwww.bungie.net%2FDefault.aspx&id=42917";
http.ContentType = "application/x-www-form-urlencoded";
http.Method = WebRequestMethods.Http.Post;
Stream ostream = http.GetRequestStream();
//used to convert strings into bytes
System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();
//Post information
byte[] buffer = encoding.GetBytes("PPSX=" + PPSX +"&PwdPad=IfYouAreReadingThisYouHaveTooMuc&login=YOUREMAILGOESHERE&passwd=YOURWORDGOESHERE" +
"&LoginOptions=2&PPFT=" + PPFT);
ostream.Write(buffer, 0, buffer.Length);
ostream.Close();
HttpWebResponse response2 = (HttpWebResponse)http.GetResponse();
readStream = new StreamReader(response2.GetResponseStream());
HTML = readStream.ReadToEnd();
response2.Close();
ostream.Dispose();
foreach (Cookie cookie in response2.Cookies)
{
Console.WriteLine(cookie.Name + ": ");
Console.WriteLine(cookie.Value);
Console.WriteLine(cookie.Expires);
Console.WriteLine();
}
//SET POSTURL value
string POSTANON = "http://www.bungie.net/Default.aspx?wa=wsignin1.0";
//Get the ANON value
string ANON = HTML.Remove(0, HTML.IndexOf("ANON"));
ANON = ANON.Remove(0, ANON.IndexOf("value") + 7);
ANON = ANON.Substring(0, ANON.IndexOf("\""));
ANON = HttpUtility.UrlEncode(ANON);
//Get the ANONExp value
string ANONExp = HTML.Remove(0, HTML.IndexOf("ANONExp"));
ANONExp = ANONExp.Remove(0, ANONExp.IndexOf("value") + 7);
ANONExp = ANONExp.Substring(0, ANONExp.IndexOf("\""));
ANONExp = HttpUtility.UrlEncode(ANONExp);
//Get the t value
string t = HTML.Remove(0, HTML.IndexOf("id=\"t\""));
t = t.Remove(0, t.IndexOf("value") + 7);
t = t.Substring(0, t.IndexOf("\""));
t = HttpUtility.UrlEncode(t);
//POST the Info and Accept the Bungie Cookies
http = (HttpWebRequest)HttpWebRequest.Create(POSTANON);
http.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.8) Gecko/20100202 Firefox/3.5.8 (.NET CLR 3.5.30729)";
http.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
http.Headers.Add("Accept-Language", "en-us,en;q=0.5");
http.Headers.Add("Accept-Encoding", "gzip,deflate");
http.Headers.Add("Accept-Charset", "ISO-8859-1,utf-8;q=0.7,*;q=0.7");
http.Headers.Add("Keep-Alive", "115");
http.CookieContainer = new CookieContainer();
http.ContentType = "application/x-www-form-urlencoded";
http.Method = WebRequestMethods.Http.Post;
http.Expect = null;
ostream = http.GetRequestStream();
int test = ANON.Length;
int test1 = ANONExp.Length;
int test2 = t.Length;
buffer = encoding.GetBytes("ANON=" + ANON +"&ANONExp=" + ANONExp + "&t=" + t);
ostream.Write(buffer, 0, buffer.Length);
ostream.Close();
//Here lies the problem, I am not returned the correct cookies.
HttpWebResponse response3 = (HttpWebResponse)http.GetResponse();
GZipStream gzip = new GZipStream(response3.GetResponseStream(), CompressionMode.Decompress);
readStream = new StreamReader(gzip);
HTML = readStream.ReadToEnd();
//gets both cookies
string[] strCookies2 = response3.Headers.GetValues(11);
response3.Close();
}
}
}
I'm not sure if you're still working on this or not but the Windows Live Development site has a lot of info on it to help with using the Live ID API. I've not had much of a dig into it but their Getting Started page has a load of info plus a link to download sample applications detailing how to use the service in a variety of languages (including C#).
You can download the sample application from there.
It sounds pretty interesting what you're trying to do, so much so that I quite fancy having a play with this myself!
Change your timing and see if you get the same results.
It is so much easier to just use a UI automation framework like WatiN than to use httpwebrequest, unless that would break your requirements. With WatiN, you are thinking about what is shown on UI rather than what is on the HTML.
Related
I am developing a C# wpf application that has a functionality of logging into my website and download the file. This said website has an Authorize attribute on its action. I need 2 cookies for me to able to download the file, first cookie is for me to log in, second cookie(which is provided after successful log in) is for me to download the file. So i came up with the flow of keeping my cookies after my httpwebrequest/httpwebresponse. I am looking at my posting flow as maybe it is the problem. Here is my code.
void externalloginanddownload()
{
string pageSource = string.Empty;
CookieContainer cookies = new CookieContainer();
HttpWebRequest getrequest = (HttpWebRequest)WebRequest.Create("login uri");
getrequest.CookieContainer = cookies;
getrequest.Method = "GET";
getrequest.AllowAutoRedirect = false;
HttpWebResponse getresponse = (HttpWebResponse)getrequest.GetResponse();
using (StreamReader sr = new StreamReader(getresponse.GetResponseStream()))
{
pageSource = sr.ReadToEnd();
}
var values = new NameValueCollection
{
{"Username", "username"},
{"Password", "password"},
{ "Remember me?","False"},
};
var parameters = new StringBuilder();
foreach (string key in values.Keys)
{
parameters.AppendFormat("{0}={1}&",
HttpUtility.UrlEncode(key),
HttpUtility.UrlEncode(values[key]));
}
parameters.Length -= 1;
HttpWebRequest postrequest = (HttpWebRequest)WebRequest.Create("login uri");
postrequest.CookieContainer = cookies;
postrequest.Method = "POST";
using (var writer = new StreamWriter(postrequest.GetRequestStream()))
{
writer.Write(parameters.ToString());
}
using (WebResponse response = postrequest.GetResponse()) // the error 500 occurs here
{
using (var streamReader = new StreamReader(response.GetResponseStream()))
{
string html = streamReader.ReadToEnd();
}
}
}
When you get the WebResponse, the cookies returned will be in the response, not in the request (oddly enough, even though you need to CookieContainer on the request).
You will need to add the cookies from the response object to your CookieContainer, so it gets sent on the next request.
One simple way:
for(var cookie in getresponse.Cookies)
cookies.Add(cookie)
Since the cookies in response is already a cookies container, you can do this (might help to check for null in case all cookies were already there)
if (response.Cookies != null) cookies.Add(response.Cookies)
You may also have trouble with your POST as you need to set ContentType and length:
myWebRequest.ContentLength = parameters.Length;
myWebRequest.AllowWriteStreamBuffering = true;
If you have any multibyte characters to think about, you may have to address that as well by setting the encoding to UTF-8 on the request and the stringbuilder, and converting string to bytes and using that length.
Another tip: some web server code chokes if there is no user agent. Try:
myWebRequest.UserAgent = "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)";
And just in case you have any multibyte characters, it is better to do this:
var databytes = System.Text.Encoding.UTF8.GetBytes(parameters.ToString());
myWebRequest.ContentLength = databytes.Length;
myWebRequest.ContentType = "application/x-www-form-urlencoded; charset=utf-8";
using (var stream = myWebRequest.GetRequestStream())
{
stream.Write(databytes, 0, databytes.Length);
}
In C# Application (Server side Web API) Enable the C++ Exception and Common Language Run time Exceptions using (Ctrl+Alt+E) what is the Server side Exception it's throw.
First you check data is binding Properly. After you can see what it is Exact Exception. the Internal Server Error Mostly throw the data is not correct format and not properly managed Exception.
Somehow I Iam not able to download the html content of https://{YourName}.visualstudio.com/Defaultcollection/ via HttpWebRequest/WebRequest or WebClient.
It always results a HTML-Page with following error Message:
Microsoft Internet Explorer's Enhanced Security Configuration is currently enabled on your environment. This enhanced level of security prevents our web integration experiences from displaying or performing correctly. To continue with your operation please disable this configuration or contact your administrator.
I have tried alot of ways to get to my needed result. I tried using OAuth2 and also setup Alternate authentication credentials. I even disabled Microsoft Internet Explorer's Enhanced Security.
Here are 2 of my x methods which doesnt seem to work. Both give the same result (see error msg above):
private static void Test()
{
WebClient client = new WebClient();
client.UseDefaultCredentials = true;
client.Credentials = new NetworkCredential(UserName,Password);
//Pretend to be a browser
client.Headers.Add("user-agent", "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3 (.NET CLR 4.0.20506)");
var HTML = client.DownloadString("https://<YourName>.visualstudio.com/Defaultcollection/");
Console.WriteLine(HTML);
}
private static void Test2()
{
CookieContainer cookies = new CookieContainer();
HttpWebRequest authRequest = (HttpWebRequest)HttpWebRequest.Create("https://<YourName>.visualstudio.com/Defaultcollection/");
//Set Header
authRequest.UserAgent = "Mozilla/5.0 (Windows NT 5.1; rv:2.0b8) Gecko/20100101 Firefox/4.0b8";
authRequest.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
authRequest.Headers.Add("Accept-Encoding", "gzip, deflate");
authRequest.Headers.Add("Accept-Language", "de,en;q=0.5");
authRequest.Headers.Add("Accept-Charset", "ISO-8859-1,utf-8;q=0.7,*;q=0.7");
//authRequest.Headers.Add("Keep-Alive", "30000");
authRequest.Headers.Add(HttpRequestHeader.Authorization, SetAuthHeaderValue());
//Something
authRequest.ContentLength = 0;
authRequest.ContentType = "application/soap+xml; charset=utf-8";
authRequest.Host = "<YourName>.visualstudio.com";
//Set Cookies
authRequest.CookieContainer = cookies;
HttpWebResponse response = (HttpWebResponse)authRequest.GetResponse();
StreamReader readStream = new StreamReader(response.GetResponseStream());
string HTML = readStream.ReadToEnd();
Console.WriteLine(HTML);
readStream.Close();
}
private static string SetAuthHeaderValue()
{
//string _auth = string.Format("{0}:{1}",UserName,Password);
//string _enc = Convert.ToBase64String(Encoding.ASCII.GetBytes(_auth));
String encoded = System.Convert.ToBase64String(System.Text.Encoding.GetEncoding("ISO-8859-1").GetBytes(UserName + ":" + Password));
string _cred = string.Format("{1}", "Basic", encoded);
return _cred;
}
I picked the Header-Values you see here, by tracing the connection with fiddler.
Is somebody able to authenticated,connect and download the html-content from https://{YourName}.visualstudio.com/Defaultcollection/?
Would be awesome, thanks :)!
I am new to C# but it happens that I need to programmatically login to a particular web site for screen-scraping in C#. I have done on-line research (this site has been particularly helpful) and I have learnt that I need to use one of the following objects/classes in order to login: WebRequest/WebResponse, HttpWebRequest/HttpWebResponse, WebClient, and also that I need to pass cookies that I receive from the web site to subsequent (screen scraping) requests. I, however, have not been able to successfully login, and at this point I have ran out of ideas. I want to login on the home page ------ and then screen scrape a number of pages like this one: -------. The web site works like this: It allows one to access pages like the one I have referenced, but unless a user is logged in, it returns asterisks in some of the fields. I presume that it means that the content is dynamically generated, which I suspect may be the underlying cause of my login troubles. I am including the code that I am using to login to the web site:
class Program
{
private static string link_main_page = "-----------";
private static string link_target_page = "------------";
private static string authorization_param = "----------";
private static void LoginUsingTheHttpWebRequestClass()
{
HttpWebRequest MyLoginRequest = (HttpWebRequest)WebRequest.Create(link_main_page);
MyLoginRequest.Method = "POST";
MyLoginRequest.UserAgent = "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; InfoPath.1; .NET4.0C; .NET CLR 2.0.50727; .NET4.0E; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)";
MyLoginRequest.ContentType = "application/x-www-form-urlencoded";
MyLoginRequest.CookieContainer = new CookieContainer();
byte[] sentData = Encoding.UTF8.GetBytes(authorization_param);
MyLoginRequest.ContentLength = sentData.Length;
Stream sendStream = MyLoginRequest.GetRequestStream();
sendStream.Write(sentData, 0, sentData.Length);
HttpWebResponse MyLoginResponse = (HttpWebResponse)MyLoginRequest.GetResponse();
CookieCollection MyCookieCollection = new CookieCollection();
MyCookieCollection.Add(MyLoginResponse.Cookies);
foreach (Cookie MyCookie in MyCookieCollection)
{
Console.WriteLine("Cookie:");
Console.WriteLine("{0} = {1}", MyCookie.Name, MyCookie.Value);
}
HttpWebRequest MyGetRequest = (HttpWebRequest)WebRequest.Create(link_target_page);
MyGetRequest.ContentType = "application/x-www-form-urlencoded";
MyGetRequest.Method = "GET";
MyGetRequest.UserAgent = "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; InfoPath.1; .NET4.0C; .NET CLR 2.0.50727; .NET4.0E; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)";
MyGetRequest.CookieContainer = new CookieContainer();
MyGetRequest.CookieContainer.Add(MyCookieCollection);
HttpWebResponse MyGetResponse = (HttpWebResponse)MyGetRequest.GetResponse();
Stream stream;
stream = MyGetResponse.GetResponseStream();
string s;
using (StreamReader sr = new StreamReader(stream))
{
s = sr.ReadToEnd();
using (StreamWriter sw = File.CreateText("TheFile.htm"))
{
sw.Write(s);
sw.Close();
}
sr.Close();
}
}
private static void LoginUsingTheWebRequestClass()
{
WebRequest MyLoginRequest = WebRequest.Create(link_main_page);
MyLoginRequest.Method = "POST";
MyLoginRequest.ContentType = "application/x-www-form-urlencoded";
byte[] sentData = Encoding.UTF8.GetBytes(authorization_param);
MyLoginRequest.ContentLength = sentData.Length;
Stream sendStream = MyLoginRequest.GetRequestStream();
sendStream.Write(sentData, 0, sentData.Length);
WebResponse MyLoginResponse = MyLoginRequest.GetResponse();
string CookieHeader;
CookieHeader = MyLoginResponse.Headers["Set-cookie"];
Console.WriteLine("Cookie:");
Console.WriteLine(CookieHeader);
WebRequest MyGetRequest = WebRequest.Create(link_target_page);
MyGetRequest.ContentType = "application/x-www-form-urlencoded";
MyGetRequest.Method = "GET";
MyGetRequest.Headers.Add("Cookie", CookieHeader);
WebResponse MyGetResponse = MyGetRequest.GetResponse();
Stream stream;
stream = MyGetResponse.GetResponseStream();
string s;
using (StreamReader sr = new StreamReader(stream))
{
s = sr.ReadToEnd();
using (StreamWriter sw = File.CreateText("TheFile2.htm"))
{
sw.Write(s);
sw.Close();
}
sr.Close();
}
}
static void Main(string[] args)
{
Console.WriteLine("Login Using the HttpWebRequest Class");
LoginUsingTheHttpWebRequestClass();
Console.WriteLine("Login Using the WebRequest Class");
LoginUsingTheWebRequestClass();
Console.WriteLine("Done! Press any key to continue");
Console.ReadKey();
}
}
Neither the attempt to login using HttpWebRequest/HttpWebResponse nor the attempt to login using WebRequest/WebResponse works. The first one returns a cookie that looks like this:
PHPSESSID=hncrr0...
The second one returns a cookie that looks like that:
PHPSESSID=88dn1n9...; path=/
These cookies look suspicious to me. For one thing they look different from the cookies in the IE. But I do not know what exactly I should expect.
(I also tried to pass cookies that I received via a (Http)WebRequest/(Http)WebResponse to a WebClient but again to no avail - I am not including it here to save space).
I would very much appreciate any input. If someone wants to run the code, I can actually provide the actual login/password information (registration on that web site is free anyway).
Basically I am making a chat app for my university students only and for that I have to make sure they are genuine by checking there details on UMS(university management system) and get their basic detail so they chat genuinely. I am nearly done with my chat app only the login is left.
So I want to login to my UMS page via my website from a generic handler.
and then navigate to another page in it to access there basic info keeping the session alive.
I did research on httpwebrequest and failed to login with my credentials.
https://ums.lpu.in/lpuums
(made in asp.net)
I did tried codes from other posts for login.
I am novice to this part so bear with me.. any help will be appreciated.
Without the actual handshake with UMS via a defined API, you would end up scraping UMS html, which is bad for various reasons.
I would suggest you read up on Single Sign On (SSO).
A few articles on SSO and ASP.NET -
1. Codeproject
2. MSDN
3. asp.net forum
Edit 1
Although, I think this is a bad idea, since you say you are out of options, here is a link that shows how Html Agility Pack can help in scraping the web pages.
Beware of the drawbacks of screen scraping, changes from UMS will not be communicated to you, and you will see your application not working all of a sudden.
public string Scrap(string Username, string Password)
{
string Url1 = "https://www.example.com";//first url
string Url2 = "https://www.example.com/login.aspx";//secret url to post request to
//first request
CookieContainer jar = new CookieContainer();
HttpWebRequest request1 = (HttpWebRequest)WebRequest.Create(Url1);
request1.CookieContainer = jar;
//Get the response from the server and save the cookies from the first request..
HttpWebResponse response1 = (HttpWebResponse)request1.GetResponse();
//second request
string postData = "***viewstate here***";//VIEWSTATE
HttpWebRequest request2 = (HttpWebRequest)WebRequest.Create(Url2);
request2.CookieContainer = jar;
request2.KeepAlive = true;
request2.Referer = Url2;
request2.Method = WebRequestMethods.Http.Post;
request2.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
request2.UserAgent = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.121 Safari/535.2";
request2.ContentType = "application/x-www-form-urlencoded";
request2.AllowWriteStreamBuffering = true;
request2.ProtocolVersion = HttpVersion.Version11;
request2.AllowAutoRedirect = true;
byte[] byteArray = Encoding.ASCII.GetBytes(postData);
request2.ContentLength = byteArray.Length;
Stream newStream = request2.GetRequestStream(); //open connection
newStream.Write(byteArray, 0, byteArray.Length); // Send the data.
newStream.Close();
HttpWebResponse response2 = (HttpWebResponse)request2.GetResponse();
using (StreamReader sr = new StreamReader(response2.GetResponseStream()))
{
responseData = sr.ReadToEnd();
}
return responseData;
}
This is the code which works for me any one can add there links and viewstate for asp.net websites to scrap and you need to take care of cookie too.
and for other websites(non asp.net) they don't require viewstate.
Use fiddler to find things needed to add in header and viewstate or cookie.
Hope this helps if some one having the problem. :)
I'm a bit confused on how to go about this as I'm not really conversant with web stuff. I'm using a console application in C# to try and retrieve value from a page link inside a password protected homepage. I'm using the following details
Here's the code I'm trying:
HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create("");
req.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705;)";
req.Method = "POST";
req.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
req.Headers.Add("Accept-Language: en-us,en;q=0.5");
req.Headers.Add("Accept-Encoding: gzip,deflate");
req.Headers.Add("Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7");
req.KeepAlive = true;
req.Headers.Add("Keep-Alive: 300");
req.Referer = "copy from url";
req.ContentType = "application/x-www-form-urlencoded";
String Username = copy from url;
String PassWord = copy from url;
StreamWriter sw = new StreamWriter(req.GetRequestStream());
sw.Write(string.Format("&loginname={0}&password={1}&btnSubmit=Log In&institutioncode=H4V9KLUT45AV&version=2", Username, PassWord));
sw.Close();
HttpWebResponse response = (HttpWebResponse)req.GetResponse();
StreamReader reader = new StreamReader(response.GetResponseStream());
string tmp = reader.ReadToEnd();
However, when I inspect the data retrieved from the web page it shows something like this:
'...Your Session has timed out due to inactivity.Please logout and
relogin.return to login page>'
I'm guessing this is due to some VIEWSTATE stuff in ASP.NET
I'm also guessing I might have a problem with retrieving the data from the link I'll extract from the homepage, coz it seems the link simply loads data into a frame rather than reload the webpage.
Anyone please?
Your form data is incorrect. After removing the & at the beginning it worked for me:
sw.Write(string.Format("loginname={0}&password={1}&btnSubmit=Log In&institutioncode=H4V9KLUT45AV&version=2", Username, PassWord));
Additionally, as already mentioned in the other answer, you need to add the returned ASPSESSIONIDSSRRDRST cookie in further requests to the site.
Ok... the website is using Cookies, so, after you logged in you need to retrieve the cookies first, to make another WebRequest:
CookieCollection cookiesResponse = new CookieCollection();
if (response != null)
{
foreach (string cookie in response.Headers["Set-Cookie"].Split(';'))
{
string name = cookie.Split('=')[0];
string value = cookie.Substring(name.Length + 1);
cookiesResponse.Add(new Cookie(name.Trim(), value.Trim(), path, domain));
}
}
In you example the cookie contains: ASPSESSIONIDSSRRDRST=FEKODBMDBEIPCLLENCFLFBEA
You must use that CookieCollection for any request to the web, in your request you can set the cookies:
request.CookieContainer = cookiesResponse;
And finaly, you can parse the response. You can use an html tag parse, or parse the plain text.
I hope this is helpful.