Hey I'm trying to figure out using HttpWebRequest to do a Post request to a login page, say yahoo mail, and examine the returned page source.
But using my Post method I still got the login page.
Here is my method:
public static string GetResponse(string sURL, ref CookieContainer cookies, string sParameters)
{
HttpWebRequest httpRequest = (HttpWebRequest)WebRequest.Create(sURL);
httpRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36";
httpRequest.CookieContainer = cookies;
httpRequest.Method = "Post";
httpRequest.ContentType = "application/x-www-form-urlencoded";
httpRequest.ContentLength = sParameters.Length;
httpRequest.AllowAutoRedirect = true;
using (Stream stream = httpRequest.GetRequestStream())
{
stream.Write(Encoding.UTF8.GetBytes(sParameters), 0, sParameters.Length);
}
HttpWebResponse httpWebResponse = (HttpWebResponse)httpRequest.GetResponse();
string sResponse;
using (Stream stream = httpWebResponse.GetResponseStream())
{
StreamReader reader = new StreamReader(stream, System.Text.Encoding.GetEncoding(936));
sResponse = reader.ReadToEnd();
}
return sResponse;
}
The code to call the method is:
string sParameter = ".tries=1&.src=ym&.md5=&.hash=&.js=&.last=&promo=&.intl=us&.lang=en-US&.bypass=&.partner=&.u=eip09319532h1&.v=0&.challenge=3QjvX9eEFtJRrABhZp9kgS9IT.VO&.yplus=&.emailCode=&pkg=&stepid=&.ev=&hasMsgr=0&.chkP=Y&.done=http%3A%2F%2Fmail.yahoo.com&.pd=ym_ver%3D0%26c%3D%26ivt%3D%26sg%3D&.ws=1&.cp=0&nr=0&pad=3&aad=3&login=username%40yahoo.com&passwd=xxxxx&.persistent=&.save=&passwd_raw=";
System.Net.CookieContainer coookies = null ;
string sResponse;
sResponse = GetResponse(sUrl, ref coookies, sParameter);
The string sParameter was obtained by examining the data posted to the server in Firefox's Firebug plugin. But in the parameters I posted above, I masked my user id and password.
I wanted to re-use the session so I passed a CookieContainer object as reference to the method.
It compiles and runs, but the page returned to me is not logged-in status.
I have read several similar questions on stackoverflow, but still can't make my method work. Your help is appreciated.
Related
So I am currently trying to log into my account on a website using WebRequest.
I have been reading about it to the point where I feel like I wanted to use an example to learn by trial and error.
This is the example I am using
Login to website, via C#
So when I try to execute my code it returns an unhandled exception and its this one
System.Net.WebException: 'The remote server returned an error: (404)
Not Found.'
I tried stepping through the code and I THINK it might be that it's trying to POST somewhere where it can't.
I wanted to fix this before moving onto getting a confirmation that it successfully logged in.
I changed the username and password to dummy text for the sake of this question.
What did I do wrong here and whats the most logical way of fixing this issue?
Thanks in advance.
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;
string formUrl = "https://secure.runescape.com/m=weblogin/login.ws"; // NOTE: This is the URL the form POSTs to, not the URL of the form (you can find this in the "action" attribute of the HTML's form tag
string formParams = string.Format("login-username={0}&login-password={1}", "myUsername", "password");
string cookieHeader;
WebRequest req = WebRequest.Create(formUrl);
req.ContentType = "application/x-www-form-urlencoded";
req.Method = "POST";
byte[] bytes = Encoding.ASCII.GetBytes(formParams);
req.ContentLength = bytes.Length;
using (Stream os = req.GetRequestStream())
{
os.Write(bytes, 0, bytes.Length);
}
WebResponse resp = req.GetResponse();
cookieHeader = resp.Headers["Set-cookie"];
When you scrape a website, you have to make sure you mimic everything that happens. That includes any client-side state (Cookies) that is sent earlier before a form is POST-ed. As most sites don't like to be scraped or steered by bots they are often rather picky about what is the payload. Same is true for the site you're trying to control.
Three important things you have missed:
You didn't start with an initial GET so you have the required cookies in a CookieContainer.
on the post you missed an header (Referrer) and three hidden fields in the form.
The form fields are named username and password (as can be seen in the name attribute of the input tags). You have used the id's.
Fixing those omissions will result in the following code:
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;
string useragent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36";
// capture cookies, this is important!
var cookies = new CookieContainer();
// do a GET first, so you have the initial cookies neeeded
string loginUrl = "https://secure.runescape.com/m=weblogin/loginform.ws?mod=www&ssl=0&dest=community";
// HttpWebRequest
var reqLogin = (HttpWebRequest) WebRequest.Create(loginUrl);
// minimal needed settings
reqLogin.UserAgent = useragent;
reqLogin.CookieContainer = cookies;
reqLogin.Method = "GET";
var loginResp = reqLogin.GetResponse();
//loginResp.Dump(); // LinqPad testing
string formUrl = "https://secure.runescape.com/m=weblogin/login.ws"; // NOTE: This is the URL the form POSTs to, not the URL of the form (you can find this in the "action" attribute of the HTML's form tag
// in ther html the form has 3 more hidden fields, those are needed as well
string formParams = string.Format("username={0}&password={1}&mod=www&ssl=0&dest=community", "myUsername", "password");
string cookieHeader;
// notice the cast to HttpWebRequest
var req = (HttpWebRequest) WebRequest.Create(formUrl);
// put the earlier cookies back on the request
req.CookieContainer = cookies;
// the Referrer is mandatory, without it a timeout is raised
req.Headers["Referrer"] = "https://secure.runescape.com/m=weblogin/loginform.ws?mod=www&ssl=0&dest=community";
req.UserAgent = useragent;
req.ContentType = "application/x-www-form-urlencoded";
req.Method = "POST";
byte[] bytes = Encoding.ASCII.GetBytes(formParams);
req.ContentLength = bytes.Length;
using (Stream os = req.GetRequestStream())
{
os.Write(bytes, 0, bytes.Length);
}
WebResponse resp = req.GetResponse();
cookieHeader = resp.Headers["Set-cookie"];
This returns for me success. It is up to you parse the resulting HTML to plan your next steps.
I'm trying to fetch the HTML of a page through code:
WebRequest r = WebRequest.Create(szPageURL);
WebClient client = new WebClient();
try
{
WebResponse resp = r.GetResponse();
StreamReader sr = new StreamReader(resp.GetResponseStream());
szHTML = sr.ReadToEnd();
}
This code works when I use URLs like www.microsoft.com, www.google.com, or www.nasa.gov. However, when I put in www.epa.gov (using either 'http' or 'https' in the URL parameter), I get a 403 exception when executing r.GetResponse(). Yet I can easily fetch the page manually in a browser. The exception I'm getting is 403 (Forbidden) and the exception status member says "ProtocolError". What does that mean? Why I am I getting this on a page that actually is available? Anyone have any ideas? Thanks!
BTW - I also tried this way:
string downloadString = client.DownloadString(szPageURL);
Got exact same exception.
try this code, it works:
string Url = "https://www.epa.gov/";
CookieContainer cookieJar = new CookieContainer();
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(Url);
request.CookieContainer = cookieJar;
request.Accept = #"text/html, application/xhtml+xml, */*";
request.Referer = #"https://www.epa.gov/";
request.Headers.Add("Accept-Language", "en-GB");
request.UserAgent = #"Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; Trident/6.0)";
request.Host = #"www.epa.gov";
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
String htmlString;
using (var reader = new StreamReader(response.GetResponseStream()))
{
htmlString = reader.ReadToEnd();
}
I need to send HTTP request using POST method to an Asp.Net form, this form is a login form includes 3 controls :
TextBox for username (with name="x" and id="IX").
TextBox for Password (With name="P" and id="IP").
Button Submit (with name="S" and id="IS").
I tried the following code:
string getUrl = "http://url/login.aspx";
string postData = String.Format("x={0}&P={1}", "usernamevalue", "passwordvalue");
HttpWebRequest getRequest = (HttpWebRequest)WebRequest.Create(getUrl);
getRequest.Method = WebRequestMethods.Http.Post;
getRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.121 Safari/535.2";
getRequest.AllowWriteStreamBuffering = true;
getRequest.ProtocolVersion = HttpVersion.Version11;
getRequest.AllowAutoRedirect = true;
getRequest.ContentType = "application/x-www-form-urlencoded";
byte[] byteArray = Encoding.ASCII.GetBytes(postData);
getRequest.ContentLength = byteArray.Length;
Stream newStream = getRequest.GetRequestStream(); //open connection
newStream.Write(byteArray, 0, byteArray.Length); // Send the data.
newStream.Close();
HttpWebResponse getResponse = (HttpWebResponse)getRequest.GetResponse();
using (StreamReader sr = new StreamReader(getResponse.GetResponseStream()))
{
string sourceCode = sr.ReadToEnd();
}
And when retrieving string SourceCode it returns HTML of the login page inspite it should return home page which redirected to it after submitting login successfully.
I think that the code used not submitting button, I need to fix this issue by passing data to controls and click button submit and then get response of the home page(after successful login) not the login page response.
Thanks in advance.
use cookiecontainer to get response cookie. then send request to homepage with same cookie container.
If your login request is successfull, Response may add cookie to response.
So you can use same cookie on everypage
var cookieContainer = new CookieContainer();
getRequest.CookieContainer = cookieContainer;
...
//send request to login.aspx page.
HttpWebRequest homeRequest = (HttpWebRequest)WebRequest.Create(homeUrl);
homeRequest.CookieContainer = cookieContainer;
//send request to homepage
I have a website with webservice active(prestashop)
This site require an authentication.
I use this code:
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8";
request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36";
request.Method = "GET";
request.Credentials = new NetworkCredential("key", "");
request.PreAuthenticate = true;
//request.Connection
request.Host = "localhost";
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Stream dataStream = response.GetResponseStream();
StreamReader reader = new StreamReader(dataStream);
String R = reader.ReadToEnd();
The code is ok but my problem is that there is a login form for the webservice
In fact, the HttpWebRequest object , sends two requests:
with the first answer is not authorized while the second was ok status.
I used fiddler web debbuger.
I apologize for my English.
if the form is submitted using GET method you must pass the form paramaters in the url query string, for instance http://url?username={0}&pass={1}. If it is POST method, you must pass the form info into the http body request. There is a lot of examples in stackoverflow of this. Also you must handle the cookies witch is achieve using the CookieContainer. In the first request intialize the container
request.CookieContainer = new CookieContainer();
when the request comeback with ok status the cookies will be in request.Cookies witch is a CookieCollection instance. Later for further request you must have to pass this cookies in order to retrieve the correct data.
request.CookieContainer = new CookieContainer();
request.CookieContainer.Add(userCookies);
Hope it helps!
Using a C# WebRequest, I am attempting to screen scrape a website utilizing ASP.NET Forms Authentication.
First, the application performs a GET to the login page and extracts the __VIEWSTATE and __EVENTVALIDATION keys from hidden input fields, and the .NET SessionId from its cookie. Next, the application performs a POST with the username, password, other required form fields, and the three aforementioned .NET variables to the form action.
From a Fiddler session using Chrome to authenticate into the website, I am expecting a 302 with a token stored in a cookie to allow navigation of the secure area of the site. I cannot understand why I keep getting 302s without a token, redirecting me to the website's non-authenticated home page. In Fiddler, my application's request looks exactly the same as the request made from within Chrome or Firefox.
// Create a request using a URL that can receive a post.
var request = (HttpWebRequest)WebRequest.Create(LoginUrl);
// Set the Method property of the request to POST.
_container = new CookieContainer();
request.UserAgent = "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.17 Safari/537.36";
request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
request.Headers["Accept-Encoding"] = "gzip,deflate,sdch";
request.Headers["Accept-Language"] = "en-US,en;q=0.8";
var response = (HttpWebResponse)request.GetResponse();
_container.Add(response.Cookies);
string responseFromServer;
using (var decompress = new GZipStream(response.GetResponseStream(), CompressionMode.Decompress))
{
using (var reader = new StreamReader(decompress))
{
// Read the content.
responseFromServer = reader.ReadToEnd();
}
}
var doc = new HtmlDocument();
doc.LoadHtml(responseFromServer);
var hiddenFields = doc.DocumentNode.SelectNodes("//input[#type='hidden']").ToDictionary(input => input.GetAttributeValue("name", ""), input => input.GetAttributeValue("value", ""));
request = (HttpWebRequest)WebRequest.Create(LoginUrl);
request.Method = "POST";
request.CookieContainer = _container;
// Create POST data and convert it to a byte array. Modify this line accordingly
var postData = String.Format("ddlsubsciribers={0}&memberfname={1}&memberpwd={2}&chkRemberMe=true&Imgbtn=LOGIN&__EVENTTARGET&__EVENTARGUMENT&__LASTFOCUS", Agency, Username, Password);
postData = hiddenFields.Aggregate(postData, (current, field) => current + ("&" + field.Key + "=" + field.Value));
ServicePointManager.ServerCertificateValidationCallback = AcceptAllCertifications;
var byteArray = Encoding.UTF8.GetBytes(postData);
//request.UserAgent = "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.17 Safari/537.36";
// Set the ContentType property of the WebRequest.
request.ContentType = "application/x-www-form-urlencoded";
request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
request.Headers["Accept-Encoding"] = "gzip,deflate,sdch";
request.Headers["Accept-Language"] = "en-US,en;q=0.8";
// Set the ContentLength property of the WebRequest.
request.ContentLength = byteArray.Length;
// Get the request stream.
var dataStream = request.GetRequestStream();
// Write the data to the request stream.
dataStream.Write(byteArray, 0, byteArray.Length);
// Close the Stream object.
dataStream.Close();
// Get the response.
response = (HttpWebResponse)request.GetResponse();
_container.Add(response.Cookies);
// Clean up the streams.
dataStream.Close();
response.Close();
As it would turn out, some funky characters in the __EVENTVALIDATION variable were being encoded into a line break, and ASP.NET then threw out the session assuming it had become corrupt. The solution was to escape the ASP.NET variables using Uri.EscapeDataString.
postData = hiddenFields.Aggregate(postData, (current, field) => current + ("&" + field.Key + "=" + Uri.EscapeDataString(field.Value)));