HttpWebRequest (post) and redirection - c#

I'm trying to log onto the following website using HttpWebRequest: http://mostanmeldung.moessinger.at/login.php
Texts are in German, but they don't really matter. If you look at the source code (which by the way was not written by me, so don't blame me for its bad style :P), you will see a form tag that contains two input tags. The name of the first one is "BN" (username), and the name of the second one is "PW" (password). I am trying to send data containing values for these two inputs to the webserver using the HttpWebRequest class. However, posting the values redirects the request to another page called "einloggen.php". On that site I am told whether my login was successful.
My problem is that I am able to send the data without any problems, however, all I receive is the content of "login.php", the site you have to enter your username and password on.
This is what my code looks like:
string post = String.Format(PostPattern, Username, Password);
byte[] postBytes = Encoding.ASCII.GetBytes(post);
CookieContainer cookies = new CookieContainer();
// "Address": http://mostanmeldung.moessinger.at/login.php
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(Address);
req.CookieContainer = cookies;
req.Method = "POST";
req.ContentType = "application/x-www-form-urlencoded";
req.ContentLength = postBytes.Length;
req.AllowAutoRedirect = true;
MessageBox.Show(post); // shows me "BN=boop;PW=hi"
Stream reqStream = req.GetRequestStream();
reqStream.Write(postBytes, 0, postBytes.Length);
reqStream.Close();
WebResponse res;
if (/*req.HaveResponse &&*/ (res = req.GetResponse()) != null)
{
StreamReader reader = new StreamReader(res.GetResponseStream());
MessageBox.Show(reader.ReadToEnd());
return AuthResult.Success;
}
return AuthResult.NoResponse;
The message box at line 22 (5 lines before the end) shows me the content of "login.php" instead of "einloggen.php" which I am redirected to. Why is that?

The ACTION on that form points to einloggen.php, not login.php, so you need to send your POST data to einloggen.php instead.

Related

Remote server error while logging into a website using WebRequest

So I am currently trying to log into my account on a website using WebRequest.
I have been reading about it to the point where I feel like I wanted to use an example to learn by trial and error.
This is the example I am using
Login to website, via C#
So when I try to execute my code it returns an unhandled exception and its this one
System.Net.WebException: 'The remote server returned an error: (404)
Not Found.'
I tried stepping through the code and I THINK it might be that it's trying to POST somewhere where it can't.
I wanted to fix this before moving onto getting a confirmation that it successfully logged in.
I changed the username and password to dummy text for the sake of this question.
What did I do wrong here and whats the most logical way of fixing this issue?
Thanks in advance.
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;
string formUrl = "https://secure.runescape.com/m=weblogin/login.ws"; // NOTE: This is the URL the form POSTs to, not the URL of the form (you can find this in the "action" attribute of the HTML's form tag
string formParams = string.Format("login-username={0}&login-password={1}", "myUsername", "password");
string cookieHeader;
WebRequest req = WebRequest.Create(formUrl);
req.ContentType = "application/x-www-form-urlencoded";
req.Method = "POST";
byte[] bytes = Encoding.ASCII.GetBytes(formParams);
req.ContentLength = bytes.Length;
using (Stream os = req.GetRequestStream())
{
os.Write(bytes, 0, bytes.Length);
}
WebResponse resp = req.GetResponse();
cookieHeader = resp.Headers["Set-cookie"];
When you scrape a website, you have to make sure you mimic everything that happens. That includes any client-side state (Cookies) that is sent earlier before a form is POST-ed. As most sites don't like to be scraped or steered by bots they are often rather picky about what is the payload. Same is true for the site you're trying to control.
Three important things you have missed:
You didn't start with an initial GET so you have the required cookies in a CookieContainer.
on the post you missed an header (Referrer) and three hidden fields in the form.
The form fields are named username and password (as can be seen in the name attribute of the input tags). You have used the id's.
Fixing those omissions will result in the following code:
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;
string useragent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36";
// capture cookies, this is important!
var cookies = new CookieContainer();
// do a GET first, so you have the initial cookies neeeded
string loginUrl = "https://secure.runescape.com/m=weblogin/loginform.ws?mod=www&ssl=0&dest=community";
// HttpWebRequest
var reqLogin = (HttpWebRequest) WebRequest.Create(loginUrl);
// minimal needed settings
reqLogin.UserAgent = useragent;
reqLogin.CookieContainer = cookies;
reqLogin.Method = "GET";
var loginResp = reqLogin.GetResponse();
//loginResp.Dump(); // LinqPad testing
string formUrl = "https://secure.runescape.com/m=weblogin/login.ws"; // NOTE: This is the URL the form POSTs to, not the URL of the form (you can find this in the "action" attribute of the HTML's form tag
// in ther html the form has 3 more hidden fields, those are needed as well
string formParams = string.Format("username={0}&password={1}&mod=www&ssl=0&dest=community", "myUsername", "password");
string cookieHeader;
// notice the cast to HttpWebRequest
var req = (HttpWebRequest) WebRequest.Create(formUrl);
// put the earlier cookies back on the request
req.CookieContainer = cookies;
// the Referrer is mandatory, without it a timeout is raised
req.Headers["Referrer"] = "https://secure.runescape.com/m=weblogin/loginform.ws?mod=www&ssl=0&dest=community";
req.UserAgent = useragent;
req.ContentType = "application/x-www-form-urlencoded";
req.Method = "POST";
byte[] bytes = Encoding.ASCII.GetBytes(formParams);
req.ContentLength = bytes.Length;
using (Stream os = req.GetRequestStream())
{
os.Write(bytes, 0, bytes.Length);
}
WebResponse resp = req.GetResponse();
cookieHeader = resp.Headers["Set-cookie"];
This returns for me success. It is up to you parse the resulting HTML to plan your next steps.

Vimeo uploading. Cannot get complete_uri field in response

I've been fiddling quite a bit with my uploading to vimeo.
I've made a ticket request.
I've uploaded the file.
I've checked the file if its uploaded.
I need to run the method DELETE with the complete_uri response i should get from my ticket.
However, im not receiving any complete_URI from the ticket response.
Here is my code:
public static dynamic GenerateTicket()
{
const string apiUrl = "https://api.vimeo.com/me/videos?type=streaming";
var req = (HttpWebRequest)WebRequest.Create(apiUrl);
req.Accept = "application/vnd.vimeo.*+json;version=3.0";
req.Headers.Add(HttpRequestHeader.Authorization, "bearer " + AccessToken);
req.Method = "POST";
var res = (HttpWebResponse)req.GetResponse();
var dataStream = res.GetResponseStream();
var reader = new StreamReader(dataStream);
var result = Json.Decode(reader.ReadToEnd());
return result;
}
This response gives me:
form
ticket_id
upload_link
upload_link_secure
uri
user
In order to finish my upload i need to run step 4 in this guide: https://developer.vimeo.com/api/upload
Sending parameter type=streaming as body:
ASCIIEncoding encoding = new ASCIIEncoding();
string stringData = "type=streaming"; //place body here
byte[] data = encoding.GetBytes(stringData);
req.Method = "PUT";
req.ContentLength = data.Length;
Stream newStream = req.GetRequestStream();
newStream.Write(data, 0, data.Length);
newStream.Close();
At the moment, type=streaming must be sent in the body of the request, not as a url parameter.
This will probably change to allow either option.
the important point is :
"The first thing you need to do is request upload access for your application. You can do so from your My Apps page."
If you get all values without complete_uri, it means: you dont have an upload access token. So go to your apps and make an upload request

Login and download image with C#

I need some images from a portal and they are only accessible if I login to the portal.
I need to do it with a C# program. I don't know what username field and password field are, because they use POST method. After loging in I want to enter some URLs that contain the images I want.
What should I do?
For logging in I'm using:
HttpWebRequest httpWReq = (HttpWebRequest)WebRequest.Create(#"http://mysite.com");
ASCIIEncoding encoding = new ASCIIEncoding();
string postData = "UsernameFieldName=Something";
postData += "&PasswordFieldName=SomethingElse";
byte[] data = encoding.GetBytes(postData);
httpWReq.Method = "POST";
httpWReq.ContentType = "application/x-www-form-urlencoded";
httpWReq.ContentLength = data.Length;
using (Stream stream = httpWReq.GetRequestStream())
{
stream.Write(data, 0, data.Length);
}
HttpWebResponse response = (HttpWebResponse)httpWReq.GetResponse();
string responseString = new StreamReader(response.GetResponseStream()).ReadToEnd();
Then for downloading image which is in another page I use:
using (var client = new WebClient())
{
string FileName = #"image.jpg";
client.DownloadFile("http://mysite.com/Image?imgCode=12345", FileName);
}
I don’t have a complete solution but here are some details to get you started.
Figure out what are post field simply by looking at page source
Once you send request for login you’ll also need to find a way to accept authentication cookie and then send it in all subsequent requests because their application most probably uses cookies.
After you are logged in you can download images like this
string imageFile = #"c:\image.jpg";
using (System.Net.WebClient client = new System.Net.WebClient())
{
client.DownloadFile("http://www.somewebsite.com/someimage.jpg", imageFile);
}
Here are couple examples to get you started with http posts in C#
HTTP request with post

HTTP post form using c# - post name of form as well

I'm using the following code to POST data to a URL on button click. I need to be able to send a form name along with this data. Any suggestions?
string url = "http://www.someurl.com";
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
string proxy = null;
string data = String.Format("{0}={1}&{2}={3}&{4}={5}&{6}={7}&{8}={9}&{10}={11}",
txtName.ClientID, txtName.Text,
txtEmail.ClientID, txtEmail.Text,
txtLanguages.ClientID, txtLanguages.Text,
txtPhone.ClientID, txtPhone.Text,
txtAdditional.ClientID, txtAdditional.Text);
byte[] buffer = Encoding.UTF8.GetBytes(data);
req.Method = "POST";
req.ContentType = "application/x-www-form-urlencoded";
req.ContentLength = buffer.Length;
req.Proxy = new WebProxy(proxy, true); // ignore for local addresses
req.CookieContainer = new CookieContainer(); // enable cookies
Stream reqst = req.GetRequestStream(); // add form data to request stream
reqst.Write(buffer, 0, buffer.Length);
reqst.Flush();
reqst.Close();
If you mean the action, just append it to the URL.
You can append any key/value pair you want to the POST data - doesn't matter whether it's a control name, form name, form action or nowhere on the page at all.
By the way, if you're going to construct POST data manually you should URL-encode the values, eg.
string data = String.Format("{0}={1}&{2}={3}&{4}={5}&{6}={7}&{8}={9}&{10}={11}",
txtName.ClientID, HttpUtility.UrlEncode(txtName.Text),
...
If you are managing different activities with different forms, you could also send a hidden variable, between your form with the kind of activity you are doing, so you can evaluate the hidden hidden value, and act acording to it.
Or put a value and name to the submit button... it will come to your script as apost variable as well.

How to make my web scraper log in to this website via C#

I have an application that reads parts of the source code on a website. That all works; but the problem is that the page in question requires the user to be logged in to access this source code. What my program needs a way to initially log the user into the website- after that is done, I'll be able to access and read the source code.
The website that needs to be logged into is:
mmoinn.com/index.do?PageModule=UsersLogin
You can continue using WebClient to POST (instead of GET, which is the HTTP verb you're currently using with DownloadString), but I think you'll find it easier to work with the (slightly) lower-level classes WebRequest and WebResponse.
There are two parts to this - the first is to post the login form, the second is recovering the "Set-cookie" header and sending that back to the server as "Cookie" along with your GET request. The server will use this cookie to identify you from now on (assuming it's using cookie-based authentication which I'm fairly confident it is as that page returns a Set-cookie header which includes "PHPSESSID").
POSTing to the login form
Form posts are easy to simulate, it's just a case of formatting your post data as follows:
field1=value1&field2=value2
Using WebRequest and code I adapted from Scott Hanselman, here's how you'd POST form data to your login form:
string formUrl = "http://www.mmoinn.com/index.do?PageModule=UsersAction&Action=UsersLogin"; // NOTE: This is the URL the form POSTs to, not the URL of the form (you can find this in the "action" attribute of the HTML's form tag
string formParams = string.Format("email_address={0}&password={1}", "your email", "your password");
string cookieHeader;
WebRequest req = WebRequest.Create(formUrl);
req.ContentType = "application/x-www-form-urlencoded";
req.Method = "POST";
byte[] bytes = Encoding.ASCII.GetBytes(formParams);
req.ContentLength = bytes.Length;
using (Stream os = req.GetRequestStream())
{
os.Write(bytes, 0, bytes.Length);
}
WebResponse resp = req.GetResponse();
cookieHeader = resp.Headers["Set-cookie"];
Here's an example of what you should see in the Set-cookie header for your login form:
PHPSESSID=c4812cffcf2c45e0357a5a93c137642e; path=/; domain=.mmoinn.com,wowmine_referer=directenter; path=/; domain=.mmoinn.com,lang=en; path=/;domain=.mmoinn.com,adt_usertype=other,adt_host=-
GETting the page behind the login form
Now you can perform your GET request to a page that you need to be logged in for.
string pageSource;
string getUrl = "the url of the page behind the login";
WebRequest getRequest = WebRequest.Create(getUrl);
getRequest.Headers.Add("Cookie", cookieHeader);
WebResponse getResponse = getRequest.GetResponse();
using (StreamReader sr = new StreamReader(getResponse.GetResponseStream()))
{
pageSource = sr.ReadToEnd();
}
EDIT:
If you need to view the results of the first POST, you can recover the HTML it returned with:
using (StreamReader sr = new StreamReader(resp.GetResponseStream()))
{
pageSource = sr.ReadToEnd();
}
Place this directly below cookieHeader = resp.Headers["Set-cookie"]; and then inspect the string held in pageSource.
You can simplify things quite a bit by creating a class that derives from WebClient, overriding its GetWebRequest method and setting a CookieContainer object on it. If you always set the same CookieContainer instance, then cookie management will be handled automatically for you.
But the only way to get at the HttpWebRequest before it is sent is to inherit from WebClient and override that method.
public class CookieAwareWebClient : WebClient
{
private CookieContainer cookie = new CookieContainer();
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = cookie;
}
return request;
}
}
var client = new CookieAwareWebClient();
client.BaseAddress = #"https://www.site.com/any/base/url/";
var loginData = new NameValueCollection();
loginData.Add("login", "YourLogin");
loginData.Add("password", "YourPassword");
client.UploadValues("login.php", "POST", loginData);
//Now you are logged in and can request pages
string htmlSource = client.DownloadString("index.php");
Matthew Brindley, your code worked very good for some website I needed (with login), but I needed to change to HttpWebRequest and HttpWebResponse otherwise I get a 404 Bad Request from the remote server. Also I would like to share my workaround using your code, and is that I tried it to login to a website based on moodle, but it didn't work at your step "GETting the page behind the login form" because when successfully POSTing the login, the Header 'Set-Cookie' didn't return anything despite other websites does.
So I think this where we need to store cookies for next Requests, so I added this.
To the "POSTing to the login form" code block :
var cookies = new CookieContainer();
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(formUrl);
req.CookieContainer = cookies;
And To the "GETting the page behind the login form" :
HttpWebRequest getRequest = (HttpWebRequest)WebRequest.Create(getUrl);
getRequest.CookieContainer = new CookieContainer();
getRequest.CookieContainer.Add(resp.Cookies);
getRequest.Headers.Add("Cookie", cookieHeader);
Doing this, lets me Log me in and get the source code of the "page behind login" (website based moodle) I know this is a vague use of the CookieContainer and HTTPCookies because we may ask first is there a previously set of cookies saved before sending the request to the server. This works without problem anyway, but here's a good info to read about WebRequest and WebResponse with sample projects and tutorial:
Retrieving HTTP content in .NET
How to use HttpWebRequest and HttpWebResponse in .NET
Sometimes, it may help switching off AllowAutoRedirect and setting both login POST and page GET requests the same user agent.
request.UserAgent = userAgent;
request.AllowAutoRedirect = false;

Categories

Resources