Scope:
I am developing a C# aplication to simulate queries into this site. I am quite familiar with simulating web requests for achieving the same human steps, but using code instead.
If you want to try yourself, just type this number into the CNPJ box:
08775724000119 and write the captcha and click on Confirmar
I've dealed with the captcha already, so it's not a problem anymore.
Problem:
As soon as i execute the POST request for a "CNPJ", a exception is thrown:
The remote server returned an error: (403) Forbidden.
Fiddler Debugger Output:
Link for Fiddler Download
This is the request generated by my browser, not by my code
POST https://www.sefaz.rr.gov.br/sintegra/servlet/hwsintco HTTP/1.1
Host: www.sefaz.rr.gov.br
Connection: keep-alive
Content-Length: 208
Cache-Control: max-age=0
Origin: https://www.sefaz.rr.gov.br
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11
Content-Type: application/x-www-form-urlencoded
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Referer: https://www.sefaz.rr.gov.br/sintegra/servlet/hwsintco
Accept-Encoding: gzip,deflate,sdch
Accept-Language: pt-BR,pt;q=0.8,en-US;q=0.6,en;q=0.4
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
Cookie: GX_SESSION_ID=gGUYxyut5XRAijm0Fx9ou7WnXbVGuUYoYTIKtnDydVM%3D; JSESSIONID=OVuuMFCgQv9k2b3fGyHjSZ9a.undefined
// PostData :
_EventName=E%27CONFIRMAR%27.&_EventGridId=&_EventRowId=&_MSG=&_CONINSEST=&_CONINSESTG=08775724000119&cfield=rice&_VALIDATIONRESULT=1&BUTTON1=Confirmar&sCallerURL=http%3A%2F%2Fwww.sintegra.gov.br%2Fnew_bv.html
Code samples and References used:
I'm using a self developed library to handle/wrap the Post and Get requests.
The request object has the same parameters (Host,Origin, Referer, Cookies..) as the one issued by the browser (logged my fiddler up here).
I've also managed to set the ServicePointValidator of certificates by using:
ServicePointManager.ServerCertificateValidationCallback =
new RemoteCertificateValidationCallback (delegate { return true; });
After all that configuration, i stil getting the forbidden exception.
Here is how i simulate the request and the exception is thrown
try
{
this.Referer = Consts.REFERER;
// PARAMETERS: URL, POST DATA, ThrownException (bool)
response = Post (Consts.QUERYURL, postData, true);
}
catch (Exception ex)
{
string s = ex.Message;
}
Thanks in advance for any help / solution to my problem
Update 1:
I was missing the request for the homepage, which generates cookies (Thanks #W0lf for pointing me that out)
Now there's another weird thing. Fiddler is not showing my Cookies on the request, but here they are :
I made a successful request using the browser and recorded it in Fiddler.
The only things that differ from your request are:
my browser sent no value for the sCallerURL parameter (I have sCallerURL= instead of sCallerURL=http%3A%2F%2Fwww....)
the session ids are different (obviously)
I have other Accept-Language: values (I'm pretty sure this is not important)
the Content-Length is different (obviously)
Update
OK, I thought the Fiddler trace was from your application. In case you are not setting cookies on your request, do this:
before posting data, do a GET request to https://www.sefaz.rr.gov.br/sintegra/servlet/hwsintco. If you examine the response, you'll notice the website sends two session cookies.
when you do the POST request, make sure to attach the cookies you got at the previous step
If you don't know how to store the cookies and use them in the other request, take a look here.
Update 2
The problems
OK, I managed to reproduce the 403, figured out what caused it, and found a fix.
What happens in the POST request is that:
the server responds with status 302 (temporary redirect) and the redirect location
the browser redirects (basically does a GET request) to that location, also posting the two cookies.
.NET's HttpWebRequest attempts to do this redirect seamlessly, but in this case there are two issues (that I would consider bugs in the .NET implementation):
the GET request after the POST(redirect) has the same content-type as the POST request (application/x-www-form-urlencoded). For GET requests this shouldn't be specified
cookie handling issue (the most important issue) - The website sends two cookies: GX_SESSION_ID and JSESSIONID. The second has a path specified (/sintegra), while the first does not.
Here's the difference: the browser assigns by default a path of /(root) to the first cookie, while .NET assigns it the request url path (/sintegra/servlet/hwsintco).
Due to this, the last GET request (after redirect) to /sintegra/servlet/hwsintpe... does not get the first cookie passed in, as its path does not correspond.
The fixes
For the redirect problem (GET with content-type), the fix is to do the redirect manually, instead of relying on .NET for this.
To do this, tell it to not follow redirects:
postRequest.AllowAutoRedirect = false
and then read the redirect location from the POST response and manually do a GET request on it.
The cookie problem (that has happened to others as well)
For this, the fix I found was to take the misplaced cookie from the CookieContainer, set it's path correctly and add it back to the container in the correct location.
This is the code to do it:
private void FixMisplacedCookie(CookieContainer cookieContainer)
{
var misplacedCookie = cookieContainer.GetCookies(new Uri(Url))[0];
misplacedCookie.Path = "/"; // instead of "/sintegra/servlet/hwsintco"
//place the cookie in thee right place...
cookieContainer.SetCookies(
new Uri("https://www.sefaz.rr.gov.br/"),
misplacedCookie.ToString());
}
Here's all the code to make it work:
using System;
using System.IO;
using System.Net;
using System.Text;
namespace XYZ
{
public class Crawler
{
const string Url = "https://www.sefaz.rr.gov.br/sintegra/servlet/hwsintco";
public void Crawl()
{
var cookieContainer = new CookieContainer();
/* initial GET Request */
var getRequest = (HttpWebRequest)WebRequest.Create(Url);
getRequest.CookieContainer = cookieContainer;
ReadResponse(getRequest); // nothing to do with this, because captcha is f##%ing dumb :)
/* POST Request */
var postRequest = (HttpWebRequest)WebRequest.Create(Url);
postRequest.AllowAutoRedirect = false; // we'll do the redirect manually; .NET does it badly
postRequest.CookieContainer = cookieContainer;
postRequest.Method = "POST";
postRequest.ContentType = "application/x-www-form-urlencoded";
var postParameters =
"_EventName=E%27CONFIRMAR%27.&_EventGridId=&_EventRowId=&_MSG=&_CONINSEST=&" +
"_CONINSESTG=08775724000119&cfield=much&_VALIDATIONRESULT=1&BUTTON1=Confirmar&" +
"sCallerURL=";
var bytes = Encoding.UTF8.GetBytes(postParameters);
postRequest.ContentLength = bytes.Length;
using (var requestStream = postRequest.GetRequestStream())
requestStream.Write(bytes, 0, bytes.Length);
var webResponse = postRequest.GetResponse();
ReadResponse(postRequest); // not interested in this either
var redirectLocation = webResponse.Headers[HttpResponseHeader.Location];
var finalGetRequest = (HttpWebRequest)WebRequest.Create(redirectLocation);
/* Apply fix for the cookie */
FixMisplacedCookie(cookieContainer);
/* do the final request using the correct cookies. */
finalGetRequest.CookieContainer = cookieContainer;
var responseText = ReadResponse(finalGetRequest);
Console.WriteLine(responseText); // Hooray!
}
private static string ReadResponse(HttpWebRequest getRequest)
{
using (var responseStream = getRequest.GetResponse().GetResponseStream())
using (var sr = new StreamReader(responseStream, Encoding.UTF8))
{
return sr.ReadToEnd();
}
}
private void FixMisplacedCookie(CookieContainer cookieContainer)
{
var misplacedCookie = cookieContainer.GetCookies(new Uri(Url))[0];
misplacedCookie.Path = "/"; // instead of "/sintegra/servlet/hwsintco"
//place the cookie in thee right place...
cookieContainer.SetCookies(
new Uri("https://www.sefaz.rr.gov.br/"),
misplacedCookie.ToString());
}
}
}
Sometimes HttpWebRequest needs proxy initialization:
request.Proxy = new WebProxy();//in my case it doesn't need parameters, but you can set it to your proxy address
Related
Here i do a Post request and i know the address (i am not the owner) and it is not malicious, I just want to Post the request and get the desired response.
Web request code:
HttpWebRequest oHTTP = (HttpWebRequest)WebRequest.Create("https://some-random-website.com/");
string data = Uri.EscapeDataString(parameters);
oHTTP.Method = "POST";
oHTTP.ContentType = "application/x-www-form-urlencoded";
oHTTP.UserAgent = "Mozilla/5.0 (Windows NT 9; WOW64; rv:38.0) Firefox:40.1";
oHTTP.ContentLength = parameters.Length;
using (Stream stream = oHTTP.GetRequestStream())
stream.Write(Encoding.ASCII.GetBytes(parameters), 0, parameters.Length);
HttpWebResponse response = (HttpWebResponse)oHTTP.GetResponse();
string oReceived = new StreamReader(response.GetResponseStream() ?? throw new InvalidOperationException()).ReadToEnd();
Response title:
Warning: Suspected Phishing Site Ahead!
Then there is a button that says:
Dismiss this warning and enter site
So my question is how can i ignore this warnings and post my request successfully? Should i change my UserAgent?
Note1: I use Fiddler to inspect both request and response header and content.
Note2: I have done the same thing in AutoIt but it uses WinHttp and there is no issue on this website.
I am trying to access this link through HttpClient but every time it says that IsSuccessStatusCode is false. In the past I was able to get the content but now it wont work. It gives me 302 response code.
the code that I am trying is:
var handler = new HttpClientHandler()
{
AllowAutoRedirect = false,
UseCookies = true,
PreAuthenticate = true,
UseDefaultCredentials = true
};
var client = new HttpClient(handler);
//client.DefaultRequestHeaders.Add("User-Agent", "Mozilla/5.0 (Windows NT 6.2; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0");
var data = await client.GetAsync(Url);
if (!data.IsSuccessStatusCode)
{
;
}
var doc = new HtmlDocument();
var content = await data.Content.ReadAsStringAsync();
Can someone tell me what I am doing wrong here and how can I make it work? So I can get the content. Thanks
P.S. I have the permission of the website owners to use the website.
Quick Solution
AllowAutoRedirect = false // change this to true
What's going on with the IsSuccessStatusCode?
Let's start by looking at the HttpResponseMessage Class for the implementation of the IsSuccessStatusCode property.
public bool IsSuccessStatusCode
{
get { return ((int)statusCode >= 200) && ((int)statusCode <= 299); }
}
As you can see, the 302 status code in will return a false.
302 Status Code
The requested resource resides temporarily under a different URI. Since the redirection might be altered on occasion, the client SHOULD continue to use the Request-URI for future requests. This response is only cacheable if indicated by a Cache-Control or Expires header field.
The temporary URI SHOULD be given by the Location field in the response. Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI(s).
If the 302 status code is received in response to a request other than GET or HEAD, the user agent MUST NOT automatically redirect the request unless it can be confirmed by the user, since this might change the conditions under which the request was issued.
Source: https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.3
I have tried just about all related solutions found on the Web, but they all refused to work for some reason. And this does not work too: C# - HttpWebRequest POST (Login to Facebook) , since we are using different methods.
And I am not using the POST method, but the GET method, which is being used in a request. The site I am using does not need any login credentials to get the image. (Most of the other root domains the site has does not require a cookie.)
The below code is a part of what I figured out to make the program get the image like the web-based versions do, but with a few problems.
Before, I was trying to use a normal WebClient to download the image since it refused to show up in any way that the PictureBox control would accept. But then I switched to HttpWebRequest.
The particular root domain of the site where I am trying to get the image from requires a cookie, though.
Below is a code snippet which basically tries to get an image from a site. The only trouble is, it is almost impossible to get the image from the site unless you pass a few things in the HttpWebRequest, along with a cookie.
For now, I am using a static cookie as a temporary workaround.
HttpWebRequest _request = (HttpWebRequest)HttpWebRequest.Create(_URL);
_request.Method = WebRequestMethods.Http.Get;
_request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8";
_request.Headers.Set(HttpRequestHeader.AcceptEncoding, "gzip,deflate,sdch");
_request.Headers.Set(HttpRequestHeader.AcceptLanguage, "en-US,en;q=0.8");
_request.Headers.Set(HttpRequestHeader.CacheControl, "max-age=0");
_request.Host = "www.habbo" + _Country;
_request.UserAgent = "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.146 Safari/537.36";
using (WebResponse _response = _request.GetResponse())
using (Stream _stream = _response.GetResponseStream())
{
Image _image = Image.FromStream(_stream);
_bitmap = new Bitmap(_image);
string contentType = _response.ContentType;
_PictureBox.Image = _bitmap;
}
Let's let the following variables be:
_URL = "http://www.habbo.com/habbo-imaging/avatarimage?hb=img&user=aa&direction=2&head_direction=2&size=m&img_format=gif";
_Country = ".com";
Most of the things I am passing into the HttpWebRequest is obtained from looking at the Network tab of Google Chrome's Developer Tools.
The web-based versions of the Habbo Imager seems to just direct people to the page where they can find the image, and their browsers seem to somehow add the cookie. What I am doing is different, as all they do is display the site where the image is located, but I want to locate the image's true location, then read from it to a type Image.
Apparently the site seems to need the user to "visit" them, according to what I read from this thread: Click here
What I would like to know is, is there a better way to get a valid cookie that the server will happily accept every time?
Or do I need to somehow trick the site into thinking the user has visited the page and seen it, thereby making them maybe return the cookie we might need, even though the user doesn't ever see the page?
Not too sure if this would mean that I need to somehow dynamically generate the cookies though.
I also do not understand how to truly create or get the cookies (and set stored cookies) using C#, so if it is possible, please use some examples.
I would prefer to not use any third-party libraries, or to change the code I am using too much. Neither is the program going to send two GET requests just to be able to get what it could get with one GET request. Thus, this wouldn't work: Passing cookie with HttpWebRequest in winforms?
I am using .NET 4.0.
It is a little bit more complicated than at first sight expected. The browser makes actually two calls. The first one returns an html script with a small piece of javascript that when executed sets a cookie and reload the page. In your c# code you have to mimic that.
In your form class add an instance variable to hold all the cookies across multiple httpwebrequest calls:
readonly CookieContainer cookiecontainer = new CookieContainer();
I have created a Builder method that creates the HttpWebRequest and returns an HttpWebResponse. It takes a namevaluecollection to add any cookies to the Cookiecontainer.
private HttpWebResponse Builder(string url, string host, NameValueCollection cookies)
{
HttpWebRequest request = (HttpWebRequest) WebRequest.Create(url);
request.Method = WebRequestMethods.Http.Get;
request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8";
// _request.Headers.Set(HttpRequestHeader.AcceptEncoding, "gzip,deflate,sdch");
request.Headers.Set(HttpRequestHeader.AcceptLanguage, "en-US,en;q=0.8");
request.Headers.Set(HttpRequestHeader.CacheControl, "max-age=0");
request.Host = host;
request.UserAgent = "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.146 Safari/537.36";
request.CookieContainer = cookiecontainer;
if (cookies != null)
{
foreach (var cookiekey in cookies.AllKeys)
{
request.CookieContainer.Add(
new Cookie(
cookiekey,
cookies[cookiekey],
#"/",
host));
}
}
return (HttpWebResponse) request.GetResponse();
}
If the incoming stream turns out to be an text/html contenttype we need to parse its content and return the cookie name and value. The Parse method does just that:
// find in the html and return the three parameters in a string array
// setCookie('YPF8827340282Jdskjhfiw_928937459182JAX666', '127.0.0.1', 10);
private static string[] Parse(Stream _stream, string encoding)
{
const string setCookieCall = "setCookie('";
// copy html as string
var ms = new MemoryStream();
_stream.CopyTo(ms);
var html = Encoding.GetEncoding(encoding).GetString(ms.ToArray());
// find setCookie call
var findFirst = html.IndexOf(
setCookieCall,
StringComparison.InvariantCultureIgnoreCase) + setCookieCall.Length;
var last = html.IndexOf(");", findFirst, StringComparison.InvariantCulture);
var setCookieStatmentCall = html.Substring(findFirst, last - findFirst);
// take the parameters
var parameters = setCookieStatmentCall.Split(new[] {','});
for (int x = 0; x < parameters.Length; x++)
{
// cleanup
parameters[x] = parameters[x].Replace("'", "").Trim();
}
return parameters;
}
Now are our building blocks complete we can start calling our methods from the Click method. We use a loop to call our Builder twice to obtain a result from the given url. Based on the received contenttype we either Parse or create the Image from the stream.
private void button1_Click(object sender, EventArgs e)
{
var cookies = new NameValueCollection();
for (int tries = 0; tries < 2; tries++)
{
using (var response = Builder(_URL, "www.habbo" + _Country, cookies))
{
using (var stream = response.GetResponseStream())
{
string contentType = response.ContentType.ToLowerInvariant();
if (contentType.StartsWith("text/html"))
{
var parameters = Parse(stream, response.CharacterSet);
cookies.Add(parameters[0], parameters[1]);
}
if (contentType.StartsWith("image"))
{
pictureBox1.Image = Image.FromStream(stream);
break; // we're done, get out
}
}
}
}
}
Words of caution
This code works for the url in your question. I didn't take any measures to handle other patterns, and/or exceptions. It is up to you to add that. Also when doing this kind of scraping make sure the owner of the website does allow this.
I am trying to write code that will authenticate to the website wallbase.cc. I've looked at what it does using Firfebug/Chrome Developer tools and it seems fairly easy:
Post "usrname=$USER&pass=$PASS&nopass_email=Type+in+your+e-mail+and+press+enter&nopass=0" to the webpage "http://wallbase.cc/user/login", store the returned cookies and use them on all future requests.
Here is my code:
private CookieContainer _cookies = new CookieContainer();
//......
HttpPost("http://wallbase.cc/user/login", string.Format("usrname={0}&pass={1}&nopass_email=Type+in+your+e-mail+and+press+enter&nopass=0", Username, assword));
//......
private string HttpPost(string url, string parameters)
{
try
{
System.Net.WebRequest req = System.Net.WebRequest.Create(url);
//Add these, as we're doing a POST
req.ContentType = "application/x-www-form-urlencoded";
req.Method = "POST";
((HttpWebRequest)req).Referer = "http://wallbase.cc/home/";
((HttpWebRequest)req).CookieContainer = _cookies;
//We need to count how many bytes we're sending. Post'ed Faked Forms should be name=value&
byte[] bytes = System.Text.Encoding.ASCII.GetBytes(parameters);
req.ContentLength = bytes.Length;
System.IO.Stream os = req.GetRequestStream();
os.Write(bytes, 0, bytes.Length); //Push it out there
os.Close();
//get response
using (System.Net.WebResponse resp = req.GetResponse())
{
if (resp == null) return null;
using (Stream st = resp.GetResponseStream())
{
System.IO.StreamReader sr = new System.IO.StreamReader(st);
return sr.ReadToEnd().Trim();
}
}
}
catch (Exception)
{
return null;
}
}
After calling HttpPost with my login parameters I would expect all future calls using this same method to be authenticated (assuming a valid username/password). I do get a session cookie in my cookie collection but for some reason I'm not authenticated. I get a session cookie in my cookie collection regardless of which page I visit so I tried loading the home page first to get the initial session cookie and then logging in but there was no change.
To my knowledge this Python version works: https://github.com/sevensins/Wallbase-Downloader/blob/master/wallbase.sh (line 336)
Any ideas on how to get authentication working?
Update #1
When using a correct user/password pair the response automatically redirects to the referrer but when an incorrect user/pass pair is received it does not redirect and returns a bad user/pass pair. Based on this it seems as though authentication is happening, but maybe not all the key pieces of information are being saved??
Update #2
I am using .NET 3.5. When I tried the above code in .NET 4, with the added line of System.Net.ServicePointManager.Expect100Continue = false (which was in my code, just not shown here) it works, no changes necessary. The problem seems to stem directly from some pre-.Net 4 issue.
This is based on code from one of my projects, as well as code found from various answers here on stackoverflow.
First we need to set up a Cookie aware WebClient that is going to use HTML 1.0.
public class CookieAwareWebClient : WebClient
{
private CookieContainer cookie = new CookieContainer();
protected override WebRequest GetWebRequest(Uri address)
{
HttpWebRequest request = (HttpWebRequest)base.GetWebRequest(address);
request.ProtocolVersion = HttpVersion.Version10;
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = cookie;
}
return request;
}
}
Next we set up the code that handles the Authentication and then finally loads the response.
var client = new CookieAwareWebClient();
client.UseDefaultCredentials = true;
client.BaseAddress = #"http://wallbase.cc";
var loginData = new NameValueCollection();
loginData.Add("usrname", "test");
loginData.Add("pass", "123");
loginData.Add("nopass_email", "Type in your e-mail and press enter");
loginData.Add("nopass", "0");
var result = client.UploadValues(#"http://wallbase.cc/user/login", "POST", loginData);
string response = System.Text.Encoding.UTF8.GetString(result);
We can try this out using the HTML Visualizer inbuilt into Visual Studio while staying in debug mode and use that to confirm that we were able to authenticate and load the Home page while staying authenticated.
The key here is to set up a CookieContainer and use HTTP 1.0, instead of 1.1. I am not entirely sure why forcing it to use 1.0 allows you to authenticate and load the page successfully, but part of the solution is based on this answer.
https://stackoverflow.com/a/10916014/408182
I used Fiddler to make sure that the response sent by the C# Client was the same as with my web browser Chrome. It also allows me to confirm if the C# client is being redirect correctly. In this case we can see that with HTML 1.0 we are getting the HTTP/1.0 302 Found and then redirects us to the home page as intended. If we switch back to HTML 1.1 we will get an HTTP/1.1 417 Expectation Failed message instead.
There is some information on this error message available in this stackoverflow thread.
HTTP POST Returns Error: 417 "Expectation Failed."
Edit: Hack/Fix for .NET 3.5
I have spent a lot of time trying to figure out the difference between 3.5 and 4.0, but I seriously have no clue. It looks like 3.5 is creating a new cookie after the authentication and the only way I found around this was to authenticate the user twice.
I also had to make some changes on the WebClient based on information from this post.
http://dot-net-expertise.blogspot.fr/2009/10/cookiecontainer-domain-handling-bug-fix.html
public class CookieAwareWebClient : WebClient
{
public CookieContainer cookies = new CookieContainer();
protected override WebRequest GetWebRequest(Uri address)
{
var request = base.GetWebRequest(address);
var httpRequest = request as HttpWebRequest;
if (httpRequest != null)
{
httpRequest.ProtocolVersion = HttpVersion.Version10;
httpRequest.CookieContainer = cookies;
var table = (Hashtable)cookies.GetType().InvokeMember("m_domainTable", System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.GetField | System.Reflection.BindingFlags.Instance, null, cookies, new object[] { });
var keys = new ArrayList(table.Keys);
foreach (var key in keys)
{
var newKey = (key as string).Substring(1);
table[newKey] = table[key];
}
}
return request;
}
}
var client = new CookieAwareWebClient();
var loginData = new NameValueCollection();
loginData.Add("usrname", "test");
loginData.Add("pass", "123");
loginData.Add("nopass_email", "Type in your e-mail and press enter");
loginData.Add("nopass", "0");
// Hack: Authenticate the user twice!
client.UploadValues(#"http://wallbase.cc/user/login", "POST", loginData);
var result = client.UploadValues(#"http://wallbase.cc/user/login", "POST", loginData);
string response = System.Text.Encoding.UTF8.GetString(result);
You may need to add the following:
//get response
using (System.Net.WebResponse resp = req.GetResponse())
{
foreach (Cookie c in resp.Cookies)
_cookies.Add(c);
// Do other stuff with response....
}
Another thing that you might have to do is, if the server responds with a 302 (redirect) the .Net web request will automatically follow it and in the process you might lose the cookie you're after. You can turn off this behavior with the following code:
req.AllowAutoRedirect = false;
The Python you reference uses a different referrer (http://wallbase.cc/start/). It is also followed by another post to (http://wallbase.cc/user/adult_confirm/1). Try the other referrer and followup with this POST.
I think you are authenticating correctly, but that the site needs more info/assertions from you before proceeding.
For several days I've tried to write a program that remote upload image to an image host (imgur.com). I used Wireshark to sniff http requests sent by browser, then create HttpWebRequest with similar headers and parameters. But the server always send back to me something weird. Please look at the code (this code is simplified):
static void Main(string[] args)
{
ServicePointManager.Expect100Continue = false;
CookieContainer cc = new CookieContainer();
List<string> formData = new List<string>();
//The first request - login
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://imgur.com/signin");
configRequest(request, cc);
//add POST params
add(formData, "username", "abcdefgh"); //this is a working account,
add(formData, "password", "abcdefgh"); //feel free to use it if you
add(formData, "remember", "remember"); //want to test
add(formData, "submit", "");
writeToRequestStream(request, formData);
//send request
request.GetResponse();
//The second request - remote upload image
request = (HttpWebRequest)WebRequest.Create("http://imgur.com/upload?sid_hash=9efff36179fef47dc5e078a4575fd96a");
configRequest(request, cc);
//add POST params
formData = new List<string>();
add(formData, "url", "http://img34.imageshack.us/img34/8425/89948070152259768406.jpg");
add(formData, "create_album", "0");
add(formData, "album_title", "Optional Album Title");
add(formData, "album_layout", "b");
add(formData, "edit_url", "0");
writeToRequestStream(request, formData);
//send request
Stream s = request.GetResponse().GetResponseStream();
StreamReader sr = new StreamReader(s);
string html = sr.ReadToEnd();
sr.Close();s.Close();
Console.WriteLine(html + "\n\n");
}
static void add(List<string> formData, string key, string value)
{
formData.Add(HttpUtility.UrlEncode(key) + "=" + HttpUtility.UrlEncode(value));
}
static void configRequest(HttpWebRequest request, CookieContainer cc)
{
request.Method = "POST";
request.ContentType = "application/x-www-form-urlencoded; charset=UTF-8";
request.CookieContainer = cc;
request.Credentials = CredentialCache.DefaultCredentials;
request.Accept = "*/*";
request.KeepAlive = true;
request.Referer = "http://imgur.com/";
request.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.15) Gecko/20110303 Firefox/3.6.15";
request.Headers.Add("Accept-Language", "en-us,en;q=0.5");
request.Headers.Add("Accept-Encoding", "gzip,deflate");
request.Headers.Add("Accept-Charset", "ISO-8859-1,utf-8;q=0.7,*;q=0.7");
request.Headers.Add("Keep-Alive", "115");
request.Headers.Add("X-Requested-With", "XMLHttpRequest");
request.Headers.Add("Pragma", "no-cache");
request.Headers.Add("Cache-Control", "no-cache");
}
static void writeToRequestStream(HttpWebRequest request, List<string> formData)
{
//build request stream
string queryString = String.Join("&", formData.ToArray());
byte[] byteArray = Encoding.UTF8.GetBytes(queryString);
//write to stream
request.ContentLength = byteArray.Length;
Stream rs = request.GetRequestStream();
rs.Write(byteArray, 0, byteArray.Length);
rs.Close();
}
Now I sniff my uploading request (2nd request) and compare it to the browser's request, there're only 2 differences:
Browser's 'Connection' header ='keep-alive' but mine doesn't exist (I don' know why although request.Keep-alive is set to 'true')
Some browser's cookies doesn't appear in mine.
The response should be a JSON, something like this:
{"hashes":"[\"QcvII\"]","hash":"QcvII","album":false,"edit":false}
But the server responses to my request by a pile of special characters... I can't find out which in above 2 differences makes my code doesn't work. I will extremely appreciate if you can help me making this code work. I'm a newbie so please don't blame me if my code or my expression's silly.
Can anybody help to make this code work?
P/S: i'm using .net framework 4
My guess is that the sid_hash url parameter in your attempt to upload the image is a session id that needs to change when you log in.
OK, now I've found out the solution, fortunately. Forget all things in my function configRequest() (except 3 first lines), they just make things go wrong. The solution is, after sending the login request, send another request to the homepage (no parameter needed, but remember to include the cookies received from the 1st request). The sid_hash can be found in the returned HTML. Use that sid_hash to make the remote uploading request.
Thank you all, guys.
Not sure about your code, but ClipUpload is an open source project that seems to already do about what you want:
Quickly upload anything thats on your clipboard to the internet. It supports FTP, Imgur.com, Pastebin.com and SendSpace.com. Usage? Step 1: Copy. Step 2: Click system tray icon. Step 3: Paste public link. The easiest way to share your clipboard!
Most likely, the second request contains the session ID cookies. Without those cookies, server will not be able to recognise you hence upload will not work.
You can set the keep-alive yourself but my suggestion is to post snippet of the response headers to the first request so we could help.
UPDATE
According to your updates, you need to include this cookie:
IMGURSESSION=iliutpm33rhl2rugn5vcr8jq60
Obviously the value will change with each logging.