I am trying to download a file over HTTPS and I just keep running into a brick wall with correctly setting Cookies and Headers.
Does anyone have/know of any code that I can review for doing this correctly ? i.e. download a file over https and set cookies/headers ?
Thanks!
I did this the other day, in summary you need to create a HttpWebRequest and HttpWepResponse to submit/receive data. Since you need to maintain cookies across multiple requests, you need to create a cookie container to hold your cookies. You can set header properties on request/response if needed as well....
Basic Concept:
Using System.Net;
// Create Cookie Container (Place to store cookies during multiple requests)
CookieContainer cookies = new CookieContainer();
// Request Page
HttpWebRequest req = (HttpWebRequest)WebRequest.Create("https://www.amazon.com");
req.CookieContainer = cookies;
// Response Output (Could be page, PDF, csv, etc...)
HttpWebResponse resp= (HttpWebResponse)req.GetResponse();
// Add Response Cookies to Cookie Container
// I only had to do this for the first "login" request
cookies.Add(resp.Cookies);
The key to figuring this out is capturing the traffic for real request. I did this using Fiddler and over the course of a few captures (almost 10), I figured out what I need to do to reproduce the login to a site where I needed to run some reports based on different selection critera (date range, parts, etc..) and download the results into CSV files. It's working perfect, but Fiddler was the key to figuring it out.
http://www.fiddler2.com/fiddler2/
Good Luck.
Zach
This fellow wrote an application to download files using HTTP:
http://www.codeproject.com/KB/IP/DownloadDemo.aspx
Not quite sure what you mean by setting cookies and headers. Is that required by the site you are downloading from? If it is, what cookies and headers need to be set?
I've had good luck with the WebClient class. It's a wrapper for HttpWebRequest that can save a few lines of code: http://msdn.microsoft.com/en-us/library/system.net.webclient.aspx
Related
I am adding multiple custom headers in my HttpResponse and on the next request from the browser for the page, I want to read those custom headers and determine if I need to send a 304 response to the browser.
I added the custom headers using Response.AddHeader but on the next request from the browser, the custom headers were not sent.
The browser did recieve the custom headers in the response stream but did not send them on the subsequent request.
I'm expecting the headers since I need to read them on the first request and not on post requests.
NOTE: I don't want to use cookies since I don't want to increase payload. I don't want to use sessions since I don't want to burden the server. My aim to decrease processing in the server as much as possible. As I've mentioned in my comment, I read about ETags and I'm hoping the technique that's used in ETags could be used for custom headers.
There are other ways of of passing information between requests. See this discussion.
You can also use session variables.
I'm experiencing a strange issue with WebClient.DownloadString that I can't seem to solve, my code:
Dim client As New WebClient()
Dim html = client.DownloadString("http://www.btctrade.com/")
The content doesn't seem to be fully AJAX, so it can't be that. Is it due to the web page being in Chinese? I'm guessing HTML is just served as HTML, so can't really be that either. The URL is fine when I go to it and there seems to be no redirects to https either.
Anyone know why this is happening?
You must set cookies and useragent in the webclient headers this works
client .Headers.Add(HttpRequestHeader.UserAgent, "UserAgent,Mozilla/5.0 (Windows NT 5.1; rv:14.0) Gecko/20100101 Firefox/14.0.1");
client .Headers.Add(HttpRequestHeader.Cookie, "USER_PW=9b1283bfe37ac47b243a1e0c9c1c9e52; PHPSESSID=f692406a0c84dba2605a7065d55a3b53")
and if u want that the request do all this work , you have to user httpwebrequest then save all the response's headers and use them in a new request
WebClient is not buggy, so probably the server is returning data you did not expect. Use Fiddler to watch what happens when you go to the site in a web browser.
When I executed your code the web site returned no data. When I visited the site in a web browser it returned data. Probably, the site is detecting that you are a bot and denying you access. Fake being a browser by mimicking what you see in Fiddler.
I'm calling a url in my app,and that url send me a data via cookies, how should i get that data from cookies?? Like NSHttpCookie and NSHttpSharedCookie in IOS.
1. If you're using the Windows Phone WebBrowser control, you can try to use the WebBrowserExtensions.GetCookies method :
You could use the GetCookies method to retrieve cookies associated
with a website if you use the WebBrowser control in your application.
Once you have retrieved a CookieCollection, you could use the cookies
to make subsequent HTTP requests to the website.
It should return a CookieCollection which contains Cookie instances from which you'll get all the information you need.
2. If you're using HttpWebRequest, you'll find a good tutorial from msdn in here.
Basically you have to create and to populate a CookieContainer instance from your HttpWebRequest to send cookies and you just have to get received cookies from the Cookies property of HttpWebResponse in the other way. (It also returns a CookieCollection)
I'm trying to get HTML code from a specific webpage, but when I do it using
HttpWebRequest request;
HttpWebResponse response;
StreamReader streamReader;
request = (HttpWebRequest)WebRequest.Create(pageURL);
response = (HttpWebResponse)request.GetResponse();
streamReader = new StreamReader(response.GetResponseStream(), Encoding.GetEncoding("windows-1251"));
htmlCode = streamReader.ReadToEnd();
streamReader.Close();
or using WebClient, I get redirected to a login page and I get its code.
Is there any other way to get HTML code?
I read some information here: How to get HTML from a current request, in a postback , but didn't understand what should I do, or how and where to specify URL.
P.S.:
I'm logged-in in a browser. Notepad++ perfectly gets what I need via "right click - view source code".
Thanks.
If you get redirected to a login page, then presumably you must be logged in before you can get the content.
So you need to make a request, with suitable credentials, to the login page. Get whatever tokens are sent (usually in the form of cookies) to maintain the login. Then request the page you want (sending the cookies with the request).
Alternatively (and this is the preferred approach), most major sites that expect automated systems to interact with them provide an API (often using OAuth for authentication). Consult their documentation to see how their API works.
If the page you want to get to is behind a login screen - you're going to need to do the login mechanism through code. And add an associated CookieCollection to hold the login cookie that the website will try to drop on your Request.
Alternatively, if you have a user who can help the program along, you could try listing the cookies for the site once they've logged in through their browser. Copy that cookie across and add it to the CookieCollection.
Cheers
Simon
If you want to scrap an html page that requires autentication, I suggest you to use Watin
to fill the proper fields and navigate to the pages you want to download.
Maybe iot seems a little overkilling at a first glance, but it will save a lot of troubles later.
This should be an easy question, but I've been unable to solve it. I'm trying to change the Referral header prior to redirecting the page of an HttpResponse object. I know this can be done in an HttpWebResponse, but can't get this to work for a standard Page.Response.
I'm trying to just set the referer header to look like it originated from a temp page on my site (this is for analytics tracking for an external system).
Is this possible to do??
I've tried to use the code below (as well as variations such as Response.AppendHeader and Response.AddHeader), however the Referer always shows as the page that the Request initiated from.
Response.Headers.Add("Referer", "http://test.local/fromA");
Response.Redirect(HttpContext.Current.Request.Url.AbsoluteUri);
If not via .net can this be accomplished via js?
Thanks!
Referer is controlled (and sent) by the client. You can't affect it server-side. There may be some JavaScript that you could emit that'd get the client to do it - but it's probably considered a security flaw, so I wouldn't count on it.
The referrer is set by the client, not the server. It is useful to include in a request and not a response as it points to the URL where the request came from.