I'm trying to get the same type of results that Fiddler gets when I launch a webpage from my app.
Below is the code I'm using and the results I'm getting. I've used google.com only as an example.
What do I need to modify in my code to get the results I want or do I need an entirely different approach?
Thanks for your help.
My code:
// create the HttpWebRequest object
HttpWebRequest objRequest = (HttpWebRequest)WebRequest.Create("http://www.google.com");
// get the response object which has the header info, using the GetResponse method
var objResults = objRequest.GetResponse();
// get the header count
int intCount = objResults.Headers.Count;
// loop through the results object
for (int i = 0; i < intCount; i++)
{
string strKey = objResults.Headers.GetKey(i);
string strValue = objResults.Headers.Get(i);
lblResults.Text += strKey + "<br />" + strValue + "</br /><br />";
}
My results:
Cache-Control
private, max-age=0
Content-Type
text/html; charset=ISO-8859-1
Date
Tue, 05 Jun 2012 17:40:38 GMT
Expires
-1
Set-Cookie
PREF=ID=526197b0260fd361:FF=0:TM=1338918038:LM=1338918038:S=gefqgwkuzuPJlO3G; expires=Thu, 05-Jun-2014 17:40:38 GMT; path=/; domain=.google.com,NID=60=CJbpzMe6uTKf58ty7rysqUFTW6GnsQHZ-Uat_cFf1AuayffFtJoFQSIwT5oSQKqQp5PSIYoYtBf_8oSGh_Xsk1YtE7Z834Qwn0A4Sw3ruVCA9v3f_UDYH4b4fAloFJbW; expires=Wed, 05-Dec-2012 17:40:38 GMT; path=/; domain=.google.com; HttpOnly
P3P
CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Server
gws
X-XSS-Protection
1; mode=block
X-Frame-Options
SAMEORIGIN
Transfer-Encoding
chunked
=========================
Fiddler results:
Result Protocol Host URL Body Caching Content-Type Process Comments Custom
1 304 HTTP www.rolandgarros.com /images/misc/weather/P8.gif 0 max-age=700 Expires: Tue, 05 Jun 2012 17:53:40 GMT image/gif firefox:5456
2 200 HTTP www.google.com / 23,697 private, max-age=0 Expires: -1 text/html; charset=UTF-8 chrome:2324
3 304 HTTP www.rolandgarros.com /images/misc/weather/P9.gif 0 max-age=700 Expires: Tue, 05 Jun 2012 17:53:57 GMT image/gif firefox:5456
4 200 HTTP Tunnel to translate.googleapis.com:443 0 chrome:2324
5 200 HTTP www.google.com
The difference is Fiddler is actually recording an entire session, not just a single HTTP request.
If a user loads Google.com, the response is typically an HTML document which contains images, script files, CSS files, etc. Your browser will then initiate a new HTTP request for each one of those resources. With Fiddler running, it tracks each of those HTTP requests and spits out the result code and other information about the session.
With your C# code above, you're only initiating a single HTTP request, thus you only have information about a single result.
You'd probably be better off writing a browser plugin. Otherwise, you'd have to parse the HTML response and load other resources from that document as well.
If you do need to do this with C# code, you could probably parse the document with the HTML Agility Pack and then look for other resources within the HTML to simulate a browser. There's also embedded browsers, such as Awesomium, that might be helpful.
You are not asking for the same information that Fiddler is displaying. Fiddler shows the HTTP Status code, the host and URI and (it appears, from your example) the Content Length, Content Type and Cache status.
For many of these you will have to peek in to the response headers.
Related
Request:
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
String responseString = new StreamReader(response.GetResponseStream()).ReadToEnd();
Console.WriteLine(responseString);
Response:
{"code":"SUCCESS","details":
{"created_time":"","id":"xxxx"},
"message":"uploaded",
"status":"success"}
HTTP/1.1 200 OK
Date: Wed, 18 Dec 2019 11:42:26 IST
Last-Modified: Wed, 18 Dec 2019 11:42:25 IST
Content-Type: application/json
Connection: Keep-Alive
Server: AWServer
Pragma: no-cache
Cache-Control: no-cache
Expires: 1
Whenever the above-mentioned C# request is executed, the response occasionally contains headers(HTTP/1.1 200 OK...), When I'm only trying to get the body part({"code"....} alone(response.GetResponseStream()). Is this the intended behavior?
Take a look at the basic article on http headers
HTTP headers let the client and the server pass additional information with an HTTP request or response. An HTTP header consists of its case-insensitive name followed by a colon (:), then by its value. Whitespace before the value is ignored.
Headers are additional information. I guess that since you left out the url and the whole creation of the Request and the url, this means that some responses have Headers and some not. That depends on the additional non-body information the api or web server wants to respond with.
It's in the control of the responder and not the receiver.
Don't ignore them: Some times interesting metadata come from Headers. It should not be data but information about it, like encoding, CORS info etc.
last modified header link
date header link
I'm attempting to write a curl-like tool that demonstrates the effect of various HTTP caching headers on dot net's HttpClient class.
In my initial attempt I'm pointing the tool at one of my internal web services that does not specify any caching information in the response and examining the header of the response.
I expect to see that the request is re-sent each time and executed on the server, returning a new but identical set of content each time (for the purpose of this test, the content is static on the server). But, instead, each request after the first returns much more quickly than the first and includes a new header Age that was not present in the very first response. This indicates to me that the HttpClient in my command-line tool is returning the response from cache, not placing a new request.
Here is the first request with the response headers:
HTTP:>GET http://myserver:8058/path1/path2
Status 200 OK (OK in 00:00:00.3235905):
Date = Sat, 08 Jul 2017 15:55:22 GMT
Server = Microsoft-HTTPAPI/2.0
Content-Length = 150867
Content-Type = application/json; charset=utf-8
and here is the request from the same session of my curl tool, a little while later:
HTTP:>GET http://myserver:8058/path1/path2
Status 200 OK (OK in 00:00:00.0188433):
Date = Sat, 08 Jul 2017 15:55:22 GMT
Server = Microsoft-HTTPAPI/2.0
Age = 312
Content-Length = 150867
Content-Type = application/json; charset=utf-8
and finally, after I stop and start my program, here's another request from the new instance:
HTTP:>GET http://myserver:8058/path1/path2
Status 200 OK (OK in 00:00:00.0517271):
Date = Sat, 08 Jul 2017 15:55:22 GMT
Server = Microsoft-HTTPAPI/2.0
Age = 528
Content-Length = 150867
Content-Type = application/json; charset=utf-8
The last one I find even more difficult to understand as I was under the impression (from reading this: https://aspnetmonsters.com/2016/08/2016-08-27-httpclientwrong/) that caching is maintained per instance of HttpClient.
This seems to continue forever with Age increasing each request. The only way to get back to the original response is to use Internet Explorer and delete temporary internet files.
[Additional Info] After leaving my command line application open for a couple of hours I repeated the request and received a response identical to the original, without the Age header. So it appears that, if HttpClient was caching the response, that cache expired after a couple of hours.
Can anyone tell me if I'm correct that HttpClient is performing internal caching in this case, and if so, why it's doing so in the absence of any caching-related response headers and what policy it's using?
I'm trying to gather a list of recent posts that contain a certain hashtag. The API Documentation states that I should be using the following GET call:
https://api.instagram.com/v1/tags/{tag-name}/media/recent?access_token=ACCESS-TOKEN
When I load the page where I want this information displayed, I perform the following:
using(HttpClient Client = new HttpClient())
{
var uri = "https://api.instagram.com/v1/tags/" + tagToLookFor + "/media/recent?access_token=" + Session["instagramaccesstoken"].ToString();
var results = Client.GetAsync(uri).Result;
// Result handling below here.
}
For reference, tagToLookFor is a constant string defined at the top of the class (eg. foo), and I store the Access Token returned from the OAuth process in the Session object with a key of 'instagramaccesstoken'.
While debugging this, I checked to make sure the URI was being formed correctly, and it does contain both the tag name and the just-created access_token. Using Apigee with the same URI (Save for a different access_token) returns the valid results I would expect. However, attempting to GET using the URI on my webstie returns:
{
StatusCode: 400,
ReasonPhrase: 'BAD REQUEST',
Version: 1.1,
Content: System.Net.Http.StreamContent,
Headers:{
X-Ratelimit-Remaining: 499
Vary: Cookie
Vary: Accept-Language
X-Ratelimit-Limit: 500
Pragma: no-cache
Connection: keep-alive
Cache-Control: no-store, must-revalidate, no-cache, private
Date: Fri, 27 Nov 2015 21:39:56 GMT
Set-Cookie: csrftoken=97cc443e4aaf11dbc44b6c1fb9113378; expires=Fri, 25-Nov-2016 21:39:56 GMT; Max-Age=31449600; Path=/
Content-Length: 283
Content-Language: en
Content-Type: application/json; charset=utf-8
Expires: Sat, 01 Jan 2000 00:00:00 GMT
}
}
I'm trying to determine what the difference between the two could be; the only thing that I can think of is that access_token is somehow being invalidated when I switch between pages. The last thing I do on the Login/Auth page is store the access_token using Session.Add, then call Server.Transfer to move to the page that I'm calling this on.
Any Ideas on what the issue could be? Thanks.
Attach the token to the header when making the request.
Client.DefaultRequestHeaders.Add("access_token", "Bearer " + token);
The problem ended up being one regarding Sandbox Mode. I had registered an app after the switch, and I was the only user in my sandbox. As a result, it had no problem finding my posts/info, but Sandbox Mode acts as if the Sandbox users are the only users on Instagram, so naturally it would not find anything else.
It turns out there was an existing registered application in my organization (made before the switch date) that does not have any such limitations, so I have been testing using that AppID/secret.
tl;dr: If you're the only user in your app's sandbox, work on getting users into your sandbox. See their article about it for more info.
Let's say we make a request to a URL and get back the raw response, like this:
HTTP/1.1 200 OK
Date: Wed, 28 Apr 2010 14:39:13 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: PREF=ID=e2bca72563dfffcc:TM=1272465553:LM=1272465553:S=ZN2zv8oxlFPT1BJG; expires=Fri, 27-Apr-2012 14:39:13 GMT; path=/; domain=.google.co.uk
Server: gws
X-XSS-Protection: 1; mode=block
Connection: close
<!doctype html><html><head>...</head><body>...</body></html>
What would be the best way to remove the HTTP headers from the response in C#? With regexes? Parsing it into some kind of HTTPResponse object and using only the body?
EDIT:
I'm using SOCKS to make the request; that's why I get the raw response.
Headers and body are separated by empty line. it is really easier to do it without RE. Just search for first empty line.
If you use HttpWebrequest class you get an HttpWebResponse object returned which in turn contains a collection of Headers. You can then remove them, parse them or do whatever you wish with them.
Note that using the substring method will leave you with a leading carriage return. I used this:
string HTTPHeaderDelimiter = "\r\n\r\n";
if (RawHTTPResponse.IndexOf("HTTP/1.1 200 OK") > -1)
{
HTTPPayload = RawHTTPResponse.Substring(RawHTTPResponse.IndexOf(HTTPHeaderDelimiter)+HTTPHeaderDelimiter.Length);
}
else
{
return;
}
I've recently run into some problems with the CookieContainer. Either I'm doing something seriously wrong or there is some kind of bug w/ the CookieContainer object. It doesn't seem to update the cookie collection with certain Set-Cookie headers.
This might be a lengthy post and I appologize, but I want to be as thurough as possible so I'm going to list my HTTP sniffing logs as well as my actual implementation code.
public bool SendRequest(HttpWebRequest request, IDictionary<string, string> data, int retries)
{
// copy request in case request instance already failed
HttpWebRequest newRequest = (HttpWebRequest)HttpWebRequest.Create(request.RequestUri);
newRequest.Method = request.Method;
// if POST data was provided, write it to the stream
if (data != null && data.Count != 0)
{
StreamWriter writer = new StreamWriter(newRequest.GetRequestStream());
writer.Write(createPostString(data));
writer.Close();
}
// set request with global cookie container
newRequest.CookieContainer = this.cookieJar;
try
{
using (HttpWebResponse resp = (HttpWebResponse)newRequest.GetResponse())
{
//CookieCollection newCooks = getCookies(resp.Headers);
//updateCookies(newCooks);
this.cookieJar = newRequest.CookieContainer;
this.Html = getResponseString(resp);
/* remainder snipped */
So there is the code, here are two request to responses I sniffed in Fiddler:
Request 1
POST /login/ HTTP/1.1
Host: www.site.com
Content-Length: 47
Expect: 100-continue
Connection: Keep-Alive
Response 1
HTTP/1.1 200 OK
Date: Wed, 02 Dec 2009 17:03:35 GMT
Server: Apache
Set-Cookie: tcc=one; path=/
Set-Cookie: cust_id=2702585226; domain=.site.com; path=/; expires=Mon, 01-Jan-2011 00:00:00 GMT
Set-Cookie: cust_session=12%2F2%2F2009%20%2012%3A3%3A35; domain=.site.com; path=/; expires=Wed 2-Dec-2009 17:33:35
Set-Cookie: refer_id_persistent=0000; domain=.site.com; path=/; expires=Fri 2-Dec-2011 17:3:35
Set-Cookie: refer_id=0000; domain=.site.com; path=/
Set-Cookie: private_browsing_mode=off; domain=.site.com; path=/; expires=Fri, 01-Jan-2010 17:03:35 GMT
Set-Cookie: member_session=UmFuZG9tSVYL%5BS%5D%5BP%5DfhH77bYaVoS9j9Yd8ySRkyHHz%5BS%5Dk0S8MVsQ6AyraNlcdcCRC0RkB%5BP%5DfBYVM4vn6JQ3HlJxT3GlJi1RZiMGQaITg7HN9dpu9oRbZgMjhJlXXa%5BP%5D7pFSjqDIZWRr3LAfnhh3btv4E3rvVH42CeOP%5BS%5Dx6kDyvrokQEHyIHPGi7zswZbuHrUdx2XKEKKJzw1unDWfw0LZWjoehAs0QgSOz6Nzp8P4Hp8hqrULdIMch6acPT%5BS%5DbKV8zwugBIcjr5dI3rVR%5BP%5Dv42rsTtQB7dyb%5BP%5DRKb8Y83cGqhHM33hP%5BP%5DUtmbDC1PPfr%5BS%5DPC23lAO%5BS%5DmQ3mOy9x4pgQSOfp40XSfzgVg3EavITaxHBeI5nO3%5BP%5D%5BS%5D2rSDthDfuEm4sT9i6UF3sYd1vlOL0IC9ZsVatV1yhhpQ%5BE%5D%5BE%5D; domain=.site.com; path=/; expires=Fri, 01-Jan-2010 17:03:35 GMT
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
Request 2
GET /test?search=jjkjf HTTP/1.1
Host: www.site.com
Cookie: tcc=one; cust_id=2702585226; private_browsing_mode=off; member_session=UmFuZG9tSVYL%5BS%5D%5BP%5DfhH77bYaVoS9j9Yd8ySRkyHHz%5BS%5Dk0S8MVsQ6AyraNlcdcCRC0RkB%5BP%5DfBYVM4vn6JQ3HlJxT3GlJi1RZiMGQaITg7HN9dpu9oRbZgMjhJlXXa%5BP%5D7pFSjqDIZWRr3LAfnhh3btv4E3rvVH42CeOP%5BS%5Dx6kDyvrokQEHyIHPGi7zswZbuHrUdx2XKEKKJzw1unDWfw0LZWjoehAs0QgSOz6Nzp8P4Hp8hqrULdIMch6acPT%5BS%5DbKV8zwugBIcjr5dI3rVR%5BP%5Dv42rsTtQB7dyb%5BP%5DRKb8Y83cGqhHM33hP%5BP%5DUtmbDC1PPfr%5BS%5DPC23lAO%5BS%5DmQ3mOy9x4pgQSOfp40XSfzgVg3EavITaxHBeI5nO3%5BP%5D%5BS%5D2rSDthDfuEm4sT9i6UF3sYd1vlOL0IC9ZsVatV1yhhpQ%5BE%5D%5BE%5D
So as you can see, the CookieContainer (this.cookieJar) which is used for every request is not picking up the Set-Cookie header for refer_id, cust_session, refer_id_persistent. However it does pick up cust_id, private_browsing_mode, tcc, and member_session... Any ideas why this might be?
Just wanted to update this post in case someone else came across this. Issue is that .NET complies with the RFC specification for cookie tags, but not all sites do. So, ultimately, the issue is not Microsoft, or .NET for the matter. (Although, IE, manages the cookies fine so it would be better to rewrite their .NET cookie parsing methods using the same parsing methods) The issue is the sites that do not follow RFC specifications.
Nonetheless, an issue I've often encountered is that sites will use commas in the expiration dates in their cookies. .NET interprets these as separators between different cookie fields and strips the ending and everything there after off of the cookie.
RFC spec: "Cookie:, followed by a comma-separated list of one or more cookies." An easy solution to this problem would be for the web server to enclose values with commas in quotation marks, per the RFC document. However, there is no RFC police, so we can only hope that people follow the rules.
MSDN SetCookies:
SetCookies pulls all the HTTP cookies out of the HTTP cookie header, builds a Cookie for each one, and then adds each Cookie to the internal CookieCollection that is associated with the URI. The HTTP cookies in the cookieHeader string must be delimited by commas.
MSDN GetCookieHeader
GetCookieHeader returns a string that holds the HTTP cookie header for the Cookie instances specified by uri. The HTTP header is built by adding a string representation of each Cookie associated with uri. Note that the exact format of the string depends on the RFC that the Cookie conforms to. The strings for all the Cookie instances that are associated with uri are combined and delimited by semicolons.
This string is not in the correct format for use as the second parameter of the SetCookies method.
This is only a quick scan through you code but it seems that you are sending post data before you send the cookies in the request.
if (data != null && data.Count != 0)
{
StreamWriter writer = new StreamWriter(newRequest.GetRequestStream());
writer.Write(createPostString(data));
writer.Close();
}
// set request with global cookie container
newRequest.CookieContainer = this.cookieJar;
What might be happening is when you write your post data to the stream this is sent to the remote server. However for a cookies to be set they must be sent to the server before any postdata. The simple solution is to swap this around like so:
// set request with global cookie container
newRequest.CookieContainer = this.cookieJar;
if (data != null && data.Count != 0)
{
StreamWriter writer = new StreamWriter(newRequest.GetRequestStream());
writer.Write(createPostString(data));
writer.Close();
}
CookieContainer has 2 major issues that I have come across, whether it is by design or bug I don't know.
1) Cookies set on a 302 post are not picked up.
Example
Post to site
302 redirect response
Load New page which sets cookie
Solution
Set autoredirect to false and manually follow the redirects and set the cookies yourself
2) .Net is VERY fussy about incorectly form cookie strings that have a comma in the string. This is actually correct, but occasioally cookies have the date set that include a comma, which stops all cookies being set.
Solution
Manually parse cookie strings and add yourself. A horrible task. I have a sprawing mess of a hack function, that loops and ifs but the end result is it works for all cases I have thrown at it so far. IT isn't pretty but it gets job done
Not sure the above is your issue, but maybe. If not some food for thought anyway
My solution: replace " UTC" with " GMT".
Try using CookieContainer.GetCookieHeader and CookieContainer.SetCookies
YourCookieContainer.GetCookieHeader(new Uri("your url"));
YourCookieContainer.SetCookies(new Uri("your url"), "string from GetCookieHeader");