I'm using HttpClient 0.6.0 from NuGet.
I have the following C# code:
var client = new HttpClient(new WebRequestHandler() {
CachePolicy =
new HttpRequestCachePolicy(HttpRequestCacheLevel.CacheIfAvailable)
});
client.GetAsync("http://myservice/asdf");
The service (this time CouchDB) returns an ETag value and status code 200 OK. There is returned a Cache-Control header with value must-revalidate
Update, here are the response headers from couchdb (taken from the visual studio debugger):
Server: CouchDB/1.1.1 (Erlang OTP/R14B04)
Etag: "1-27964df653cea4316d0acbab10fd9c04"
Date: Fri, 09 Dec 2011 11:56:07 GMT
Cache-Control: must-revalidate
Next time I do the exact same request, HttpClient does a conditional request and gets back 304 Not Modified. Which is right.
However, if I am using low-level HttpWebRequest class with the same CachePolicy, the request isn't even made the second time. This is the way I would want HttpClient also behave.
Is it the must-revalidate header value or why is HttpClient behaving differently? I would like to do only one request and then have the rest from cache without the conditional request..
(Also, as a side-note, when debugging, the Response status code is shown as 200 OK, even though the service returns 304 Not Modified)
Both clients behave correctly.
must-revalidate only applies to stale responses.
When the must-revalidate directive is present in a response received by a cache, that cache MUST NOT use the entry after it becomes stale to respond to a
subsequent request without first revalidating it with the origin server. (I.e., the cache MUST do an end-to-end revalidation every time, if, based solely on the origin server's Expires or max-age value, the cached response is stale.)
Since you do not provide explicit expiration, caches are allowed to use heuristics to determine freshness.
Since you do not provide Last-Modified caches do not need to warn the client that heuristics was used.
If none of Expires, Cache-Control: max-age, or Cache-Control: s- maxage (see section 14.9.3) appears in the response, and the response does not include other restrictions on caching, the cache MAY compute a freshness lifetime using a heuristic. The cache MUST attach Warning 113 to any response whose age is more than 24 hours if such warning has not already been added.
The response age is calculated based on Date header since Age is not present.
If the response is still fresh according to heuristic expiration, caches may use the stored response.
One explanation is that HttpWebRequest uses heuristics and that there was a stored response with status code 200 that was still fresh.
Answering my own question..
According to http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.4 I would say that
a "Cache-Control: must-revalidate" without expiration states that the resource should be validated on every request.
In this case it means a conditional GET should be done every time the resource is made. So in this case System.Net.Http.HttpClient is behaving correctly and the legacy (Http)WebRequest is doing invalid behavior.
Related
I'm attempting to write a curl-like tool that demonstrates the effect of various HTTP caching headers on dot net's HttpClient class.
In my initial attempt I'm pointing the tool at one of my internal web services that does not specify any caching information in the response and examining the header of the response.
I expect to see that the request is re-sent each time and executed on the server, returning a new but identical set of content each time (for the purpose of this test, the content is static on the server). But, instead, each request after the first returns much more quickly than the first and includes a new header Age that was not present in the very first response. This indicates to me that the HttpClient in my command-line tool is returning the response from cache, not placing a new request.
Here is the first request with the response headers:
HTTP:>GET http://myserver:8058/path1/path2
Status 200 OK (OK in 00:00:00.3235905):
Date = Sat, 08 Jul 2017 15:55:22 GMT
Server = Microsoft-HTTPAPI/2.0
Content-Length = 150867
Content-Type = application/json; charset=utf-8
and here is the request from the same session of my curl tool, a little while later:
HTTP:>GET http://myserver:8058/path1/path2
Status 200 OK (OK in 00:00:00.0188433):
Date = Sat, 08 Jul 2017 15:55:22 GMT
Server = Microsoft-HTTPAPI/2.0
Age = 312
Content-Length = 150867
Content-Type = application/json; charset=utf-8
and finally, after I stop and start my program, here's another request from the new instance:
HTTP:>GET http://myserver:8058/path1/path2
Status 200 OK (OK in 00:00:00.0517271):
Date = Sat, 08 Jul 2017 15:55:22 GMT
Server = Microsoft-HTTPAPI/2.0
Age = 528
Content-Length = 150867
Content-Type = application/json; charset=utf-8
The last one I find even more difficult to understand as I was under the impression (from reading this: https://aspnetmonsters.com/2016/08/2016-08-27-httpclientwrong/) that caching is maintained per instance of HttpClient.
This seems to continue forever with Age increasing each request. The only way to get back to the original response is to use Internet Explorer and delete temporary internet files.
[Additional Info] After leaving my command line application open for a couple of hours I repeated the request and received a response identical to the original, without the Age header. So it appears that, if HttpClient was caching the response, that cache expired after a couple of hours.
Can anyone tell me if I'm correct that HttpClient is performing internal caching in this case, and if so, why it's doing so in the absence of any caching-related response headers and what policy it's using?
Is it possible to set Max Age for Cookies in Web Forms Application? I know, that it's okey to set Expire, but is there a way to set Max Age?
Asp.Net doesn't specifically provide this property on HttpCookie, probably because they are very Microsoft-centric, and IE doesn't support max-age (as least, as of IE11)
However, you can still do it. Here's some code demonstrating the proper and invalid ways to set this cookie with max-age:
// doesn't work:
var mytestcookie = new HttpCookie("regular_httpcookie", "01");
mytestcookie.Values.Add("max-age", "300");
Response.Cookies.Add(mytestcookie);
// *does* work:
Response.Headers.Add("set-cookie", "testingmaxage=01;max-age=300; path=/");
And it renders like this in the HTTP response:
Set-Cookie testingmaxage=01;max-age=300; path=/
X-AspNet-Version 4.0.30319
Set-Cookie regular_httpcookie=01&max-age=300; expires=Fri, 10-Jun-2016 15:02:15 GMT; path=/
As you can see above, if you are also setting cookies using HttpCookie, this will create a second "set-cookie" header on the response , but the browser won't mind, it will just add it to the list of cookies.
I tested on IE11 and Chrome and this is a non-issue - the cookies will all go in, as long as they have differing names. If the cookie name conflicts with one already set in HttpCookies, the last one in wins. Check out the text of your HTTP response to see which one goes in last. (Best to simply make sure they don't conflict though)
As I mentioned at the beginning, when testing on IE11, I noted that it's ignoring the max-age property of the cookie. Here's a link to a way to settle that issue:
Set-Cookie: Expire property, clock skew and Internet Explorer issue
Sorry for my english.
In Delphi I have an idHttp component with the hoWaitForUnexpectedData option activated.
When I send a POST request to a URL, it redirects the client to a second URL with the same POST request and headers. Also, the server response contains "Connection: keep-alive" in its header.
However, when I try to do the same request in C# with a HttpWebRequest component, it redirects to the second URL using the method GET.
I need the C# HttpWebRequest component to work like the Delphi idHTTP one does. I don't understand why it uses a GET instead of a POST when following the redirection.
Here's my code in Delphi, using hoWaitForUnexpectedData:
// The server is supposed to send a 'Content-Length' header without sending
// the actual data. 1xx, 204, and 304 replies are not supposed to contain
// entity bodies, either...
if TextIsSame(ARequest.Method, Id_HTTPMethodHead) or
TextIsSame(ARequest.MethodOverride, Id_HTTPMethodHead) or
((AResponse.ResponseCode div 100) = 1) or
(AResponse.ResponseCode = 204) or
(AResponse.ResponseCode = 304) then
begin
// Have noticed one case where a non-conforming server did send an
// entity body in response to a HEAD request. If requested, ignore
// anything the server may send by accident
if not (hoWaitForUnexpectedData in FOptions) then begin
Exit;
end;
Result := CheckForPendingData(100);
end
else if (AResponse.ResponseCode div 100) = 3 then
begin
// This is a workaround for buggy HTTP 1.1 servers which
// does not return any body with 302 response code
Result := CheckForPendingData(5000);
end else begin
Result := True;
end;
An HTTP redirect, by definition from the standard, should be handled using a GET. Therefore, if you send a POST and get a redirect as an answer, the expected behavior is to perform a GET to the redirect address. I suspect the old Delphi component is following old practices and replicates the call including with the POST verb.
I would try to disable AllowAutoRedirect in the HttpWebRequest object and handle this manually, as your case seems to differ from the standard.
The hoWaitForUnexpectedData option has no effect on how TIdHTTP handles redirects, and neither does the section of code you quoted.
However, the hoTreat302Like303 option does affect redirect handling. If TIdHTTP receives a 303 redirect, or receives a 302 redirect with hoTreat302Like303 enabled, TIdHTTP sends the new request as a GET. Otherwise, it sends the new request using the same verb as the redirected request. This is by design, and there is a series of comments in the implementation code of the TIdHTTPProtocol.ProcessResponse() method explaining the rational behind this behavior:
// GDG 21/11/2003. If it's a 303, we should do a get this time
// RLebeau 7/15/2004 - do a GET on 302 as well, as mentioned in RFC 2616
// RLebeau 1/11/2008 - turns out both situations are WRONG! RFCs 2068 and
// 2616 specifically state that changing the method to GET in response
// to 302 and 303 is errorneous. Indy 9 did it right by reusing the
// original method and source again and only changing the URL, so lets
// revert back to that same behavior!
// RLebeau 12/28/2012 - one more time. RFCs 2068 and 2616 actually say that
// changing the method in response to 302 is erroneous, but changing the
// method to GET in response to 303 is intentional and why 303 was introduced
// in the first place. Erroneous clients treat 302 as 303, though. Now
// encountering servers that actually expect this 303 behavior, so we have
// to enable it again! Adding an optional HTTPOption flag so clients can
// enable the erroneous 302 behavior if they really need it.
The jist of it is that the HTTP spec says to send a GET for a 303 redirect, whereas it is ambiguous about whether to send a GET for 302. Some browsers do, some do not. That is why the hoTreat302Like303 option was added, though it is disabled by default for backwards compatibility with earlier Indy versions.
So, the behavior you describe means you must be encountering a 302 redirect, with hoTreat302Like303 disabled (which it is default). If you enable that option, TIdHTTP will behave more like HttpWebRequest, not the other way around.
I'm doing some close to the metal HTTP tangling with Owin.
I have a owin middleware that outputs javascripts. It looks like this (relevant parts)
public override Task Invoke(IOwinContext context)
{
var response = context.Response;
response.ContentType = "application/javascript";
response.StatusCode = 200;
if (ClientCached(context.Request, scriptBuildDate))
{
response.StatusCode = 304;
response.Headers["Content-Length"] = "0";
response.Body.Close();
response.Body = Stream.Null;
return Task.FromResult<Object>(null);
}
response.Headers["Last-Modified"] = scriptBuildDate.ToUniversalTime().ToString("r");
return response.WriteAsync(js);
}
private bool ClientCached(IOwinRequest request, DateTime contentModified)
{
string header = request.Headers["If-Modified-Since"];
if (header != null)
{
DateTime isModifiedSince;
if (DateTime.TryParse(header, out isModifiedSince))
{
return isModifiedSince >= contentModified;
}
}
return false;
}
It will output 200 if its not client cached and add a Last-Modified date to the header, if its client cached it will output 304 "Not modified".
The problem is that the client will not call the url again unless they are doing a hard F5 in the browser. My understanding of Last modified caching is that it should call each time to check if the content has been modified?
Update:
Control: must-revalidate
Chrome
F5 and ctrl+F5 will call server, opening site in new tab or restarting browser will call server, typing the address in same tab will not call server. If-Modified-Since only cleared when doing Ctrl+F5 which means it can be used to return 304 correctly when content not modified
IE10
F5 and ctrl+F5 will call server, opening site in new tab will not call server, typing the address in same tab will not call server. If-Modified-Since cleared when doing Ctrl+F5 OR when restarting browser
Cache-Control: no-cache and Pragma: no-cach
Chrome
Will call server for every action If-Modified-Since only cleared when doing Ctrl+F5
Will call server for every action If-Modified-Since cleared for both restarting browser and Ctrl+F5
Conclusion
Looks like no-cache is might better if you want to be sure it calls to check for 304 each time
From the HTTP/1.1 spec (RFC2616, my emphasis):
13.2.2 Heuristic Expiration
Since origin servers do not always provide explicit expiration times,
HTTP caches typically assign heuristic expiration times, employing
algorithms that use other header values (such as the Last-Modified
time) to estimate a plausible expiration time. The HTTP/1.1
specification does not provide specific algorithms, but does impose
worst-case constraints on their results. Since heuristic expiration
times might compromise semantic transparency, they ought to used
cautiously, and we encourage origin servers to provide explicit
expiration times as much as possible.
Providing a Last-Modified header is not equivalent to asking user agents to check for updates every time they need a resource from your server.
Ideally, you should add an Expires header whenever possible. However, adding the header Cache-Control: must-revalidate should help.
I've recently run into some problems with the CookieContainer. Either I'm doing something seriously wrong or there is some kind of bug w/ the CookieContainer object. It doesn't seem to update the cookie collection with certain Set-Cookie headers.
This might be a lengthy post and I appologize, but I want to be as thurough as possible so I'm going to list my HTTP sniffing logs as well as my actual implementation code.
public bool SendRequest(HttpWebRequest request, IDictionary<string, string> data, int retries)
{
// copy request in case request instance already failed
HttpWebRequest newRequest = (HttpWebRequest)HttpWebRequest.Create(request.RequestUri);
newRequest.Method = request.Method;
// if POST data was provided, write it to the stream
if (data != null && data.Count != 0)
{
StreamWriter writer = new StreamWriter(newRequest.GetRequestStream());
writer.Write(createPostString(data));
writer.Close();
}
// set request with global cookie container
newRequest.CookieContainer = this.cookieJar;
try
{
using (HttpWebResponse resp = (HttpWebResponse)newRequest.GetResponse())
{
//CookieCollection newCooks = getCookies(resp.Headers);
//updateCookies(newCooks);
this.cookieJar = newRequest.CookieContainer;
this.Html = getResponseString(resp);
/* remainder snipped */
So there is the code, here are two request to responses I sniffed in Fiddler:
Request 1
POST /login/ HTTP/1.1
Host: www.site.com
Content-Length: 47
Expect: 100-continue
Connection: Keep-Alive
Response 1
HTTP/1.1 200 OK
Date: Wed, 02 Dec 2009 17:03:35 GMT
Server: Apache
Set-Cookie: tcc=one; path=/
Set-Cookie: cust_id=2702585226; domain=.site.com; path=/; expires=Mon, 01-Jan-2011 00:00:00 GMT
Set-Cookie: cust_session=12%2F2%2F2009%20%2012%3A3%3A35; domain=.site.com; path=/; expires=Wed 2-Dec-2009 17:33:35
Set-Cookie: refer_id_persistent=0000; domain=.site.com; path=/; expires=Fri 2-Dec-2011 17:3:35
Set-Cookie: refer_id=0000; domain=.site.com; path=/
Set-Cookie: private_browsing_mode=off; domain=.site.com; path=/; expires=Fri, 01-Jan-2010 17:03:35 GMT
Set-Cookie: member_session=UmFuZG9tSVYL%5BS%5D%5BP%5DfhH77bYaVoS9j9Yd8ySRkyHHz%5BS%5Dk0S8MVsQ6AyraNlcdcCRC0RkB%5BP%5DfBYVM4vn6JQ3HlJxT3GlJi1RZiMGQaITg7HN9dpu9oRbZgMjhJlXXa%5BP%5D7pFSjqDIZWRr3LAfnhh3btv4E3rvVH42CeOP%5BS%5Dx6kDyvrokQEHyIHPGi7zswZbuHrUdx2XKEKKJzw1unDWfw0LZWjoehAs0QgSOz6Nzp8P4Hp8hqrULdIMch6acPT%5BS%5DbKV8zwugBIcjr5dI3rVR%5BP%5Dv42rsTtQB7dyb%5BP%5DRKb8Y83cGqhHM33hP%5BP%5DUtmbDC1PPfr%5BS%5DPC23lAO%5BS%5DmQ3mOy9x4pgQSOfp40XSfzgVg3EavITaxHBeI5nO3%5BP%5D%5BS%5D2rSDthDfuEm4sT9i6UF3sYd1vlOL0IC9ZsVatV1yhhpQ%5BE%5D%5BE%5D; domain=.site.com; path=/; expires=Fri, 01-Jan-2010 17:03:35 GMT
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
Request 2
GET /test?search=jjkjf HTTP/1.1
Host: www.site.com
Cookie: tcc=one; cust_id=2702585226; private_browsing_mode=off; member_session=UmFuZG9tSVYL%5BS%5D%5BP%5DfhH77bYaVoS9j9Yd8ySRkyHHz%5BS%5Dk0S8MVsQ6AyraNlcdcCRC0RkB%5BP%5DfBYVM4vn6JQ3HlJxT3GlJi1RZiMGQaITg7HN9dpu9oRbZgMjhJlXXa%5BP%5D7pFSjqDIZWRr3LAfnhh3btv4E3rvVH42CeOP%5BS%5Dx6kDyvrokQEHyIHPGi7zswZbuHrUdx2XKEKKJzw1unDWfw0LZWjoehAs0QgSOz6Nzp8P4Hp8hqrULdIMch6acPT%5BS%5DbKV8zwugBIcjr5dI3rVR%5BP%5Dv42rsTtQB7dyb%5BP%5DRKb8Y83cGqhHM33hP%5BP%5DUtmbDC1PPfr%5BS%5DPC23lAO%5BS%5DmQ3mOy9x4pgQSOfp40XSfzgVg3EavITaxHBeI5nO3%5BP%5D%5BS%5D2rSDthDfuEm4sT9i6UF3sYd1vlOL0IC9ZsVatV1yhhpQ%5BE%5D%5BE%5D
So as you can see, the CookieContainer (this.cookieJar) which is used for every request is not picking up the Set-Cookie header for refer_id, cust_session, refer_id_persistent. However it does pick up cust_id, private_browsing_mode, tcc, and member_session... Any ideas why this might be?
Just wanted to update this post in case someone else came across this. Issue is that .NET complies with the RFC specification for cookie tags, but not all sites do. So, ultimately, the issue is not Microsoft, or .NET for the matter. (Although, IE, manages the cookies fine so it would be better to rewrite their .NET cookie parsing methods using the same parsing methods) The issue is the sites that do not follow RFC specifications.
Nonetheless, an issue I've often encountered is that sites will use commas in the expiration dates in their cookies. .NET interprets these as separators between different cookie fields and strips the ending and everything there after off of the cookie.
RFC spec: "Cookie:, followed by a comma-separated list of one or more cookies." An easy solution to this problem would be for the web server to enclose values with commas in quotation marks, per the RFC document. However, there is no RFC police, so we can only hope that people follow the rules.
MSDN SetCookies:
SetCookies pulls all the HTTP cookies out of the HTTP cookie header, builds a Cookie for each one, and then adds each Cookie to the internal CookieCollection that is associated with the URI. The HTTP cookies in the cookieHeader string must be delimited by commas.
MSDN GetCookieHeader
GetCookieHeader returns a string that holds the HTTP cookie header for the Cookie instances specified by uri. The HTTP header is built by adding a string representation of each Cookie associated with uri. Note that the exact format of the string depends on the RFC that the Cookie conforms to. The strings for all the Cookie instances that are associated with uri are combined and delimited by semicolons.
This string is not in the correct format for use as the second parameter of the SetCookies method.
This is only a quick scan through you code but it seems that you are sending post data before you send the cookies in the request.
if (data != null && data.Count != 0)
{
StreamWriter writer = new StreamWriter(newRequest.GetRequestStream());
writer.Write(createPostString(data));
writer.Close();
}
// set request with global cookie container
newRequest.CookieContainer = this.cookieJar;
What might be happening is when you write your post data to the stream this is sent to the remote server. However for a cookies to be set they must be sent to the server before any postdata. The simple solution is to swap this around like so:
// set request with global cookie container
newRequest.CookieContainer = this.cookieJar;
if (data != null && data.Count != 0)
{
StreamWriter writer = new StreamWriter(newRequest.GetRequestStream());
writer.Write(createPostString(data));
writer.Close();
}
CookieContainer has 2 major issues that I have come across, whether it is by design or bug I don't know.
1) Cookies set on a 302 post are not picked up.
Example
Post to site
302 redirect response
Load New page which sets cookie
Solution
Set autoredirect to false and manually follow the redirects and set the cookies yourself
2) .Net is VERY fussy about incorectly form cookie strings that have a comma in the string. This is actually correct, but occasioally cookies have the date set that include a comma, which stops all cookies being set.
Solution
Manually parse cookie strings and add yourself. A horrible task. I have a sprawing mess of a hack function, that loops and ifs but the end result is it works for all cases I have thrown at it so far. IT isn't pretty but it gets job done
Not sure the above is your issue, but maybe. If not some food for thought anyway
My solution: replace " UTC" with " GMT".
Try using CookieContainer.GetCookieHeader and CookieContainer.SetCookies
YourCookieContainer.GetCookieHeader(new Uri("your url"));
YourCookieContainer.SetCookies(new Uri("your url"), "string from GetCookieHeader");