WebClient problem with URL which ends with a period - c#

I'm running the following code;
using (WebClient wc = new WebClient())
{
string page = wc.DownloadString(URL);
...
}
To access the URL of a share price website, http://www.shareprice.co.uk
If you append a company's symbol name onto the end of the URL, then a page is returned which I parse to get the latest price info etc.
e.g.
http://www.shareprice.co.uk/VOD
http://www.shareprice.co.uk/TW.
Now, my problem is that some symbols end in periods, as in the second example there. For some unknown reason, the code above has a problem retrieving these sorts of URLs.
There is no run-time error, but a page is returned back which reports "Symbol could not be found" from the website itself, indicating that something is happening to the period on the end of the URL in between the call to DownloadString and the actual HTTP request.
Does anyone have any idea what might be causing this, and how to fix it?
Thanks

It seems you found a bug in WebClient/WebRequest, though perhaps Microsoft put that in intentionally, who knows. Nonetheless, when you pass in TW., the URI class is translating that to TW without the period. Since WebClient/WebRequest parse strings into URI, your . is disappearing in that world.
You may have to use TcpClient to get around this and roll your own web client. Any variation of this:
TcpClient oClient = new TcpClient("www.shareprice.co.uk", 80);
NetworkStream ns = oClient.GetStream();
StreamWriter sw = new StreamWriter(ns);
sw.Write(
string.Format(
"GET /{0} HTTP/1.1\r\nUser-Agent: {1}\r\nHost: www.shareprice.co.uk\r\n\r\n",
"TW.",
"MyTCPClient" )
);
sw.Flush();
StringBuilder sb = new StringBuilder();
while (true)
{
int i = ns.ReadByte(); // Inefficient but more reliable
if (i == -1) break; // Other side has closed socket
sb.Append( (char) i ); // Accrue 'c' to save page data
}
oClient.Close();
This will give you a 302 redirect, so just parse out the 'Location:' and execute the above again with the new location.
HTTP/1.1 302 Found
Date: Wed, 11 Nov 2009 19:29:27 GMT
Server: lighttpd
X-Powered-By: PHP/5.2.4-2ubuntu5.7
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Location: /TW./TAYLOR-WIMPEY-PLC
Content-type: text/html; charset=UTF-8
Content-Length: 0
Set-Cookie: SSID=668d5d0023e9885e1ef3762ef5e44033; path=/
Vary: Accept-Encoding
Connection: close

Try adding a slash to the end, after the period. Your normal web browser will do that for you, and the WebClient class isn't that smart.
http://www.shareprice.co.uk/TW./
This worked for me as well when I typed it into the browser.
Edit - added
The following all also worked in the browser
http://www.shareprice.co.uk/TW
and
http://www.shareprice.co.uk/TW/
so it looks like you should be able to just check to see if the last character is a period, and remove it.

use URL encoding...it will turn the "." into %2E

To address a single period (.) at the end of a URL use the following:
<system.web>
<httpRuntime relaxedUrlToFileSystemMapping="true" />
</system.web>
To address two periods (..) or other denied sequences, see the following article:
http://www.iis.net/ConfigReference/system.webServer/security/requestFiltering/denyUrlSequences

Just add a space after the period, when parsing the space will be removed but the period will stay there.

Related

Is Dot Net HttpClient Unexpectedly Caching Responses?

I'm attempting to write a curl-like tool that demonstrates the effect of various HTTP caching headers on dot net's HttpClient class.
In my initial attempt I'm pointing the tool at one of my internal web services that does not specify any caching information in the response and examining the header of the response.
I expect to see that the request is re-sent each time and executed on the server, returning a new but identical set of content each time (for the purpose of this test, the content is static on the server). But, instead, each request after the first returns much more quickly than the first and includes a new header Age that was not present in the very first response. This indicates to me that the HttpClient in my command-line tool is returning the response from cache, not placing a new request.
Here is the first request with the response headers:
HTTP:>GET http://myserver:8058/path1/path2
Status 200 OK (OK in 00:00:00.3235905):
Date = Sat, 08 Jul 2017 15:55:22 GMT
Server = Microsoft-HTTPAPI/2.0
Content-Length = 150867
Content-Type = application/json; charset=utf-8
and here is the request from the same session of my curl tool, a little while later:
HTTP:>GET http://myserver:8058/path1/path2
Status 200 OK (OK in 00:00:00.0188433):
Date = Sat, 08 Jul 2017 15:55:22 GMT
Server = Microsoft-HTTPAPI/2.0
Age = 312
Content-Length = 150867
Content-Type = application/json; charset=utf-8
and finally, after I stop and start my program, here's another request from the new instance:
HTTP:>GET http://myserver:8058/path1/path2
Status 200 OK (OK in 00:00:00.0517271):
Date = Sat, 08 Jul 2017 15:55:22 GMT
Server = Microsoft-HTTPAPI/2.0
Age = 528
Content-Length = 150867
Content-Type = application/json; charset=utf-8
The last one I find even more difficult to understand as I was under the impression (from reading this: https://aspnetmonsters.com/2016/08/2016-08-27-httpclientwrong/) that caching is maintained per instance of HttpClient.
This seems to continue forever with Age increasing each request. The only way to get back to the original response is to use Internet Explorer and delete temporary internet files.
[Additional Info] After leaving my command line application open for a couple of hours I repeated the request and received a response identical to the original, without the Age header. So it appears that, if HttpClient was caching the response, that cache expired after a couple of hours.
Can anyone tell me if I'm correct that HttpClient is performing internal caching in this case, and if so, why it's doing so in the absence of any caching-related response headers and what policy it's using?

(Instagram API) Trouble Reaching Tag Endpoints

I'm trying to gather a list of recent posts that contain a certain hashtag. The API Documentation states that I should be using the following GET call:
https://api.instagram.com/v1/tags/{tag-name}/media/recent?access_token=ACCESS-TOKEN
When I load the page where I want this information displayed, I perform the following:
using(HttpClient Client = new HttpClient())
{
var uri = "https://api.instagram.com/v1/tags/" + tagToLookFor + "/media/recent?access_token=" + Session["instagramaccesstoken"].ToString();
var results = Client.GetAsync(uri).Result;
// Result handling below here.
}
For reference, tagToLookFor is a constant string defined at the top of the class (eg. foo), and I store the Access Token returned from the OAuth process in the Session object with a key of 'instagramaccesstoken'.
While debugging this, I checked to make sure the URI was being formed correctly, and it does contain both the tag name and the just-created access_token. Using Apigee with the same URI (Save for a different access_token) returns the valid results I would expect. However, attempting to GET using the URI on my webstie returns:
{
StatusCode: 400,
ReasonPhrase: 'BAD REQUEST',
Version: 1.1,
Content: System.Net.Http.StreamContent,
Headers:{
X-Ratelimit-Remaining: 499
Vary: Cookie
Vary: Accept-Language
X-Ratelimit-Limit: 500
Pragma: no-cache
Connection: keep-alive
Cache-Control: no-store, must-revalidate, no-cache, private
Date: Fri, 27 Nov 2015 21:39:56 GMT
Set-Cookie: csrftoken=97cc443e4aaf11dbc44b6c1fb9113378; expires=Fri, 25-Nov-2016 21:39:56 GMT; Max-Age=31449600; Path=/
Content-Length: 283
Content-Language: en
Content-Type: application/json; charset=utf-8
Expires: Sat, 01 Jan 2000 00:00:00 GMT
}
}
I'm trying to determine what the difference between the two could be; the only thing that I can think of is that access_token is somehow being invalidated when I switch between pages. The last thing I do on the Login/Auth page is store the access_token using Session.Add, then call Server.Transfer to move to the page that I'm calling this on.
Any Ideas on what the issue could be? Thanks.
Attach the token to the header when making the request.
Client.DefaultRequestHeaders.Add("access_token", "Bearer " + token);
The problem ended up being one regarding Sandbox Mode. I had registered an app after the switch, and I was the only user in my sandbox. As a result, it had no problem finding my posts/info, but Sandbox Mode acts as if the Sandbox users are the only users on Instagram, so naturally it would not find anything else.
It turns out there was an existing registered application in my organization (made before the switch date) that does not have any such limitations, so I have been testing using that AppID/secret.
tl;dr: If you're the only user in your app's sandbox, work on getting users into your sandbox. See their article about it for more info.

Would like to get http response results like Fiddler

I'm trying to get the same type of results that Fiddler gets when I launch a webpage from my app.
Below is the code I'm using and the results I'm getting. I've used google.com only as an example.
What do I need to modify in my code to get the results I want or do I need an entirely different approach?
Thanks for your help.
My code:
// create the HttpWebRequest object
HttpWebRequest objRequest = (HttpWebRequest)WebRequest.Create("http://www.google.com");
// get the response object which has the header info, using the GetResponse method
var objResults = objRequest.GetResponse();
// get the header count
int intCount = objResults.Headers.Count;
// loop through the results object
for (int i = 0; i < intCount; i++)
{
string strKey = objResults.Headers.GetKey(i);
string strValue = objResults.Headers.Get(i);
lblResults.Text += strKey + "<br />" + strValue + "</br /><br />";
}
My results:
Cache-Control
private, max-age=0
Content-Type
text/html; charset=ISO-8859-1
Date
Tue, 05 Jun 2012 17:40:38 GMT
Expires
-1
Set-Cookie
PREF=ID=526197b0260fd361:FF=0:TM=1338918038:LM=1338918038:S=gefqgwkuzuPJlO3G; expires=Thu, 05-Jun-2014 17:40:38 GMT; path=/; domain=.google.com,NID=60=CJbpzMe6uTKf58ty7rysqUFTW6GnsQHZ-Uat_cFf1AuayffFtJoFQSIwT5oSQKqQp5PSIYoYtBf_8oSGh_Xsk1YtE7Z834Qwn0A4Sw3ruVCA9v3f_UDYH4b4fAloFJbW; expires=Wed, 05-Dec-2012 17:40:38 GMT; path=/; domain=.google.com; HttpOnly
P3P
CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Server
gws
X-XSS-Protection
1; mode=block
X-Frame-Options
SAMEORIGIN
Transfer-Encoding
chunked
=========================
Fiddler results:
Result Protocol Host URL Body Caching Content-Type Process Comments Custom
1 304 HTTP www.rolandgarros.com /images/misc/weather/P8.gif 0 max-age=700 Expires: Tue, 05 Jun 2012 17:53:40 GMT image/gif firefox:5456
2 200 HTTP www.google.com / 23,697 private, max-age=0 Expires: -1 text/html; charset=UTF-8 chrome:2324
3 304 HTTP www.rolandgarros.com /images/misc/weather/P9.gif 0 max-age=700 Expires: Tue, 05 Jun 2012 17:53:57 GMT image/gif firefox:5456
4 200 HTTP Tunnel to translate.googleapis.com:443 0 chrome:2324
5 200 HTTP www.google.com
The difference is Fiddler is actually recording an entire session, not just a single HTTP request.
If a user loads Google.com, the response is typically an HTML document which contains images, script files, CSS files, etc. Your browser will then initiate a new HTTP request for each one of those resources. With Fiddler running, it tracks each of those HTTP requests and spits out the result code and other information about the session.
With your C# code above, you're only initiating a single HTTP request, thus you only have information about a single result.
You'd probably be better off writing a browser plugin. Otherwise, you'd have to parse the HTML response and load other resources from that document as well.
If you do need to do this with C# code, you could probably parse the document with the HTML Agility Pack and then look for other resources within the HTML to simulate a browser. There's also embedded browsers, such as Awesomium, that might be helpful.
You are not asking for the same information that Fiddler is displaying. Fiddler shows the HTTP Status code, the host and URI and (it appears, from your example) the Content Length, Content Type and Cache status.
For many of these you will have to peek in to the response headers.

HttpWebRequest with caching enabled throws exceptions

I'm working on a small C#/WPF application that interfaces with a web service implemented in Ruby on Rails, using handcrafted HttpWebRequest calls and JSON serialization. Without caching, everything works as it's supposed to, and I've got HTTP authentication and compression working as well.
Once I enable caching, by setting request.CachePolicy = new HttpRequestCachePolicy(HttpRequestCacheLevel.CacheIfAvailable);, things go awry - in the production environment. When connecting to a simple WEBrick instance, things work fine, I get HTTP/1.1 304 Not Modified as expected and HttpWebRequest delivers the cached content.
When I try the same against the production server, running nginx/0.8.53 + Phusion Passenger 3.0.0, the application breaks. First request (uncached) is served properly, but on the second request which results in the 304 response, I get a WebException stating that "The request was aborted: The request was canceled." as soon as I invoke request.GetResponse().
I've run the connections through fiddler, which hasn't helped a whole lot; both WEBrick and nginx return an empty entity body, albeit different response headers. Intercepting the request and changing the response headers for nginx to match those of WEBrick didn't change anything, leading me to think that it could be a keep-alive issue; setting request.KeepAlive = false; changes nothing, though - it doesn't break stuff when connecting to WEBrick, and it doesn't fix stuff when connecting to nginx.
For what it's worth, the WebException.InnerException is a NullReferenceException with the following StackTrace:
at System.Net.HttpWebRequest.CheckCacheUpdateOnResponse()
at System.Net.HttpWebRequest.CheckResubmitForCache(Exception& e)
at System.Net.HttpWebRequest.DoSubmitRequestProcessing(Exception& exception)
at System.Net.HttpWebRequest.ProcessResponse()
at System.Net.HttpWebRequest.SetResponse(CoreResponseData coreResponseData)
Headers for the (working) WEBrick connection:
########## request
GET /users/current.json HTTP/1.1
Authorization: Basic *REDACTED*
Content-Type: application/json
Accept: application/json
Accept-Charset: utf-8
Host: testbox.local:3030
If-None-Match: "84a49062768e4ca619b1c081736da20f"
Accept-Encoding: gzip, deflate
Connection: Keep-Alive
########## response
HTTP/1.1 304 Not Modified
X-Ua-Compatible: IE=Edge
Etag: "84a49062768e4ca619b1c081736da20f"
Date: Wed, 01 Dec 2010 18:18:59 GMT
Server: WEBrick/1.3.1 (Ruby/1.8.7/2010-08-16)
X-Runtime: 0.177545
Cache-Control: max-age=0, private, must-revalidate
Set-Cookie: *REDACTED*
Headers for the (exception-throwing) nginx connection:
########## request
GET /users/current.json HTTP/1.1
Authorization: Basic *REDACTED*
Content-Type: application/json
Accept: application/json
Accept-Charset: utf-8
Host: testsystem.local:8080
If-None-Match: "a64560553465e0270cc0a23cc4c33f9f"
Accept-Encoding: gzip, deflate
Connection: Keep-Alive
########## response
HTTP/1.1 304 Not Modified
Connection: keep-alive
Status: 304
X-Powered-By: Phusion Passenger (mod_rails/mod_rack) 3.0.0
ETag: "a64560553465e0270cc0a23cc4c33f9f"
X-UA-Compatible: IE=Edge,chrome=1
X-Runtime: 0.240160
Set-Cookie: *REDACTED*
Cache-Control: max-age=0, private, must-revalidate
Server: nginx/0.8.53 + Phusion Passenger 3.0.0 (mod_rails/mod_rack)
UPDATE:
I tried doing a quick-and-dirty manual ETag cache, but turns out that's a no-go: I get a WebException when invoking request.GetResponce(), telling me that "The remote server returned an error: (304) Not Modified." - yeah, .NET, I kinda knew that, and I'd like to (attempt to) handle it myself, grr.
UPDATE 2:
Getting closer to the root of the problem. The showstopper seems to be a difference in the response headers for the initial request. WEBrick includes a Date: Wed, 01 Dec 2010 21:30:01 GMT header, which isn't present in the nginx reply. There's other differences as well, but intercepting the initial nginx reply with fiddler and adding a Date header, the subsequent HttpWebRequests are able to process the (unmodified) nginx 304 replies.
Going to try to look for a workaround, as well as getting nginx to add the Date header.
UPDATE 3:
It seems that the serverside issue is with Phusion Passenger, they have an open issue about lack of the Date header. I'd still say that HttpWebRequest's behavior is... suboptimal.
UPDATE 4:
Added a Microsoft Connect ticket for the bug.
I think the designers find it reasonable to throw an exception when the "expected behavior"---i.e., getting a response body---cannot be completed. You can handle this somewhat intelligently as follows:
catch (WebException ex)
{
if (ex.Status == WebExceptionStatus.ProtocolError)
{
var statusCode = ((HttpWebResponse)ex.Response).StatusCode;
// Test against HttpStatusCode enumeration.
}
else
{
// Do something else, e.g. throw;
}
}
So, it turns out to be Phusion Passenger (or nginx, depending on how you look at it - and Thin as well) that doesn't add a Date HTTP response header, combined with what I see as a bug in .NET HttpWebRequest (in my situation there's no If-Modified-Since, thus Date shouldn't be necessary) leading to the problem.
The workaround for this particular case was to edit our Rails ApplicationController:
class ApplicationController < ActionController::Base
# ...other stuff here
before_filter :add_date_header
# bugfix for .NET HttpWebRequst 304-handling bug and various
# webservers' lazyness in not adding the Date: response header.
def add_date_header
response.headers['Date'] = Time.now.to_s
end
end
UPDATE:
Turns out it's a bit more complex than "just" setting HttpRequestCachePolicy - to repro, I also need to have manually constructed HTTP Basic Auth. So the involved components are the following:
HTTP server that doesn't include a HTTP "Date:" response header.
manual construction of HTTP Authorization request header.
use of HttpRequestCachePolicy.
Smallest repro I've been able to come up with:
namespace Repro
{
using System;
using System.IO;
using System.Net;
using System.Net.Cache;
using System.Text;
class ReproProg
{
const string requestUrl = "http://drivelog.miracle.local:3030/users/current.json";
// Manual construction of HTTP basic auth so we don't get an unnecessary server
// roundtrip telling us to auth, which is what we get if we simply use
// HttpWebRequest.Credentials.
private static void SetAuthorization(HttpWebRequest request, string _username, string _password)
{
string userAndPass = string.Format("{0}:{1}", _username, _password);
byte[] authBytes = Encoding.UTF8.GetBytes(userAndPass.ToCharArray());
request.Headers["Authorization"] = "Basic " + Convert.ToBase64String(authBytes);
}
static public void DoRequest()
{
var request = (HttpWebRequest) WebRequest.Create(requestUrl);
request.Method = "GET";
request.CachePolicy = new HttpRequestCachePolicy(HttpRequestCacheLevel.CacheIfAvailable);
SetAuthorization(request, "user#domain.com", "12345678");
using(var response = request.GetResponse())
using(var stream = response.GetResponseStream())
using(var reader = new StreamReader(stream))
{
string reply = reader.ReadToEnd();
Console.WriteLine("########## Server reply: {0}", reply);
}
}
static public void Main(string[] args)
{
DoRequest(); // works
DoRequest(); // explodes
}
}
}

Help With .NET CookieContainer

I've recently run into some problems with the CookieContainer. Either I'm doing something seriously wrong or there is some kind of bug w/ the CookieContainer object. It doesn't seem to update the cookie collection with certain Set-Cookie headers.
This might be a lengthy post and I appologize, but I want to be as thurough as possible so I'm going to list my HTTP sniffing logs as well as my actual implementation code.
public bool SendRequest(HttpWebRequest request, IDictionary<string, string> data, int retries)
{
// copy request in case request instance already failed
HttpWebRequest newRequest = (HttpWebRequest)HttpWebRequest.Create(request.RequestUri);
newRequest.Method = request.Method;
// if POST data was provided, write it to the stream
if (data != null && data.Count != 0)
{
StreamWriter writer = new StreamWriter(newRequest.GetRequestStream());
writer.Write(createPostString(data));
writer.Close();
}
// set request with global cookie container
newRequest.CookieContainer = this.cookieJar;
try
{
using (HttpWebResponse resp = (HttpWebResponse)newRequest.GetResponse())
{
//CookieCollection newCooks = getCookies(resp.Headers);
//updateCookies(newCooks);
this.cookieJar = newRequest.CookieContainer;
this.Html = getResponseString(resp);
/* remainder snipped */
So there is the code, here are two request to responses I sniffed in Fiddler:
Request 1
POST /login/ HTTP/1.1
Host: www.site.com
Content-Length: 47
Expect: 100-continue
Connection: Keep-Alive
Response 1
HTTP/1.1 200 OK
Date: Wed, 02 Dec 2009 17:03:35 GMT
Server: Apache
Set-Cookie: tcc=one; path=/
Set-Cookie: cust_id=2702585226; domain=.site.com; path=/; expires=Mon, 01-Jan-2011 00:00:00 GMT
Set-Cookie: cust_session=12%2F2%2F2009%20%2012%3A3%3A35; domain=.site.com; path=/; expires=Wed 2-Dec-2009 17:33:35
Set-Cookie: refer_id_persistent=0000; domain=.site.com; path=/; expires=Fri 2-Dec-2011 17:3:35
Set-Cookie: refer_id=0000; domain=.site.com; path=/
Set-Cookie: private_browsing_mode=off; domain=.site.com; path=/; expires=Fri, 01-Jan-2010 17:03:35 GMT
Set-Cookie: member_session=UmFuZG9tSVYL%5BS%5D%5BP%5DfhH77bYaVoS9j9Yd8ySRkyHHz%5BS%5Dk0S8MVsQ6AyraNlcdcCRC0RkB%5BP%5DfBYVM4vn6JQ3HlJxT3GlJi1RZiMGQaITg7HN9dpu9oRbZgMjhJlXXa%5BP%5D7pFSjqDIZWRr3LAfnhh3btv4E3rvVH42CeOP%5BS%5Dx6kDyvrokQEHyIHPGi7zswZbuHrUdx2XKEKKJzw1unDWfw0LZWjoehAs0QgSOz6Nzp8P4Hp8hqrULdIMch6acPT%5BS%5DbKV8zwugBIcjr5dI3rVR%5BP%5Dv42rsTtQB7dyb%5BP%5DRKb8Y83cGqhHM33hP%5BP%5DUtmbDC1PPfr%5BS%5DPC23lAO%5BS%5DmQ3mOy9x4pgQSOfp40XSfzgVg3EavITaxHBeI5nO3%5BP%5D%5BS%5D2rSDthDfuEm4sT9i6UF3sYd1vlOL0IC9ZsVatV1yhhpQ%5BE%5D%5BE%5D; domain=.site.com; path=/; expires=Fri, 01-Jan-2010 17:03:35 GMT
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
Request 2
GET /test?search=jjkjf HTTP/1.1
Host: www.site.com
Cookie: tcc=one; cust_id=2702585226; private_browsing_mode=off; member_session=UmFuZG9tSVYL%5BS%5D%5BP%5DfhH77bYaVoS9j9Yd8ySRkyHHz%5BS%5Dk0S8MVsQ6AyraNlcdcCRC0RkB%5BP%5DfBYVM4vn6JQ3HlJxT3GlJi1RZiMGQaITg7HN9dpu9oRbZgMjhJlXXa%5BP%5D7pFSjqDIZWRr3LAfnhh3btv4E3rvVH42CeOP%5BS%5Dx6kDyvrokQEHyIHPGi7zswZbuHrUdx2XKEKKJzw1unDWfw0LZWjoehAs0QgSOz6Nzp8P4Hp8hqrULdIMch6acPT%5BS%5DbKV8zwugBIcjr5dI3rVR%5BP%5Dv42rsTtQB7dyb%5BP%5DRKb8Y83cGqhHM33hP%5BP%5DUtmbDC1PPfr%5BS%5DPC23lAO%5BS%5DmQ3mOy9x4pgQSOfp40XSfzgVg3EavITaxHBeI5nO3%5BP%5D%5BS%5D2rSDthDfuEm4sT9i6UF3sYd1vlOL0IC9ZsVatV1yhhpQ%5BE%5D%5BE%5D
So as you can see, the CookieContainer (this.cookieJar) which is used for every request is not picking up the Set-Cookie header for refer_id, cust_session, refer_id_persistent. However it does pick up cust_id, private_browsing_mode, tcc, and member_session... Any ideas why this might be?
Just wanted to update this post in case someone else came across this. Issue is that .NET complies with the RFC specification for cookie tags, but not all sites do. So, ultimately, the issue is not Microsoft, or .NET for the matter. (Although, IE, manages the cookies fine so it would be better to rewrite their .NET cookie parsing methods using the same parsing methods) The issue is the sites that do not follow RFC specifications.
Nonetheless, an issue I've often encountered is that sites will use commas in the expiration dates in their cookies. .NET interprets these as separators between different cookie fields and strips the ending and everything there after off of the cookie.
RFC spec: "Cookie:, followed by a comma-separated list of one or more cookies." An easy solution to this problem would be for the web server to enclose values with commas in quotation marks, per the RFC document. However, there is no RFC police, so we can only hope that people follow the rules.
MSDN SetCookies:
SetCookies pulls all the HTTP cookies out of the HTTP cookie header, builds a Cookie for each one, and then adds each Cookie to the internal CookieCollection that is associated with the URI. The HTTP cookies in the cookieHeader string must be delimited by commas.
MSDN GetCookieHeader
GetCookieHeader returns a string that holds the HTTP cookie header for the Cookie instances specified by uri. The HTTP header is built by adding a string representation of each Cookie associated with uri. Note that the exact format of the string depends on the RFC that the Cookie conforms to. The strings for all the Cookie instances that are associated with uri are combined and delimited by semicolons.
This string is not in the correct format for use as the second parameter of the SetCookies method.
This is only a quick scan through you code but it seems that you are sending post data before you send the cookies in the request.
if (data != null && data.Count != 0)
{
StreamWriter writer = new StreamWriter(newRequest.GetRequestStream());
writer.Write(createPostString(data));
writer.Close();
}
// set request with global cookie container
newRequest.CookieContainer = this.cookieJar;
What might be happening is when you write your post data to the stream this is sent to the remote server. However for a cookies to be set they must be sent to the server before any postdata. The simple solution is to swap this around like so:
// set request with global cookie container
newRequest.CookieContainer = this.cookieJar;
if (data != null && data.Count != 0)
{
StreamWriter writer = new StreamWriter(newRequest.GetRequestStream());
writer.Write(createPostString(data));
writer.Close();
}
CookieContainer has 2 major issues that I have come across, whether it is by design or bug I don't know.
1) Cookies set on a 302 post are not picked up.
Example
Post to site
302 redirect response
Load New page which sets cookie
Solution
Set autoredirect to false and manually follow the redirects and set the cookies yourself
2) .Net is VERY fussy about incorectly form cookie strings that have a comma in the string. This is actually correct, but occasioally cookies have the date set that include a comma, which stops all cookies being set.
Solution
Manually parse cookie strings and add yourself. A horrible task. I have a sprawing mess of a hack function, that loops and ifs but the end result is it works for all cases I have thrown at it so far. IT isn't pretty but it gets job done
Not sure the above is your issue, but maybe. If not some food for thought anyway
My solution: replace " UTC" with " GMT".
Try using CookieContainer.GetCookieHeader and CookieContainer.SetCookies
YourCookieContainer.GetCookieHeader(new Uri("your url"));
YourCookieContainer.SetCookies(new Uri("your url"), "string from GetCookieHeader");

Categories

Resources