I'm writing a Windows UWP program using Windows.Web.Http namespace. I stumbled upon a MSDN posting that mentions the below code will force chunked encoding. It does indeed do so. I've made some attempts at changing chunk size (http request) but the size is stuck at 0x10000.
Is there a way to change the chunk size?
What is the rationale for making GetInputStreamAt(0) trigger chunking? Seems like a weird, not-documented-anywhere side-effect. Is there some logic to it?
Here's the MSDN posting:
HttpStreamContent won't save stream
Windows.Storage.Streams.InMemoryRandomAccessStream contentStream = new Windows.Storage.Streams.InMemoryRandomAccessStream();
Windows.Storage.Streams.DataWriter dw = new Windows.Storage.Streams.DataWriter(contentStream);
dw.WriteBytes(System.Text.Encoding.UTF8.GetBytes(body));
dw.StoreAsync().AsTask();
content = new Windows.Web.Http.HttpStreamContent(contentStream.GetInputStreamAt(0));
content.Headers.Add("Content-Type", "application/json; charset=utf-8");
MSFT Guy Says:
... the difference between the contentStream.Seek(0) and contentStream.GetInputStreamAt(0) is: using the first approach actually sends the Content-Length HTTP header followed by the entity body, whereas the second approach uses the Transfer-Encoding: chunked HTTP header, followed by the HTTP chunks. This is helpful in scenarios where the target server only accepts a "specific" format of the HTTP request.
Related
The following has been amusing me for a while now.
First of all, I have been scraping sites for a couple of months. Among them hebrew sites as well, and had no problem whatsoever in receiving hebrew characters from the http server.
For some reason I am very curious to sort out, the following site is an exception. I can't get the characters properly encoded. I tried emulating the working requests I do via Fiddler, but to no avail. My c# request headers look exactly the same, but still the characters will not be readable.
What I do not understand is why I have always been able to retrieve hebrew characters from other sites, while from this one specifically I am not. What is this setting that is causing this.
Try the following sample out.
HttpClient httpClient = new HttpClient();
httpClient.DefaultRequestHeaders.TryAddWithoutValidation("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0");
//httpClient.DefaultRequestHeaders.TryAddWithoutValidation("Accept", "text/html;q=0.9");
//httpClient.DefaultRequestHeaders.TryAddWithoutValidation("Accept-Language", "en-US,en;q=0.5");
//httpClient.DefaultRequestHeaders.TryAddWithoutValidation("Accept-Encoding", "gzip, deflate");
var getTask = httpClient.GetStringAsync("http://winedepot.co.il/Default.asp?Page=Sale");
//doing it like this for the sake of the example
var contents = getTask.Result;
//add a breakpoint at the following line to check the contents of "contents"
Console.WriteLine();
As mentioned, such code works for any other israeli site I try - say, Ynet news site, for instance.
Update: I figured out while "debugging" with Fiddler that the response object, for the ynet site (one which works), returns the header
Content-Type: text/html; charset=UTF-8
while this header is absent in the response from winedepot.co.il
I tried adding it, but still made no difference.
var getTask = httpClient.GetAsync("http://www.winedepot.co.il");
var response = getTask.Result;
var contentObj = response.Content;
contentObj.Headers.Remove("Content-Type");
contentObj.Headers.Add("Content-Type", "text/html; charset=UTF-8");
var readTask = response.Content.ReadAsStringAsync();
var contents = readTask.Result;
Console.WriteLine();
The problem you're encountering is that the webserver is lying about its content-type, or rather, not being specific enough.
The first site responds with this header:
Content-Type: text/html; charset=UTF-8
The second one with this header:
Content-Type: text/html
This means that in the second case, your client will have to make assumptions about what encoding the text is actually in. To learn more about text encodings, please read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).
And the built-in HTTP clients for .NET don't really do a great job at this, which is understandable, because it is a Hard Problem. Read the linked article for the trouble a web browser will have to go through in order to guess the encoding, and then try to understand why you don't want this logic in a programmable web client.
Now the sites do provide you with a <meta http-equiv="Content-Type" content="actual encoding here" /> tag, which is a nasty workaround for not having to properly configure a web server. When a browser encounters such a tag, it will have to restart parsing the document with the specified content-type, and then hope it is correct.
The steps roughly are, assuming an HTML payload:
Perform web request, keep the response document in a binary buffer.
Inspect the content-type header, if present, and if it isn't present or doesn't provide a charset, do some assumption about the encoding.
Read the response by decoding the buffer, and parsing the resulting HTML.
When encountering a <meta http-equiv="Content-Type" /> header, discard all decoded text, and start again by interpreting the binary buffer as text encoded in the specified encoding.
The C# HTTP clients stop at step 2, and rightfully so. They are HTTP clients, not HTML-displaying browsers. They don't care that your payload is HTML, JSON, XML, or any other textual format.
When no charset is given in the content-type response header, the .NET HTTP clients default to the ISO-8859-1 encoding, which cannot display the characters from the character set Windows-1255 (Hebrew) that the page actually is encoded in (or rather, it has different characters at the same code points).
Some C# implementations that try to do encoding detection from the meta HTML element are provided in Encoding trouble with HttpWebResponse. I cannot vouch for their correctness, so you'll have to try it at your own risk. I do know that the currently highest-voted answer actually re-issues the request when it encounters the meta tag, which is quite silly, because there is no guarantee that the second response will be the same as the first, and it's just a waste of bandwidth.
You can also do some assumption about that you know the encoding being used for a certain site or page, and then force the encoding to that:
using (Stream resStream = response.GetResponseStream())
{
StreamReader reader = new StreamReader(resStream, YourFixedEncoding);
string content = reader.ReadToEnd();
}
Or, for HttpClient:
using (var client = new HttpClient())
{
var response = await client.GetAsync(url);
var responseStream = await client.ReadAsStreamAsync();
using (var fixedEncodingReader = new StreamReader(responseStream, Encoding.GetEncoding(1255)))
{
string responseString = fixedEncodingReader.ReadToEnd();
}
}
But assuming an encoding for a particular response, or URL, or site, is entirely unsafe altogether. It is in no way guaranteed that this assumption will be correct every time.
I'm using RestSharp to make a call to REST service. My call looks something like this:
var request = new RestRequest("/foo", Method.POST);
request.JsonSerializer.ContentType = "application/json; charset=utf-8";
request.AddJsonBody(new string[] { "param1", "param2" });
var response = this._client.Execute<Foo>(request);
For most other calls this works fine. I'm running into issues when the response is compressed. The headers in the response look (mostly) like this:
HTTP/1.1 200 OK
Uncompressed-Size: 35000
Content-Length: 3019
Content-Encoding: deflate
Content-Type: application/json
The issue is when I call this method with RestSharp I keep getting the error:
Error: Block length does not match with its complement.
I've tried setting the Accept-Encoding header in the request but it still produces the error. I also tried using a custom deserializer but the error is occurring before deserialization. From what I can tell, RestSharp should automatically handle deflation if the Content-Encoding header says deflate (which it does).
How can I get RestSharp to handle the deflation properly?
UPDATE
In the end I was able to have the service changed to look for an Accept-Encoding header in the request with a value of identity. If found, the service was changed to return the data uncompressed.
This is unfortunately not really a solution to the original issue but it does resolve the problem for me. If a better solution is posted I will try it.
According to this post, you should be able to handle it if you won't pass charset=utf-8 in content type.
Please refer to this:
RestSharp compress request while making rest call to server
In C# on .net 4.0 I'm trying to get the header date/time of a Internet web site.
My goal is to validate a local systems time (within seconds) of Internet time by using HTTP rather then SNTP. I'm a SNTP fan but it won't do in this scenario. I found this concept of using HTTP headers for time called "HTP" and want to replicate it in C#.
Tried to use HttpWebRequest.Headers collection using MSDN example on the page, which doesn't return me the Date (or much else).
If HttpWebRequest.Headers is a good way to go about getting this value, why can't I see Date in this result? Is there a better way?
var myHttpWebRequest=(HttpWebRequest)WebRequest.Create("http://www.microsoft.com");
myHttpWebRequest.GetResponse();
Console.WriteLine("\nThe HttpHeaders are \n\n\tName\t\tValue\n{0}",
myHttpWebRequest.Headers);
You seem to be reading the request headers, instead of reading the response headers:
var myHttpWebRequest = (HttpWebRequest)WebRequest.Create("http://www.microsoft.com");
var response = myHttpWebRequest.GetResponse();
Console.WriteLine("\nThe HttpHeaders are \n\n\tName\t\tValue\n{0}", response.Headers);
I'm looking at this method in this HTTPCombiner:
private bool CanGZip(HttpRequest request)
{
string acceptEncoding = request.Headers["Accept-Encoding"];
if (!string.IsNullOrEmpty(acceptEncoding) &&
(acceptEncoding.Contains("gzip") || acceptEncoding.Contains("deflate")))
return true;
return false;
}
If this returns true then the response is compressed using a GZipStream. Is this right?
Those are two different algorithms :
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.3
http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.5
Some code here :
http://www.singular.co.nz/blog/archive/2008/07/06/finding-preferred-accept-encoding-header-in-csharp.aspx
So, according to the protocol, it is not right, as if the browser says "give me the content using deflate", you shouldn't send it back gzipped.
GZip (which is based on Deflate) and Deflate are two different algorithms, so a request for "deflate" should definitely not return gzipped content.
However, this should be easy to fix, by simply using a GZipStream if the accept header contains "gzip" and a DeflateStream for "deflate".
Both are included in System.IO.Compression, so it's not like you'd have to code your own deflate algorithm or use a third party implementation.
Typically most of the browsers understand GZip and Deflate. They tell the server by specifying it in the request header as Accept-Encoding:gzip, deflate. The HTTPCombiner gives preference to GZip. If both the types are present then GZip is given the preference. HttpCombiner will send the content only if the browser requests for Defalte only.
I am using C# for my project, can anyone tell me what is a standered structure of a HTTP POST Requset. How to attach POST data like a file in the Request from code.
Simply i want to create a POST request from my code itself, with diffrent items to be posted available.
I have checked Ietf's RFC for http POST but its too long....
Specs for simple reference
I have always appreciated HTTP Made Really Easy as a starting point. It's small, concise and friendly.
Often you can get enough implementation details (or at least enough understanding) from this guide's simple style to suffice your need. It has worked for me many times. There is a section on POST. The guide builds cumulatively.
Additionally it links to proper specifications and fuller resources should you need to reference them and get into more detail.
.NET Supporting Classes
Fortunately the .NET Framework Class Library contains higher level classes that can simplify your life. Look into the MSDN documentation and examples about System.Net.WebClient (doesn't lend itself as well to POST, favours GET for quick usage methods). Consider the more flexible System.Web.HttpRequest and System.Web.HttpResponse counterpart classes.
Example using C#
This code sample shows the concept of posting binary data to a stream.
This method is called like:
PostMyData(Stream_instance, "http://url_to_post_to");
Namespaces involved are:
using System.IO;
using System.Net;
The custom method would look something like the following.
Note: Concept taken from MSDN sample code here.
Although I use MIME type application/octet-stream for generic binary data, you can use any well known type from this list of mime types to target the kind of binary data you are sending.
public int PostMyData(Stream binaryData, string postToUrl) {
// make http request
var request = (HttpWebRequest)WebRequest.Create(postToUrl);
request.Method = "POST";
request.ContentType = "application/octet-stream"; // binary data:
// data (bytes) that will be posted in body of request
var streamOut = request.GetRequestStream();
binaryData.CopyTo(streamOut);
// post and get response
using (var response = (HttpWebResponse)request.GetResponse()) {
var code = response.StatusCode;
return (int)code;
}
}
Use HttpWebRequest, its always the best way, but for a more simple aproach on Http Post read:
http://programaticallyspeaking.site40.net/blog/2010/11/how-to-implement-an-http-server-part-1/
Hey i found a way to view a sample POST request, Use Fiddler to track HTTp transfers and click on RAW to view raw data being transfered.