Unicode data in web response header - c#

I have developed web api which accept files using POST method, makes manipulation and return them back using HTTP Response. The web api return additional data in http header like output file name. The problem is that then I am posting and receiving response with HttpWebResponse I get scrambled file name in response header value and unicode characters are lost.
For example if I submit наталья.docx file I get наÑалÑÑ.pdf.
The full response header
Pragma: no-cache
Transfer-Encoding: chunked
Access-Control-Allow-Origin: *
Result: True
StoreFile: false
Timeout: 300
OutputFileName: наÑалÑÑ.pdf
Content-Disposition: attachment; filename=наÑалÑÑ.pdf
Cache-Control: no-cache, no-store
Content-Type: application/pdf
Date: Wed, 12 Sep 2012 07:21:37 GMT
Expires: -1
Server: Microsoft-IIS/7.5
X-AspNet-Version: 4
I am reading header values like this
HttpWebResponse webResponse = FormUpload.MultipartFormDataPost(postdatatoserver);
using (Stream clientResponse = webResponse.GetResponseStream())
if (webResponse.StatusCode == HttpStatusCode.OK)
{
Helpers.CopyStream(clientResponse, outStream);
webHeaderCollection = webResponse.Headers;
}
I am not sure should I just decode scrambled characters to unicode when I read them from response header or maybe I need to include encoding into response header when I send data from web api server?

See http://msdn.microsoft.com/en-us/library/system.net.webresponse.getresponsestream.aspx:
Stream ReceiveStream = myWebResponse.GetResponseStream();
Encoding enc = System.Text.Encoding.UTF8;
// Pipe the stream to a higher level stream reader with the required encoding format.
StreamReader readStream = new StreamReader(ReceiveStream, enc);
You might also try
System.Text.Encoding.Default
or
System.Text.Encoding.UTF7
or
System.Text.Encoding.Unicode
or
System.Text.Encoding.GetEncoding(1251)
or
System.Text.Encoding.GetEncoding(1252)
or
System.Text.Encoding.GetEncoding(20866)
See here for a longer list:
http://www.pcreview.co.uk/forums/system-text-encoding-getencoding-whatvalidstrings-t1406242.html
Edit:
Current [RFC 2045] grammar restricts parameter values (and hence
Content-Disposition filenames) to US-ASCII.
So the HTTP-Headers are always transmitted in ASCII format, irrespective of the StreamReader encoding.
IE doesn't conform to the standard, so there is a workaround: UrlEncode the filename
So you need to do this when you write the file back:
// IE needs url encoding, FF doesn't support it, Google Chrome doesn't care
if (Request.Browser.IsBrowser ("IE"))
{
fileName = Server.UrlEncode(fileName);
}
Response.Clear ();
Response.AddHeader ("content-disposition", String.Format ("attachment;filename=\"{0}\"", fileName));
Response.AddHeader ("Content-Length", data.Length.ToString (CultureInfo.InvariantCulture));
Response.ContentType = mimeType;
Response.BinaryWrite(data);
As per
Unicode in Content-Disposition header
you can add an asterisk, and append the proper encoding.

Related

Setting HttpResponse.ContentEncoding to GZIP

I have a small IHttpModule that's reading a POST request from another server and relaying it on. The response from the remote server has the header
Content-Encoding: gzip
How do i specify this in the HttpResponse i'm returning to the caller? Content-Encoding is defined as a text encoding type, so it's expecting a text encoding such as UTF8.
context.Response.ContentEncoding = ???;
Should i be ignoring this and manually setting the header?
If you modifying response, then you should, decode and read the content, gzip retrieved value and add header in response.
//Code for gzip the content and add header
context.Response.Filter = new System.IO.Compression.GZipStream(
context.Response.Filter,
System.IO.Compression.CompressionMode.Compress);
context.Response.AppendHeader("Content-Encoding", "gzip");
If relaying the response without any change, then no need to do any thing.

HTTP Upload Corrupts Zip File?

I have a zip file that I can read with DotNetZipLib from the file system. However, when I POST it via a form to my MVC application it can't be read as a stream. My best guess at the moment is that the HTTP upload is somehow corrupting the zip file. There's no shortage of questions with the same problem, and I thought I'd accounted for the stream properly but perhaps I'm not using the .NET object(s) here as intended.
Here's my WebAPI POST handler:
public void Post(HttpRequestMessage request)
{
using(var fileData = request.Content.ReadAsStreamAsync().Result)
if (fileData.Length > 0)
{
var zip = ZipFile.Read(fileData); // exception
}
}
The exception, of course, is from the DotNetZipLib ZipFile just saying that the stream can't be read as a zip. If I replace fileData with just a path to the file (this is all being tested on the same machine) then it reads it, so it has to be the HTTP upload.
In FireBug, the headers for the POST are:
Response Headers:
Cache-Control no-cache
Content-Length 1100
Content-Type application/xml; charset=utf-8
Date Sat, 01 Feb 2014 23:18:32 GMT
Expires -1
Pragma no-cache
Server Microsoft-IIS/8.0
X-AspNet-Version 4.0.30319
X-Powered-By ASP.NET
X-SourceFiles =?UTF-8?B?QzpcRGF0YVxDb2RlXE9yZ1BvcnRhbFxPcmdQb3J0YWxTZXJ2ZXJcYXBpXGFwcHg=?=
Request Headers
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding gzip, deflate
Accept-Language en-US,en;q=0.5
Connection keep-alive
Cookie uvts=ENGUn8FXEnEQFeS
Host localhost:48257
Referer http://localhost:48257/Home/Test
User-Agent Mozilla/5.0 (Windows NT 6.3; WOW64; rv:26.0) Gecko/20100101 Firefox/26.0
Request Headers From Upload Stream
Content-Length 31817
Content-Type multipart/form-data; boundary=---------------------------265001916915724
And the form is simple enough:
<form action="/api/appx" method="post" enctype="multipart/form-data">
<input name="postedFile" type="file" />
<input type="submit" />
</form>
Am I doing something wrong with the steam? Pulling data from the HttpRequestMessage incorrectly? Or perhaps I should be receiving the upload in an entirely different way?
When you post a file using a HTML form the media type is multipart/form-data which has some special formatting syntax, as you can see from your Firebug details. You can't just read it as a stream and expect it to match the file that was sent. There are a set of ReadAsMultipartAsync extension methods for handling this media type.
The below code worked fine for both Zip and Text file. you may try this out
public HttpStatusCode Post(string fileName)
{
var task = this.Request.Content.ReadAsStreamAsync();
task.Wait();
Stream requestStream = task.Result;
try
{
Stream fileStream = File.Create(HttpContext.Current.Server.MapPath("~/" + fileName));
requestStream.CopyTo(fileStream);
fileStream.Close();
requestStream.Close();
}
catch (IOException)
{
throw new HttpResponseException(HttpStatusCode.InternalServerError);
}
HttpResponseMessage response = new HttpResponseMessage();
response.StatusCode = HttpStatusCode.Created;
return response.StatusCode;
}

Decoding a response c#

I'm developing some API for testing, and I have a problem when I make a webrequest and especially when i retrieve the webresponse.
I use this code:
string request = HttpPost("http://iunlocker.net/check_imei.php", "ime_i=013270000134001");
public static string HttpPost(string URI, string Parameters)
{
try
{
System.Net.WebRequest req = System.Net.WebRequest.Create(URI);
req.ContentType = "application/x-www-form-urlencoded";
req.Method = "POST";
byte[] bytes = System.Text.Encoding.ASCII.GetBytes(Parameters);
req.ContentLength = bytes.Length;
System.IO.Stream os = req.GetRequestStream();
os.Write(bytes, 0, bytes.Length);
os.Close();
System.Net.WebResponse resp= req.GetResponse();
if (resp == null) return null;
System.IO.StreamReader sr = new System.IO.StreamReader(resp.GetResponseStream());
return sr.ReadToEnd().Trim();
}
catch (Exception ex) { }
return null;
}
The website in the call is an example, because with this and with other websites I can't retrieve the result correctly. I receive an exception "Error 403"
Can anyone maybe help me by telling what I may be doing wrong?
I thought the problem was on encoding/decoding -- in fact using Fiddler it asks me if I want to decode before see the text -- but with another website, used for examples, I receive the same message from Fiddler but I can retrieve the response without a problem.
Thanks in advance.
HTTP 403 error means "access forbidden". The destination website is refusing to fulfill your request, for reasons of its own.
Given this particular website http://iunlocker.net/, I'm going to hazard a guess that it may be checking the HTTP_REFERER. In other words it's refusing to fulfill your request because it knows it didn't come from a browser that was viewing the form.
[EDIT] After viewing the response from
curl --form ime_i=013270000134001 -i http://iunlocker.net/check_imei.php
I can see that the immediate response is setting a cookie and a redirect.
HTTP/1.1 307 Temporary Redirect
Server: nginx
Date: Wed, 03 Jul 2013 04:00:27 GMT
Content-Type: text/html
Content-Length: 180
Connection: keep-alive
Set-Cookie: PMBC=35e9e4cd3a7f9d50e7f3bb39d43750d1; path=/
Location: http://iunlocker.net/check_imei.php?pmtry=1
<html>
<head><title>307 Temporary Redirect</title></head>
<body bgcolor="white">
<center><h1>307 Temporary Redirect</h1></center>
<hr><center>nginx</center>
</body>
</html>
This site does not want you scraping it; if you wish to defeat this you will have to make use of its cookies.
http://en.wikipedia.org/wiki/HTTP_403 - The web server is denying you access to that URL.
Perhaps the IP address you are using, is not allowed to access that resource. Check web server.

How to get content type of a web address?

I want to get type of a web address. For example this is a Html page and its page type is text/html but the type of this is text/xml. this page's type seems to be image/png but it's text/html.
I want to know how can I detect the content type of a web address like this?
it should be something like this
var request = HttpWebRequest.Create("http://www.google.com") as HttpWebRequest;
if (request != null)
{
var response = request.GetResponse() as HttpWebResponse;
string contentType = "";
if (response != null)
contentType = response.ContentType;
}
HTTP Response header: content-type
For a more detailed response, please provide a more detailed question.
using (MyClient client = new MyClient())
{
client.HeadOnly = true;
string uri = "http://www.google.com";
byte[] body = client.DownloadData(uri); // note should be 0-length
string type = client.ResponseHeaders["content-type"];
client.HeadOnly = false;
// check 'tis not binary... we'll use text/, but could
// check for text/html
if (type.StartsWith(#"text/"))
{
string text = client.DownloadString(uri);
Console.WriteLine(text);
}
}
Will get you the mime type from the headers without downloading the page. Just look for the content-type in the response headers.
You can detect the Content-Type by the Http header of the response,for http://bayanbox.ir/user/ahmadalli/images/div.png ,the header is
Connection:keep-alive
Content-Encoding:gzip
Content-Type:text/html; charset=utf-8
Date:Tue, 14 Aug 2012 03:01:41 GMT
Server:bws
Transfer-Encoding:chunked
Vary:Accept-Encoding
Read up on HTTP headers.
HTTP headers will tell you the content type. For example:
content-type: application/xml.
There are two ways to determining the content-type
the file extension invoked by the URL
the http header content-type
The first one was somewhat promoted by microsoft during to old days and is not a good practice anymore.
If the client has display constraints accepting only certain content-type, it would request the server with the headers like
accept: application/json
accept: text/html
accept: application/xml
And then if the server could supply one of those and chooses XML it would return the content with the header
content-type: application/xml.
However, some services include further information like
content-type: application/xml; charset=utf-8
rather than using a header of its own for the character encoding.

Would like to get http response results like Fiddler

I'm trying to get the same type of results that Fiddler gets when I launch a webpage from my app.
Below is the code I'm using and the results I'm getting. I've used google.com only as an example.
What do I need to modify in my code to get the results I want or do I need an entirely different approach?
Thanks for your help.
My code:
// create the HttpWebRequest object
HttpWebRequest objRequest = (HttpWebRequest)WebRequest.Create("http://www.google.com");
// get the response object which has the header info, using the GetResponse method
var objResults = objRequest.GetResponse();
// get the header count
int intCount = objResults.Headers.Count;
// loop through the results object
for (int i = 0; i < intCount; i++)
{
string strKey = objResults.Headers.GetKey(i);
string strValue = objResults.Headers.Get(i);
lblResults.Text += strKey + "<br />" + strValue + "</br /><br />";
}
My results:
Cache-Control
private, max-age=0
Content-Type
text/html; charset=ISO-8859-1
Date
Tue, 05 Jun 2012 17:40:38 GMT
Expires
-1
Set-Cookie
PREF=ID=526197b0260fd361:FF=0:TM=1338918038:LM=1338918038:S=gefqgwkuzuPJlO3G; expires=Thu, 05-Jun-2014 17:40:38 GMT; path=/; domain=.google.com,NID=60=CJbpzMe6uTKf58ty7rysqUFTW6GnsQHZ-Uat_cFf1AuayffFtJoFQSIwT5oSQKqQp5PSIYoYtBf_8oSGh_Xsk1YtE7Z834Qwn0A4Sw3ruVCA9v3f_UDYH4b4fAloFJbW; expires=Wed, 05-Dec-2012 17:40:38 GMT; path=/; domain=.google.com; HttpOnly
P3P
CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Server
gws
X-XSS-Protection
1; mode=block
X-Frame-Options
SAMEORIGIN
Transfer-Encoding
chunked
=========================
Fiddler results:
Result Protocol Host URL Body Caching Content-Type Process Comments Custom
1 304 HTTP www.rolandgarros.com /images/misc/weather/P8.gif 0 max-age=700 Expires: Tue, 05 Jun 2012 17:53:40 GMT image/gif firefox:5456
2 200 HTTP www.google.com / 23,697 private, max-age=0 Expires: -1 text/html; charset=UTF-8 chrome:2324
3 304 HTTP www.rolandgarros.com /images/misc/weather/P9.gif 0 max-age=700 Expires: Tue, 05 Jun 2012 17:53:57 GMT image/gif firefox:5456
4 200 HTTP Tunnel to translate.googleapis.com:443 0 chrome:2324
5 200 HTTP www.google.com
The difference is Fiddler is actually recording an entire session, not just a single HTTP request.
If a user loads Google.com, the response is typically an HTML document which contains images, script files, CSS files, etc. Your browser will then initiate a new HTTP request for each one of those resources. With Fiddler running, it tracks each of those HTTP requests and spits out the result code and other information about the session.
With your C# code above, you're only initiating a single HTTP request, thus you only have information about a single result.
You'd probably be better off writing a browser plugin. Otherwise, you'd have to parse the HTML response and load other resources from that document as well.
If you do need to do this with C# code, you could probably parse the document with the HTML Agility Pack and then look for other resources within the HTML to simulate a browser. There's also embedded browsers, such as Awesomium, that might be helpful.
You are not asking for the same information that Fiddler is displaying. Fiddler shows the HTTP Status code, the host and URI and (it appears, from your example) the Content Length, Content Type and Cache status.
For many of these you will have to peek in to the response headers.

Categories

Resources