Cyrillic symbols in HttpClient POST request for upload filename - c#

In one of my .NET applications I've go a method which uploads file to the site via HttpClient. Here is implementation
using (var clientHandler = new HttpClientHandler
{
CookieContainer = cookieContainer,
UseDefaultCredentials = true
})
{
using (var client = new HttpClient(clientHandler))
{
client.BaseAddress = requestAddress;
client.DefaultRequestHeaders.Accept.Clear();
using (var content = new MultipartFormDataContent())
{
var streamContent = new StreamContent(new MemoryStream(fileData));
streamContent.Headers.ContentDisposition = ContentDispositionHeaderValue.Parse("form-data");
streamContent.Headers.ContentDisposition.Parameters.Add(new NameValueHeaderValue("name", "contentFile"));
streamContent.Headers.ContentDisposition.Parameters.Add(new NameValueHeaderValue("filename", "\"" + fileName + "\""));
streamContent.Headers.ContentType = new MediaTypeHeaderValue(contentType);
content.Add(streamContent);
HttpResponseMessage response = await client.PostAsync("/Files/UploadFile", content);
if (response.IsSuccessStatusCode)
{
return true;
}
return false;
}
}
}
Method work fine. But when I pass Cyrillic symbols in fileName property generated post request filename has corrupted symbols like ????1.docx for exmaple where ? replaces the Cyrillic symbol. Is there any way to send Cyrillic symbols without corruption?

I believe filename is very limited on what you can do in terms of code page. I guess it only supports ASCII (not 100% sure). There is a better header you can use called filename* which is bit hard to google for since google will just remove the * and you get all the ordinary filename back.
Long story short you need to use this:
$"filename*=UTF-8''{fileName}"
You also might need to do some encoding of filename with regard to space, etc. You can google a bit more on that + check this SO post.
P.S. Some older browsers might not like it, you need to check your requirements.

Related

.NET 6 Function App - how to return a HTML page from a file

I'm currently in the process of migrating several Azure Function Apps to .NET 6. One of these involves returning various content files via a HTTP request.
Previously (on .NET 3.1) this works fine for both json/text files, and HTML:
var callbackFileLocation = Path.Combine(Helper.GetFunctionPath(), "Files", filename);
var stream = new FileStream(callbackFileLocation, FileMode.Open, FileAccess.Read)
{
Position = 0
};
var okObjectResult = new OkObjectResult(stream);
okObjectResult.ContentTypes.Clear();
if (filename.Contains(".html"))
{
okObjectResult.ContentTypes.Add(new Microsoft.Net.Http.Headers.MediaTypeHeaderValue("text/html"));
}
else
{
okObjectResult.ContentTypes.Add(new Microsoft.Net.Http.Headers.MediaTypeHeaderValue("application/json"));
}
return okObjectResult;
This doesn't return the same results on .NET Core 6 - you tend to just get given the object type as a name e.g. Microsoft.AspNetCore.Mvc.OkObjectResult or System.IO.FileStream. It's easy enough to fix for the json files, as I can just convert them into text, and make sure the function app is returning that as the payload.
HTML seems trickier - I've tried reading the stream to end, and various methods mentioned here and on other sites, e.g:
public static HttpResponseMessage Run(string filename)
{
var callbackFileLocation = Path.Combine(Helper.GetFunctionPath(), "Files", filename);
var response = new HttpResponseMessage(HttpStatusCode.OK);
var stream = new FileStream(callbackFileLocation, FileMode.Open);
response.Content = new StreamContent(stream);
response.Content.Headers.ContentType = new MediaTypeHeaderValue("text/html");
return response;
}
Or returning the HTML text within FileContentResult ("application/octet-stream") or ContentResult, e.g:
new ContentResult { Content = content, ContentType = "text/html", StatusCode = 200 };
The closest I've got is the HTML as raw text, but want the HTML rendered in the browser.
Any suggestions? Documentation around this on .NET 6 seems thin...thanks!
Not the best answer (and thanks for the help #Jonas Weinhardt) - but I couldn't find a way to do this using dotnet-isolated process.
It worked fine when moved back to non-isolated. (I guess it's something to do with the new GRPC functions or something like that?)

Upload raw byte data with dotnet core HttpClient

I am updating some old C# code to use HttpClient instead of WebClient. As part of this, I need to upload a byte array of a file to an api.
With WebClient, this worked perfectly fine
byte[] data = GetMyData();
using (var client = new WebClient())
{
//set url, headers etc
var r = client.UploadData(url, "PUT", data);
}
With HttpClient, I've tried various methods, such as
byte[] data = GetMyData();
using (var client = new HttpClient())
{
//set url, headers etc
var r = await client.PutAsync(url, new ByteArrayContent(data));
}
I've also tried different ways of using Multipart data that I found Googling around, but the server does not accept anything I've tried. I don't have a lot of documentation on the server API, I only know that the WebClient way has worked well for many years. Is there a way to recreate the WebClient.UploadData behavior with HttpClient?
Thanks to the commenters for putting me on the right track. The Content-Type headers were not being set correctly for the HttpClient way, by putting it on the actual content. code below.
byte[] data = GetMyData();
using (var client = new HttpClient())
{
//set url, headers etc
var content = new ByteArrayContent(data);
content.Headers.ContentType = new MediaTypeWithQualityHeaderValue(contentType);
var r = await client.PutAsync(url, content);
}

Send JSON list via GET Request

I am attempting to speed up the process of my local software sync. Right now we send a GET requested for each individual record that we need and the API sends back a JSON string containing that records data, which is then inserted into the local database. This all works, however it can be tediously slow. I am trying to speed this up, and was hoping a good way to do so would be to send a JSON of List<Dictionary<string, string>>. This would make it so that I can request much more data in one shot on the API side, add it to the list, and pass it back as JSON to the local machine.
Right now on the local side I have:
Encoding enc = System.Text.Encoding.GetEncoding(1252);
using (HttpClient client = new HttpClient())
{
string basicAuth = Convert.ToBase64String(ASCIIEncoding.ASCII.GetBytes(string.Format("{0}:{1}", usr, pwd)));
client.BaseAddress = new Uri(baseUrl);
client.DefaultRequestHeaders.Clear();
client.DefaultRequestHeaders.Authorization = new System.Net.Http.Headers.AuthenticationHeaderValue("Basic", basicAuth);
client.DefaultRequestHeaders.Accept.Add(new System.Net.Http.Headers.MediaTypeWithQualityHeaderValue("application/json"));
string requested = JsonConvert.SerializeObject(tableList);
HttpResponseMessage response = client.GetAsync(syncUrl + hash + "/" + requested).Result;
if (!response.IsSuccessStatusCode)
{
// get the error
StreamReader errorStream = new StreamReader(response.Content.ReadAsStreamAsync().Result, enc);
throw new Exception(errorStream.ReadToEnd());
}
}
My Controller call looks like this:
[System.Web.Http.AcceptVerbs("GET")]
[Route("getRecords/{hash}/{requested}")]
public HttpResponseMessage getRecords(string hash, string requested)
Whenever I make this call it gives me an error that it cannot find the URI and I don't even hit my breakpoint on my API. How do I get this to work, or is there a better way to accomplish what I'm doing?
You need to urlencode the data, if there is any special url chars (like ampersand or slash) it will render unusable the data, so you must urlencode it to be rightly formatted.
Use something like...
string requested = Uri.EscapeDataString(JsonConvert.SerializeObject(tableList));
This will encode the special chars so them can be transferred securely on the URL.

Upload to S3 from HTTPWebResponse.GetResponseStream() in c#

I am trying to upload from an HTTP stream directly to S3, without storing in memory or as a file first. I am already doing this with Rackspace Cloud Files as HTTP to HTTP, however the AWS authentication is beyond me so am trying to use the SDK.
The problem is the upload stream is failing with this exception:
"This stream does not support seek operations."
I've tried with PutObject and TransferUtility.Upload, both fail with the same thing.
Is there any way to stream into S3 as the stream comes in, rather than buffering the whole thing to a MemoryStream or FileStream?
or is there any good examples of doing the authentication into S3 request using HTTPWebRequest, so I can duplicate what I do with Cloud Files?
Edit: or is there a helper function in the AWSSDK for generating the authorization header?
CODE:
This is the failing S3 part (both methods included for completeness):
string uri = RSConnection.StorageUrl + "/" + container + "/" + file.SelectSingleNode("name").InnerText;
var req = (HttpWebRequest)WebRequest.Create(uri);
req.Headers.Add("X-Auth-Token", RSConnection.AuthToken);
req.Method = "GET";
using (var resp = req.GetResponse() as HttpWebResponse)
{
using (Stream stream = resp.GetResponseStream())
{
Amazon.S3.Transfer.TransferUtility trans = new Amazon.S3.Transfer.TransferUtility(S3Client);
trans.Upload(stream, config.Element("root").Element("S3BackupBucket").Value, container + file.SelectSingleNode("name").InnerText);
//Use EITHER the above OR the below
PutObjectRequest putReq = new PutObjectRequest();
putReq.WithBucketName(config.Element("root").Element("S3BackupBucket").Value);
putReq.WithKey(container + file.SelectSingleNode("name").InnerText);
putReq.WithInputStream(Amazon.S3.Util.AmazonS3Util.MakeStreamSeekable(stream));
putReq.WithMetaData("content-length", file.SelectSingleNode("bytes").InnerText);
using (S3Response putResp = S3Client.PutObject(putReq))
{
}
}
}
And this is how I do it successfully from S3 to Cloud Files:
using (GetObjectResponse getResponse = S3Client.GetObject(new GetObjectRequest().WithBucketName(bucket.BucketName).WithKey(file.Key)))
{
using (Stream s = getResponse.ResponseStream)
{
//We can stream right from s3 to CF, no need to store in memory or filesystem.
var req = (HttpWebRequest)WebRequest.Create(uri);
req.Headers.Add("X-Auth-Token", RSConnection.AuthToken);
req.Method = "PUT";
req.AllowWriteStreamBuffering = false;
if (req.ContentLength == -1L)
req.SendChunked = true;
using (Stream stream = req.GetRequestStream())
{
byte[] data = new byte[32768];
int bytesRead = 0;
while ((bytesRead = s.Read(data, 0, data.Length)) > 0)
{
stream.Write(data, 0, bytesRead);
}
stream.Flush();
stream.Close();
}
req.GetResponse().Close();
}
}
As no-one answering seems to have done it, I spent the time working it out based on guidance from Steve's answer:
In answer to this question "is there any good examples of doing the authentication into S3 request using HTTPWebRequest, so I can duplicate what I do with Cloud Files?", here is how to generate the auth header manually:
string today = String.Format("{0:ddd,' 'dd' 'MMM' 'yyyy' 'HH':'mm':'ss' 'zz00}", DateTime.Now);
string stringToSign = "PUT\n" +
"\n" +
file.SelectSingleNode("content_type").InnerText + "\n" +
"\n" +
"x-amz-date:" + today + "\n" +
"/" + strBucketName + "/" + strKey;
Encoding ae = new UTF8Encoding();
HMACSHA1 signature = new HMACSHA1(ae.GetBytes(AWSSecret));
string encodedCanonical = Convert.ToBase64String(signature.ComputeHash(ae.GetBytes(stringToSign)));
string authHeader = "AWS " + AWSKey + ":" + encodedCanonical;
string uriS3 = "https://" + strBucketName + ".s3.amazonaws.com/" + strKey;
var reqS3 = (HttpWebRequest)WebRequest.Create(uriS3);
reqS3.Headers.Add("Authorization", authHeader);
reqS3.Headers.Add("x-amz-date", today);
reqS3.ContentType = file.SelectSingleNode("content_type").InnerText;
reqS3.ContentLength = Convert.ToInt32(file.SelectSingleNode("bytes").InnerText);
reqS3.Method = "PUT";
Note the added x-amz-date header as HTTPWebRequest sends the date in a different format to what AWS is expecting.
From there it was just a case of repeating what I was already doing.
Take a look at Amazon S3 Authentication Tool for Curl. From that web page:
Curl is a popular command-line tool for interacting with HTTP
services. This Perl script calculates the proper signature, then calls
Curl with the appropriate arguments.
You could probably adapt it or its output for your use.
I think the problem is that according to the AWS Documentation Content-Length is required and you don't know what the length is until the stream has finished.
(I would guess the Amazon.S3.Util.AmazonS3Util.MakeStreamSeekable routine is reading the whole stream into memory to get around this problem which makes it unsuitable for your scenario.)
What you can do is read the file in chunks and upload them using MultiPart upload.
PS, I assume you know the C# source for the AWSSDK for dotnet is on Github.
This is a true hack (which would probably break with a new implementation of the AWSSDK), and it requires knowledge of the length of the file being requested, but if you wrap the response stream as shown with this class (a gist) as shown below:
long length = fileLength;
you can get file length in several ways. I am uploading from a dropbox link, so they give me the
length along with the url. Alternatively, you can perform a HEAD request and get the Content-Length.
string uri = RSConnection.StorageUrl + "/" + container + "/" + file.SelectSingleNode("name").InnerText;
var req = (HttpWebRequest)WebRequest.Create(uri);
req.Headers.Add("X-Auth-Token", RSConnection.AuthToken);
req.Method = "GET";
using (var resp = req.GetResponse() as HttpWebResponse)
{
using (Stream stream = resp.GetResponseStream())
{
//I haven't tested this path
Amazon.S3.Transfer.TransferUtility trans = new Amazon.S3.Transfer.TransferUtility(S3Client);
trans.Upload(new HttpResponseStream(stream, length), config.Element("root").Element("S3BackupBucket").Value, container + file.SelectSingleNode("name").InnerText);
//Use EITHER the above OR the below
//I have tested this with dropbox data
PutObjectRequest putReq = new PutObjectRequest();
putReq.WithBucketName(config.Element("root").Element("S3BackupBucket").Value);
putReq.WithKey(container + file.SelectSingleNode("name").InnerText);
putReq.WithInputStream(new HttpResponseStream(stream, length)));
//These are necessary for really large files to work
putReq.WithTimeout(System.Threading.Timeout.Infinite);
putReq.WithReadWriteTimeout(System.Thread.Timeout.Infinite);
using (S3Response putResp = S3Client.PutObject(putReq))
{
}
}
}
The hack is overriding the Position and Length properties, and returning 0 for Position{get}, noop'ing Position{set}, and returning the known length for Length.
I recognize that this might not work if you don't have the length or if the server providing the source does not support HEAD requests and Content-Length headers. I also realize it might not work if the reported Content-Length or the supplied length doesn't match the actual length of the file.
In my test, I also supply the Content-Type to the PutObjectRequest, but I don't that that is necessary.
As sgmoore said, the problem is that your content length is not seekable from the HTTP response. However HttpWebResponse does have a content length property available. So you can actually form your Http post request to S3 yourself instead of using the Amazon library.
Here's another Stackoverflow question that managed to do that with what looks like full code to me.

C# encoding Shift-JIS vs. utf8 html agility pack

i have a problem. My goal is to save some Text from a (Japanese Shift-JS encoded)html into a utf8 encoded text file.
But i don't really know how to encode the text.. The HtmlNode object is encoded in Shift-JS. But after i used the ToString() Method, the content is corrupted.
My method so far looks like this:
public String getPage(String url)
{
String content = "";
HtmlDocument page = new HtmlWeb(){AutoDetectEncoding = true}.Load(url);
HtmlNode anchor = page.DocumentNode.SelectSingleNode("//div[contains(#class, 'article-def')]");
if (anchor != null)
{
content = anchor.InnerHtml.ToString();
}
return content;
}
I tried
Console.WriteLine(page.Encoding.EncodingName.ToString());
and got: Japanese Shift-JIS
But converting the html into a String produces the error. I thought there should be a way, but since documentation for html-agility-pack is sparse and i couldn't really find a solution via google, i'm here too get some hints.
Well, AutoDetectEncoding doesn't really work like you'd expect it to. From what i found from looking at the source code of the AgilityPack, the property is only used when loading a local file from disk, not from an url.
So there's three options. One would be to just set the Encoding
OverrideEncoding = Encoding.GetEncoding("shift-jis")
If you know the encoding will always be the same that's the easiest fix.
Or you could download the file locally and load it the same way you do now but instead of the url you'd pass the file path.
using (var client=new WebClient())
{
client.DownloadFile(url, "20130519-OYT1T00606.htm");
}
var htmlWeb = new HtmlWeb(){AutoDetectEncoding = true};
var file = new FileInfo("20130519-OYT1T00606.htm");
HtmlDocument page = htmlWeb.Load(file.FullName);
Or you can detect the encoding from your content like this:
byte[] pageBytes;
using (var client = new WebClient())
{
pageBytes = client.DownloadData(url);
}
HtmlDocument page = new HtmlDocument();
using (var ms = new MemoryStream(pageBytes))
{
page.Load(ms);
var metaContentType = page.DocumentNode.SelectSingleNode("//meta[#http-equiv='Content-Type']").GetAttributeValue("content", "");
var contentType = new System.Net.Mime.ContentType(metaContentType);
ms.Position = 0;
page.Load(ms, Encoding.GetEncoding(contentType.CharSet));
}
And finally, if the page you are querying returns the content-Type in the response you can look here for how to get the encoding.
Your code would of course need a few more null checks than mine does. ;)

Categories

Resources