C# Stream Response from 3rd party, minimal buffering - c#

Our ASP.NET MVC endpoint is a behaving as a proxy to another 3rd party HTTP endpoint, which returns about 400MB of XML document generated dynamically.
Is there a way for ASP.NET MVC to "stream" that 3rd party response straight to the user of our endpoint with "minimal" buffering ?
At the moment, it looks like ASP.NET System.Web.Mvc.Controller.File() loads the whole file into memory as the response.
Not sure how I can confirm this, other than the jump in memory usage ?
System.Web.Mvc.Controller.File(
The IIS AppPool memory usage increases by 400MB, which is then re-claimed by Garbage Collection later.
It will be nice if we can avoid System.Web.Mvc.Controller.File() loading the whole 400MB strings into memory, by streaming it "almost directly" from incoming response,
is it possible ?
The mock c# linqpad code is roughly like this
public class MyResponseItem {
public Stream myStream;
public string metadata;
}
void Main()
{
Stream stream = MyEndPoint();
//now let user download this XML as System.Web.Mvc.FileResult
System.Web.Mvc.ActionResult fileResult = System.Web.Mvc.Controller.File(stream, "text/xml");
fileResult.Dump();
}
Stream MyEndPoint() {
MyResponseItem myResponse = GetStreamFromThirdParty("https://www.google.com");
return myResponse.myStream;
}
MyResponseItem GetStreamFromThirdParty(string fullUrl)
{
MyResponseItem myResponse = new MyResponseItem();
System.Net.WebResponse webResponse = System.Net.WebRequest.Create(fullUrl).GetResponse();
myResponse.myStream = webResponse.GetResponseStream();
return myResponse;
}

You can reduce the memory footprint by not buffering and just copying the stream directly to output stream, an quick n' dirty example of this here:
public async Task<ActionResult> Download()
{
using (var httpClient = new System.Net.Http.HttpClient())
{
using (
var stream = await httpClient.GetStreamAsync(
"https://ckannet-storage.commondatastorage.googleapis.com/2012-10-22T184507/aft4.tsv.gz"
))
{
Response.ContentType = "application/octet-stream";
Response.Buffer = false;
Response.BufferOutput = false;
await stream.CopyToAsync(Response.OutputStream);
}
return new HttpStatusCodeResult(200);
}
}
If you want to reduce the footprint even more you can set a lower buffer size with the CopyToAsync(Stream, Int32) overload, default is 81920 bytes.

My requirement on proxy download also need to ensure the source ContentType (or any Header you need) can be forwarded as well. (e.g. If I proxy-download a video in http://techslides.com/sample-webm-ogg-and-mp4-video-files-for-html5, I need to let user see the same browser-video-player screen as they open the link directly, but not jumping to file-download / hard-coded ContentType)
Basing on answer by #devlead + another post https://stackoverflow.com/a/30164356/4684232, I adjusted a lil on the answer to fulfill my need. Here's my adjusted code in case anyone has the same need.
public async Task<ActionResult> Download(string url)
{
using (var httpClient = new System.Net.Http.HttpClient())
{
using (var response = await httpClient.GetAsync(url, HttpCompletionOption.ResponseHeadersRead))
{
response.EnsureSuccessStatusCode();
using (var stream = await response.Content.ReadAsStreamAsync())
{
Response.ContentType = response.Content.Headers.ContentType.ToString();
Response.Buffer = false;
Response.BufferOutput = false;
await stream.CopyToAsync(Response.OutputStream);
}
}
return new HttpStatusCodeResult(200);
}
}
p.s. HttpCompletionOption.ResponseHeadersRead is the important performance key. Without it, GetAsync will await until the whole source response stream are downloaded, which is much slower.

Related

Send a 'Stream' over a PutAsync request

I'm trying my hand at .NET Core but I'm stuck trying to convert multipart/form-data to an application/octet-stream to send via a PUT request. Anybody have any expertise I could borrow?
[HttpPost("fooBar"), ActionName("FooBar")]
public async Task<IActionResult> PostFooBar() {
HttpResponseMessage putResponse = await _httpClient.PutAsync(url, HttpContext.Request.Body);
}
Update: I think I might have two issues here:
My input format is multipart/form-data so I need to split out the file from the form data.
My output format must be application-octet stream but PutAsync expects HttpContent.
I had been trying to do something similar and having issues. I needed to PUT large files (>1.5GB) to a bucket on Amazon S3 using a pre-signed URL. The implementation on Amazon for .NET would fail for large files.
Here was my solution:
static HttpClient client = new HttpClient();
client.Timeout = TimeSpan.FromMinutes(60);
static async Task<bool> UploadLargeObjectAsync(string presignedUrl, string file)
{
Console.WriteLine("Uploading " + file + " to bucket...");
try
{
StreamContent strm = new StreamContent(new FileStream(file, FileMode.Open, FileAccess.Read));
strm.Headers.ContentType = new System.Net.Http.Headers.MediaTypeHeaderValue("application/octet-stream");
HttpResponseMessage putRespMsg = await client.PutAsync(presignedUrl, strm);
}
catch (Exception e)
{
Console.WriteLine(e.Message);
return false;
}
return true;
}
Turns out Request has a Form property that contains a Files property that has an OpenReadStream() function on it to convert it into a stream. How exactly I was supposed to know that, I'm not sure.
Either way, here's the solution:
StreamContent stream = new StreamContent(HttpContext.Request.Form.Files[0].OpenReadStream());
HttpResponseMessage putResponse = await _httpClient.PutAsync(url, stream);

Streaming HttpResponse through Owin

I have a HttpResponse object as a result of HttpClient.SendAsync() call. The response has a chunked transfer encoding and results in 1.5 GB of data.
I want to pass this data through OWIN pipeline. To do this I need to convert it to a stream. Simplified code to do this is:
public async Task Invoke(IDictionary<string, object> environment)
{
var httpContent = GetHttpContent();
var responseStream = (Stream)environment["owin.ResponseBody"];
await httpContent.CopyToAsync(responseStream);
}
However, the last line results in copying the entire stream to the memory. And when I use wget to download the data directly from the backend server, it is downloaded successfully and shows a progress bar (although it doesn't know the overall size since it is chunked). But when I use wget to download data from my OWIN-hosted application it sticks on sending the request.
How should I stream this data through an OWIN pipeline to prevent copying it to memory?
EDIT
This is how I get the HttpResponse:
var client = new HttpClient(new HttpClientHandler());
// …and then:
using (var request = new HttpRequestMessage { RequestUri = uri, Method = HttpMethod.Get })
{
return client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead).Result;
}
I assume this is in IIS? System.Web also buffers responses: https://msdn.microsoft.com/en-us/library/system.web.httpresponse.bufferoutput(v=vs.110).aspx
See server.DisableResponseBuffering in
https://katanaproject.codeplex.com/wikipage?title=OWIN%20Keys&referringTitle=Documentation

Best way to implement async Http request that returns string content

In my app I need to do lot of parallel http requests and I have read that it is proper to do it using async/await. In each request I need to get string content from it (often it is html of some site) and my question is: how can I do it in best way?
My current implementation:
public static async Task<string> GetStringContentAsync(HttpWebRequest webRequest)
{
try
{
using (var response = (HttpWebResponse) await webRequest.GetResponseAsync()
.ConfigureAwait(false))
{
var content = await GetStringContentFromResponseAsync(response)
.ConfigureAwait(false);
return content;
}
}
catch (Exception exception)
{
return null;
}
}
private static async Task<string> GetStringContentFromResponseAsync(HttpWebResponse response)
{
using (var responseStream = GetResponseStream(response))
{
if (responseStream == null)
return null;
using (var streamReader = new StreamReader(responseStream))
{
var content = await streamReader.ReadToEndAsync()
.ConfigureAwait(false);
return content;
}
}
}
private static Stream GetResponseStream(HttpWebResponse webResponse)
{
var responseStream = webResponse.GetResponseStream();
if (responseStream == null)
return null;
Stream stream;
switch (webResponse.ContentEncoding.ToUpperInvariant())
{
case "GZIP":
stream = new GZipStream(responseStream, CompressionMode.Decompress);
break;
case "DEFLATE":
stream = new DeflateStream(responseStream, CompressionMode.Decompress);
break;
default:
stream = responseStream;
break;
}
return stream;
}
And example of using:
var httpWebRequest = (HttpWebRequest) WebRequest.Create("http://stackoverflow.com/");
var content = await HttpHelper.GetStringContentAsync(httpWebRequest)
.ConfigureAwait(false);
Is this correct implementation, or we can improve something here? Maybe I'm doing some overhead when using async/await when reading stream?
Reason of my question is that when I'm using my code like this:
for(var i=0;i<1000;i++)
{
Task.Run(()=>{
var httpWebRequest = (HttpWebRequest) WebRequest.Create("http://google.com/");
var content = await HttpHelper.GetStringContentAsync(httpWebRequest)
.ConfigureAwait(false);
});
}
this tasks take to long to execute, but one request to google is very fast. I thought that async requests in this example must be ready almost in same time and this time must be pretty close to "one google request" time.
EDIT:
I forgot to say that I know about ServicePointManager.DefaultConnectionLimit and set it 5000 in my app. So it is not a problem.
I can't use HttpClient because my final goal is to do 100-300 requests at one time from different proxies. And if I understand right, HttpClient can work with only one proxy at one time and can't setup each request separately.
That's a tricky one. Since you know about DefaultConnectionLimit, it's already something good, but there is one more interesting and rather surprising thing:
httpRequest.ServicePoint.ConnectionLeaseTimeout
httpRequest.ServicePoint.MaxIdleTime
Information is here, your latencies might be caused by its default behavior and connections being held to ServicePoint while trying to make next request
Here's the answer answer to your issue: https://msdn.microsoft.com/en-us/library/86wf6409(v=vs.90).aspx
Using synchronous calls in asynchronous callback methods can result in severe performance penalties. Internet requests made with WebRequest and its descendants must use Stream.BeginRead to read the stream returned by the WebResponse.GetResponseStream method.
That means absolutely no synchronous code (including awaits) when reading the response stream. But even that isn't enough, as DNS lookups and TCP connection are still blocking. If you can use .NET 4.0, there's a much more easy to use System.Net.Http.HttpClient class. Otherwise, you can use System.Threading.ThreadPool, which is the workaround I ended up using on 3.5:
ThreadPool.QueueUserWorkItem((o) => {
// make a synchronous request via HttpWebRequest
});

Is HttpClient flawed when sending large files/content?

After reading/googling about HttpClient, I have the impression that this component is not suitable for uploading large files or contents to REST services.
It seems that if the upload takes more than the established timeout, the transmission will fail. Does it make sense? What does this timeout means?
Getting progress information seems hard or requires add-ons.
So my questions are: Is it possible to sove these two issues without too much hassle? Otherwise, what's the best approach when working with large contents and REST services?
Yes, if the upload takes longer that the TimeOut, the upload will fail. This is a limitation of HttpClient. The most robust solution to this problem is the one that Thomas Levesque has written an article about, and linked in his comments to your question. You have to use HttpWebRequest instead of HttpClient.
If you want to get progress messages, open the file as a FileStream and manually iterate through it, copying bytes in increments onto the (upload) request stream. As you go, you can calculate your progress relative to the file size.
TL's code example. Be sure to read the article though!:
long UploadFile(string path, string url, string contentType)
{
// Build request
var request = (HttpWebRequest)WebRequest.Create(url);
request.Method = WebRequestMethods.Http.Post;
request.AllowWriteStreamBuffering = false;
request.ContentType = contentType;
string fileName = Path.GetFileName(path);
request.Headers["Content-Disposition"] = string.Format("attachment; filename=\"{0}\"", fileName);
try
{
// Open source file
using (var fileStream = File.OpenRead(path))
{
// Set content length based on source file length
request.ContentLength = fileStream.Length;
// Get the request stream with the default timeout
using (var requestStream = request.GetRequestStreamWithTimeout())
{
// Upload the file with no timeout
fileStream.CopyTo(requestStream);
}
}
// Get response with the default timeout, and parse the response body
using (var response = request.GetResponseWithTimeout())
using (var responseStream = response.GetResponseStream())
using (var reader = new StreamReader(responseStream))
{
string json = reader.ReadToEnd();
var j = JObject.Parse(json);
return j.Value<long>("Id");
}
}
catch (WebException ex)
{
if (ex.Status == WebExceptionStatus.Timeout)
{
LogError(ex, "Timeout while uploading '{0}'", fileName);
}
else
{
LogError(ex, "Error while uploading '{0}'", fileName);
}
throw;
}
}

Asp.net Web Api Streaming

I have been trying to stream a file to my web service. In my Controller(ApiController) I have a Post function as follows:
public void Post(Stream stream)
{
if (stream != null && stream.Length > 0)
{
_websitesContext.Files.Add(new DbFile() { Filename = Guid.NewGuid().ToString(), FileBytes= ToBytes(stream) });
_websitesContext.SaveChanges();
}
}
I have been trying to stream a file with my web client by doing the following:
public void UploadFileStream(HttpPostedFileBase file)
{
WebClient myWebClient = new WebClient();
Stream postStream = myWebClient.OpenWrite(GetFileServiceUrl(), "POST");
var buffer = ToBytes(file.InputStream);
postStream.Write(buffer, 0,buffer.Length);
postStream.Close();
}
Now when i debug my web service, it gets into the Post function, but stream is always null. Was wondering if anyone may have an idea why this is happening?
Web API doesn't model bind to 'Stream' type hence you are seeing the behavior. You could instead capture the incoming request stream by doing: Request.Content.ReadAsStreamAsync()
Example:
public async Task<HttpResponseMessage> UploadFile(HttpRequestMessage request)
{
Stream requestStream = await request.Content.ReadAsStreamAsync();
Note: you need not even have HttpRequestMessage as a parameter as you could always access this request message via the "Request" property available via ApiController.
You can replace with this code
var uri = new Uri(GetFileServiceUrl());
Stream postStream = myWebClient.OpenWrite(uri.AbsoluteUri, "POST");
RestSharp makes this sort of stuff quite easy to do. Recommend trying it out.

Categories

Resources