C# Speed Up Writing from Response Stream to ViewStream - c#

I have this code that is asynchronously splitting a file into parts, and downloading them using HTTP content range. It then writes the downloaded data to a ViewStream on a Memory Mapped file. I am currently reading from the response stream into a buffer, then writing all the data from the buffer into the ViewStream. Is there a more efficient/faster way to do this? I am not really concerned about memory use, but I am trying to maximize speed. Pieces is a list that contains value tuples indicating the (Start, End) for the piece of the file, and httpPool is a object pool with a bunch of preconfigured HTTP Clients. Any help is greatly appreciated, thank you!
await Parallel.ForEachAsync(pieces,
new ParallelOptions() { MaxDegreeOfParallelism = Environment.ProcessorCount },
async (piece, cancellationToken) =>
{
//Get a http client from the pool and request for the content range
var client = httpPool.Get();
var request = new HttpRequestMessage { RequestUri = new Uri(url) };
request.Headers.Range = new RangeHeaderValue(piece.Item1, piece.Item2);
//Request headers so we dont cache the file into memory
if (client != null)
{
var message = await client.SendAsync(request,HttpCompletionOption.ResponseHeadersRead,cancellationToken).ConfigureAwait(false);
if (message.IsSuccessStatusCode)
{
//Get the content stream from the message request
using (var streamToRead = await message.Content.ReadAsStreamAsync(cancellationToken).ConfigureAwait(false))
{
//Create a memory mapped stream to the mmf with the piece offset and size equal to the response size
using (var streams = mmf.CreateViewStream(piece.Item1,message.Content.Headers.ContentLength!.Value,MemoryMappedFileAccess.Write))
{
//Copy from the content stream to the mmf stream
var buffer = new byte[bufferSize];
int offset, bytesRead;
// Until we've read everything
do
{
offset = 0;
// Until the buffer is very nearly full or there's nothing left to read
do
{
bytesRead = await streamToRead.ReadAsync(buffer.AsMemory(offset, bufferSize - offset),cancellationToken);
offset += bytesRead;
} while (bytesRead != 0 && offset < bufferSize);
// Empty the buffer
if (offset != 0)
{
await streams.WriteAsync(buffer.AsMemory(0, offset),cancellationToken);
}
} while (bytesRead != 0);
streams.Flush();
streams.Close();
}
streamToRead.Close();
}
}
message.Content.Dispose();
message.Dispose();
}
request.Dispose();
httpPool.Return(client);
});

I don't know how much it is going to help, but I tried to do something. How well does it work?
I also did some refactoring, so here are some notes:
Do not call .Close() or .Dispose() manually if you already have a using block or a using statement. All it does is add noise to your code and confuse anyone reading it. In fact, almost never call .Close() or .Dispose() manually at all.
Do you realize client would never be returned to the pool if any exception occurred in the method? You need to do these things in a finally block or by using an IDisposable struct which returns client to the pool in it's Dispose() implementation. (also, request would not be disposed in the method if any exception occurred, add using)
Whenever you can, prefer if statements that return early rather than ones that wrap the entire rest of the method. The latter is hard to read and maintain.
You are not really benefiting from Parallel as 99% of the method is asynchronously waiting for IO. Just use Task.WhenAll() instead.
I got rid of the custom buffering/copying and just called the CopyToAsync() method on message.Content which accepts a Stream. It should help the performance, probably. I reckon it has to be better optimized than the simplest possible buffer thingee.
Code:
await Task.WhenAll(pieces.Select(p => DownloadToMemoryMappedFile(p)));
// change the piece type from dynamic to what you need
async Task DownloadToMemoryMappedFile(dynamic piece, CancellationToken cancellationToken = default)
{
//Get a http client from the pool and request for the content range
var client = httpPool.Get();
try
{
using var request = new HttpRequestMessage { RequestUri = new Uri(url) };
//Request headers so we dont cache the file into memory
request.Headers.Range = new RangeHeaderValue(piece.Item1, piece.Item2);
if (client is null)
return;
using var message = await client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead, cancellationToken).ConfigureAwait(false);
if (!message.IsSuccessStatusCode)
return;
//Create a memory mapped stream to the mmf with the piece offset and size equal to the response size
using var streams = mmf.CreateViewStream(piece.Item1, message.Content.Headers.ContentLength!.Value, MemoryMappedFileAccess.Write);
await message.Content.CopyToAsync(streams).ConfigureAwait(false);
}
finally
{
httpPool.Return(client);
}
}

Related

How to process data from one stream and return it as another async stream in ASP.NET

I'm writing a ASP.NET Web API (.NET 7). I need to create an service that loads specified data via a stream, does something to it on-the-fly and then returns it as a stream.
Let's say this is the method definition in a StreamEncoder service class:
public async Task<Stream> EncodeStream(Stream input);
What that method needs to do is:
Immediately return the output stream
Keep loading input stream data in chunks (for example 32 bytes)
Process that chunk by encoding it into Base64
Pass that encoded chunk to the output stream.
The idea is that later this service can be used withing the API endpoint. Something like this:
[HttpGet("image")]
public async Task<IActionResult> GetImage([FromQuery] string url, [FromQuery] string format)
{
// Perform checks ...
// Load stream
url = HttpUtility.UrlDecode(url);
var imageStream = await _imageLoaderService.LoadImage(url);
if (imageStream is null) return NotFound();
// Start processing the stream
var outputStream = await _streamEncoder(imageStream);
// Return immediately
return File(outputStream, Formats[format]);
}
Processing streams synchronously in batches is pretty simple, but I can't seem to find a solution for processing on-the-fly, so that the API client starts receiving data before the server had a chance to finish loading all of it's data.
Very often the input data is over 500MB in size and I need to be processing several of them at the same time. I can't just load all of data to RAM, then process it and at the end return the result.
How can I solve this problem? Are there any libraries that would help with it?
There's a couple of things in play here.
First, streaming responses come with a couple of limitations.
The HTTP response is comprised of headers followed by the data. Since the status code is returned as a header value, the first limitation of streaming responses is that you cannot change the status code. In order to stream a response, your API must return 200 and then stream. If you're streaming along and the upstream has an error, then there's no way to change that status code to a 502 or 500; all you can do is throw an exception and then ASP.NET will clamp the connection shut, which most clients will interpret as an error (some kind of general "communications error", not a 500).
The other limitation is that your code may not know the length of the response until after it's sent. This is especially true since there is encoding being done. So this means your response won't have a Content-Length header, which means no nice progress updates for your clients.
But if you're OK with those limitations, then the specifics of how to do a streaming response come into play.
You can start streaming by calling StartAsync and then copy to the stream, as such:
[HttpGet("image")]
public async Task GetImage([FromQuery] string url, [FromQuery] string format)
{
// Load stream
url = HttpUtility.UrlDecode(url);
var imageStream = await _imageLoaderService.LoadImage(url);
if (imageStream is null)
{
Response.StatusCode = 404;
return;
}
// Set all the response headers.
Response.StatusCode = 200;
Response.ContentType = Formats[format];
Response.Headers[...] = ...
// Send the headers and start streaming.
await Response.StartAsync();
// Process the stream. This is just a straight copy as an example.
await imageStream.CopyToAsync(Response.Body);
}
Note that you do lose the nice IAsyncResult helpers with this approach. (In particular, if you're using File and friends to set Content-Disposition, then the IAsyncResult helpers handle all the tedious header value encoding that is necessary). If you want to keep the IAsyncResult helpers, then you can't use StartAsync directly. In that case I recommend you write your own IAsyncResult type.
I have a FileCallbackResult type on GitHub that passes the output stream to a callback. Using my type would look like this:
[HttpGet("image")]
public async Task<IAsyncResult> GetImage([FromQuery] string url, [FromQuery] string format)
{
// Load stream
url = HttpUtility.UrlDecode(url);
var imageStream = await _imageLoaderService.LoadImage(url);
if (imageStream is null)
return NotFound();
return new FileCallbackResult(Formats[format], async (stream, context) =>
{
// Process the stream. This is just a straight copy as an example.
await imageStream.CopyToAsync(stream);
});
}
Technically, it would also be possible to write a producer/consumer stream, but this would be more work. No type like that currently exists (except NetworkStream, but you can't control both sides of that one). In the past, this would have been considerably difficult, but today I think you could do it using pipelines. Pipelines are a more modern and more efficient form of stream that also support producer/consumer semantics. Once you had a producer/consumer stream, then you could pass it to the standard File helper method. The only tricky part is error handling: you'd have to be sure that your producer delegate was wrapped in a top-level try/catch and would capture and re-raise any exception to the consumer.
Update: Indeed, creating a producer/consumer stream is not difficult due to pipelines:
public sealed class ProducerConsumerStream
{
public static Stream Create(Func<Stream, Task> producer, PipeOptions? options = null)
{
var pipe = new Pipe(options ?? PipeOptions.Default);
var readStream = pipe.Reader.AsStream();
var writeStream = pipe.Writer.AsStream();
Run();
return readStream;
async void Run()
{
try
{
await producer(writeStream);
await writeStream.FlushAsync();
pipe.Writer.Complete();
}
catch (Exception ex)
{
pipe.Writer.Complete(ex);
}
}
}
}
Usage (note that real-world usage should specify PipeOptions.PauseWriterThreshold):
public Stream EncodeStream(Stream input)
{
return ProducerConsumerStream.Create(async output =>
{
// Process the stream. This is just a straight copy as an example.
await input.CopyToAsync(output);
});
}
[HttpGet("image")]
public async Task<IAsyncResult> GetImage([FromQuery] string url, [FromQuery] string format)
{
// Load stream
url = HttpUtility.UrlDecode(url);
var imageStream = await _imageLoaderService.LoadImage(url);
if (imageStream is null)
return NotFound();
var outputStream = _streamEncoder.EncodeStream(imageStream);
return File(outputStream, Formats[format]);
}

UWP Unhandled Exception when writing to serial

I'm having an issue with writing to a serial device in UWP. My task for writing to the port looks like this:
public async Task WriteAsync(byte[] stream)
{
if (stream.Length > 0 && serialDevice != null)
{
await writeSemaphore.WaitAsync();
try
{
DataWriter dataWriter = new DataWriter(serialDevice.OutputStream);
dataWriter.WriteBytes(stream);
await dataWriter.StoreAsync();
dataWriter.DetachStream();
dataWriter = null;
}
finally
{
writeSemaphore.Release();
}
}
}
The code works fine the first two times I call this function. The third time I get Unhandled Exception in ntdll.dll in the await dataWriter.StoreAsync() line.
The full exception I can see is:
Unhandled exception at 0x00007FFCB3FCB2C0 (ntdll.dll) in xx.exe:
0xC000000D: An invalid parameter was passed to a service or function.
This answer mentions a garbage collector closing an input stream, however I don't see why would it happen in my code. Any help on getting to the bottom of this issue would be highly appreciated!
Turns out the solution to my problem was in another piece of code. I had a function reading the bytes like this:
private async Task ReadAsync(CancellationToken cancellationToken)
{
Task<UInt32> loadAsyncTask;
uint ReadBufferLength = 1024;
// If task cancellation was requested, comply
cancellationToken.ThrowIfCancellationRequested();
// Set InputStreamOptions to complete the asynchronous read operation when one or more bytes is available
dataReader.InputStreamOptions = InputStreamOptions.Partial;
using (var childCancellationTokenSource = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken))
{
// Create a task object to wait for data on the serialPort.InputStream
loadAsyncTask = dataReader.LoadAsync(ReadBufferLength).AsTask(childCancellationTokenSource.Token);
// Launch the task and wait
UInt32 bytesRead = await loadAsyncTask;
if (bytesRead > 0)
{
byte[] vals = new byte[3]; //TODO:adjust size
dataReader.ReadBytes(vals);
//status.Text = "bytes read successfully!";
}
}
}
Specifically the problem was in the following two lines:
byte[] vals = new byte[3]; //TODO:adjust size
dataReader.ReadBytes(vals);
As soon as I set the size of the vals array to bytesRead value the problem went away.
Normally, you would not have to set dataWriter to null, because the GC will know that an object will not be used anymore.
You'd better call dataWriter.Dispose() method like many UWP samples.
For example: SocketActivityStreamSocket
Please read IDisposable.Dispose Method () document for more details.

Timeout on endless Http stream

I’m writing C# client for REST service to access Http streaming API. I was supposed to open stream and receive messages that are sent by server in endless stream. On external command reading from stream should stop. HttpClient is used for implementation. Variable streamingActive is used to stop reading from stream since reader.EndOfStream will never return value true.
using (HttpClient httpClient = new HttpClient())
{
httpClient.Timeout = TimeSpan.FromMilliseconds(Timeout.Infinite);
var messagesUri = "http://127.0.0.1:8080/api/messages/progress?sender_id=7837492342";
var stream = httpClient.GetStreamAsync(messagesUri).Result;
using (var reader = new StreamReader(stream))
{
while (streamingActive && !reader.EndOfStream)
{
var currentLine = reader.ReadLine();
try
{
var psMessage = JsonConvert.DeserializeObject<PSMessage>(currentLine);
Console.WriteLine(psMessage);
}
catch (JsonReaderException exc)
{}
}
}
}
It works fine while there are messages coming from the server. If there is no new messages from server for ~300 s call reader.EndOfStream throw exception “The operation has timed out.”
at System.Net.ConnectStream.Read(Byte[] buffer, Int32 offset, Int32 size)
at System.Net.Http.HttpClientHandler.WebExceptionWrapperStream.Read(Byte[] buffer, Int32 offset, Int32 count)
Is there any way to keep stream opened (hypothetically) forever?
If you're willing to let go of the StreamReader, why not open a Stream object directly? The following is for a Universal App (using Windows.Web API)
HttpResponseMessage respMessage;
//note: client and request is defined elsewhere.
respMessage = client.SendRequestAsync(request, HttpCompletionOption.ResponseHeadersRead);
using (var responseStream = await respMessage.Content.ReadAsInputStreamAsync())
{
using (var fileWriteStream = await fileToStore.OpenAsync(FileAccessMode.ReadWrite))
{
while (streamingActive && (await responseStream.ReadAsync(streamReadBuffer, bufferLength, InputStreamOptions.None)).Length > 0)
{
await fileWriteStream.WriteAsync(streamReadBuffer);
}
}
}
This should wait indefinitely for the await responseStream.ReadAsync. If you still need to cancel the opperation you can use a Task with CancellationToken approach.

SerialPort.BaseStream.ReadAsync missing the first byte

I am trying to use the SerialPort class in .net.
I've opted to keep my service async, so I am using the async-methods on SerialPort.BaseStream.
In my async method, I write a byte[] to the serial port, then start reading until I haven't received any more data in n milliseconds, and return that result.
The problem is, however, that I seem to miss the first byte in all replies other than the very first reply after opening the serial port.
If I close the port after every response (Read), and open it again before doing a new request (Write), the first byte is not missing. This, however, often results in a "Access to the port 'COM4' is denied." exception, if I try to open the port too soon after closing. It also seems very unnecessary to open/close for every write/read.
This is basically what my method looks like:
private async Task<byte[]> SendRequestAsync(byte[] request)
{
// Write the request
await _serialPort.BaseStream.WriteAsync(request, 0, request.Length);
var buffer = new byte[BUFFER_SIZE];
bool receiveComplete = false;
var bytesRead = 0;
// Read from the serial port
do
{
var responseTask = _serialPort.BaseStream.ReadAsync(buffer, bytesRead, BUFFER_SIZE - bytesRead);
if (await Task.WhenAny(responseTask, Task.Delay(300)) == responseTask)
{
bytesRead += responseTask.Result;
}
else
receiveComplete = true;
} while (!receiveComplete);
var response = new byte[bytesRead];
Array.Copy(buffer, 0, response, 0, bytesRead);
return response;
}
Is there anything obviously wrong in the way I am doing this? Is there a smarter way to achieve the same asynchronously?
Just because you're not observing the last ReadAsync() doesn't mean it gets canceled, it's still running, which apparently manifests by it reading the first byte of the following message.
What you should do is to cancel the last ReadAsync() by using a CancellationToken. Note that there is a possible race between the timeout and the read, but I'm assuming that if the timeout elapsed, it's not possible for the read to complete without another write.
The code would look like this:
var cts = new CancellationTokenSource();
do
{
var responseTask = _serialPort.BaseStream.ReadAsync(
buffer, bytesRead, BUFFER_SIZE - bytesRead, cts.Token);
if (await Task.WhenAny(responseTask, Task.Delay(300)) == responseTask)
{
bytesRead += responseTask.Result;
}
else
{
cts.Cancel();
receiveComplete = true;
}
} while (!receiveComplete);
Note that both the cause and the solution are my guesses, it's certainly possible that I'm wrong about one or both of them.

HttpListener writing data to response output stream

I have small local web server based on HttpListener. The server provides files to local client application unpacking and writing files to
response.OutputStream;
but sometimes files (videos) are huge and I don't think it's a good idea to always copy all file bytes to output stream (memory). I would like to connect served file stream to response output stream something like this:
response.OutputStream = myFileStream;
but -ok- response.OutputStream is read only so I just can write bytes - is there any way to make some kind of partial writing (streaming)?
Regards.
You will need to create a thread and stream your data to response.
Use something like this:
in your main thread:
while (Listening)
{
// wait for next incoming request
var result = listener.BeginGetContext(ListenerCallback, listener);
result.AsyncWaitHandle.WaitOne();
}
somewhere in your class:
public static void ListenerCallback(IAsyncResult result)
{
var listenerClosure = (HttpListener)result.AsyncState;
var contextClosure = listenerClosure.EndGetContext(result);
// do not process request on the dispatcher thread, schedule it on ThreadPool
// otherwise you will prevent other incoming requests from being dispatched
ThreadPool.QueueUserWorkItem(
ctx =>
{
var response = (HttpListenerResponse)ctx;
using (var stream = ... )
{
stream.CopyTo(response.ResponseStream);
}
response.Close();
}, contextClosure.Response);
}

Categories

Resources