Sending live audio file to REST server using HttpClient and PushStreamContent

Sending live audio file to REST server using HttpClient and PushStreamContent - c#

I need to stream audio data from the microphone to a REST server.
I am working with a propriatery ASR engine and need to collect the data then stream it in real time in a single call to PostAsync
Looking online, I found articles on PushStreamContent but either I am not using it correctly I don't understand what I'm doing (or both).
I have a MemoryStream called stream_memory to which I write data constantly from the main thread and I want to read it, while data is streaming, and post it in real time in a single post. In the example below, I also use an event stream_data_event and an object lock to prevent multiple threads writing to the MemoryStream at the same time. I clear the memory stream every time I read from it as I don't need the data afterwards.
Here is a snip of my code that is running in a thread of its own:
http_client = new HttpClient();
http_client.http_client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("*/*"));
http_client.DefaultRequestHeaders.TryAddWithoutValidation("Accept-Language", "en-us");
http_client.DefaultRequestHeaders.TransferEncodingChunked = true;
HttpContent content = new System.Net.Http.PushStreamContent(async (stream, httpContent, transportContext) =>
{
while (stream_data_event.WaitOne())
{
lock (main_window.stream_memory_lock)
{
stream_data_event.Reset();
long write_position = main_window.stream_memory.Position;
main_window.stream_memory.Seek(0, SeekOrigin.Begin);
main_window.stream_memory.CopyTo(stream, (int)write_position);
main_window.stream_memory.Position = 0;
}
}
});
content.Headers.TryAddWithoutValidation("Content-Type", "audio/L16;rate=8000");
string request_uri = String.Format("/v1/speech:recognize");
HttpResponseMessage response = await http_client.PostAsync(request_uri, content);
string http_result = await response.Content.ReadAsStringAsync();
The call to PostAsync calls the code for the PushStreamContent as expected.
However, as long as I am in the while loop, nothing is sent to the server (checked on wireshark).
If I exit the loop manually in debugger and call close on the stream, PostAsync exists but nothing is sent to the server.
I need to have a way to continue streaming information while in the PostAsync and have the data go out as the audio arrives.
Any ideas?

It turns out that the reason nothing was sent to the server has to do with the way wireshark displays HTTP results.
My expectations were that once the request was sent I would see it immediately in wireshark. However, wireshark only shows the request once it is complete which means it is shown far after the request started streaming.
After I realized that, I could see by data being sent over to the server.

Related

How to hook real-time audio stream endpoint to Direct Line Speech Endpoint?

I am trying to hook-up my real time audio endpoint which produces continuous audio stream with Direct Line Speech (DLS) endpoint which eventually interacts with my Azure bot api.
I have a websocket API that continuously receives audio stream in binary format and this is what I intend to forward it to the DLS endpoint for continuous Speech2Text with my bot.
Based on the feedback and answer here, I have been able to hook up my Direct Line speech endpoint with a real-time stream.
I've tried a sample wav file which correctly gets transcribed by DLS and my bot is correctly able to retrieve the text to operate on it.
I have used the ListenOnce() API and am using a PushAudioInputStream method to push the audio stream to the DLS speech endpoint.
The below code is internals of ListenOnce() method
// Create a push stream
using (var pushStream = AudioInputStream.CreatePushStream())
{
using (var audioInput = AudioConfig.FromStreamInput(pushStream))
{
// Create a new Dialog Service Connector
this.connector = new DialogServiceConnector(dialogServiceConfig, audioInput);
// ... also subscribe to events for this.connector
// Open a connection to Direct Line Speech channel
this.connector.ConnectAsync();
Debug.WriteLine("Connecting to DLS");
pushStream.Write(dataBuffer, dataBuffer.Length);
try
{
this.connector.ListenOnceAsync();
System.Diagnostics.Debug.WriteLine("Started ListenOnceAsync");
}
}
}
dataBuffer in above code is the 'chunk' of binary data I've received on my websocket.
const int maxMessageSize = 1024 * 4; // 4 bytes
var dataBuffer = new byte[maxMessageSize];
while (webSocket.State == WebSocketState.Open)
{
var result = await webSocket.ReceiveAsync(new ArraySegment<byte>(dataBuffer), CancellationToken.None);
if (result.MessageType == WebSocketMessageType.Close)
{
Trace.WriteLine($"Received websocket close message: {result.CloseStatus.Value}, {result.CloseStatusDescription}");
await webSocket.CloseAsync(result.CloseStatus.Value, result.CloseStatusDescription, CancellationToken.None);
}
else if (result.MessageType == WebSocketMessageType.Text)
{
var message = Encoding.UTF8.GetString(dataBuffer);
Trace.WriteLine($"Received websocket text message: {message}");
}
else // binary
{
Trace.WriteLine("Received websocket binary message");
ListenOnce(dataBuffer); //calls the above
}
}
But the above code doesn't work. I believe I have couple of issues/questions with this approach -
I believe I am not correctly chunking the data to Direct Line Speech to ensure that it receives full audio for correct S2T conversion.
I know DLS API supports ListenOnceAsync() but not sure if this supports ASR (it knows when the speaker on other side stopped talking)
Can I just get the websocket url for the Direct Line Speech endpoint and assume DLS correctly consumes the direct websocket stream?

I believe I am not correctly chunking the data to Direct Line Speech to ensure that it receives full audio for correct S2T conversion.
DialogServiceConnector.ListenOnceAsync will listen until the stream is closed (or enough silence is detected). You are not closing your stream except for when you dispose of it at the end of your using block. You could await ListenOnceAsync but you'd have to make sure you close the stream first. If you don't await ListenOnceAsync then you can close the stream whenever you want, but you should probably do it as soon as you finish writing to the stream and you have to make sure you don't dispose of the stream (or the config) before ListenOnceAsync has had a chance to complete.
You also want to make sure ListenOnceAsync gets the full utterance. If you're only receiving 4 bytes at a time then that's certainly not a full utterance. If you want to keep your chunks to 4 bytes then it may be a good idea to keep ListenOnceAsync running during multiple iterations of that loop rather than calling it over and over for every 4 bytes you get.
I know DLS API supports ListenOnceAsync() but not sure if this supports ASR (it knows when the speaker on other side stopped talking)
I think you will have to determine when the speaker stops talking on the client side and then receive a message from your WebSocket indicating that you should close the audio stream for ListenOnceAsync.
It looks like ListenOnceAsync does support ASR.
Can I just get the websocket url for the Direct Line Speech endpoint and assume DLS correctly consumes the direct websocket stream?
You could try it, but I would not assume that myself. Direct Line Speech is still in preview and I don't expect compatibility to come easy.

Make HttpWebRequest ignoring response

I have a custom WebUploadTraceListener : TraceListener that I use to send HTTP (and eventually HTTPS) POST data to a web service that writes it to a database.
I have tested doing this with both WebClient and HttpWebRequest and empirically I'm seeing better performance with the latter.
Because of the one-way nature of the data, I don't care about the server response. But I found that if I don't handle the HttpWebResponse my code locks up on the third write. I think this is because of the DefaultConnectionLimit setting and the system not reusing the resource...
Per Jon Skeet
Note that you do need to dispose of the WebResponse returned by request.GetResponse - otherwise the underlying infrastructure won't know that you're actually done with it, and won't be able to reuse the connection.
HttpWebRequest httpRequest = (HttpWebRequest)WebRequest.Create(ServiceURI);
httpRequest.Method = "POST";
httpRequest.ContentType = "application/x-www-form-urlencoded";
try
{
using (Stream stream = httpRequest.GetRequestStream())
{
stream.Write(postBytes, 0, postBytes.Length);
}
using (HttpWebResponse response = (HttpWebResponse)httpRequest.GetResponse())
{
/// discard response
}
}
catch (Exception)
{
/// ...
}
I want to maximize the speed of sending the POST data and get back to the main program flow as quickly as possible. Because I'm tracing program flow, a synchronous write is preferable, but not mandatory as I can always add a POST field including a Tick count.
Is an HttpWebRequest and stream.Write the quickest method for doing this in .Net4.0 ?
Is there a cheaper way of discarding the unwanted response?

Attually, httpRequest.GetResponse only post the data to server and don't download anything to the client, except the information to tell client if the request is successfully processed by server.
You only get the respone data when you call GetResponseStream. If even the information about the success/error of the request you don't want to received, you don't have anyway to tell if server is success process the request or not.
So the answer is:
Yes, that is lowest level for managed code, unless you want mess up
with socket (which you shouldn't)
With HttpWebRequest, no. You almost don't get any data
you don't want.

Monitoring network stream for new data

I am writing an application that is interested in the status information from certain network devices. One of the devices provides the status information through Http and uses a multipart message; The first time you query the information it sends down the whole status and from then on whenever the status of the device changes a new multipart message is sent down the stream with just the changes.
I am using C# and am interested in using HttpClient or equivalent to open the stream, read all the information currently in the stream and then monitor this stream for when there is new information so that I can update the status information of the device accordingly in the application.
In essence the code I have looks something like this
using (var handler = new HttpClientHandler { Credentials = new NetworkCredential(username, password) })
{
using (var client = new HttpClient(handler))
{
var task = client.GetStreamAsync(uri);
task.Wait();
var stream = task.Result;
while(true)
{
byte[] bytes = ReadBytesFromStream(stream);
DoSomethingWithBytes(bytes);
}
}
The code in real life runs in a thread however and would need terminated correctly when told to.
The issue I am having is that when there is nothing in the stream the Read call on stream.ReadByte() blocks. If I put a ReadTimeout on the stream then when the Read call fails(i.e. when no new information is ready) then the CanRead property is set to false and I have to restart the process however in doing so recieve all the original status information again instead of only the elements that have changed.
Is there something that can be done to keep the stream alive until I tell it to terminate while being able to unblock on the read if no information is available? The reason I need to do this is since the application is multithreaded I need to terminate this code safely and the read is stopping the application from closing down.

Instead of using HttpClient I used an HttpWebRequest and set KeepAlive to true and AllowReadStreamBuffering properties to true. This keeps the stream alive and allows you to read bytes as of when they become available.
By keeping a reference to the network stream returned from GetResponseStream we can call Dispose on the NetworkStream which interrupts any reads that are currently taking place otherwise the read can block for as long as it needs to i.e. until it recieves data which solves the thread lifetime issues.

The right way to deal with the "I/O operation blocks my thread" problem is to use asynchronous I/O. .NET networking components offer a number of options here, but in your case you seem to be reading from a stream and even using (incorrectly) the GetStreamAsync() method, so the code can be cleaned up to handle both correctly and cleanly.
For example:
async Task ExecuteUriAsync(string username, string password, Uri uri)
{
using (var handler = new HttpClientHandler { Credentials = new NetworkCredential(username, password) })
{
using (var client = new HttpClient(handler))
{
Stream stream = await client.GetStreamAsync(uri);
byte[] buffer = new byte[10240];
while(true)
{
int byteCount = await stream.ReadAsync(buffer, 0, buffer.Length);
if (byteCount == 0)
{
// end-of-stream...must be done with the connection
return;
}
DoSomethingWithBytes(bytes, byteCount);
}
}
}
}
Your post is vague on what your previous ReadBytesFromStream() method did, and what DoSomethingWithBytes() does, but presumably you can figure out how to integrate that logic in the above.

GetResponseStream() or ReadBytes() who is actually responsible for downloading the data and how?

If we create a HttpWebRequest and get the ResponseStream from its response, then whether the data will get downloaded completely at once or, when we call the ReadBytes of the stream , then only the data will get download from the network and then reads the content?
Code sample which i want to refer is mentioned below:
var webRequest = HttpWebRequest.Create('url of a big file approx 700MB') as HttpWebRequest;
var webResponse = webRequest.GetResponse();
using (BinaryReader ns = new BinaryReader(webResponse.GetResponseStream()))
{
Thread.Sleep(60000); //Sleep for 60seconds, hope 700MB file get downloaded in 60 seconds
//At this point whether the response is totally downloaded or will not get downloaded at all
var buffer = ns.ReadBytes(bufferToRead);
//Or, in the above statement ReadBytes function is responsible for downloading the content from the internet.
}

GetResponseStream opens and returns a Stream object. The stream object is sourced from the underlying Socket. This Socket is sent data by the network adapter asynchronously. The data just arrives and is buffered. GetResponseStream will block execution until the first data arrives.
ReadByte pulls the data up from the socket layer to c#. This method will block execution until there is a byte avaliable.
Closing the stream prematurely will end the asynchronous transfer (closes the Socket, the sender will be notified of this as their connection will fail) and discard (flush) any buffered data that you have not used yet.

var webRequest = HttpWebRequest.Create('url of a big file approx 700MB') as HttpWebRequest;
Okay, we're set up ready to go. It's a bit different if you PUT or POST a stream of your own, but the differences are analogous.
var webResponse = webRequest.GetResponse();
When GetResponse() returns, it will at the very least have read all of the HTTP headers. It may well have read the headers of a redirect, and done another request to the URI it was redirected to. It's also possible that it's actually hitting a cache (either directly or because the webserver setnt 304 Not Modified) but by default the details of that are hidden from you.
There will likely be some more bytes in the socket's buffer.
using (BinaryReader ns = new BinaryReader(webResponse.GetResponseStream()))
{
At this point, we've got a stream representing the network stream.
Let's remove the Thread.Sleep() it does nothing except add a risk of the connection timing out. Even assuming it doesn't timeout while waiting, the connection will have "backed off" from sending bytes since you weren't reading them, so the effect will be to slow things even more than you did by adding a deliberate slow-down.
var buffer = ns.ReadBytes(bufferToRead);
At this point, either bufferToRead bytes have been read to create a byte[] or else fewer than bufferToRead because the total size of the stream was less than that, in which case buffer contains the entire stream. This will take as long as it takes.
}
At this point, because a successful HTTP GET was performed, the underlying web-access layer may cache the response (probably not if it's very large - the default assumption is that very large requests don't get repeated a lot and don't benefit from caching).
Error conditions will raise exceptions if they occur, and in that case no caching will ever be done (there is no point caching a buggy response).
There is no need to sleep, or otherwise "wait" on it.
It's worth considering the following variant that works at just a slightly lower level by manipulating the stream directly rather than through a reader:
using(var stm = webResponse.GetResponseStream())
{
We're going to work on the stream directly;
byte[] buffer = new byte[4096];
do
{
int read = stm.Read(buffer, 0, 4096);
This will return up to 4096 bytes. It may read less, because it has a chunk of bytes already available and it returns that many immediately. It will only return 0 bytes if it is at the end of the stream, so this gives us a balance between waiting and not waiting - it promises to wait long enough to get at least one byte, but whether or not it waits until it gets all 4096 bytes is up to the stream to choose whether it is more efficient to wait that long or return fewer bytes;
DoSomething(buffer, 0, read);
We work with the bytes we got.
} while(read != 0);
Read() only gives us zero bytes, if it's at the end of the stream.
}
And again, when the stream is disposed, the response may or may not be cached.
As you can see, even at the lowest level .NET gives us access to when using HttpWebResponse, there's no need to add code to wait on anything, as that is always done for us.
You can use asynchronous access to the stream to avoid waiting, but then the asynchronous mechanism still means you get the result when it's available.

To answer your question about when streaming starts, GetResponseStream() will start receiving data from the server. However, at some point the network buffers will become full and the server will stop sending data if you don't read off the buffers. For a detailed description of the tcp buffers, etc see here.
So your sleep of 60000 will not be helping you much as the network buffers along the way will fill up and data will stop arriving until you read it off. It is better to read it off and write it in chunks as you go.
More info on the workings of ResponseStream here.
If you are wondering about what buffer size to use, see here.

Terminate Web Request Early (C#)

As part of a suite of integration tests I am writing, I want to assert that my server behaves correctly when a client HTTP request terminates early, before all the response data has been sent.
Is it possible to create an HTTP request and terminate it after receiving just a few bytes of data in C#?

You don't have to read all bytes out fo the response. Just read as many bytes as you want and then return from your test.
You can do so more or less like this:
Stream myStream = resp.GetResponseStream();
myStream.Read(bufferArray, 0, 1); //read 1 byte into bufferArray
return;
You may find the documentation on WebReponse useful.

Just start the call asynchronously using, say a background worker, and then close the thread/channel.

I found a solution which works for me. I just close the response after getting it. This seems to leave me with the response headers, but closes the connection before the server is finished sending.
var response = request.getResponse();
response.Close();
// Assert that server has dealt with closed response correctly

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.