I am currently working on a project handling multiple byte arrays. In one scenario it could happen that an array is being reinitialized multiple times in a while loop. The only reason this happens is because the application is receiving read requests, but the information received is incorrect.
I have to have a 'clean' byte array to write back to the device indicating that the information was incorrect. The device being read from has the tendency to jump into a 'loop' where it keeps sending that 'incorrect' data until it is interacted with which is not a problem per say. Just know the device spamming read requests is not a problem at this moment in time and will be dealt with in the future.
My question is this:
Are there any performance/coding issues when reinitializing an array inside a while loop for an extended period of time? 10000+ iterations for example
Code example:
byte[] array = new byte[500]
while(condition)
{
// Some work
array = new byte[500]
Thread.Sleep(100);
// Some work
}
If the array dont change you can simply clear the Array (https://learn.microsoft.com/de-de/dotnet/api/system.array.clear?view=net-5.0):
Array.Clear(array, 0, array.Length);
To save performance you can also run your function asynchronous.
Related
I am researching the possibility of using pipelines for processing binary messages coming from network.
The binary messages i will be processing come with an payload and it is desirable to keep the payload in its binary form.
The idea is to read out the whole message and create a slice of message and its payload, once the message is completely read it will be passed to a channel chain for processing, the processing will not be instant and might take some time or be executed later and the goal is not to have the pipe reader wait until the processing is complete, then once the message processing is complete i would need to release the processed buffer region to the pipe writer.
Now of course i could just create a new byte array and copy the data coming from pipe writer but that would beat the purpose of no-copy? So as i understand i would need some buffer synchronization between the pipeline and the channel?
I observed the available apis (AdvanceTo) of pipe reader where its possible to tell the pipe reader what was consumed and what was examined but cant get around how this could be synced outside of the pipe reading method.
So the question would be whether there are some techniques or examples on how this can be achieved.
The buffer obtained from TryRead/ReadAsync is only valid until you call AdvanceTo, with the expectation that as soon as you've done that: anything you reported as consumed is available to be recycled for use elsewhere (which could be parallel/concurrent readers). Strictly speaking: even the bits you haven't reported as consumed: you still shouldn't treat as valid once you've called AdvanceTo (although in reality, it is likely that they'll still be the same segments - just: that isn't the concern of the caller; to the caller, it is only valid between the read and the advance).
This means that you explicitly can't do:
while (...)
{
var result = await pipe.ReadAsync();
if (TryIdentifyFrameBoundary(out var frame)) {
BeginProcessingInBackground(frame); // <==== THIS IS A PROBLEM!
reader.AdvanceTo(frame.End, frame.End);
}
else if { // take nothing
reader.AdvanceTo(buffer.Start, buffer.End);
if (result.IsCompleted) break; // that's all folks
}
}
because the "in background" bit, when it fires, could now be reading someone else's data (due to it being reused already).
So: either you need to process the frame contents as part of the read loop, or you're going to have to make a copy of the data, most likely by using:
c#
var len = checked ((int)buffer.Length);
var oversized = ArrayPool<byte>.Shared.Rent(len);
buffer.CopyTo(oversized);
and pass oversized to your background processing, remembering to only look at the first len bytes of it. You could pass this as a ReadOnlyMemory<byte>, but you need to consider that you're also going to want to return it to the array-pool afterwards (probably in a finally block), and passing it as a memory makes it a little more awkward (but not impossible, thanks to MemoryMarshal.TryGetArray).
Note: in early versions of the pipelines API, there was an element of reference-counting, which did allow you to preserve buffers, but it had a few problems:
it complicated the API hugely
it led to leaked buffers
it was ambiguous and confusing what "preserved" meant; is the count until it gets reused? or released completely?
so that feature was dropped.
I'm making a program in which one of its functions, in order to correctly create the message to be sent, keeps calling a function I have generated to add each of the parts to the array. The thing is, in C# you can't do this because the byte arrays (and if I'm not wrong, any kind of array) has a finite Length which cannot be changed.
Due to this, I thought of creating 2 byte variables. The first one would get the first to values. The second one would be created after you know the quantity of new bytes you have to add, and after this, you would delete the first variable and create it again, with the Length of the previous variable, but adding the Length of the new values, doing the same you did with the second variable. The code I've generated is:
byte[] message_mod_0 = adr_and_func;
byte[] byte_memory_adr = AddAndTypes.ToByteArray(memory_adr);
byte[] message_mod_1 = new byte[2 + byte_memory_adr.Length];
message_mod_1 = AddAndTypes.AddByteArrayToByteArray(message_mod_0, byte_memory_adr);
AddAndTypes.AddByteArrayToByteArray(message_mod_0, AddAndTypes.IntToByte(value));
byte[] CRC = Aux.CRC(message_mod_0);
AddAndTypes.AddByteArrayToByteArray(message_mod_0, CRC);
In this code, the two variables I've meant are message_mod_0 and message_mod_1. I also think of doing the deleting and redeclaring the byte_memory_adr variable that is required in order to know which is the Length of the byte array you want to add to the ouput message.
The parameters adr_and_func, memory_adr and value are given as input parameters of the function I'm making.
The question can be summed up as: is there any way to delete variables in the same scope they were created? And, in case it can be done, would there be any problem if I created a new variable with the same name after I have deleted the first one? I can't think of any reason why that could happen, but I'm pretty new to this programming language.
Also, I don't know if there is any less messy way of doing this.
This sounds like you are writing your own custom serializer.
I would recommend just using a existing library, like protobuf.net to define your messages if at all possible.
If this is not possible you can use a BinaryWriter to write your values to a Stream. If you want to keep it in memory use a MemoryStream and use .ToArray() when your done to get a array of all bytes.
As for memory, do not worry about it. Unless you have gigabyte sized messages the memory requirements should not be an issue, and the garbage collector will automatically recycle memory when it is no longer needed, and it can do this after the last usage, regardless of scope. If you have huge memory streams you might want to look at something like recyclable memory stream since this can avoid some allocation performance issues and fragmentation issues.
I have a project where audio is recorded at several sources, and the goal is to process X (user-defined) seconds of it through various methods (DSP as well as speech-to-text).
I'm using a MixingSampleProvider to collect all the sources into one "provider." I'm passing this to a NotifyingSampleProvider because it raises an event at each sample, and then passing that sample to my class that does the processing. I'm adding the float the NotifyingSampleProvider produces to the end of my "X second window" array (using Array.Copy to create a temp array with all but that last value, adding that last value, and copying the temp array back to the original) which I use for processing.
The obvious problem here is that it notifies (and that I'm locking and adding to the "X second window" array) for every single sample, or 44100 times a second. This leads to the audio being pretty much constantly locked so it can't be processed. There's got to be a more performant way to deal with this.
The one idea I had was a BufferedWaveProvider that doesn't get read from anywhere else, so it's always full (with DiscardOnOverflow = true of course). However A) this still requires a NotifyingSampleProvider to add to it periodically (you can't pass a provider to the BufferedWaveProvider for it to automatically read from) and I'd like to get away from such frequent (44100 Hz) function calls, and B) I understand that there's a limit to the BufferDuration length, which might be too small of a window (I don't recall what the limit is and I can't find anything online saying what it is).
How might I solve this problem? Using NAudio, how to I keep the last X seconds of audio accessible to a separate class at any time without using the NotifyingSampleProvider?
[SOLVED, the question is based on incorrect assumptions]
While working with TCP I came across an issue with NetworkStream.Read returning value 0 in two separate cases, which I have trouble distinguishing.
A bit of background - I have a working client-server solution communicating via TCP using length-prefixed messages. However, since most of the communication (apart from some initial messages exchange) happens from Client to Server, the Server doesn't have a good way to know whether a Client is still connected or not. One way to find this out is to send something to the Client from time to time, and that is what I decided to do.
I know that I can add a dedicated "ping" message to my protocol, and simply ignore it in the Client, but I was also testing for other possibilities. One thing I tried was to send an empty byte array to the Client like so:
networkStream.Write(new byte[0], 0, 0);
All looks good, it seems to be sending a TCP packet with no data in it... however! My client code does expect data from the Server from time to time, so it has a thread that blocks on networkStream.Read, like so:
int bytesRead = networkStream.Read(buffer, 0, 4);
if (bytesRead == 0)
break;
According to the docs, Socket.Read (or NetworkStream.Read) returns 0 if the other end closes the connection. This is true, but in my case, after sending the empty byte array, Read(...) also returns 0.
So far I was unable to distinguish those two situations. Socket.Connected checked after Read is true in both cases (connection closed and the empty byte array). Is there any other way to handle this?
Again, I do know that sending this empty array is almost the same as adding a new type of message for that purpose. I am not asking for a solution here... just wanted to know if .NET's Socket can distinguish between an empty byte array and connection closing.
EDIT:
I am sorry for bothering everyone with a question, that in the end was based on incorrect assumptions. My tests were not done on my production code, and were too sloppy. This caused me to draw incorrect conclusions.
Basically, what I was testing is if I do a Write(new byte[0]...) on one end, the other end's Read(...) will return 0. It did, but not due to the send. The TcpClient I used to test was falling out of scope, which (I assume) caused it to be Disposed by GC, thus the connection got closed, which caused the Read to return 0. I did repeat the test with TcpClient not being disposed/lost, and the Read does not return anything, no matter how many empty byte arrays I send.
At first, I expected the Nagle's algorithm to mess things up, but it doesn't in this case - 1-byte arrays arrive without delays, as I was testing on localhost. I may do a different test, using Sockets, and explicitly disabling Nagle's algorithm, but I don't think this will change anything.
Now I just need to check whether sending such an array will actually allow me to detect a disconnection, but that is a different story, not in the scope of this question.
EDIT 2:
I did some more tests regarding this, and found out, that despite several suggestions, e.g. here (which seems like a valid source of information), doing an empty Send does not recognize a broken connection. I physically disconnected a network cable, and my Server is doing those empty sends every 5 seconds. It's been going like that for several minutes, and no disconnection was detected. If I decide to send any data (even a single byte), the disconnection gets detected after 20 seconds at most.
From MSDN's NetworkStream.Read Method (Byte[], Int32, Int32):
The Read operation reads as much data as is available, up to the
number of bytes specified by the size parameter. If no data is
available for reading, the Read method returns 0.
While sending data, you're sending an empty byte array and writing zero bytes using:
networkStream.Write(new byte[0], 0, 0);
Whereas, while reading the data, you claim that
"My client code does expect data from the Server from time to time, so
it has a thread that blocks on networkStream."
int bytesRead = networkStream.Read(buffer, 0, 4);
if (bytesRead == 0)
break;
But, then again you're trying to read 4 bytes to a byte array, which would obviously keep on waiting. So, what else do you expect!
"...just wanted to know if .NET's Socket can distinguish between an empty
byte array and connection closing."
Connection closing is totally a different story, which involves several steps before closing of a Socket connection. So, obviously, it is not the same as sending or receiving zero bytes!
-> Lastly, as hinted by #WithMetta in the comments, please check whether the data is available to be read or not using NetworkStream.DataAvailable Property.
while(networkStream.DataAvailable) { // your code logic}
Note: Let me appologize for the length of this question, i had to put a lot of information into it. I hope that doesn't cause too many people to simply skim it and make assumptions. Please read in its entirety. Thanks.
I have a stream of data coming in over a socket. This data is line oriented.
I am using the APM (Async Programming Method) of .NET (BeginRead, etc..). This precludes using stream based I/O because Async I/O is buffer based. It is possible to repackage the data and send it to a stream, such as a Memory stream, but there are issues there as well.
The problem is that my input stream (which i have no control over) doesn't give me any information on how long the stream is. It simply is a stream of newline lines looking like this:
COMMAND\n
...Unpredictable number of lines of data...\n
END COMMAND\n
....repeat....
So, using APM, and since i don't know how long any given data set will be, it is likely that blocks of data will cross buffer boundaries requiring multiple reads, but those multiple reads will also span multiple blocks of data.
Example:
Byte buffer[1024] = ".................blah\nThis is another l"
[another read]
"ine\n.............................More Lines..."
My first thought was to use a StringBuilder and simply append the buffer lines to the SB. This works to some extent, but i found it difficult to extract blocks of data. I tried using a StringReader to read newlined data but there was no way to know whether you were getting a complete line or not, as StringReader returns a partial line at the end of the last block added, followed by returning null aftewards. There isn't a way to know if what was returned was a full newlined line of data.
Example:
// Note: no newline at the end
StringBuilder sb = new StringBuilder("This is a line\nThis is incomp..");
StringReader sr = new StringReader(sb);
string s = sr.ReadLine(); // returns "This is a line"
s = sr.ReadLine(); // returns "This is incomp.."
What's worse, is that if I just keep appending to the data, the buffers get bigger and bigger, and since this could run for weeks or months at a time that's not a good solution.
My next thought was to remove blocks of data from the SB as I read them. This required writing my own ReadLine function, but then I'm stuck locking the data during reads and writes. Also, the larger blocks of data (which can consist of hundreds of reads and megabytes of data) require scanning the entire buffer looking for newlines. It's not efficient and pretty ugly.
I'm looking for something that has the simplicity of a StreamReader/Writer with the convenience of async I/O.
My next thought was to use a MemoryStream, and write the blocks of data to a memory stream then attach a StreamReader to the stream and use ReadLine, but again I have issues with knowing if a the last read in the buffer is a complete line or not, plus it's even harder to remove the "stale" data from the stream.
I also thought about using a thread with synchronous reads. This has the advantage that using a StreamReader, it will always return a full line from a ReadLine(), except in broken connection situations. However this has issues with canceling the connection, and certain kinds of network problems can result in hung blocking sockets for an extended period of time. I'm using async IO because i don't want to tie up a thread for the life of the program blocking on data receive.
The connection is long lasting. And data will continue to flow over time. During the intial connection, there is a large flow of data, and once that flow is done the socket remains open waiting for real-time updates. I don't know precisely when the initial flow has "finished", since the only way to know is that no more data is sent right away. This means i can't wait for the initial data load to finish before processing, I'm pretty much stuck processing "in real time" as it comes in.
So, can anyone suggest a good method to handle this situation in a way that isn't overly complicated? I really want this to be as simple and elegant as possible, but I keep coming up with more and more complicated solutions due to all the edge cases. I guess what I want is some kind of FIFO in which i can easily keep appending more data while at the same time poping data out of it that matches certain criteria (ie, newline terminated strings).
That's quite an interesting question. The solution for me in the past has been to use a separate thread with synchronous operations, as you propose. (I managed to get around most of the problems with blocking sockets using locks and lots of exception handlers.) Still, using the in-built asynchronous operations is typically advisable as it allows for true OS-level async I/O, so I understand your point.
Well I've gone and written a class for accomplishing what I believe you need (in a relatively clean manner I would say). Let me know what you think.
using System;
using System.Collections.Generic;
using System.IO;
using System.Text;
public class AsyncStreamProcessor : IDisposable
{
protected StringBuilder _buffer; // Buffer for unprocessed data.
private bool _isDisposed = false; // True if object has been disposed
public AsyncStreamProcessor()
{
_buffer = null;
}
public IEnumerable<string> Process(byte[] newData)
{
// Note: replace the following encoding method with whatever you are reading.
// The trick here is to add an extra line break to the new data so that the algorithm recognises
// a single line break at the end of the new data.
using(var newDataReader = new StringReader(Encoding.ASCII.GetString(newData) + Environment.NewLine))
{
// Read all lines from new data, returning all but the last.
// The last line is guaranteed to be incomplete (or possibly complete except for the line break,
// which will be processed with the next packet of data).
string line, prevLine = null;
while ((line = newDataReader.ReadLine()) != null)
{
if (prevLine != null)
{
yield return (_buffer == null ? string.Empty : _buffer.ToString()) + prevLine;
_buffer = null;
}
prevLine = line;
}
// Store last incomplete line in buffer.
if (_buffer == null)
// Note: the (* 2) gives you the prediction of the length of the incomplete line,
// so that the buffer does not have to be expanded in most/all situations.
// Change it to whatever seems appropiate.
_buffer = new StringBuilder(prevLine, prevLine.Length * 2);
else
_buffer.Append(prevLine);
}
}
public void Dispose()
{
Dispose(true);
GC.SuppressFinalize(this);
}
private void Dispose(bool disposing)
{
if (!_isDisposed)
{
if (disposing)
{
// Dispose managed resources.
_buffer = null;
GC.Collect();
}
// Dispose native resources.
// Remember that object has been disposed.
_isDisposed = true;
}
}
}
An instance of this class should be created for each NetworkStream and the Process function should be called whenever new data is received (in the callback method for BeginRead, before you call the next BeginRead I would imagine).
Note: I have only verified this code with test data, not actual data transmitted over the network. However, I wouldn't anticipate any differences...
Also, a warning that the class is of course not thread-safe, but as long as BeginRead isn't executed again until after the current data has been processed (as I presume you are doing), there shouldn't be any problems.
Hope this works for you. Let me know if there are remaining issues and I will try to modify the solution to deal with them. (There could well be some subtlety of the question I missed, despite reading it carefully!)
What you're explaining in you're question, reminds me very much of ASCIZ strings. (link text). That may be a helpfull start.
I had to write something similar to this in college for a project I was working on. Unfortunatly, I had control over the sending socket, so I inserted a length of message field as part of the protocol. However, I think that a similar approach may benefit you.
How I approached my solution was I would send something like 5HELLO, so first I'd see 5, and know I had message length 5, and therefor the message I needed was 5 characters. However, if on my async read, i only got 5HE, i would see that I have message length 5, but I was only able to read 3 bytes off the wire (Let's assume ASCII characters). Because of this, I knew I was missing some bytes, and stored what I had in fragment buffer. I had one fragment buffer per socket, therefor avoiding any synchronization problems. The rough process is.
Read from socket into a byte array, record how many bytes was read
Scan through byte by byte, until you find a newline character (this becomes very complex if you're not receiving ascii characters, but characters that could be multiple bytes, you're on you're own for that)
Turn you're frag buffer into a string, and append you're read buffer up until the new line to it. Drop this string as a completed message onto a queue or it's own delegate to be processed. (you can optimize these buffers by actually having you're read socket writing to the same byte array as you're fragment, but that's harder to explain)
Continue looping through, every time we find a new line, create a string from the byte arrange from a recorded start / end position and drop on queue / delegate for processing.
Once we hit the end of our read buffer, copy anything that's left into the frag buffer.
Call the BeginRead on the socket, which will jump to step 1. when data is available in the socket.
Then you use another Thread to read you're queue of incommign messages, or just let the Threadpool handle it using delegates. And do whatever data processing you have to do. Someone will correct me if I'm wrong, but there is very little thread synchronization issues with this, since you can only be reading or waiting to read from the socket at any one time, so no worry about locks (except if you're populating a queue, I used delegates in my implementation). There are a few details you will need to work out on you're own, like how big of a frag buffer to leave, if you receive 0 newlines when you do a read, the entire message must be appended to the fragment buffer without overwriting anything. I think it ran me about 700 - 800 lines of code in the end, but that included the connection setup stuff, negotiation for encryption, and a few other things.
This setup performed very well for me; I was able to perform up to 80Mbps on 100Mbps ethernet lan using this implementation a 1.8Ghz opteron including encryption processing. And since you're tied to the socket, the server will scale since multiple sockets can be worked on at the same time. If you need items processed in order, you'll need to use a queue, but if order doesn't matter, then delegates will give you very scalable performance out of the threadpool.
Hope this helps, not meant to be a complete solution, but a direction in which to start looking.
*Just a note, my implementation was down purely at the byte level and supported encryption, I used characters for my example to make it easier to visualize.