Changes to Buffer While BeginWrite is Called - c#

I'm curious as to whether making changes to the byte[] before BeginWrite actually finishes writing, will influence what is finally written by the FileStream.
I've this code below, with currentPage being a byte[] with the data I want to write.
try
{
FileStream.BeginWrite(currentPage, 0, currentPage.Length, new AsyncCallback(EndWriteCallback),new State(logFile.fs, currentPage, BUFFER_SIZE, manualEvent));
manualEvent.WaitOne();
}
catch (Exception e)
{
//handle exception here
}
I have this within a loop that will replace the data in currentPage. What will happen if I make changes to currentPage (like assign a new byte[] with all 0's in it)? Does FileStream buffer the byte[] to be written somewhere or does it actually just references the byte[] I passed in when I call it?
I tried looking at the MSDN article but all I could find was
Multiple simultaneous asynchronous requests render the request completion order uncertain.
Could someone please explain this to me?

This code should answer your questions. Firstly I create a long byte array where every cell is equal to 255. Then I start 2 threads. The first one is responsible for writing the prepared byte array to file. At the same time the second thread modifies this array, starting from the last cell, by setting every cell to 0.
The exact results of executing this code will depend on the machine, current CPU usage etc. On my computer one time I observed that about 77% of the created file contained 255s and the rest 0s. The next time it was about 70%. It confirms that the input array is not blocked for writing by BeginWrite method.
In order to observe this effect try to run this program a few times. It might be also necessary to use the longer array.
var path = #"C:\Temp\temp.txt";
var list = new List<byte>();
for(var i = 0; i < 1000000; ++i)
list.Add(255);
var buffer = list.ToArray();
var t1 = Task.Factory.StartNew(() =>
{
using (var fs = File.OpenWrite(path))
{
var res = fs.BeginWrite(buffer, 0, buffer.Length, null, null);
res.AsyncWaitHandle.WaitOne();
}
});
var t2 = Task.Factory.StartNew(() =>
{
for (var i = buffer.Length - 1; i > 0; --i)
buffer[i] = 0;
});
Task.WaitAll(t1, t2);

Related

Receiving a complete network stream using int NetworkStream.Read(Span<Bytes>)

As the title says I am trying to use the new (C# 8.0) object (Span) for my networking project. On my previous implementation I learned that it was mandatory to make sure that a NetworkStream have received a complete buffer before trying to use its content, otherwise, depending on the connection, the data received on the other end may not be whole.
while (true)
{
while (!stream.DataAvailable)
Thread.Sleep(10);
int received = 0;
byte[] response = new byte[consumerBufferSize];
//Loop that forces the stream to read all incoming data before using it
while (received < consumerBufferSize)
received += stream.Read(response, received, consumerBufferSize - received);
string[] message = ObjectWrapper.ConvertByteArrayToObject<string>(response);
consumerAction(this, message);
}
However, it was introduced a different approach for reading network stream data (Read(Span)). And assuming that stackalloc will help with performance I am attempting to migrate my old implementation to accomodate this method. Here is what it looks like now:
while (true)
{
while (!stream.DataAvailable)
Thread.Sleep(10);
Span<byte> response = stackalloc byte[consumerBufferSize];
stream.Read(response);
string[] message = ObjectWrapper.ConvertByteArrayToObject<string>(response).Split('|');
consumerAction(this, message);
}
But now how can I be sure that the buffer was completely read since it does not provides methods like the one I was using?
Edit:
//Former methodd
int Read (byte[] buffer, int offset, int size);
//The one I am looking for
int Read (Span<byte> buffer, int offset, int size);
I'm not sure I understand what you're asking. All the same features you relied on in the first code example still exist when using Span<byte>.
The Read(Span<byte>) overload still returns the count of bytes read. And since the Span<byte> is not the buffer itself, but rather just a window into the buffer, you can update the Span<byte> value to indicate the new starting point to read additional data. Having the count of bytes read and being able to specify the offset for the next read are all you need to duplicate the functionality in your old example. Of course, you don't currently have any code that saves the original buffer reference; you'll need to add that too.
I would expect something like this to work fine:
while (true)
{
while (!stream.DataAvailable)
Thread.Sleep(10);
byte* response = stackalloc byte[consumerBufferSize];
while (received < consumerBufferSize)
{
Span<byte> span = new Span<byte>(response, received, consumerBufferSize - received);
received += stream.Read(span);
}
// process response here...
}
Note that this requires unsafe code because of the way stackalloc works. You can only avoid that by using Span<T> and allocating new blocks each time. Of course, that will eventually eat up all your stack.
Since in your implementation you apparently are dedicating a thread to this infinite loop, I don't see how stackalloc is helpful. You might as well just allocate a long-lived buffer array in the heap and use that.
In other words, I don't really see how this is better than just using the original Read(byte[], int, int) overload with a regular managed array. But the above is how you'd get the code to work.
Aside: you should learn how the async APIs work. Since you're already using NetworkStream, the async/await patterns are a natural fit. And regardless of what API you use, a loop checking DataAvailable is just plain crap. Don't do that. The Read() method is already a blocking method; you don't need to wait for data to show up in a separate loop, since the Read() method won't return until there is some.
I am just adding a little bit of additional information.
The function that you are talking about has the following description
public override int Read (Span<byte> buffer);
(source : https://learn.microsoft.com/en-us/dotnet/api/system.net.sockets.networkstream.read?view=net-5.0 )
Where the int returned is the amount of byte read from the NetworkStream. Now if we are looking at the Span functions we find Slice with the following description
public Span<T> Slice (int start);
(source : https://learn.microsoft.com/en-us/dotnet/api/system.span-1.slice?view=net-5.0#system-span-1-slice(system-int32) )
Which returns a portion of our Span, which you can use to send a certain portion of your stackalloc to your NetworkStream without using unsafe code.
Reusing your Code you could use something like this
while (true)
{
while (!stream.DataAvailable)
Thread.Sleep(10);
int received = 0;
Span<byte> response = stackalloc byte[consumerBufferSize];
//Loop that forces the stream to read all incoming data before using it
while (received < consumerBufferSize)
received += stream.Read(response.Slice(received));
string[] message = ObjectWrapper.ConvertByteArrayToObject<string>(response).Split('|');
consumerAction(this, message);
}
In simple words, we "create" a new Span that is a portion of the initial Span pointing to our stackalloc with Slice, the "start" parameter allows us to choose where to start this portion. The portion is then passed to the function read which will start writing in our buffer wherever we "started" our Slice.

Report progress without slowing procedure

Im trying to decrypt a file reporting the progress to show it in a progress bar, here is my decription function
private static void Decrypt(String inName, String outName, byte[] rijnKey, byte[] rijnIV)
{
FileStream fin = new FileStream(inName, FileMode.Open, FileAccess.Read);
FileStream fout = new FileStream(outName, FileMode.OpenOrCreate, FileAccess.Write);
fout.SetLength(0);
byte[] bin = new byte[1048576];
long rdlen = 0;
long totlen = fin.Length;
int len;
SymmetricAlgorithm rijn = SymmetricAlgorithm.Create();
CryptoStream encStream = new CryptoStream(fout, rijn.CreateDecryptor(rijnKey, rijnIV), CryptoStreamMode.Write);
while (rdlen < totlen)
{
len = fin.Read(bin, 0, bin.Length);
encStream.Write(bin, 0, len);
rdlen = rdlen + len;
//Call here a method to report progress
}
encStream.Close();
fout.Close();
fin.Close();
}
I want to call a method to report the progress inside the loop, but depending on the response time of the method this may slow the performance of the decrypter, how can I report the progress without this problem?
Thanks!
A couple of suggestions for you:
In your reporting method, have it check as to how long it was since it last reported. If it is less than, say, 0.3s, have it return without doing anything - no progress bar needs to be updated more than 3 times per second.
and/or
Offload the work of the reporting method onto another thread - that way the method within your loop will return immediately (your loop can continue right away). In your method on the other thread, include a check not to start another thread (i.e. just return without doing anything) if the previous reporting thread has not completed yet.
Or simpler still, which may well work in your situation, include a counter in your loop and then every n times through your loop, do your progress report and reset the value of n to zero. Select a value for n by experiment, so that it updates often enough (a couple of times per second) but you are not doing more progress updates than you have to. e.g. if your loop iterates at 3000 times per second, doing your update every 1000th time will be fine.

Awaiting data from Serial Port in C#

I have an application that receives data from a wireless radio using RS-232. These radios use an API for communicating with multiple clients. To use the radios I created a library for communicate with them that other software can utilize with minimal changes from a normal SerialPort connection. The library reads from a SerialPort object and inserts incoming data into different buffers depending on the radio it receives from. Each packet that is received contains a header indicating its length, source, etc.
I start by reading the header, which is fixed-length, from the port and parsing it. In the header, the length of the data is defined before the data payload itself, so once I know the length of the data, I then wait for that much data to be available, then read in that many bytes.
Example (the other elements from the header are omitted):
// Read header
byte[] header = new byte[RCV_HEADER_LENGTH];
this.Port.Read(header, 0, RCV_HEADER_LENGTH);
// Get length of data in packet
short dataLength = header[1];
byte[] payload = new byte[dataLength];
// Make sure all the payload of this packet is ready to read
while (this.Port.BytesToRead < dataLength) { }
this.Port.Read(payload, 0, dataLength);
Obviously the empty while port is bad. If for some reason the data never arrives the thread will lock. I haven't encountered this problem yet, but I'm looking for an elegant way to do this. My first thought is to add a short timer that starts just before the while-loop, and sets an abortRead flag when it elapses that would break the while loop, like this:
// Make sure all the payload of this packet is ready to read
abortRead = false;
readTimer.Start();
while (this.Port.BytesToRead < dataLength && !abortRead) {}
This code needs to handle a constant stream of incoming data as quickly as it can, so keeping overhead to a minimum is a concern, and am wondering if I am doing this properly.
You don't have to run this while loop, the method Read would either fill the buffer for you or would throw a TimeoutException if buffer wasn't filled within the SerialPort.ReadTimeout time (which you can adjust to your needs).
But some general remark - your while loop would cause intensive CPU work for nothing, in the few milliseconds it would take the data to arrive you would have thousends of this while loop iterations, you should've add some Thread.Sleep inside.
If you want to truly adress this problem, you need to run the code in the background. There are different options to do that; you can start a thread, you start a Task or you can use async await.
To fully cover all options, the answer would be endless. If you use threads or tasks with the default scheduler and your wait time is expected to be rather short, you can use SpinWait.SpinUntil instead of your while loop. This will perform better than your solution:
SpinWait.SpinUntil(() => this.Port.BytesToRead >= dataLength);
If you are free to use async await, I would recommend this solution, since you need only a few changes to your code. You can use Task.Delay and in the best case you pass a CancellationToken to be able to cancel your operation:
try {
while (this.Port.BytesToRead < dataLength) {
await Task.Delay(100, cancellationToken);
}
}
catch(OperationCancelledException) {
//Cancellation logic
}
I think I would do this asynchronously with the SerialPort DataReceived event.
// Class fields
private const int RCV_HEADER_LENGTH = 8;
private const int MAX_DATA_LENGTH = 255;
private SerialPort Port;
private byte[] PacketBuffer = new byte[RCV_HEADER_LENGTH + MAX_DATA_LENGTH];
private int Readi = 0;
private int DataLength = 0;
// In your constructor
this.Port.DataReceived += new SerialDataReceivedEventHandler(DataReceivedHandler);
private void DataReceivedHandler(object sender, SerialDataReceivedEventArgs e)
{
if (e.EventType != SerialData.Chars)
{
return;
}
// Read all available bytes.
int len = Port.BytesToRead;
byte[] data = new byte[len];
Port.Read(data, 0, len);
// Go through each byte.
for (int i = 0; i < len; i++)
{
// Add the next byte to the packet buffer.
PacketBuffer[Readi++] = data[i];
// Check if we've received the complete header.
if (Readi == RCV_HEADER_LENGTH)
{
DataLength = PacketBuffer[1];
}
// Check if we've received the complete data.
if (Readi == RCV_HEADER_LENGTH + DataLength)
{
// The packet is complete add it to the appropriate buffer.
Readi = 0;
}
}
}

WebException (timeout) When Reading Files From Multiple Threads After Setting ThreadPool.MaxThreads

So, I tracked down the issue, but I don't understand the root cause and I'm curious.
I have multiple threads reading files (sometimes the same file, but usually different files. This doesn't seem to matter) from a local drive. This is the test setup, but in production these files are retrieved from a web server.
Anyway, I noticed that, after calling ThreadPool.SetMaxThreads(), I was receiving timeouts reading these files. Removing that line makes the problem go away. My hunch is that it has to do with setting the number of asynchronous IO threads (completionPortThreads, the second argument), but even when I set that value to a large number (50, 100, ...), the issue remains.
Removing the call to SetMaxThreads "fixes" the issue, though it means I can't increase or decrease the number of threads for testing purposes.
Here is a block of code which reproduces the issue. The file size doesn't matter as my test files range anywhere from 2KB to 3MB.
class Program
{
static void Main(string[] args)
{
_count = 15;
// Comment this line out and everything works
ThreadPool.SetMaxThreads(13, 50);
using (var mre = new ManualResetEvent(false))
{
for (int i = 0; i < _count; ++i)
{
ThreadPool.QueueUserWorkItem(ThreadFunc, mre);
}
mre.WaitOne();
}
}
private static readonly ConcurrentStack<byte[]> _files = new ConcurrentStack<byte[]>();
private static int _count;
private static void ThreadFunc(object o)
{
const string path = #"SomeLocalFile";
var file = ReadFile(path);
_files.Push(file);
if (Interlocked.Decrement(ref _count) == 0)
{
((ManualResetEvent)o).Set();
}
}
private static byte[] ReadFile(string uri)
{
var request = WebRequest.Create(uri);
using (var response = request.GetResponse())
using (var stream = response.GetResponseStream())
{
var ret = new byte[stream.Length];
stream.Read(ret, 0, ret.Length);
return ret;
}
}
}
So, yeah, not sure what's going on here. Even with a large value for IO threads I timeout on each test. I'm certainly missing something.
FileWebRequest which is the type returned by WebRequest.Create() also uses ThreadPool.QueueUserWorkItem. Since you limit the worker threads, the queued work of FileWebRequest never gets executed. You need to set max worker threads to at least _count + 1 (plus 1 so that there is at least one thread the can process the queued work by FileWebRequest).
FileWebRequest.GetRequestStream does the following:
ThreadPool.QueueUserWorkItem(read file)
Wait until the file is read or timeout is reached
Better Solution:
Do not enqueue items to the ThreadPool. Use WebRequest.GetResponseAsync instead.

How do I achieve better granularity when performing streaming operations?

Okay so I'm working on my file transfer service, and I can transfer the files fine with WCF streaming. I get good speeds, and I'll eventually be able to have good resume support because I chunk my files into small bits before streaming.
However, I'm running into issues with both the server side transfer and the client side receiving when it comes to measuring a detailed transfer speed as the messages are streamed and written.
Here's the code where the file is chunked, which is called by the service every time it needs to send another chunk to the client.
public byte[] NextChunk()
{
if (MoreChunks) // If there are more chunks, procede with the next chunking operation, otherwise throw an exception.
{
byte[] buffer;
using (BinaryReader reader = new BinaryReader(File.OpenRead(FilePath)))
{
reader.BaseStream.Position = currentPosition;
buffer = reader.ReadBytes((int)MaximumChunkSize);
}
currentPosition += buffer.LongLength; // Sets the stream position to be used for the next call.
return buffer;
}
else
throw new InvalidOperationException("The last chunk of the file has already been returned.");
In the above, I basically write to the buffer based on the chunk size I am using(in this case it's 2mb which I found to have the best transfer speeds compared to larger or smaller chunk sizes). I then do a little work to remember where I left off, and return the buffer.
The following code is the server side work.
public FileMessage ReceiveFile()
{
if (!transferSpeedTimer.Enabled)
transferSpeedTimer.Start();
byte[] buffer = chunkedFile.NextChunk();
FileMessage message = new FileMessage();
message.FileMetaData = new FileMetaData(chunkedFile.MoreChunks, buffer.LongLength);
message.ChunkData = new MemoryStream(buffer);
if (!chunkedFile.MoreChunks)
{
OnTransferComplete(this, EventArgs.Empty);
Timer timer = new Timer(20000f);
timer.Elapsed += (sender, e) =>
{
StopSession();
timer.Stop();
};
timer.Start();
}
//This needs to be more granular. This method is called too infrequently for a fast and accurate enough progress of the file transfer to be determined.
TotalBytesTransferred += buffer.LongLength;
return message;
}
In this method, which is called by the client in a WCF call, I get information for the next chunk, create my message, do a little bit with timers to stop my session once the transfer is complete and update the transfer speeds. Shortly before I return the message I increment my TotalBytesTransferred with the length of the buffer, which is used to help me calculate transfer speed.
The problem with this, is it takes a while to stream the file to the client, and so the speeds I'm getting are false. What I'm trying to aim for here is a more granular modification to the TotalBytesTransferred variable so I have a better representation of how much data is being sent to the client at any given time.
Now, for the client side code, which uses an entirely different way of calculating transfer speed.
if (Role == FileTransferItem.FileTransferRole.Receiver)
{
hostChannel = channelFactory.CreateChannel();
((IContextChannel)hostChannel).OperationTimeout = new TimeSpan(3, 0, 0);
bool moreChunks = true;
long bytesPreviousPosition = 0;
using (BinaryWriter writer = new BinaryWriter(File.OpenWrite(fileWritePath)))
{
writer.BaseStream.SetLength(0);
transferSpeedTimer.Elapsed += ((sender, e) =>
{
transferSpeed = writer.BaseStream.Position - bytesPreviousPosition;
bytesPreviousPosition = writer.BaseStream.Position;
});
transferSpeedTimer.Start();
while (moreChunks)
{
FileMessage message = hostChannel.ReceiveFile();
moreChunks = message.FileMetaData.MoreChunks;
writer.BaseStream.Position = filePosition;
// This is golden, but I need to extrapolate it out and do the stream copy myself so I can do calculations on a per byte basis.
message.ChunkData.CopyTo(writer.BaseStream);
filePosition += message.FileMetaData.ChunkLength;
// TODO This needs to be more granular
TotalBytesTransferred += message.FileMetaData.ChunkLength;
}
OnTransferComplete(this, EventArgs.Empty);
}
}
else
{
transferSpeedTimer.Elapsed += ((sender, e) =>
{
totalElapsedSeconds += (int)transferSpeedTimer.Interval;
transferSpeed = TotalBytesTransferred / totalElapsedSeconds;
});
transferSpeedTimer.Start();
host.Open();
}
Here, my TotalBytesTransferred is also based on the length of the chunk coming in. I know I can get a more granular calculation if I do the stream writing myself instead of using the CopyTo for the stream, but I'm not exactly sure how to best go about this.
Can anybody help me out here? Outside of this class I have another class polling the property of TransferSpeed as it's updated internally.
I apologize if I posted too much code, but I wasn't sure what to post and what not.
EDIT: I realize at least with the Server side implementation, the way I can get a more granular reading on how many bytes have been transferred, is by reading the position of the return message value of the stream. However, I don't know a way to do this to ensure absolute integrity on my count. I thought about maybe using a timer and polling the position as the stream was being transferred, but then the next call might be made and I would quickly become out of sync.
How can I poll data from the returning stream and know immediately when the stream finishes so I can quickly add up the remainder of what was left of the stream into my byte count?
Okay I have found what seems to be ideal for me. I don't know if it's perfect, but it's pretty darn good for my needs.
On the Server side, we have this code that does the work of transferring the file. The chunkedFile class obviously does the chunking, but this is the code that sends the information to the Client.
public FileMessage ReceiveFile()
{
byte[] buffer = chunkedFile.NextChunk();
FileMessage message = new FileMessage();
message.FileMetaData = new FileMetaData(chunkedFile.MoreChunks, buffer.LongLength, chunkedFile.CurrentPosition);
message.ChunkData = new MemoryStream(buffer);
TotalBytesTransferred = chunkedFile.CurrentPosition;
UpdateTotalBytesTransferred(message);
if (!chunkedFile.MoreChunks)
{
OnTransferComplete(this, EventArgs.Empty);
Timer timer = new Timer(20000f);
timer.Elapsed += (sender, e) =>
{
StopSession();
timer.Stop();
};
timer.Start();
}
return message;
}
The client basically calls this code, and the server proceeds to get a new chunk, put it in a stream, update the TotalBytesTransferred based on the position of the chunkedFile(which keeps track of the underlying file system file that is used to draw the data from). I'll show the method UpdateTotalBytesTransferred(message) in a moment, as that is where all the code for the server and client reside to achieve the more granular polling of the TotalBytesTransferred.
Next up is the client side work.
hostChannel = channelFactory.CreateChannel();
((IContextChannel)hostChannel).OperationTimeout = new TimeSpan(3, 0, 0);
bool moreChunks = true;
using (BinaryWriter writer = new BinaryWriter(File.OpenWrite(fileWritePath)))
{
writer.BaseStream.SetLength(0);
while (moreChunks)
{
FileMessage message = hostChannel.ReceiveFile();
moreChunks = message.FileMetaData.MoreChunks;
UpdateTotalBytesTransferred(message);
writer.BaseStream.Position = filePosition;
message.ChunkData.CopyTo(writer.BaseStream);
TotalBytesTransferred = message.FileMetaData.FilePosition;
filePosition += message.FileMetaData.ChunkLength;
}
OnTransferComplete(this, EventArgs.Empty);
}
This code is very simple. It calls the host to get the file stream, and also utilizes the UpdateTotalBytesTransferred(message) method. It does a little bit of work to remember the position of the underlying file that is being written, and copies the stream to that file while also updating the TotalBytesTransferred after finishing.
The way I achieved the granularity I was looking for was with the UpdateTotalBytesTransferred method as follows. It works exactly the same for both the Server and the Client.
private void UpdateTotalBytesTransferred(FileMessage message)
{
long previousStreamPosition = 0;
long totalBytesTransferredShouldBe = TotalBytesTransferred + message.FileMetaData.ChunkLength;
Timer timer = new Timer(500f);
timer.Elapsed += (sender, e) =>
{
if (TotalBytesTransferred + (message.ChunkData.Position - previousStreamPosition) < totalBytesTransferredShouldBe)
{
TotalBytesTransferred += message.ChunkData.Position - previousStreamPosition;
previousStreamPosition = message.ChunkData.Position;
}
else
{
timer.Stop();
timer.Dispose();
}
};
timer.Start();
}
What this does is take in the FileMessage which is basically just a stream and some information about the file itself. It has a variable previousStreamPosition to remember the last position it was when it was polling the underlying stream. It also does a simple calculation with totalBytesTransferredShouldBe based on how many bytes are already transferred plus the total length of the stream.
Finally, a timer is created and executed, which upon every tick checks to see if it needs to be incrementing the TotalBytesTransferred. If it's not supposed to update it anymore(reached the end of the stream basically), it stops and disposes of the timer.
This all allows me to get very small reads of how many bytes have been transferred, which lets me better calculate the total progress in a more fluid way, as more accurately measure the file transfer speeds achieved.

Categories

Resources