StreamReader ThreadSafe Issue? Possibly?

StreamReader ThreadSafe Issue? Possibly? - c#

I'm guessing this doesn't work because of the StreamReader being non thread safe, (don't know howto fix that, google is no help)
Anyway I've been trying to figure exactly whats wrong with this code, it works 80% of the time, other times it fails to parse incoming packets and will just drop them.
This is a void for a http-like tcp server im writing. it works exactly like an http packet, but the "CONTENT-LENGTH" header tells it the length of the packets data (payload). This is where the problem is happening. Can anyone suggest to me howto improve this and fix this? because I'm completely lost.
void InternalStart()
{
bool continueWhile = true;
while (continueWhile)
{
if (SR.EndOfStream)
{
continueWhile = false;
break;
}
if (par_ReadStatus != ReadStatusEnum.WaitingForPayload)
{
int charCode = SR.Peek();
if (charCode == -1)
{
continueWhile = false;
break;
}
string outputLine = "";
outputLine = SR.ReadLine();
ReadLine(outputLine);
}
else if (par_ReadStatus == ReadStatusEnum.WaitingForPayload)
{
int length = int.Parse(par_ParsingPacket.Attributes["CONTENT-LENGTH"]);
char[] array = new char[length];
for (int i = 0; i < length; i++)
{
array.SetValue(Convert.ToChar(SR.Read()), i);
}
string payload = new string(array);
ReadLine(payload);
}
}
if (ReadEnd != null)
{
ReadEnd();
}
}

StreamReader being non thread safe, (don't know howto fix that, google is no help)
Simple. Beginner programmer level: Do not read the StreamReader from more than one thread. A design trying to do so is a failure to understand what a stream is and how efficient multi thread programming works.
There is no need to have multiple threads hit a single stream reader at all. You have to isolate threads before and assin a stream reader exclusively to a specific thread for the time of handling the data. If you want to get professional and fast you work like IIS and suck data out in infrastructure threads that then feed of work packets into a worker queue multiple threads work off.
And dependingo n performance requriements you may want to work off sockets and use the async socket mechanisms to make sure you are not wasting 1000 threads for 1000 operaions in progress at a great cost without any benefit.
Anyway i've been trying to figure exactly whats wrong with this code,
Ah - nice try. Sadly you neither tell us what problem you really have nor does your code show anything using threads, so at the end your question and the code fail to make any sense in combination.

Related

Await doesn't give a result

Ok, I have some code to present. Here is extension method for NetworkStream object.
public async static Task<byte[]> ReadDataAsync(this NetworkStream clientStream)
{
byte[] data = {};
var buffer = new byte[1024];
if (clientStream.CanRead)
{
using (var ms = new MemoryStream())
{
try
{
int bytesRead;
while (clientStream.DataAvailable &&
(bytesRead = await clientStream.ReadAsync(buffer, 0, buffer.Length)) > 0)
{
await ms.WriteAsync(buffer, 0, bytesRead);
}
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
return data;
}
data = ms.ToArray();
}
}
else
{
Console.WriteLine("Closing clientStream.");
clientStream.Close();
}
return data;
}
And the code where I am trying to call this method.
public async static Task Preform(Socket client)
{
var stream = new NetworkStream(client);
var data = await stream.ReadDataAsync();
var message = await MessageFabrique.DeserializeMessage(data);
ServerCollections.Instance.ServerIssueQueue.Add(new ServerIssue
{
Message = message,
ClientStream = stream
});
}
ReadDataAsync method always returns me to an empty array. And at the moment when i'm trying to deserialize data there is an exception - because data[0]. Please help me. Why is this happening, if await guarantees me the result, when it needed?

clientStream.DataAvailable does not mean data might show up in the future. It means data is available right now for reading. Get rid of it and just read, the read will block till data shows up or will return 0 when the stream hits it's end.

Scott's answer is right, but .Net already takes care of you...
You might consider Stream.CopyToAsync
await clientStream.CopyToAsync(ms)
for code with considerably less places to go wrong.

In addition to the other answers, you might also want to create a synchronization context. See this article for details.
The summary is that async/await works differently in console applications than it does in a UI application. WPF and WebForms applications have a synchronization context by default but console applications don't. The result (which is actually remarkably "poorly advertised" in the documentation) is that the behavior of async/await is much less predictable in a console application than it is in a UI application, and that this might make it not work "as advertised" under certain circumstances.
For example, in a UI application "async" doesn't necessarily mean that the code runs on a background thread. It's the equivalent of "come back to me later when I'm ready." As an analogy, consider going out to eat with 10 people: when the waiter comes by, the first person he asks to order isn't ready. Two bad solutions here would be to a) bring in a second waiter to either wait for the first guy to become ready or take the other 9 people's orders) or b) wait until the first guy's ready to start taking orders. The optimal thing is to take the other 9 people's orders and then come back to the first guy hoping he'll be ready by that time. At risk of oversimplifying this is basically how async works in a UI (unless you're explicitly putting the code on a background thread with something like Task.Run). However, in a console application when you use async there's no guarantee as to where the code will actually run.
If, however, you add a synchronization context as described in the the article I link to it'll behave in a much more predictable manner.

Is there a way to determine whether there is no remaining data when reading from SerialPort in C#?

I'm trying to read data from SerialPort and what I'm now trying to do is to poll the SerialPort every 100ms and see if it contains any remaining data.
public static async Task<string> ReadRemaining(SerialPort port)
{
int prevBytesToRead = 0;
await Task.Delay(100);
while (prevBytesToRead != port.BytesToRead || port.BytesToRead == 0)
{
prevBytesToRead = port.BytesToRead;
await Task.Delay(100);
}
return port.ReadExisting();
}
I feel this is very inefficient. Is there a better way?

You already are using the BytesToRead property, which is not really the best approach1, but I guess it is working for you.
You appear to be asking for a prediction of the future, you're asking about data that hasn't yet been sent from the other device. There's no way for the serial port to know that. Perhaps based on your application protocol, the other device tells you when it is done sending (end of message marker) or perhaps told you how long the message would be.
You have to figure out how to rephrase the question so it doesn't require prophesying the future.
1The clean way to use serial ports and C# async is to call port.BaseStream.ReadAsync. Make sure the timeouts are set properly. Under Win32 timeouts you can detect gaps in the stream which often correspond to message boundaries, sadly .NET SerialPort doesn't allow that. But you still can find a better value for the ReadTimeout property than the default.

CPU-greedy loop when streaming music

To give some context, I'm working on an opensource alternative desktop Spotify client, with accessibility at it's core. You'll also see some NAudio in here.
I'm noticing pretty intense CPU usage as soon as playback starts. Even when paused, the CPU is high.
I ran Visual Studio's inbuilt profiler to try and shed some light on any resource hogs that might be occuring. As I suspected, the problem wasin my playback manager's streaming loop.
The code that the profiler flags as one of the most sample-rich is as follows:
const int secondsToBuffer = 3;
private void GetStreaming(object state)
{
this.fullyDownloaded = false;
// secondsToBuffer is an integer to represent how many seconds we should buffer up at once to prevent choppy playback on slow connections
try
{
do
{
if (bufferedWaveProvider == null)
{
this.bufferedWaveProvider = new BufferedWaveProvider(new WaveFormat(44100, 2));
this.bufferedWaveProvider.BufferDuration = TimeSpan.FromSeconds(20); // allow us to get well ahead of ourselves
Logger.WriteDebug("Creating buffered wave provider");
this.gatekeeper.MinimumSampleSize = bufferedWaveProvider.WaveFormat.AverageBytesPerSecond * secondsToBuffer;
}
// this bit in particular seems to be the hot point
if (bufferedWaveProvider != null && bufferedWaveProvider.BufferLength - bufferedWaveProvider.BufferedBytes < bufferedWaveProvider.WaveFormat.AverageBytesPerSecond / 4)
{
Logger.WriteDebug("Buffer getting full, taking a break");
Thread.Sleep(500);
}
// do we have at least double the buffered sample's size in free space, just in case
else if (bufferedWaveProvider.BufferLength - bufferedWaveProvider.BufferedBytes > bufferedWaveProvider.WaveFormat.AverageBytesPerSecond * (secondsToBuffer * 2))
{
var sample = gatekeeper.Read();
if (sample != null)
{
bufferedWaveProvider.AddSamples(sample, 0, sample.Length);
}
}
} while (playbackState != StreamingPlaybackState.Stopped);
Logger.WriteDebug("Playback stopped");
}
finally
{
// no post-processing work here, right?
}
}
An NAudio sample was the inspiration for my way of handling streaming in this method. To find the full file's source code, you can view it here: http://blindspot.codeplex.com/SourceControl/latest#Blindspot.Playback/PlaybackManager.cs
I'm a newbie to profiling and I'm not a year on year expert on streaming either (both might be obvious).
Is there any way I can make this loop less resource intensive. Would increasing the sleep amount in the if block where the buffer is full help? Or am I barking up the wrong tree here. It seems like it would, but I'd have thought half a second would be sufficient.
Any help gratefully received.

Basically, you've created an infinite loop until the buffer gets full. The section you've marked with
// this bit in particular seems to be the hot point
probably appears to be as the calculations in the if statement are just being repeated over and over again; can any of them be moved outside of the loop?
I'd put a Thread.Sleep(50) before the while statement to prevent thrashing and see if that makes a difference (I suspect it will).

C# Threading - Reading and hashing multiple files concurrently, easiest method?

I've been trying to get what I believe to be the simplest possible form of threading to work in my application but I just can't do it.
What I want to do: I have a main form with a status strip and a progress bar on it. I have to read something between 3 and 99 files and add their hashes to a string[] which I want to add to a list of all files with their respective hashes. Afterwards I have to compare the items on that list to a database (which comes in text files).
Once all that is done, I have to update a textbox in the main form and the progressbar to 33%; mostly I just don't want the main form to freeze during processing.
The files I'm working with always sum up to 1.2GB (+/- a few MB), meaning I should be able to read them into byte[]s and process them from there (I have to calculate CRC32, MD5 and SHA1 of each of those files so that should be faster than reading all of them from a HDD 3 times).
Also I should note that some files may be 1MB while another one may be 1GB. I initially wanted to create 99 threads for 99 files but that seems not wise, I suppose it would be best to reuse threads of small files while bigger file threads are still running. But that sounds pretty complicated to me so I'm not sure if that's wise either.
So far I've tried workerThreads and backgroundWorkers but neither seem to work too well for me; at least the backgroundWorkers worked SOME of the time, but I can't even figure out why they won't the other times... either way the main form still froze.
Now I've read about the Task Parallel Library in .NET 4.0 but I thought I should better ask someone who knows what he's doing before wasting more time on this.
What I want to do looks something like this (without threading):
List<string[]> fileSpecifics = new List<string[]>();
int fileMaxNumber = 42; // something between 3 and 99, depending on file set
for (int i = 1; i <= fileMaxNumber; i++)
{
string fileName = "C:\\path\\to\\file" + i.ToString("D2") + ".ext"; // file01.ext - file99.ext
string fileSize = new FileInfo(fileName).Length.ToString();
byte[] file = File.ReadAllBytes(fileName);
// hash calculations (using SHA1CryptoServiceProvider() etc., no problems with that so I'll spare you that, return strings)
file = null; // I didn't yet check if this made any actual difference but I figured it couldn't hurt
fileSpecifics.Add(new string[] { fileName, fileSize, fileCRC, fileMD5, fileSHA1 });
}
// look for files in text database mentioned above, i.e. first check for "file bundles" with the same amount of files I have here; then compare file sizes, then hashes
// again, no problems with that so I'll spare you that; the database text files are pretty small so parsing them doesn't need to be done in an extra thread.
Would anybody be kind enough to point me in the right direction? I'm looking for the easiest way to read and hash those files quickly (I believe the hashing takes some time in which other files could already be read) and save the output to a string[], without the main form freezing, nothing more, nothing less.
I'm thankful for any input.
EDIT to clarify: by "backgroundWorkers working some of the time" I meant that (for the very same set of files), maybe the first and fourth execution of my code produces the correct output and the UI unfreezes within 5 seconds, for the second, third and fifth execution it freezes the form (and after 60 seconds I get an error message saying some thread didn't respond within that time frame) and I have to stop execution via VS.
Thanks for all your suggestions and pointers, as you all have correctly guessed I'm completely new to threading and will have to read up on the great links you guys posted.
Then I'll give those methods a try and flag the answer that helped me the most. Thanks again!

With .NET Framework 4.X
Use Directory.EnumerateFiles Method for efficient/lazy files enumeration
Use Parallel.For() to delegate parallelism work to PLINQ framework or use TPL to delegate single Task per pipeline Stage
Use Pipelines pattern to pipeline following stages: calculating hashcodes, compare with pattern, update UI
To avoid UI freeze use appropriate techniques: for WPF use Dispatcher.BeginInvoke(), for WinForms use Invoke(), see this SO answer
Considering that all this stuff has UI it might be useful adding some cancellation feature to abandon long running operation if needed, take a look at the CreateLinkedTokenSource class which allows triggering CancellationToken from the "external scope"
I can try adding an example but it's worth do it yourself so you would learn all this stuff rather than simply copy/paste - > got it working -> forgot about it.
PS: Must read - Pipelines paper at MSDN
TPL specific pipeline implementation
Pipeline pattern implementation: three stages: calculate hash, match, update UI
Three tasks, one per stage
Two Blocking Queues
//
// 1) CalculateHashesImpl() should store all calculated hashes here
// 2) CompareMatchesImpl() should read input hashes from this queue
// Tuple.Item1 - hash, Typle.Item2 - file path
var calculatedHashes = new BlockingCollection<Tuple<string, string>>();
// 1) CompareMatchesImpl() should store all pattern matching results here
// 2) SyncUiImpl() method should read from this collection and update
// UI with available results
var comparedMatches = new BlockingCollection<string>();
var factory = new TaskFactory(TaskCreationOptions.LongRunning,
TaskContinuationOptions.None);
var calculateHashesWorker = factory.StartNew(() => CalculateHashesImpl(...));
var comparedMatchesWorker = factory.StartNew(() => CompareMatchesImpl(...));
var syncUiWorker= factory.StartNew(() => SyncUiImpl(...));
Task.WaitAll(calculateHashesWorker, comparedMatchesWorker, syncUiWorker);
CalculateHashesImpl():
private void CalculateHashesImpl(string directoryPath)
{
foreach (var file in Directory.EnumerateFiles(directoryPath))
{
var hash = CalculateHashTODO(file);
calculatedHashes.Add(new Tuple<string, string>(hash, file.Path));
}
}
CompareMatchesImpl():
private void CompareMatchesImpl()
{
foreach (var hashEntry in calculatedHashes.GetConsumingEnumerable())
{
// TODO: obviously return type is up to you
string matchResult = GetMathResultTODO(hashEntry.Item1, hashEntry.Item2);
comparedMatches.Add(matchResult);
}
}
SyncUiImpl():
private void UpdateUiImpl()
{
foreach (var matchResult in comparedMatches.GetConsumingEnumerable())
{
// TODO: track progress in UI using UI framework specific features
// to do not freeze it
}
}
TODO: Consider using CancellationToken as a parameter for all GetConsumingEnumerable() calls so you easily can stop a pipeline execution when needed.

First off, you should be using a higher level of abstraction to solve this problem. You have a bunch of tasks to complete, so use the "task" abstraction. You should be using the Task Parallel Library to do this sort of thing. Let the TPL deal with the question of how many worker threads to create -- the answer could be as low as one if the work is gated on I/O.
If you do want to do your own threading, some good advice:
Do not ever block on the UI thread. That's is what is freezing your application. Come up with a protocol by which working threads can communicate with your UI thread, which then does nothing except for responding to UI events. Remember that methods of user interface controls like task completion bars must never be called by any other thread other than the UI thread.
Do not create 99 threads to read 99 files. That's like getting 99 pieces of mail and hiring 99 assistants to write responses: an extraordinarily expensive solution to a simple problem. If your work is CPU intensive then there is no point in "hiring" more threads than you have CPUs to service them. (That's like hiring 99 assistants in an office that only has four desks. The assistants spend most of their time waiting for a desk to sit at instead of reading your mail.) If your work is disk-intensive then most of those threads are going to be idle most of the time waiting for the disk, which is an even bigger waste of resources.

First, I hope you are using a built-in library for calculating hashes. It's possible to write your own, but it's far safer to use something that has been around for a while.
You may need only create as many threads as CPUs if your process is CPU intensive. If it is bound by I/O, you might be able to get away with more threads.
I do not recommend loading the entire file into memory. Your hashing library should support updating a chunk at a time. Read a chunk into memory, use it to update the hashes of each algorighm, read the next chunk, and repeat until end of file. The chunked approach will help lower your program's memory demands.
As others have suggested, look into the Task Parallel Library, particularly Data Parallelism. It might be as easy as this:
Parallel.ForEach(fileSpecifics, item => CalculateHashes(item));

Check out TPL Dataflow. You can use a throttled ActionBlock which will manage the hard part for you.

If my understanding that you are looking to perform some tasks in the background and not block your UI, then the UI BackgroundWorker would be an appropriate choice. You mentioned that you got it working some of the time, so my recommendation would be to take what you had in a semi-working state, and improve upon it by tracking down the failures. If my hunch is correct, your worker was throwing an exception, which it does not appear you are handling in your code. Unhandled exceptions that bubble out of their containing threads make bad things happen.

This code hashing one file (stream) using two tasks - one for reading, second for hashing, for more robust way you should read more chunks forward.
Because bandwidth of processor is much higher than of disk, unless you use some high speed Flash drive you gain nothing from hashing more files concurrently.
public void TransformStream(Stream a_stream, long a_length = -1)
{
Debug.Assert((a_length == -1 || a_length > 0));
if (a_stream.CanSeek)
{
if (a_length > -1)
{
if (a_stream.Position + a_length > a_stream.Length)
throw new IndexOutOfRangeException();
}
if (a_stream.Position >= a_stream.Length)
return;
}
System.Collections.Concurrent.ConcurrentQueue<byte[]> queue =
new System.Collections.Concurrent.ConcurrentQueue<byte[]>();
System.Threading.AutoResetEvent data_ready = new System.Threading.AutoResetEvent(false);
System.Threading.AutoResetEvent prepare_data = new System.Threading.AutoResetEvent(false);
Task reader = Task.Factory.StartNew(() =>
{
long total = 0;
for (; ; )
{
byte[] data = new byte[BUFFER_SIZE];
int readed = a_stream.Read(data, 0, data.Length);
if ((a_length == -1) && (readed != BUFFER_SIZE))
data = data.SubArray(0, readed);
else if ((a_length != -1) && (total + readed >= a_length))
data = data.SubArray(0, (int)(a_length - total));
total += data.Length;
queue.Enqueue(data);
data_ready.Set();
if (a_length == -1)
{
if (readed != BUFFER_SIZE)
break;
}
else if (a_length == total)
break;
else if (readed != BUFFER_SIZE)
throw new EndOfStreamException();
prepare_data.WaitOne();
}
});
Task hasher = Task.Factory.StartNew((obj) =>
{
IHash h = (IHash)obj;
long total = 0;
for (; ; )
{
data_ready.WaitOne();
byte[] data;
queue.TryDequeue(out data);
prepare_data.Set();
total += data.Length;
if ((a_length == -1) || (total < a_length))
{
h.TransformBytes(data, 0, data.Length);
}
else
{
int readed = data.Length;
readed = readed - (int)(total - a_length);
h.TransformBytes(data, 0, data.Length);
}
if (a_length == -1)
{
if (data.Length != BUFFER_SIZE)
break;
}
else if (a_length == total)
break;
else if (data.Length != BUFFER_SIZE)
throw new EndOfStreamException();
}
}, this);
reader.Wait();
hasher.Wait();
}
Rest of code here: http://hashlib.codeplex.com/SourceControl/changeset/view/71730#514336

GetResponse() taking too long

I am working on a winforms application.
I have a function to validate the URL.
private void checkForSPSiteValidity(DataGridView Sites_dataGridView)
{
foreach (DataGridViewRow myRow in SharePointSites_dataGridView.Rows)
{
try
{
DataGridViewImageCell cell = myRow.Cells[CommonCodeClass.status_GridCol] as DataGridViewImageCell;
string url = myRow.Cells[CommonCodeClass.spURL_GridCol].Value.ToString();
WebRequest req = WebRequest.Create(url);
WebResponse res = req.GetResponse();
cell.Value = Image.FromFile(CommonCodeClass.Correct_Icons);
}
catch (WebException ex)
{
Console.WriteLine(ex.Message);
if (ex.Message.Contains("remote name could not be resolved"))
{
DataGridViewImageCell cell = myRow.Cells[CommonCodeClass.status_GridCol] as DataGridViewImageCell;
cell.Value = Image.FromFile(CommonCodeClass.warning_Icon);
}
}
}
}
This code is working fine and i get the correct values but it is taking to long to process this and most of the times the application gets hanged.
I am new to threading so is there a way to implement it with that.
An example will be really helpful
If there is any other better way to do this please let me know.
Thanks

Check out the BackgroundWorker control. That's one simple way to do it.
HTH.

As you point out to solution yourself, you must perform the fetch asynchronously. BackgroundWorker is a good class to start, especially because it is a native WinForms component.
You can also look into new coming async extensions in C# if you want to solve it in a more general way.

A great way to do this is using a Thread Pool:
http://msdn.microsoft.com/en-us/library/3dasc8as%28v=vs.80%29.aspx
http://www.switchonthecode.com/tutorials/csharp-tutorial-using-the-threadpool
It's simple to implement and would be great at crunching a high volume of requests.
You can also specify a max number of threads and loop to do sets of 15, 25, 50, etc so you don't cut too many threads and end up cutting more threads then there is a benefit. I would play around with it to find out when you start to loose optimization.
The nice thing is that (see first link) you pass an object (Object threadContext) where this doesn't have to be a single value... it can be an array, a list, etc that is cast as an object. When working with lists, etc you may have to look up a bit on thread safety=, but I feel this is probably more then you are doing with threading at this point.
.
.
Please rate if helpful.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.