I have 1 thread streaming data and a 2nd (the threadpool) processing the data. The data processing takes around 100ms so I use to second thread so not to hold up the 1st thread.
While the 2nd thread is processing the data the 1st thread adds the data to a dictionary cache then when the 2nd thread is finished it processes the cached values.
My questions is this how should be doing producer /consumer code in C#?
public delegate void OnValue(ulong value);
public class Runner
{
public event OnValue OnValueEvent;
private readonly IDictionary<string, ulong> _cache = new Dictionary<string, ulong>(StringComparer.InvariantCultureIgnoreCase);
private readonly AutoResetEvent _cachePublisherWaitHandle = new AutoResetEvent(true);
public void Start()
{
for (ulong i = 0; i < 500; i++)
{
DataStreamHandler(i.ToString(), i);
}
}
private void DataStreamHandler(string id, ulong value)
{
_cache[id] = value;
if (_cachePublisherWaitHandle.WaitOne(1))
{
IList<ulong> tempValues = new List<ulong>(_cache.Values);
_cache.Clear();
_cachePublisherWaitHandle.Reset();
ThreadPool.UnsafeQueueUserWorkItem(delegate
{
try
{
foreach (ulong value1 in tempValues)
if (OnValueEvent != null)
OnValueEvent(value1);
}
finally
{
_cachePublisherWaitHandle.Set();
}
}, null);
}
else
{
Console.WriteLine(string.Format("Buffered value: {0}.", value));
}
}
}
class Program
{
static void Main(string[] args)
{
Stopwatch sw = Stopwatch.StartNew();
Runner r = new Runner();
r.OnValueEvent += delegate(ulong x)
{
Console.WriteLine(string.Format("Processed value: {0}.", x));
Thread.Sleep(100);
if(x == 499)
{
sw.Stop();
Console.WriteLine(string.Format("Time: {0}.", sw.ElapsedMilliseconds));
}
};
r.Start();
Console.WriteLine("Done");
Console.ReadLine();
}
}
The best practice for setting up the producer-consumer pattern is to use the BlockingCollection class which is available in .NET 4.0 or as a separate download of the Reactive Extensions framework. The idea is that the producers will enqueue using the Add method and the consumers will dequeue using the Take method which blocks if the queue is empty. Like SwDevMan81 pointed out the Albahari site has a really good writeup on how to make it work correctly if you want to go the manual route.
There is a good article on MSDN about Synchronizing the Producer and Consumer. There is also a good example on the Albahari site.
Related
I'm trying to implement a producer/consumer pattern using BlockingCollection<T> so I've written up a simple console application to test it.
public class Program
{
public static void Main(string[] args)
{
var workQueue = new WorkQueue();
workQueue.StartProducingItems();
workQueue.StartProcessingItems();
while (true)
{
}
}
}
public class WorkQueue
{
private BlockingCollection<int> _queue;
private static Random _random = new Random();
public WorkQueue()
{
_queue = new BlockingCollection<int>();
// Prefill some items.
for (int i = 0; i < 100; i++)
{
//_queue.Add(_random.Next());
}
}
public void StartProducingItems()
{
Task.Run(() =>
{
_queue.Add(_random.Next()); // Should be adding items to the queue constantly, but instead adds one and then nothing else.
});
}
public void StartProcessingItems()
{
Task.Run(() =>
{
foreach (var item in _queue.GetConsumingEnumerable())
{
Console.WriteLine("Worker 1: " + item);
}
});
Task.Run(() =>
{
foreach (var item in _queue.GetConsumingEnumerable())
{
Console.WriteLine("Worker 2: " + item);
}
});
}
}
However there are 3 problems with my design:
I don't know the correct way of blocking/waiting in my Main method. Doing a simple empty while loop seems terribly inefficient and CPU usage wasting simply for the sake of making sure the application doesn't end.
There's also another problem with my design, in this simple application I have a producer that produces items indefinitely, and should never stop. In a real world setup, I'd want it to end eventually (e.g. ran out of files to process). In that case, how should I wait for it to finish in the Main method? Make StartProducingItems async and then await it?
Either the GetConsumingEnumerable or Add is not working as I expected. The producer should constantly adding items, but it adds one item and then never adds anymore. This one item is then processed by one of the consumers. Both consumers then block waiting for items to be added, but none are. I know of the Take method, but again spinning on Take in a while loop seems pretty wasteful and inefficient. There is a CompleteAdding method but that then does not allow anything else to ever be added and throws an exception if you try, so that is not suitable.
I know for sure that both consumers are in fact blocking and waiting for new items, as I can switch between threads during debugging:
EDIT:
I've made the changes suggested in one of the comments, but the Task.WhenAll still returns right away.
public Task StartProcessingItems()
{
var consumers = new List<Task>();
for (int i = 0; i < 2; i++)
{
consumers.Add(Task.Run(() =>
{
foreach (var item in _queue.GetConsumingEnumerable())
{
Console.WriteLine($"Worker {i}: " + item);
}
}));
}
return Task.WhenAll(consumers.ToList());
}
GetConsumingEnumerable() is blocking. If you want to add to the queue constantly, you should put the call to _queue.Add in a loop:
public void StartProducingItems()
{
Task.Run(() =>
{
while (true)
_queue.Add(_random.Next());
});
}
Regarding the Main() method you could call the Console.ReadLine() method to prevent the main thread from finishing before you have pressed a key:
public static void Main(string[] args)
{
var workQueue = new WorkQueue();
workQueue.StartProducingItems();
workQueue.StartProcessingItems();
Console.WriteLine("Press a key to terminate the application...");
Console.ReadLine();
}
Hi I would like to know if there is something similiar to the await statement, which is used with tasks, which I can implement with threads in c#?
What I want to do is:
Start Thread A, compute some data and put the result on variable x.
After that variable x is transferred to another thread B and at the same time
Thread A starts again to compute some data, while thread B starts another calculation with the result x.
UPDATE: Ok there seems to be some confusion so I will be more accurate in my description:
I use two sensors which produce data. The data needs to be retrieved in such a way that SensorA data is retrieved (which takes a long time) and immediately after that the data from SensorB must be retrieved in another Thread, while SensorA continues retrieving another data block. The problem is i cant queue the data of both sensors in the same queue, but I need to store the data of both sensor in ONE data structure/object.
My idea was like that:
Get Data from Sensor A in Thread A.
Give result to Thread B and restart Thread A.
While Thread A runs again Thread B gets Data from Sensor B and computes the data from Sensor A and B
You can assume that Thread A always needs a longer time than Thread B
As I said in a comment. This looks like classic Producer/Consumer, for which we can use e.g. a BlockingCollection.
This is a slight modification of the sample from that page:
BlockingCollection<Data> dataItems = new BlockingCollection<Data>(100);
// "Thread B"
Task.Run(() =>
{
while (!dataItems.IsCompleted)
{
Data dataA = null;
try
{
dataA = dataItems.Take();
}
catch (InvalidOperationException) { }
if (dataA != null)
{
var dataB = ReadSensorB();
Process(dataA,dataB);
}
}
Console.WriteLine("\r\nNo more items to take.");
});
// "Thread A"
Task.Run(() =>
{
while (moreItemsToAdd)
{
Data dataA = ReadSensorA();
dataItems.Add(dataA);
}
// Let consumer know we are done.
dataItems.CompleteAdding();
});
And then moreItemsToAdd is just whatever code you need to have to cope with needing to shut this process down.
I'm not sure why you're avoiding the use of tasks? Maybe you're on an older version of .net? If so, BlockingCollection as Damien suggested is also not an option. If you're using "normal" threads, you can use a waithandle to signal results between threads. For example, an AutoResetEvent.
private int a;
private AutoResetEvent newResult = new AutoResetEvent(false);
private void ThreadA()
{
while (true)
{
a = GetSensorA();
newResult.Set();
}
}
private void ThreadB()
{
int b;
while (true)
{
newResult.WaitOne();
b = GetSensorB(); // or before "waitone"
Console.WriteLine(a + b); // do something
}
}
edit: had slight mistake in there with the reset, thanks for pointing out Damien - updated
If you can use .Net 4.5 or later, then the best way to approach this is to use the DataFlow component of the TPL.
(You must use NuGet to install DataFlow; it's not part of the CLR by default.)
Here's a sample compilable console application which demonstrates how to use DataFlow to do it:
using System;
using System.Threading;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
namespace SensorDemo
{
public sealed class SensorAData
{
public int Data;
}
public sealed class SensorBData
{
public double Data;
}
public sealed class SensorData
{
public SensorAData SensorAData;
public SensorBData SensorBData;
public override string ToString()
{
return $"SensorAData = {SensorAData.Data}, SensorBData = {SensorBData.Data}";
}
}
class Program
{
static void Main()
{
var sensorADataSource = new TransformBlock<SensorAData, SensorData>(
sensorAData => addSensorBData(sensorAData),
dataflowOptions());
var combinedSensorProcessor = new ActionBlock<SensorData>(
data => process(data),
dataflowOptions());
sensorADataSource.LinkTo(combinedSensorProcessor, new DataflowLinkOptions { PropagateCompletion = true });
// Create a cancellation source that will cancel after a few seconds.
var cancellationSource = new CancellationTokenSource(delay:TimeSpan.FromSeconds(20));
Task.Run(() => continuouslyReadFromSensorA(sensorADataSource, cancellationSource.Token));
Console.WriteLine("Started reading from SensorA");
sensorADataSource.Completion.Wait(); // Wait for reading from SensorA to complete.
Console.WriteLine("Completed reading from SensorA.");
combinedSensorProcessor.Completion.Wait();
Console.WriteLine("Completed processing of combined sensor data.");
}
static async Task continuouslyReadFromSensorA(TransformBlock<SensorAData, SensorData> queue, CancellationToken cancellation)
{
while (!cancellation.IsCancellationRequested)
await queue.SendAsync(readSensorAData());
queue.Complete();
}
static SensorData addSensorBData(SensorAData sensorAData)
{
return new SensorData
{
SensorAData = sensorAData,
SensorBData = readSensorBData()
};
}
static SensorAData readSensorAData()
{
Console.WriteLine("Reading from Sensor A");
Thread.Sleep(1000); // Simulate reading sensor A data taking some time.
int value = Interlocked.Increment(ref sensorValue);
Console.WriteLine("Read Sensor A value = " + value);
return new SensorAData {Data = value};
}
static SensorBData readSensorBData()
{
Console.WriteLine("Reading from Sensor B");
Thread.Sleep(100); // Simulate reading sensor B data being much quicker.
int value = Interlocked.Increment(ref sensorValue);
Console.WriteLine("Read Sensor B value = " + value);
return new SensorBData {Data = value};
}
static void process(SensorData value)
{
Console.WriteLine("Processing sensor data: " + value);
Thread.Sleep(1000); // Simulate slow processing of combined sensor values.
}
static ExecutionDataflowBlockOptions dataflowOptions()
{
return new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 1,
BoundedCapacity = 1
};
}
static int sensorValue;
}
}
I have a compute intensive method Calculate that may run for a few seconds, requests come from multiple threads.
Only one Calculate should be executing, a subsequent request should be queued until the initial request completes. If there is already a request queued then the the subsequent request can be discarded (as the queued request will be sufficient)
There seems to be lots of potential solutions but I just need the simplest.
UPDATE: Here's my rudimentaryattempt:
private int _queueStatus;
private readonly object _queueStatusSync = new Object();
public void Calculate()
{
lock(_queueStatusSync)
{
if(_queueStatus == 2) return;
_queueStatus++;
if(_queueStatus == 2) return;
}
for(;;)
{
CalculateImpl();
lock(_queueStatusSync)
if(--_queueStatus == 0) return;
}
}
private void CalculateImpl()
{
// long running process will take a few seconds...
}
The simplest, cleanest solution IMO is using TPL Dataflow (as always) with a BufferBlock acting as the queue. BufferBlock is thread-safe, supports async-await, and more important, has TryReceiveAll to get all the items at once. It also has OutputAvailableAsync so you can wait asynchronously for items to be posted to the buffer. When multiple requests are posted you simply take the last and forget about the rest:
var buffer = new BufferBlock<Request>();
var task = Task.Run(async () =>
{
while (await buffer.OutputAvailableAsync())
{
IList<Request> requests;
buffer.TryReceiveAll(out requests);
Calculate(requests.Last());
}
});
Usage:
buffer.Post(new Request());
buffer.Post(new Request());
Edit: If you don't have any input or output for the Calculate method you can simply use a boolean to act as a switch. If it's true you can turn it off and calculate, if it became true again while Calculate was running then calculate again:
public bool _shouldCalculate;
public void Producer()
{
_shouldCalculate = true;
}
public async Task Consumer()
{
while (true)
{
if (!_shouldCalculate)
{
await Task.Delay(1000);
}
else
{
_shouldCalculate = false;
Calculate();
}
}
}
A BlockingCollection that only takes 1 at a time
The trick is to skip if there are any items in the collection
I would go with the answer from I3aron +1
This is (maybe) a BlockingCollection solution
public static void BC_AddTakeCompleteAdding()
{
using (BlockingCollection<int> bc = new BlockingCollection<int>(1))
{
// Spin up a Task to populate the BlockingCollection
using (Task t1 = Task.Factory.StartNew(() =>
{
for (int i = 0; i < 100; i++)
{
if (bc.TryAdd(i))
{
Debug.WriteLine(" add " + i.ToString());
}
else
{
Debug.WriteLine(" skip " + i.ToString());
}
Thread.Sleep(30);
}
bc.CompleteAdding();
}))
{
// Spin up a Task to consume the BlockingCollection
using (Task t2 = Task.Factory.StartNew(() =>
{
try
{
// Consume consume the BlockingCollection
while (true)
{
Debug.WriteLine("take " + bc.Take());
Thread.Sleep(100);
}
}
catch (InvalidOperationException)
{
// An InvalidOperationException means that Take() was called on a completed collection
Console.WriteLine("That's All!");
}
}))
Task.WaitAll(t1, t2);
}
}
}
It sounds like a classic producer-consumer. I'd recommend looking into BlockingCollection<T>. It is part of the System.Collection.Concurrent namespace. On top of that you can implement your queuing logic.
You may supply to a BlockingCollection any internal structure to hold its data, such as a ConcurrentBag<T>, ConcurrentQueue<T> etc. The latter is the default structure used.
I had to write a console application that called Microsoft Dynamics CRM web service to perform an action on over eight thousand CRM objects. The details of the web service call are irrelevant and not shown here but I needed a multi-threaded client so that I could make calls in parallel. I wanted to be able to control the number of threads used from a config setting and also for the application to cancel the whole operation if the number of service errors reached a config-defined threshold.
I wrote it using Task Parallel Library Task.Run and ContinueWith, keeping track of how many calls (threads) were in progress, how many errors we'd received, and whether the user had cancelled from the keyboard. Everything worked fine and I had extensive logging to assure myself that threads were finishing cleanly and that everything was tidy at the end of the run. I could see that the program was using the maximum number of threads in parallel and, if our maximum limit was reached, waiting until a running task completed before starting another one.
During my code review, my colleague suggested that it would be better to do it with async/await instead of tasks and continuations, so I created a branch and rewrote it that way. The results were interesting - the async/await version was almost twice as slow, and it never reached the maximum number of allowed parallel operations/threads. The TPL one always got to 10 threads in parallel whereas the async/await version never got beyond 5.
My question is: have I made a mistake in the way I have written the async/await code (or the TPL code even)? If I have not coded it wrong, can you explain why the async/await is less efficient, and does that mean it is better to carry on using TPL for multi-threaded code.
Note that the code I tested with did not actually call CRM - the CrmClient class simply thread-sleeps for a duration specified in the config (five seconds) and then throws an exception. This meant that there were no external variables that could affect the performance.
For the purposes of this question I created a stripped down program that combines both versions; which one is called is determined by a config setting. Each of them starts with a bootstrap runner that sets up the environment, creates the queue class, then uses a TaskCompletionSource to wait for completion. A CancellationTokenSource is used to signal a cancellation from the user. The list of ids to process is read from an embedded file and pushed onto a ConcurrentQueue. They both start off calling StartCrmRequest as many times as max-threads; subsequently, every time a result is processed, the ProcessResult method calls StartCrmRequest again, keeping going until all of our ids are processed.
You can clone/download the complete program from here: https://bitbucket.org/kentrob/pmgfixso/
Here is the relevant configuration:
<appSettings>
<add key="TellUserAfterNCalls" value="5"/>
<add key="CrmErrorsBeforeQuitting" value="20"/>
<add key="MaxThreads" value="10"/>
<add key="CallIntervalMsecs" value="5000"/>
<add key="UseAsyncAwait" value="True" />
</appSettings>
Starting with the TPL version, here is the bootstrap runner that kicks off the queue manager:
public static class TplRunner
{
private static readonly CancellationTokenSource CancellationTokenSource = new CancellationTokenSource();
public static void StartQueue(RuntimeParameters parameters, IEnumerable<string> idList)
{
Console.CancelKeyPress += (s, args) =>
{
CancelCrmClient();
args.Cancel = true;
};
var start = DateTime.Now;
Program.TellUser("Start: " + start);
var taskCompletionSource = new TplQueue(parameters)
.Start(CancellationTokenSource.Token, idList);
while (!taskCompletionSource.Task.IsCompleted)
{
if (Console.KeyAvailable)
{
if (Console.ReadKey().Key != ConsoleKey.Q) continue;
Console.WriteLine("When all threads are complete, press any key to continue.");
CancelCrmClient();
}
}
var end = DateTime.Now;
Program.TellUser("End: {0}. Elapsed = {1} secs.", end, (end - start).TotalSeconds);
}
private static void CancelCrmClient()
{
CancellationTokenSource.Cancel();
Console.WriteLine("Cancelling Crm client. Web service calls in operation will have to run to completion.");
}
}
Here is the TPL queue manager itself:
public class TplQueue
{
private readonly RuntimeParameters parameters;
private readonly object locker = new object();
private ConcurrentQueue<string> idQueue = new ConcurrentQueue<string>();
private readonly CrmClient crmClient;
private readonly TaskCompletionSource<bool> taskCompletionSource = new TaskCompletionSource<bool>();
private int threadCount;
private int crmErrorCount;
private int processedCount;
private CancellationToken cancelToken;
public TplQueue(RuntimeParameters parameters)
{
this.parameters = parameters;
crmClient = new CrmClient();
}
public TaskCompletionSource<bool> Start(CancellationToken cancellationToken, IEnumerable<string> ids)
{
cancelToken = cancellationToken;
foreach (var id in ids)
{
idQueue.Enqueue(id);
}
threadCount = 0;
// Prime our thread pump with max threads.
for (var i = 0; i < parameters.MaxThreads; i++)
{
Task.Run((Action) StartCrmRequest, cancellationToken);
}
return taskCompletionSource;
}
private void StartCrmRequest()
{
if (taskCompletionSource.Task.IsCompleted)
{
return;
}
if (cancelToken.IsCancellationRequested)
{
Program.TellUser("Crm client cancelling...");
ClearQueue();
return;
}
var count = GetThreadCount();
if (count >= parameters.MaxThreads)
{
return;
}
string id;
if (!idQueue.TryDequeue(out id)) return;
IncrementThreadCount();
crmClient.CompleteActivityAsync(new Guid(id), parameters.CallIntervalMsecs).ContinueWith(ProcessResult);
processedCount += 1;
if (parameters.TellUserAfterNCalls > 0 && processedCount%parameters.TellUserAfterNCalls == 0)
{
ShowProgress(processedCount);
}
}
private void ProcessResult(Task<CrmResultMessage> response)
{
if (response.Result.CrmResult == CrmResult.Failed && ++crmErrorCount == parameters.CrmErrorsBeforeQuitting)
{
Program.TellUser(
"Quitting because CRM error count is equal to {0}. Already queued web service calls will have to run to completion.",
crmErrorCount);
ClearQueue();
}
var count = DecrementThreadCount();
if (idQueue.Count == 0 && count == 0)
{
taskCompletionSource.SetResult(true);
}
else
{
StartCrmRequest();
}
}
private int GetThreadCount()
{
lock (locker)
{
return threadCount;
}
}
private void IncrementThreadCount()
{
lock (locker)
{
threadCount = threadCount + 1;
}
}
private int DecrementThreadCount()
{
lock (locker)
{
threadCount = threadCount - 1;
return threadCount;
}
}
private void ClearQueue()
{
idQueue = new ConcurrentQueue<string>();
}
private static void ShowProgress(int processedCount)
{
Program.TellUser("{0} activities processed.", processedCount);
}
}
Note that I am aware that a couple of the counters are not thread safe but they are not critical; the threadCount variable is the only critical one.
Here is the dummy CRM client method:
public Task<CrmResultMessage> CompleteActivityAsync(Guid activityId, int callIntervalMsecs)
{
// Here we would normally call a CRM web service.
return Task.Run(() =>
{
try
{
if (callIntervalMsecs > 0)
{
Thread.Sleep(callIntervalMsecs);
}
throw new ApplicationException("Crm web service not available at the moment.");
}
catch
{
return new CrmResultMessage(activityId, CrmResult.Failed);
}
});
}
And here are the same async/await classes (with common methods removed for the sake of brevity):
public static class AsyncRunner
{
private static readonly CancellationTokenSource CancellationTokenSource = new CancellationTokenSource();
public static void StartQueue(RuntimeParameters parameters, IEnumerable<string> idList)
{
var start = DateTime.Now;
Program.TellUser("Start: " + start);
var taskCompletionSource = new AsyncQueue(parameters)
.StartAsync(CancellationTokenSource.Token, idList).Result;
while (!taskCompletionSource.Task.IsCompleted)
{
...
}
var end = DateTime.Now;
Program.TellUser("End: {0}. Elapsed = {1} secs.", end, (end - start).TotalSeconds);
}
}
The async/await queue manager:
public class AsyncQueue
{
private readonly RuntimeParameters parameters;
private readonly object locker = new object();
private readonly CrmClient crmClient;
private readonly TaskCompletionSource<bool> taskCompletionSource = new TaskCompletionSource<bool>();
private CancellationToken cancelToken;
private ConcurrentQueue<string> idQueue = new ConcurrentQueue<string>();
private int threadCount;
private int crmErrorCount;
private int processedCount;
public AsyncQueue(RuntimeParameters parameters)
{
this.parameters = parameters;
crmClient = new CrmClient();
}
public async Task<TaskCompletionSource<bool>> StartAsync(CancellationToken cancellationToken,
IEnumerable<string> ids)
{
cancelToken = cancellationToken;
foreach (var id in ids)
{
idQueue.Enqueue(id);
}
threadCount = 0;
// Prime our thread pump with max threads.
for (var i = 0; i < parameters.MaxThreads; i++)
{
await StartCrmRequest();
}
return taskCompletionSource;
}
private async Task StartCrmRequest()
{
if (taskCompletionSource.Task.IsCompleted)
{
return;
}
if (cancelToken.IsCancellationRequested)
{
...
return;
}
var count = GetThreadCount();
if (count >= parameters.MaxThreads)
{
return;
}
string id;
if (!idQueue.TryDequeue(out id)) return;
IncrementThreadCount();
var crmMessage = await crmClient.CompleteActivityAsync(new Guid(id), parameters.CallIntervalMsecs);
ProcessResult(crmMessage);
processedCount += 1;
if (parameters.TellUserAfterNCalls > 0 && processedCount%parameters.TellUserAfterNCalls == 0)
{
ShowProgress(processedCount);
}
}
private async void ProcessResult(CrmResultMessage response)
{
if (response.CrmResult == CrmResult.Failed && ++crmErrorCount == parameters.CrmErrorsBeforeQuitting)
{
Program.TellUser(
"Quitting because CRM error count is equal to {0}. Already queued web service calls will have to run to completion.",
crmErrorCount);
ClearQueue();
}
var count = DecrementThreadCount();
if (idQueue.Count == 0 && count == 0)
{
taskCompletionSource.SetResult(true);
}
else
{
await StartCrmRequest();
}
}
}
So, setting MaxThreads to 10 and CrmErrorsBeforeQuitting to 20, the TPL version on my machine completes in 19 seconds and the async/await version takes 35 seconds. Given that I have over 8000 calls to make this is a significant difference. Any ideas?
I think I'm seeing the problem here, or at least a part of it. Look closely at the two bits of code below; they are not equivalent.
// Prime our thread pump with max threads.
for (var i = 0; i < parameters.MaxThreads; i++)
{
Task.Run((Action) StartCrmRequest, cancellationToken);
}
And:
// Prime our thread pump with max threads.
for (var i = 0; i < parameters.MaxThreads; i++)
{
await StartCrmRequest();
}
In the original code (I am taking it as a given that it is functionally sound) there is a single call to ContinueWith. That is exactly how many await statements I would expect to see in a trivial rewrite if it is meant to preserve the original behaviour.
Not a hard and fast rule and only applicable in simple cases, but nevertheless a good thing to keep an eye out for.
I think you over complicated your solution and ended up not getting where you wanted in either implementation.
First of all, connections to any HTTP host are limited by the service point manager. The default limit for client environments is 2, but you can increase it yourself.
No matter how much threads you spawn, there won't be more active requests than those allwed.
Then, as someone pointed out, await logically blocks the execution flow.
And finally, you spent your time creating an AsyncQueue when you should have used TPL data flows.
When implemented with async/await, I would expect the I/O bound algorithm to run on a single thread. Unlike #KirillShlenskiy, I believe that the bit responsible for "bringing back" to caller's context is not responsible for the slow-down. I think you overrun the thread pool by trying to use it for I/O-bound operations. It's designed primarily for compute-bound ops.
Have a look at ForEachAsync. I feel that's what you're looking for (Stephen Toub's discussion, you'll find Wischik's videos meaningful too):
http://blogs.msdn.com/b/pfxteam/archive/2012/03/05/10278165.aspx
(Use degree of concurrency to reduce memory footprint)
http://vimeo.com/43808831
http://vimeo.com/43808833
i've recently come across a producer/consumer pattern c# implementation. it's very simple and (for me at least) very elegant.
it seems to have been devised around 2006, so i was wondering if this implementation is
- safe
- still applicable
Code is below (original code was referenced at http://bytes.com/topic/net/answers/575276-producer-consumer#post2251375)
using System;
using System.Collections;
using System.Threading;
public class Test
{
static ProducerConsumer queue;
static void Main()
{
queue = new ProducerConsumer();
new Thread(new ThreadStart(ConsumerJob)).Start();
Random rng = new Random(0);
for (int i=0; i < 10; i++)
{
Console.WriteLine ("Producing {0}", i);
queue.Produce(i);
Thread.Sleep(rng.Next(1000));
}
}
static void ConsumerJob()
{
// Make sure we get a different random seed from the
// first thread
Random rng = new Random(1);
// We happen to know we've only got 10
// items to receive
for (int i=0; i < 10; i++)
{
object o = queue.Consume();
Console.WriteLine ("\t\t\t\tConsuming {0}", o);
Thread.Sleep(rng.Next(1000));
}
}
}
public class ProducerConsumer
{
readonly object listLock = new object();
Queue queue = new Queue();
public void Produce(object o)
{
lock (listLock)
{
queue.Enqueue(o);
// We always need to pulse, even if the queue wasn't
// empty before. Otherwise, if we add several items
// in quick succession, we may only pulse once, waking
// a single thread up, even if there are multiple threads
// waiting for items.
Monitor.Pulse(listLock);
}
}
public object Consume()
{
lock (listLock)
{
// If the queue is empty, wait for an item to be added
// Note that this is a while loop, as we may be pulsed
// but not wake up before another thread has come in and
// consumed the newly added object. In that case, we'll
// have to wait for another pulse.
while (queue.Count==0)
{
// This releases listLock, only reacquiring it
// after being woken up by a call to Pulse
Monitor.Wait(listLock);
}
return queue.Dequeue();
}
}
}
The code is older than that - I wrote it some time before .NET 2.0 came out. The concept of a producer/consumer queue is way older than that though :)
Yes, that code is safe as far as I'm aware - but it has some deficiencies:
It's non-generic. A modern version would certainly be generic.
It has no way of stopping the queue. One simple way of stopping the queue (so that all the consumer threads retire) is to have a "stop work" token which can be put into the queue. You then add as many tokens as you have threads. Alternatively, you have a separate flag to indicate that you want to stop. (This allows the other threads to stop before finishing all the current work in the queue.)
If the jobs are very small, consuming a single job at a time may not be the most efficient thing to do.
The ideas behind the code are more important than the code itself, to be honest.
You could do something like the following code snippet. It's generic and has a method for enqueue-ing nulls (or whatever flag you'd like to use) to tell the worker threads to exit.
The code is taken from here: http://www.albahari.com/threading/part4.aspx#_Wait_and_Pulse
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
namespace ConsoleApplication1
{
public class TaskQueue<T> : IDisposable where T : class
{
object locker = new object();
Thread[] workers;
Queue<T> taskQ = new Queue<T>();
public TaskQueue(int workerCount)
{
workers = new Thread[workerCount];
// Create and start a separate thread for each worker
for (int i = 0; i < workerCount; i++)
(workers[i] = new Thread(Consume)).Start();
}
public void Dispose()
{
// Enqueue one null task per worker to make each exit.
foreach (Thread worker in workers) EnqueueTask(null);
foreach (Thread worker in workers) worker.Join();
}
public void EnqueueTask(T task)
{
lock (locker)
{
taskQ.Enqueue(task);
Monitor.PulseAll(locker);
}
}
void Consume()
{
while (true)
{
T task;
lock (locker)
{
while (taskQ.Count == 0) Monitor.Wait(locker);
task = taskQ.Dequeue();
}
if (task == null) return; // This signals our exit
Console.Write(task);
Thread.Sleep(1000); // Simulate time-consuming task
}
}
}
}
Back in the day I learned how Monitor.Wait/Pulse works (and a lot about threads in general) from the above piece of code and the article series it is from. So as Jon says, it has a lot of value to it and is indeed safe and applicable.
However, as of .NET 4, there is a producer-consumer queue implementation in the framework. I only just found it myself but up to this point it does everything I need.
These days a more modern option is available using the namespace System.Threading.Tasks.Dataflow. It's async/await friendly and much more versatile.
More info here How to: Implement a producer-consumer dataflow pattern
It's included starting from .Net Core, for older .Nets you may need to install a package with the same name as the namespace.
I know the question is old, but it's the first match in Google for my request, so I decided to update the topic.
A modern and simple way to implement the producer/consumer pattern in C# is to use System.Threading.Channels. It's asynchronous and uses ValueTask's to decrease memory allocations. Here is an example:
public class ProducerConsumer<T>
{
protected readonly Channel<T> JobChannel = Channel.CreateUnbounded<T>();
public IAsyncEnumerable<T> GetAllAsync()
{
return JobChannel.Reader.ReadAllAsync();
}
public async ValueTask AddAsync(T job)
{
await JobChannel.Writer.WriteAsync(job);
}
public async ValueTask AddAsync(IEnumerable<T> jobs)
{
foreach (var job in jobs)
{
await JobChannel.Writer.WriteAsync(job);
}
}
}
Warning: If you read the comments, you'll understand my answer is wrong :)
There's a possible deadlock in your code.
Imagine the following case, for clarity, I used a single-thread approach but should be easy to convert to multi-thread with sleep:
// We create some actions...
object locker = new object();
Action action1 = () => {
lock (locker)
{
System.Threading.Monitor.Wait(locker);
Console.WriteLine("This is action1");
}
};
Action action2 = () => {
lock (locker)
{
System.Threading.Monitor.Wait(locker);
Console.WriteLine("This is action2");
}
};
// ... (stuff happens, etc.)
// Imagine both actions were running
// and there's 0 items in the queue
// And now the producer kicks in...
lock (locker)
{
// This would add a job to the queue
Console.WriteLine("Pulse now!");
System.Threading.Monitor.Pulse(locker);
}
// ... (more stuff)
// and the actions finish now!
Console.WriteLine("Consume action!");
action1(); // Oops... they're locked...
action2();
Please do let me know if this doesn't make any sense.
If this is confirmed, then the answer to your question is, "no, it isn't safe" ;)
I hope this helps.
public class ProducerConsumerProblem
{
private int n;
object obj = new object();
public ProducerConsumerProblem(int n)
{
this.n = n;
}
public void Producer()
{
for (int i = 0; i < n; i++)
{
lock (obj)
{
Console.Write("Producer =>");
System.Threading.Monitor.Pulse(obj);
System.Threading.Thread.Sleep(1);
System.Threading.Monitor.Wait(obj);
}
}
}
public void Consumer()
{
lock (obj)
{
for (int i = 0; i < n; i++)
{
System.Threading.Monitor.Wait(obj, 10);
Console.Write("<= Consumer");
System.Threading.Monitor.Pulse(obj);
Console.WriteLine();
}
}
}
}
public class Program
{
static void Main(string[] args)
{
ProducerConsumerProblem f = new ProducerConsumerProblem(10);
System.Threading.Thread t1 = new System.Threading.Thread(() => f.Producer());
System.Threading.Thread t2 = new System.Threading.Thread(() => f.Consumer());
t1.IsBackground = true;
t2.IsBackground = true;
t1.Start();
t2.Start();
Console.ReadLine();
}
}
output
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer