Lock with small overhead or lockless

Lock with small overhead or lockless - c#

I have a ConcurrentQueue that gets filled with objects from one thread and another thread takes objects from it and processes them.
If the queue gets big i can "compress" it by removing duplicates. The compressing takes the queue and makes it into a list, iterates through it and creates a new queue that only have distinct values. So I replace the queue, and since i do that i cant have objects inserted into the queue that gets overwritten, i will loose them.
My problem is if i add a lock(obj) {} or some sort of LockHandle i loose alot of performance. There are a lot of transactions but processing time is very low, so locking looks like what is killing my performance.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Collections.Concurrent;
using System.Threading;
using System.Threading.Tasks;
using System.IO;
namespace A
{
public abstract class Base
{
private ConcurrentQueue<Data> unProcessed = new ConcurrentQueue<Data>();
private const int MIN_COLLAPSETIME = 30;
private const int MIN_COLLAPSECOUNT = 1000;
private QueueCollapser Collapser;
private ManualResetEventSlim waitForCollapsing = new ManualResetEventSlim(true);
private ManualResetEventSlim waitForWrite = new ManualResetEventSlim();
// Thread signal.
public AutoResetEvent unProcessedEvent = new AutoResetEvent(false);
// exiting
public volatile bool Exiting = false;
private Task task;
public BasePusher()
{
// initiate Collapser
Collapser = new QueueCollapser();
// set up thread
task = new Task(
() =>
{
consumerTask();
}, TaskCreationOptions.LongRunning
);
}
public void Start()
{
task.Start();
}
private void consumerTask()
{
Data data = null;
while (!Exiting)
{
try
{
// do we try to collapse
if (unProcessed.Count > MIN_COLLAPSECOUNT && (DateTime.Now - Collapser.LastCollapse).TotalSeconds > MIN_COLLAPSETIME)
{
waitForCollapsing.Reset();
waitForWrite.Wait();
unProcessed = Collapser.Collapse(unProcessed);
waitForCollapsing.Set();
// tried this aswell instead of using my own locking, this is like Monitor.Enter
lock(this) {
unProcessed = Collapser.Collapse(unProcessed);
}
}
if (sum == 0)
{
// we wake the thread after 20 seconds, if nothing is in queue it will just go back and wait
unProcessedEvent.WaitOne(20000);
}
var gotOne = unProcessed.TryDequeue(out data);
if (gotOne)
{
ProcessTime(data);
}
}
}
catch (Exception e)
{
}
}
}
protected abstract void ProcessTime(Data d);
public void AddToQueue(Data d)
{
waitForCollapsing.Wait();
waitForWrite.Reset();
unProcessed.Enqueue(d);
waitForWrite.Set();
unProcessedEvent.Set();
}
// tried this aswell instead of using my own locking, this is like Monitor.Enter
public void AddToQueueAlternate(Data d)
{
lock(this) {
unProcessed.Enqueue(d);
waitForWrite.Set();
unProcessedEvent.Set();
}
}
}
}
Can this be done without locking?
Can i use a more lightweight lock? As of now there is only one thread adding data and one thread reading. And i can keep it that way if that gets me a better lock.

If you want concurrency and no duplicates, you should use a ConcurrentDictionary
So redeclare your Queue:
private ConcurrentDictionary<Data, Data> unProcessed =
new ConcurrentDictionary<Data, Data>();
This will considerably simplify your code while keeping very good performances.

Why would you have duplicates?
If the publisher can really add duplicates then you need some type of concurrent hashing object (Dictionary or HashSet) to detect and prevent this from ocurring in the publisher.
You may also want to investigate ReaderWriterLockSlim.

Related

What's the best pattern for a thread safe write cache to database?

I've a method which could be called by multiple threads, to write data to a database. To reduce database traffic, I cache the data and write it in a bulk.
Now I wanted to know, is there a better (for example lock-free pattern) to use?
Here is a Example how I do it at the moment?
public class WriteToDatabase : IWriter, IDisposable
{
public WriteToDatabase(PLCProtocolServiceConfig currentConfig)
{
writeTimer = new System.Threading.Timer(Writer);
writeTimer.Change((int)currentConfig.WriteToDatabaseTimer.TotalMilliseconds, Timeout.Infinite);
this.currentConfig = currentConfig;
}
private System.Threading.Timer writeTimer;
private List<PlcProtocolDTO> writeChache = new List<PlcProtocolDTO>();
private readonly PLCProtocolServiceConfig currentConfig;
private bool disposed;
public void Write(PlcProtocolDTO row)
{
lock (this)
{
writeChache.Add(row);
}
}
private void Writer(object state)
{
List<PlcProtocolDTO> oldCachce = null;
lock (this)
{
if (writeChache.Count > 0)
{
oldCachce = writeChache;
writeChache = new List<PlcProtocolDTO>();
}
}
if (oldCachce != null)
{
using (var s = VisuDL.CreateSession())
{
s.Insert(oldCachce);
}
}
if (!this.disposed)
writeTimer.Change((int)currentConfig.WriteToDatabaseTimer.TotalMilliseconds, Timeout.Infinite);
}
public void Dispose()
{
this.disposed = true;
writeTimer.Dispose();
Writer(null);
}
}

There are a few issues I can see with the timer based code.
Even in the new version of the code there is still a chance to lose writes on restart or shutdown.
The Dispose method is not waiting for the completion of the last timer callback that may be currently in progress.
Since timer callbacks run on thread pool threads, which are background threads, they will be aborted when the main thread exits.
There is no limit on the size of the batches, this is going to break when you hit a limit of the underlying storage api
(e.g. sql databases have a limit on query length and the number of parameters used).
since you're doing i/o the implementation should probably be async
This will behave poorly under load.
in particular as the load keeps increasing the batches will get bigger and therefore slower to execute,
a slower batch execution in turn will give the next one additional time to accumulate items making them even slower, etc...
ultimately either writing the batch will fail (if you hit a sql limit or the query times out) or the application will just go out of memory.
To handle high load you really have only two choices which are applying backpressure (i.e. slowing down the producers) or dropping writes.
you might want to allow a limited number of concurrent writers if the database can handle it.
There's a race condition on the disposed field which might result in an ObjectDisposedException in writeTimer.Change.
I think a better pattern that addresses the issues above is the consumer-producer pattern, you can implement it in .net
with a ConcurrentQueue or with the new System.Threading.Channels api.
Also keep in mind that if your application crashes for any reason you will lose the records that are still buffered.
This is a sample implementation using channels:
public interface IWriter<in T>
{
ValueTask WriteAsync(IEnumerable<T> items);
}
public sealed record Options(int BatchSize, TimeSpan Interval, int MaxPendingWrites, int Concurrency);
public class BatchWriter<T> : IWriter<T>, IAsyncDisposable
{
readonly IWriter<T> writer;
readonly Options options;
readonly Channel<T> channel;
readonly Task[] consumers;
public BatchWriter(IWriter<T> writer, Options options)
{
this.writer = writer;
this.options = options;
channel = Channel.CreateBounded<T>(new BoundedChannelOptions(options.MaxPendingWrites)
{
// Choose between backpressure (Wait) or
// various ways to drop writes (DropNewest, DropOldest, DropWrite).
FullMode = BoundedChannelFullMode.Wait,
SingleWriter = false,
SingleReader = options.Concurrency == 1
});
consumers = Enumerable.Range(start: 0, options.Concurrency)
.Select(_ => Task.Run(Start))
.ToArray();
}
async Task Start()
{
var batch = new List<T>(options.BatchSize);
var timer = Task.Delay(options.Interval);
var canRead = channel.Reader.WaitToReadAsync().AsTask();
while (true)
{
if (await Task.WhenAny(timer, canRead) == timer)
{
timer = Task.Delay(options.Interval);
await Flush(batch);
}
else if (await canRead)
{
while (channel.Reader.TryRead(out var item))
{
batch.Add(item);
if (batch.Count == options.BatchSize)
{
await Flush(batch);
}
}
canRead = channel.Reader.WaitToReadAsync().AsTask();
}
else
{
await Flush(batch);
return;
}
}
async Task Flush(ICollection<T> items)
{
if (items.Count > 0)
{
await writer.WriteAsync(items);
items.Clear();
}
}
}
public async ValueTask WriteAsync(IEnumerable<T> items)
{
foreach (var item in items)
{
await channel.Writer.WriteAsync(item);
}
}
public async ValueTask DisposeAsync()
{
channel.Writer.Complete();
await Task.WhenAll(consumers);
}
}

Instead of using a mutable List and protecting it using locks, you could use an ImmutableList, and stop worrying about the possibility of the list being mutated by the wrong thread at the wrong time. With immutable collections it is cheap and easy to pass around snapshots of your data, because you don't need to block the writers (and possibly also the readers) while creating copies of the data. An immutable collection is a snapshot by itself.
Although you don't have to worry about the contents of the collection, you still have to worry about its reference. This is because updating an immutable collection means replacing the reference to the old collection with a new collection. You don't want to have multiple threads swapping references in an uncontrollable manner, so you still need some sort of synchronization. You can still use locks, but it is quite easy to avoid locking altogether by using interlocked operations. The example below uses the handy ImmutableInterlocked.Update method, that allows to do an atomic update-and-swap in a single line:
private ImmutableList<PlcProtocolDTO> writeCache
= ImmutableList<PlcProtocolDTO>.Empty;
public void Write(PlcProtocolDTO row)
{
ImmutableInterlocked.Update(ref writeCache, x => x.Add(row));
}
private void Writer(object state)
{
IList<PlcProtocolDTO> oldCache = Interlocked.Exchange(
ref writeCache, ImmutableList<PlcProtocolDTO>.Empty);
using (var s = VisuDL.CreateSession())
s.Insert(oldCache);
}
private void Dump()
{
foreach (var row in Volatile.Read(ref writeCache))
Console.WriteLine(row);
}
Here is the description of the ImmutableInterlocked.Update method:
Mutates a value in-place with optimistic locking transaction semantics via a specified transformation function. The transformation is retried as many times as necessary to win the optimistic locking race.
This method can be used for updating any type of reference-type variables. Its usage may be increased with the advent of the new C# 9 record types, that are immutable by default, and are intended to be used as such.

How to delay multiple messages in parallel?

I am doing some TCP programming and I want to simulate some latency.
Each "message" (a byte[] representing a serialized object) must be delayed by some time t. I thought that I can have one function to collect raw messages:
private Queue<byte[]> rawMessages = new Queue<byte[]>();
private void OnDataReceived(object sender, byte[] data)
{
rawMessages.Enqueue(data);
}
Another method with a while-loop to continuously read from rawMessages and then delay each one:
private Queue<byte[]> delayedRawMessages = new Queue<byte[]>();
delayMessagesInTask = Task.Factory.StartNew(() =>
{
while (true) //TODO: Swap a cancellation token
{
if(rawMessages.Count > 0){
var rawMessage = rawMessages.Dequeue();
//???
//Once the message has been delayed, enqueue it into another buffer.
delayedRawMessages.Enqueue(rawMessage);
}
}
});
I thought that to delay each message, I could have another method to spawn a thread which uses Thread.Sleep(t) to wait for time t and then enqueue delayedRawMessages. I'm sure this will work, but I think there must be a better way. The issue with the Thread.Sleep approach is that message 2 might finish being delayed before message 1... I obviously need the messages to be delayed and done so in the correct order, otherwise I would not be using TCP.
I'm looking for a way to do this that will delay for as close to time t as possible and will not impact the rest of the application by slowing it down.
Does anyone here know of a better way to do this?

I decided to go the producer/multiple consumer approach.
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
public class LatencySimulator {
public enum SimulatorType { UP, DOWN };
public SimulatorType type;
public int latency = 0;
public int maxConsumers = 50;
public BlockingCollection<byte[]> inputQueue = new BlockingCollection<byte[]>(new ConcurrentQueue<byte[]>());
public Queue<byte[]> delayedMessagesQueue = new Queue<byte[]>();
void CreateConsumers()
{
for (int i = 0; i < maxConsumers; i++)
{
Task.Factory.StartNew(() => Consumer(),TaskCreationOptions.LongRunning);
}
}
private void Consumer()
{
foreach (var item in inputQueue.GetConsumingEnumerable())
{
Thread.Sleep(latency);
delayedMessagesQueue.Enqueue(item);
}
}
}
To use it, create a new LatencySimulator, set its 'type', max consumers and latency to simulate. Call CreateConsmers() and then populate the inputQueue. When the messages have been delayed, they will appear in the delayedMessagesQueue.
I do not really know if this is an ideal way to achieve my goal, but it works... for now.

Best practices for implementing a thread to do fast, bulk, and continuous reading in C#?

How should the reading of bulk data from a device in C# be handled in .NET 4.0? Specifically I need to read quickly from a USB HID device that emits reports over 26 packets where order must be preserved.
I've tried doing this in a BackgroundWorker thread. It reads one packet from the device at a time, and process it, before reading more. This gives reasonably good response times, but it is liable to lose a packet here and there, and the overhead costs of a single packet read adds up.
while (!( sender as BackgroundWorker ).CancellationPending) {
//read a single packet
//check for header or footer
//process packet data
}
}
What is the best practice in C# for reading a device like this?
Background:
My USB HID device continuously reports a large amount of data. The data is split over 26 packets and I must preserver the order. Unfortunately the device only marks the first the last packets in each report, so I need to be able to catch every other packet in between.

For .Net 4 you can use a BlockingCollection to provide a threadsafe queue that can be used by a producer and a consumer. The BlockingCollection.GetConsumingEnumerable() method provides an enumerator which automatically terminates when the queue has been marked as completed using CompleteAdding() and is empty.
Here's some sample code. The payload is an array of ints in this example, but of course you would use whatever data type you need.
Note that for your specific example, you can use the overload of GetConsumingEnumerable() which accepts an argument of type CancellationToken.
using System;
using System.Collections.Concurrent;
using System.Threading;
using System.Threading.Tasks;
namespace Demo
{
public static class Program
{
private static void Main()
{
var queue = new BlockingCollection<int[]>();
Task.Factory.StartNew(() => produce(queue));
consume(queue);
Console.WriteLine("Finished.");
}
private static void consume(BlockingCollection<int[]> queue)
{
foreach (var item in queue.GetConsumingEnumerable())
{
Console.WriteLine("Consuming " + item[0]);
Thread.Sleep(25);
}
}
private static void produce(BlockingCollection<int[]> queue)
{
for (int i = 0; i < 1000; ++i)
{
Console.WriteLine("Producing " + i);
var payload = new int[100];
payload[0] = i;
queue.Add(payload);
Thread.Sleep(20);
}
queue.CompleteAdding();
}
}
}
For .Net 4.5 and later, you could use the higher-level classes from Microsoft's Task Parallel Library, which has a wealth of functionality (and can be somewhat daunting at first sight).
Here's the same example using TPL DataFlow:
using System;
using System.Threading;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
namespace Demo
{
public static class Program
{
private static void Main()
{
var queue = new BufferBlock<int[]>();
Task.Factory.StartNew(() => produce(queue));
consume(queue).Wait();
Console.WriteLine("Finished.");
}
private static async Task consume(BufferBlock<int[]> queue)
{
while (await queue.OutputAvailableAsync())
{
var payload = await queue.ReceiveAsync();
Console.WriteLine("Consuming " + payload[0]);
await Task.Delay(25);
}
}
private static void produce(BufferBlock<int[]> queue)
{
for (int i = 0; i < 1000; ++i)
{
Console.WriteLine("Producing " + i);
var payload = new int[100];
payload[0] = i;
queue.Post(payload);
Thread.Sleep(20);
}
queue.Complete();
}
}
}

If missing packets is a concern do not do your processing and your reading on the same thread. Starting with .NET 4.0 they added the System.Collections.Concurrent namespace which makes this very easy to do. All you need is a BlockingCollection which behaves as a queue for your incoming packets.
BlockingCollection<Packet> _queuedPackets = new BlockingCollection<Packet>(new ConcurrentQueue<Packet>());
void readingBackgroundWorker_DoWork(object sender, DoWorkEventArgs e)
{
while (!( sender as BackgroundWorker ).CancellationPending)
{
Packet packet = GetPacket();
_queuedPackets.Add(packet);
}
_queuedPackets.CompleteAdding();
}
void processingBackgroundWorker_DoWork(object sender, DoWorkEventArgs e)
{
List<Packet> report = new List<Packet>();
foreach(var packet in _queuedPackets.GetConsumingEnumerable())
{
report.Add(packet);
if(packet.IsLastPacket)
{
ProcessReport(report);
report = new List<Packet>();
}
}
}
What will happen is while _queuedPackets is empty _queuedPackets.GetConsumingEnumerable() will block the thread not consuming any resources. As soon as a packet arrives it will unblock and do the next iteration of the foreach.
When you call _queuedPackets.CompleteAdding(); the foreach on your processing thread will run till the collection is empty then exit the foreach loop. If you don't want it to "finish up the queue" when you cancel you can easily change it up to quit early. I also am going to switch to using Tasks instead of Background workers because it makes the passing in parameters much easier to do.
void ReadingLoop(BlockingCollection<Packet> queue, CancellationToken token)
{
while (!token.IsCancellationRequested)
{
Packet packet = GetPacket();
queue.Add(packet);
}
queue.CompleteAdding();
}
void ProcessingLoop(BlockingCollection<Packet> queue, CancellationToken token)
{
List<Packet> report = new List<Packet>();
try
{
foreach(var packet in queue.GetConsumingEnumerable(token))
{
report.Add(packet);
if(packet.IsLastPacket)
{
ProcessReport(report);
report = new List<Packet>();
}
}
}
catch(OperationCanceledException)
{
//Do nothing, we don't care that it happened.
}
}
//This would replace your backgroundWorker.RunWorkerAsync() calls;
private void StartUpLoops()
{
var queue = new BlockingCollection<Packet>(new ConcurrentQueue<Packet>());
var cancelRead = new CancellationTokenSource();
var cancelProcess = new CancellationTokenSource();
Task.Factory.StartNew(() => ReadingLoop(queue, cancelRead.Token));
Task.Factory.StartNew(() => ProcessingLoop(queue, cancelProcess.Token));
//You can stop each loop indpendantly by calling cancelRead.Cancel() or cancelProcess.Cancel()
}

Thread safe StreamWriter C# how to do it? 2

So this is a continuation from my last question - So the question was
"What is the best way to build a program that is thread safe in terms that it needs to write double values to a file. If the function that saves the values via streamwriter is being called by multiple threads? Whats the best way of doing it?"
And I modified some code found at MSDN, how about the following? This one correctly writes everything to the file.
namespace SafeThread
{
class Program
{
static void Main()
{
Threading threader = new Threading();
AutoResetEvent autoEvent = new AutoResetEvent(false);
Thread regularThread =
new Thread(new ThreadStart(threader.ThreadMethod));
regularThread.Start();
ThreadPool.QueueUserWorkItem(new WaitCallback(threader.WorkMethod),
autoEvent);
// Wait for foreground thread to end.
regularThread.Join();
// Wait for background thread to end.
autoEvent.WaitOne();
}
}
class Threading
{
List<double> Values = new List<double>();
static readonly Object locker = new Object();
StreamWriter writer = new StreamWriter("file");
static int bulkCount = 0;
static int bulkSize = 100000;
public void ThreadMethod()
{
lock (locker)
{
while (bulkCount < bulkSize)
Values.Add(bulkCount++);
}
bulkCount = 0;
}
public void WorkMethod(object stateInfo)
{
lock (locker)
{
foreach (double V in Values)
{
writer.WriteLine(V);
writer.Flush();
}
}
// Signal that this thread is finished.
((AutoResetEvent)stateInfo).Set();
}
}
}

Thread and QueueUserWorkItem are the lowest available APIs for threading. I wouldn't use them unless I absolutely, finally, had no other choice. Try the Task class for a much higher-level abstraction. For details, see my recent blog post on the subject.
You can also use BlockingCollection<double> as a proper producer/consumer queue instead of trying to build one by hand with the lowest available APIs for synchronization.
Reinventing these wheels correctly is surprisingly difficult. I highly recommend using the classes designed for this type of need (Task and BlockingCollection, to be specific). They are built-in to the .NET 4.0 framework and are available as an add-on for .NET 3.5.

the code has the writer as an instance var but using a static locker. If you had multiple instances writing to different files, there's no reason they would need to share the same lock
on a related note, since you already have the writer (as a private instance var), you can use that for locking instead of using a separate locker object in this case - that makes things a little simpler.
The 'right answer' really depends on what you're looking for in terms of locking/blocking behavior. For instance, the simplest thing would be to skip the intermediate data structure just have a WriteValues method such that each thread 'reporting' its results goes ahead and writes them to the file. Something like:
StreamWriter writer = new StreamWriter("file");
public void WriteValues(IEnumerable<double> values)
{
lock (writer)
{
foreach (var d in values)
{
writer.WriteLine(d);
}
writer.Flush();
}
}
Of course, this means worker threads serialize during their 'report results' phases - depending on the performance characteristics, that may be just fine though (5 minutes to generate, 500ms to write, for example).
On the other end of the spectrum, you'd have the worker threads write to a data structure. If you're in .NET 4, I'd recommend just using a ConcurrentQueue rather than doing that locking yourself.
Also, you may want to do the file i/o in bigger batches than those being reported by the worker threads, so you might choose to just do writing in a background thread on some frequency. That end of the spectrum looks something like the below (you'd remove the Console.WriteLine calls in real code, those are just there so you can see it working in action)
public class ThreadSafeFileBuffer<T> : IDisposable
{
private readonly StreamWriter m_writer;
private readonly ConcurrentQueue<T> m_buffer = new ConcurrentQueue<T>();
private readonly Timer m_timer;
public ThreadSafeFileBuffer(string filePath, int flushPeriodInSeconds = 5)
{
m_writer = new StreamWriter(filePath);
var flushPeriod = TimeSpan.FromSeconds(flushPeriodInSeconds);
m_timer = new Timer(FlushBuffer, null, flushPeriod, flushPeriod);
}
public void AddResult(T result)
{
m_buffer.Enqueue(result);
Console.WriteLine("Buffer is up to {0} elements", m_buffer.Count);
}
public void Dispose()
{
Console.WriteLine("Turning off timer");
m_timer.Dispose();
Console.WriteLine("Flushing final buffer output");
FlushBuffer(); // flush anything left over in the buffer
Console.WriteLine("Closing file");
m_writer.Dispose();
}
/// <summary>
/// Since this is only done by one thread at a time (almost always the background flush thread, but one time via Dispose), no need to lock
/// </summary>
/// <param name="unused"></param>
private void FlushBuffer(object unused = null)
{
T current;
while (m_buffer.TryDequeue(out current))
{
Console.WriteLine("Buffer is down to {0} elements", m_buffer.Count);
m_writer.WriteLine(current);
}
m_writer.Flush();
}
}
class Program
{
static void Main(string[] args)
{
var tempFile = Path.GetTempFileName();
using (var resultsBuffer = new ThreadSafeFileBuffer<double>(tempFile))
{
Parallel.For(0, 100, i =>
{
// simulate some 'real work' by waiting for awhile
var sleepTime = new Random().Next(10000);
Console.WriteLine("Thread {0} doing work for {1} ms", Thread.CurrentThread.ManagedThreadId, sleepTime);
Thread.Sleep(sleepTime);
resultsBuffer.AddResult(Math.PI*i);
});
}
foreach (var resultLine in File.ReadAllLines(tempFile))
{
Console.WriteLine("Line from result: {0}", resultLine);
}
}
}

So you're saying you want a bunch of threads to write data to a single file using a StreamWriter? Easy. Just lock the StreamWriter object.
The code here will create 5 threads. Each thread will perform 5 "actions," and at the end of each action it will write 5 lines to a file named "file."
using System;
using System.Collections.Generic;
using System.IO;
using System.Threading;
namespace ConsoleApplication1 {
class Program {
static void Main() {
StreamWriter Writer = new StreamWriter("file");
Action<int> ThreadProcedure = (i) => {
// A thread may perform many actions and write out the result after each action
// The outer loop here represents the multiple actions this thread will take
for (int x = 0; x < 5; x++) {
// Here is where the thread would generate the data for this action
// Well simulate work time using a call to Sleep
Thread.Sleep(1000);
// After generating the data the thread needs to lock the Writer before using it.
lock (Writer) {
// Here we'll write a few lines to the Writer
for (int y = 0; y < 5; y++) {
Writer.WriteLine("Thread id = {0}; Action id = {1}; Line id = {2}", i, x, y);
}
}
}
};
//Now that we have a delegate for the thread code lets make a few instances
List<IAsyncResult> AsyncResultList = new List<IAsyncResult>();
for (int w = 0; w < 5; w++) {
AsyncResultList.Add(ThreadProcedure.BeginInvoke(w, null, null));
}
// Wait for all threads to complete
foreach (IAsyncResult r in AsyncResultList) {
r.AsyncWaitHandle.WaitOne();
}
// Flush/Close the writer so all data goes to disk
Writer.Flush();
Writer.Close();
}
}
}
The result should be a file "file" with 125 lines in it with all "actions" performed concurrently and the result of each action written synchronously to the file.

The code you have there is subtly broken - in particular, if the queued work item runs first, then it will flush the (empty) list of values immediately, before terminating, after which point your worker goes and fills up the List (which will end up being ignored). The auto-reset event also does nothing, since nothing ever queries or waits on its state.
Also, since each thread uses a different lock, the locks have no meaning! You need to make sure you hold a single, shared lock whenever accessing the streamwriter. You don't need a lock between the flushing code and the generation code; you just need to make sure the flush runs after the generation finishes.
You're probably on the right track, though - although I'd use a fixed-size array instead of a list, and flush all entries from the array when it gets full. This avoids the possibility of running out of memory if the thread is long-lived.

C# producer/consumer

i've recently come across a producer/consumer pattern c# implementation. it's very simple and (for me at least) very elegant.
it seems to have been devised around 2006, so i was wondering if this implementation is
- safe
- still applicable
Code is below (original code was referenced at http://bytes.com/topic/net/answers/575276-producer-consumer#post2251375)
using System;
using System.Collections;
using System.Threading;
public class Test
{
static ProducerConsumer queue;
static void Main()
{
queue = new ProducerConsumer();
new Thread(new ThreadStart(ConsumerJob)).Start();
Random rng = new Random(0);
for (int i=0; i < 10; i++)
{
Console.WriteLine ("Producing {0}", i);
queue.Produce(i);
Thread.Sleep(rng.Next(1000));
}
}
static void ConsumerJob()
{
// Make sure we get a different random seed from the
// first thread
Random rng = new Random(1);
// We happen to know we've only got 10
// items to receive
for (int i=0; i < 10; i++)
{
object o = queue.Consume();
Console.WriteLine ("\t\t\t\tConsuming {0}", o);
Thread.Sleep(rng.Next(1000));
}
}
}
public class ProducerConsumer
{
readonly object listLock = new object();
Queue queue = new Queue();
public void Produce(object o)
{
lock (listLock)
{
queue.Enqueue(o);
// We always need to pulse, even if the queue wasn't
// empty before. Otherwise, if we add several items
// in quick succession, we may only pulse once, waking
// a single thread up, even if there are multiple threads
// waiting for items.
Monitor.Pulse(listLock);
}
}
public object Consume()
{
lock (listLock)
{
// If the queue is empty, wait for an item to be added
// Note that this is a while loop, as we may be pulsed
// but not wake up before another thread has come in and
// consumed the newly added object. In that case, we'll
// have to wait for another pulse.
while (queue.Count==0)
{
// This releases listLock, only reacquiring it
// after being woken up by a call to Pulse
Monitor.Wait(listLock);
}
return queue.Dequeue();
}
}
}

The code is older than that - I wrote it some time before .NET 2.0 came out. The concept of a producer/consumer queue is way older than that though :)
Yes, that code is safe as far as I'm aware - but it has some deficiencies:
It's non-generic. A modern version would certainly be generic.
It has no way of stopping the queue. One simple way of stopping the queue (so that all the consumer threads retire) is to have a "stop work" token which can be put into the queue. You then add as many tokens as you have threads. Alternatively, you have a separate flag to indicate that you want to stop. (This allows the other threads to stop before finishing all the current work in the queue.)
If the jobs are very small, consuming a single job at a time may not be the most efficient thing to do.
The ideas behind the code are more important than the code itself, to be honest.

You could do something like the following code snippet. It's generic and has a method for enqueue-ing nulls (or whatever flag you'd like to use) to tell the worker threads to exit.
The code is taken from here: http://www.albahari.com/threading/part4.aspx#_Wait_and_Pulse
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
namespace ConsoleApplication1
{
public class TaskQueue<T> : IDisposable where T : class
{
object locker = new object();
Thread[] workers;
Queue<T> taskQ = new Queue<T>();
public TaskQueue(int workerCount)
{
workers = new Thread[workerCount];
// Create and start a separate thread for each worker
for (int i = 0; i < workerCount; i++)
(workers[i] = new Thread(Consume)).Start();
}
public void Dispose()
{
// Enqueue one null task per worker to make each exit.
foreach (Thread worker in workers) EnqueueTask(null);
foreach (Thread worker in workers) worker.Join();
}
public void EnqueueTask(T task)
{
lock (locker)
{
taskQ.Enqueue(task);
Monitor.PulseAll(locker);
}
}
void Consume()
{
while (true)
{
T task;
lock (locker)
{
while (taskQ.Count == 0) Monitor.Wait(locker);
task = taskQ.Dequeue();
}
if (task == null) return; // This signals our exit
Console.Write(task);
Thread.Sleep(1000); // Simulate time-consuming task
}
}
}
}

Back in the day I learned how Monitor.Wait/Pulse works (and a lot about threads in general) from the above piece of code and the article series it is from. So as Jon says, it has a lot of value to it and is indeed safe and applicable.
However, as of .NET 4, there is a producer-consumer queue implementation in the framework. I only just found it myself but up to this point it does everything I need.

These days a more modern option is available using the namespace System.Threading.Tasks.Dataflow. It's async/await friendly and much more versatile.
More info here How to: Implement a producer-consumer dataflow pattern
It's included starting from .Net Core, for older .Nets you may need to install a package with the same name as the namespace.
I know the question is old, but it's the first match in Google for my request, so I decided to update the topic.

A modern and simple way to implement the producer/consumer pattern in C# is to use System.Threading.Channels. It's asynchronous and uses ValueTask's to decrease memory allocations. Here is an example:
public class ProducerConsumer<T>
{
protected readonly Channel<T> JobChannel = Channel.CreateUnbounded<T>();
public IAsyncEnumerable<T> GetAllAsync()
{
return JobChannel.Reader.ReadAllAsync();
}
public async ValueTask AddAsync(T job)
{
await JobChannel.Writer.WriteAsync(job);
}
public async ValueTask AddAsync(IEnumerable<T> jobs)
{
foreach (var job in jobs)
{
await JobChannel.Writer.WriteAsync(job);
}
}
}

Warning: If you read the comments, you'll understand my answer is wrong :)
There's a possible deadlock in your code.
Imagine the following case, for clarity, I used a single-thread approach but should be easy to convert to multi-thread with sleep:
// We create some actions...
object locker = new object();
Action action1 = () => {
lock (locker)
{
System.Threading.Monitor.Wait(locker);
Console.WriteLine("This is action1");
}
};
Action action2 = () => {
lock (locker)
{
System.Threading.Monitor.Wait(locker);
Console.WriteLine("This is action2");
}
};
// ... (stuff happens, etc.)
// Imagine both actions were running
// and there's 0 items in the queue
// And now the producer kicks in...
lock (locker)
{
// This would add a job to the queue
Console.WriteLine("Pulse now!");
System.Threading.Monitor.Pulse(locker);
}
// ... (more stuff)
// and the actions finish now!
Console.WriteLine("Consume action!");
action1(); // Oops... they're locked...
action2();
Please do let me know if this doesn't make any sense.
If this is confirmed, then the answer to your question is, "no, it isn't safe" ;)
I hope this helps.

public class ProducerConsumerProblem
{
private int n;
object obj = new object();
public ProducerConsumerProblem(int n)
{
this.n = n;
}
public void Producer()
{
for (int i = 0; i < n; i++)
{
lock (obj)
{
Console.Write("Producer =>");
System.Threading.Monitor.Pulse(obj);
System.Threading.Thread.Sleep(1);
System.Threading.Monitor.Wait(obj);
}
}
}
public void Consumer()
{
lock (obj)
{
for (int i = 0; i < n; i++)
{
System.Threading.Monitor.Wait(obj, 10);
Console.Write("<= Consumer");
System.Threading.Monitor.Pulse(obj);
Console.WriteLine();
}
}
}
}
public class Program
{
static void Main(string[] args)
{
ProducerConsumerProblem f = new ProducerConsumerProblem(10);
System.Threading.Thread t1 = new System.Threading.Thread(() => f.Producer());
System.Threading.Thread t2 = new System.Threading.Thread(() => f.Consumer());
t1.IsBackground = true;
t2.IsBackground = true;
t1.Start();
t2.Start();
Console.ReadLine();
}
}
output
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Lock with small overhead or lockless - c#

If you want concurrency and no duplicates, you should use a ConcurrentDictionary So redeclare your Queue: private ConcurrentDictionary<Data, Data> unProcessed = new ConcurrentDictionary<Data, Data>(); This will considerably simplify your code while keeping very good performances.

Why would you have duplicates? If the publisher can really add duplicates then you need some type of concurrent hashing object (Dictionary or HashSet) to detect and prevent this from ocurring in the publisher. You may also want to investigate ReaderWriterLockSlim.

Related

What's the best pattern for a thread safe write cache to database?

How to delay multiple messages in parallel?

Best practices for implementing a thread to do fast, bulk, and continuous reading in C#?

Thread safe StreamWriter C# how to do it? 2

C# producer/consumer

Categories

Resources