Thread safe StreamWriter C# how to do it? 2 - c#

So this is a continuation from my last question - So the question was
"What is the best way to build a program that is thread safe in terms that it needs to write double values to a file. If the function that saves the values via streamwriter is being called by multiple threads? Whats the best way of doing it?"
And I modified some code found at MSDN, how about the following? This one correctly writes everything to the file.
namespace SafeThread
{
class Program
{
static void Main()
{
Threading threader = new Threading();
AutoResetEvent autoEvent = new AutoResetEvent(false);
Thread regularThread =
new Thread(new ThreadStart(threader.ThreadMethod));
regularThread.Start();
ThreadPool.QueueUserWorkItem(new WaitCallback(threader.WorkMethod),
autoEvent);
// Wait for foreground thread to end.
regularThread.Join();
// Wait for background thread to end.
autoEvent.WaitOne();
}
}
class Threading
{
List<double> Values = new List<double>();
static readonly Object locker = new Object();
StreamWriter writer = new StreamWriter("file");
static int bulkCount = 0;
static int bulkSize = 100000;
public void ThreadMethod()
{
lock (locker)
{
while (bulkCount < bulkSize)
Values.Add(bulkCount++);
}
bulkCount = 0;
}
public void WorkMethod(object stateInfo)
{
lock (locker)
{
foreach (double V in Values)
{
writer.WriteLine(V);
writer.Flush();
}
}
// Signal that this thread is finished.
((AutoResetEvent)stateInfo).Set();
}
}
}

Thread and QueueUserWorkItem are the lowest available APIs for threading. I wouldn't use them unless I absolutely, finally, had no other choice. Try the Task class for a much higher-level abstraction. For details, see my recent blog post on the subject.
You can also use BlockingCollection<double> as a proper producer/consumer queue instead of trying to build one by hand with the lowest available APIs for synchronization.
Reinventing these wheels correctly is surprisingly difficult. I highly recommend using the classes designed for this type of need (Task and BlockingCollection, to be specific). They are built-in to the .NET 4.0 framework and are available as an add-on for .NET 3.5.

the code has the writer as an instance var but using a static locker. If you had multiple instances writing to different files, there's no reason they would need to share the same lock
on a related note, since you already have the writer (as a private instance var), you can use that for locking instead of using a separate locker object in this case - that makes things a little simpler.
The 'right answer' really depends on what you're looking for in terms of locking/blocking behavior. For instance, the simplest thing would be to skip the intermediate data structure just have a WriteValues method such that each thread 'reporting' its results goes ahead and writes them to the file. Something like:
StreamWriter writer = new StreamWriter("file");
public void WriteValues(IEnumerable<double> values)
{
lock (writer)
{
foreach (var d in values)
{
writer.WriteLine(d);
}
writer.Flush();
}
}
Of course, this means worker threads serialize during their 'report results' phases - depending on the performance characteristics, that may be just fine though (5 minutes to generate, 500ms to write, for example).
On the other end of the spectrum, you'd have the worker threads write to a data structure. If you're in .NET 4, I'd recommend just using a ConcurrentQueue rather than doing that locking yourself.
Also, you may want to do the file i/o in bigger batches than those being reported by the worker threads, so you might choose to just do writing in a background thread on some frequency. That end of the spectrum looks something like the below (you'd remove the Console.WriteLine calls in real code, those are just there so you can see it working in action)
public class ThreadSafeFileBuffer<T> : IDisposable
{
private readonly StreamWriter m_writer;
private readonly ConcurrentQueue<T> m_buffer = new ConcurrentQueue<T>();
private readonly Timer m_timer;
public ThreadSafeFileBuffer(string filePath, int flushPeriodInSeconds = 5)
{
m_writer = new StreamWriter(filePath);
var flushPeriod = TimeSpan.FromSeconds(flushPeriodInSeconds);
m_timer = new Timer(FlushBuffer, null, flushPeriod, flushPeriod);
}
public void AddResult(T result)
{
m_buffer.Enqueue(result);
Console.WriteLine("Buffer is up to {0} elements", m_buffer.Count);
}
public void Dispose()
{
Console.WriteLine("Turning off timer");
m_timer.Dispose();
Console.WriteLine("Flushing final buffer output");
FlushBuffer(); // flush anything left over in the buffer
Console.WriteLine("Closing file");
m_writer.Dispose();
}
/// <summary>
/// Since this is only done by one thread at a time (almost always the background flush thread, but one time via Dispose), no need to lock
/// </summary>
/// <param name="unused"></param>
private void FlushBuffer(object unused = null)
{
T current;
while (m_buffer.TryDequeue(out current))
{
Console.WriteLine("Buffer is down to {0} elements", m_buffer.Count);
m_writer.WriteLine(current);
}
m_writer.Flush();
}
}
class Program
{
static void Main(string[] args)
{
var tempFile = Path.GetTempFileName();
using (var resultsBuffer = new ThreadSafeFileBuffer<double>(tempFile))
{
Parallel.For(0, 100, i =>
{
// simulate some 'real work' by waiting for awhile
var sleepTime = new Random().Next(10000);
Console.WriteLine("Thread {0} doing work for {1} ms", Thread.CurrentThread.ManagedThreadId, sleepTime);
Thread.Sleep(sleepTime);
resultsBuffer.AddResult(Math.PI*i);
});
}
foreach (var resultLine in File.ReadAllLines(tempFile))
{
Console.WriteLine("Line from result: {0}", resultLine);
}
}
}

So you're saying you want a bunch of threads to write data to a single file using a StreamWriter? Easy. Just lock the StreamWriter object.
The code here will create 5 threads. Each thread will perform 5 "actions," and at the end of each action it will write 5 lines to a file named "file."
using System;
using System.Collections.Generic;
using System.IO;
using System.Threading;
namespace ConsoleApplication1 {
class Program {
static void Main() {
StreamWriter Writer = new StreamWriter("file");
Action<int> ThreadProcedure = (i) => {
// A thread may perform many actions and write out the result after each action
// The outer loop here represents the multiple actions this thread will take
for (int x = 0; x < 5; x++) {
// Here is where the thread would generate the data for this action
// Well simulate work time using a call to Sleep
Thread.Sleep(1000);
// After generating the data the thread needs to lock the Writer before using it.
lock (Writer) {
// Here we'll write a few lines to the Writer
for (int y = 0; y < 5; y++) {
Writer.WriteLine("Thread id = {0}; Action id = {1}; Line id = {2}", i, x, y);
}
}
}
};
//Now that we have a delegate for the thread code lets make a few instances
List<IAsyncResult> AsyncResultList = new List<IAsyncResult>();
for (int w = 0; w < 5; w++) {
AsyncResultList.Add(ThreadProcedure.BeginInvoke(w, null, null));
}
// Wait for all threads to complete
foreach (IAsyncResult r in AsyncResultList) {
r.AsyncWaitHandle.WaitOne();
}
// Flush/Close the writer so all data goes to disk
Writer.Flush();
Writer.Close();
}
}
}
The result should be a file "file" with 125 lines in it with all "actions" performed concurrently and the result of each action written synchronously to the file.

The code you have there is subtly broken - in particular, if the queued work item runs first, then it will flush the (empty) list of values immediately, before terminating, after which point your worker goes and fills up the List (which will end up being ignored). The auto-reset event also does nothing, since nothing ever queries or waits on its state.
Also, since each thread uses a different lock, the locks have no meaning! You need to make sure you hold a single, shared lock whenever accessing the streamwriter. You don't need a lock between the flushing code and the generation code; you just need to make sure the flush runs after the generation finishes.
You're probably on the right track, though - although I'd use a fixed-size array instead of a list, and flush all entries from the array when it gets full. This avoids the possibility of running out of memory if the thread is long-lived.

Related

How to call a method on a running thread?

On a console application, i am currently starting an array of threads. The thread is passed an object and running a method in it. I would like to know how to call a method on the object inside the individual running threads.
Dispatcher doesn't work. SynchronizationContext "Send" runs on the calling thread and "Post" uses a new thread. I would like to be able to call the method and pass parameters on a running thread on the target thread it's running on and not the calling thread.
Update 2: Sample code
using System;
using System.Collections.Generic;
using System.Data.SqlClient;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
namespace CallingFromAnotherThread
{
class Program
{
static void Main(string[] args)
{
var threadCount = 10;
var threads = new Thread[threadCount];
Console.WriteLine("Main on Thread " + Thread.CurrentThread.ManagedThreadId);
for (int i = 0; i < threadCount; i++)
{
Dog d = new Dog();
threads[i] = new Thread(d.Run);
threads[i].Start();
}
Thread.Sleep(5000);
//how can i call dog.Bark("woof");
//on the individual dogs and make sure they run on the thread they were created on.
//not on the calling thread and not on a new thread.
}
}
class Dog
{
public void Run()
{
Console.WriteLine("Running on Thread " + Thread.CurrentThread.ManagedThreadId);
}
public void Bark(string text)
{
Console.WriteLine(text);
Console.WriteLine("Barking on Thread " + Thread.CurrentThread.ManagedThreadId);
}
}
}
Update 1:
Using synchronizationContext.Send results to using the calling thread
Channel created
Main thread 10
SyncData Added for thread 11
Consuming channel ran on thread 11
Calling AddConsumer on thread 10
Consumer added consumercb78b. Executed on thread 10
Calling AddConsumer on thread 10
Consumer added consumer783c4. Executed on thread 10
Using synchronizationContext.Post results to using a different thread
Channel created
Main thread 10
SyncData Added for thread 11
Consuming channel ran on thread 11
Calling AddConsumer on thread 12
Consumer added consumercb78b. Executed on thread 6
Calling AddConsumer on thread 10
Consumer added consumer783c4. Executed on thread 7
The target thread must run the code "on itself" - or it is just accessing the object across threads. This is done with some form of event dispatch loop on the target thread itself.
The SynchronizationContext abstraction can and does support this if the underlying provider supports it. For example in either WinForms or WPF (which themselves use the "window message pump") using Post will "run on the UI thread".
Basically, all such constructs follow some variation of the pattern:
// On "target thread"
while (running) {
var action = getNextDelegateFromQueue();
action();
}
// On other thread
postDelegateToQueue(actionToDoOnTargetThread);
It is fairly simple to create a primitive queue system manually - just make sure to use the correct synchronization guards. (Although I am sure there are tidy "solved problem" libraries out there; including wrapping everything up into a SynchronizationContext.)
Here is a primitive version of the manual queue. Note that there may be is1 a race condition.. but, FWIW:
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
namespace DogPark
{
internal class DogPark
{
private readonly string _parkName;
private readonly Thread _thread;
private readonly ConcurrentQueue<Action> _actions = new ConcurrentQueue<Action>();
private volatile bool _isOpen;
public DogPark(string parkName)
{
_parkName = parkName;
_isOpen = true;
_thread = new Thread(OpenPark);
_thread.Name = parkName;
_thread.Start();
}
// Runs in "target" thread
private void OpenPark(object obj)
{
while (true)
{
Action action;
if (_actions.TryDequeue(out action))
{
Program.WriteLine("Something is happening at {0}!", _parkName);
try
{
action();
}
catch (Exception ex)
{
Program.WriteLine("Bad dog did {0}!", ex.Message);
}
}
else
{
// Nothing left!
if (!_isOpen && _actions.IsEmpty)
{
return;
}
}
Thread.Sleep(0); // Don't toaster CPU
}
}
// Called from external thread
public void DoItInThePark(Action action)
{
if (_isOpen)
{
_actions.Enqueue(action);
}
}
// Called from external thread
public void ClosePark()
{
_isOpen = false;
Program.WriteLine("{0} is closing for the day!", _parkName);
// Block until queue empty.
while (!_actions.IsEmpty)
{
Program.WriteLine("Waiting for the dogs to finish at {0}, {1} actions left!", _parkName, _actions.Count);
Thread.Sleep(0); // Don't toaster CPU
}
Program.WriteLine("{0} is closed!", _parkName);
}
}
internal class Dog
{
private readonly string _name;
public Dog(string name)
{
_name = name;
}
public void Run()
{
Program.WriteLine("{0} is running at {1}!", _name, Thread.CurrentThread.Name);
}
public void Bark()
{
Program.WriteLine("{0} is barking at {1}!", _name, Thread.CurrentThread.Name);
}
}
internal class Program
{
// "Thread Safe WriteLine"
public static void WriteLine(params string[] arguments)
{
lock (Console.Out)
{
Console.Out.WriteLine(arguments);
}
}
private static void Main(string[] args)
{
Thread.CurrentThread.Name = "Home";
var yorkshire = new DogPark("Yorkshire");
var thunderpass = new DogPark("Thunder Pass");
var bill = new Dog("Bill the Terrier");
var rosemary = new Dog("Rosie");
bill.Run();
yorkshire.DoItInThePark(rosemary.Run);
yorkshire.DoItInThePark(rosemary.Bark);
thunderpass.DoItInThePark(bill.Bark);
yorkshire.DoItInThePark(rosemary.Run);
thunderpass.ClosePark();
yorkshire.ClosePark();
}
}
}
The output should look about like the following - keep in mind that this will change when run multiples times due to the inherent nature of non-synchronized threads.
Bill the Terrier is running at Home!
Something is happening at Thunder Pass!
Something is happening at Yorkshire!
Rosie is running at Yorkshire!
Bill the Terrier is barking at Thunder Pass!
Something is happening at Yorkshire!
Rosie is barking at Yorkshire!
Something is happening at Yorkshire!
Rosie is running at Yorkshire!
Thunder Pass is closing for the day!
Thunder Pass is closed!
Yorkshire is closing for the day!
Yorkshire is closed!
There is nothing preventing a dog from performing at multiple dog parks simultaneously.
1 There is a race condition present and it is this: a park may close before the last dog action runs.
This is because the dog park thread dequeues the action before the action is run - and the method to close the dog park only waits until all the actions are dequeued.
There are multiple ways to address it, for instance:
The concurrent queue could first peek-use-then-dequeue-after-the-action, or
A separate volatile isClosed-for-real flag (set from the dog park thread) could be used, or ..
I've left the bug in as a reminder of the perils of threading..
A running thread is already executing a method. You cannot directly force that thread to leave the method and enter a new one. However, you could send information to that thread to leave the current method and do something else. But this only works if the executed method can react on that passed information.
In general, you can use threads to call/execute methods, but you cannot call a method ON a running thread.
Edit, based on your updates:
If you want to use the same threads to execute dog.run and dog.bark, and do it in the same objects, the you need to modify your code:
static void Main(string[] args)
{
var threadCount = 10;
var threads = new Thread[threadCount];
Console.WriteLine("Main on Thread " + Thread.CurrentThread.ManagedThreadId);
// keep the dog objects outside the creation block in order to access them later again. Always useful.
Dog[] dogs = New Dog[threadCount];
for (int i = 0; i < threadCount; i++)
{
dogs[i] = new Dog();
threads[i] = new Thread(d.Run);
threads[i].Start();
}
Thread.Sleep(5000);
//how can i call dog.Bark("woof") --> here you go:
for (int i = 0; i < threadCount; i++)
{
threads[i] = new Thread(d.Bark);
threads[i].Start();
}
// but this will create NEW threads because the others habe exited after finishing d.run, and habe been deleted. Is this a problem for you?
// maybe the other threads are still running, causing a parallel execution of d.run and d.bark.
//on the individual dogs and make sure they run on the thread they were created on.
//not on the calling thread and not on a new thread. -->
// instead of d.run, call d.doActions and loop inside that function, check for commands from external sources:
for (int i = 0; i < threadCount; i++)
{
threads[i] = new Thread(d.doActions);
threads[i].Start();
}
// but in this case there will be sequential execution of actions. No parallel run and bark.
}
Inside your dog class:
Enum class EnumAction
{
Nothing,
Run,
bark,
exit,
};
EnumAction m_enAction;
Object m_oLockAction;
void SetAction (EnumAction i_enAction)
{
Monitor.Enter (m_oLockAction);
m_enAction = i_enAction;
Monitor.Exit (m_oLockAction);
}
void SetAction (EnumAction i_enAction)
{
Monitor.Enter (m_oLockAction);
m_enAction = i_enAction;
Monitor.Exit (m_oLockAction);
}
Void doActions()
{
EnumAction enAction;
Do
{
Thread.sleep(20);
enAction = GetAction();
Switch(enAction)
{
Case EnumAction.run:
Run(); break;
Case ...
}
} while (enAction != EnumAction.exit);
}
Got it? ;-)
Sorry for any typos, I was typing on my mobile phone, and I usually use C++CLI.
Another advice: as you would read the variable m_enAction inside the thread and write it from outside, you need to ensure that it gets updated properly due to the access from different threads. The threads MUST NOT cache the variable in the CPU, otherwise they don't see it changing. Use locks (e.g. Monitor) to achieve that. (But do not use a Monitor on m_enAction, because you can use Monitors only on objects. Create a dummy object for this purpose.)
I have added the necessary code. Check out the differences between the edits to see the changes.
You cannot run second method while first method is running. If you want them to run in parallel you need another thread. However, your object needs to be thread safe.
Execution of thread simply means execution of sequence of instruction. Dispatcher is nothing else than an infinite loop that executes queued method one after another.
I recommend you to use tasks instead of threads. Use Parallel.ForEach to run Dog.Run method on each dog object instance. To run Bark method use Task.Run(dog.Bark).
Since you used running and barking dog as an example you could write your own "dispatcher". That means infinite loop that would execute all queued work. In that case you could have all dogs in single thread. Sounds weird, but you could have unlimited amount of dogs. At the end, only as many threads can be executed at the same time as many CPU cores is available

Unordered threads problem

I had asked question about lock in here and people responded there is no problem in my lock implementation. But i catched problem. Here is same lock implementation and i am getting weird result. I expect to see numbers starts from 1 but it starts from 5.Example is at below.
class Program
{
static object locker = new object();
static void Main(string[] args)
{
for (int j = 0; j < 100; j++)
{
(new Thread(new ParameterizedThreadStart(dostuff))).Start(j);
}
Console.ReadKey();
}
static void dostuff(dynamic input)
{
lock (locker)
{
Console.WriteLine(input);
}
}
}
The code is fine. But you cannot guarantee the order the threads are executed in. When I run the code I get:
0
1
3
5
2
4
6
10
9
11
7
12
8
etc
If you need to run the threads in a specified order, you could look into using ThreadPool.QueueUserWorkItem instead.
class Program
{
static object locker = new object();
static EventWaitHandle clearCount
=new EventWaitHandle(false, EventResetMode.ManualReset);
static void Main(string[] args)
{
for (int j = 0; j < 100; j++)
{
ThreadPool.QueueUserWorkItem(dostuff, j);
}
clearCount.WaitOne();
}
static void dostuff(dynamic input)
{
lock (locker)
{
Console.WriteLine(input);
if (input == 99) clearCount.Set();
}
}
}
It doesn't make sense to put a lock where you're putting it, as you're not locking code which changes a value shared by multiple threads. The section of code you're locking doesn't change any variables at all.
The reason the numbers are out of order is because the threads aren't guaranteed to start in any particular order, unless you do something like #Mikael Svenson suggests.
For an example of a shared variable, if you use this code:
class Program
{
static object locker = new object();
static int count=0;
static void Main(string[] args)
{
for (int j = 0; j < 100; j++)
{
(new Thread(new ParameterizedThreadStart(dostuff))).Start(j);
}
Console.ReadKey();
}
static void dostuff(object Id)
{
lock (locker)
{
count++;
Console.WriteLine("Thread {0}: Count is {1}", Id, count);
}
}
}
You'll probably see that the Thread numbers aren't in order, but the count is. If you remove the lock statement, the count won't be in order either.
You have a couple of problems and wrong assumptions here.
Creating 100 threads in this fashion is not recommended.
The threads are not going to execute in the order they are started.
Placing the lock where you have it will effectively serialize the execution of the threads immediately removing any advantage you were hoping to gain by using threading.
The best approach to use is to partition your problem into separate independent chunks which can be computed simultaneously using only the least amount of thread synchronization as possible. These partitions should be executed on small and fairly static number of threads. You can use the ThreadPool, Parallel, or Task classes for doing this.
I have included a sample pattern using the Parallel.For method. To make the sample easy to understand lets say you have a list of objects that you want to clone and land into a separate list. Lets assume the clone operation is expensive and that you want to parallelize the cloning of many objects. Here is how you would do it. Notice the placement and limited use of the lock keyword.
public static void Main()
{
List<ICloneable> original = GetCloneableObjects();
List<ICloneable> copies = new List<ICloneable>();
Parallel.For(0, 100,
i =>
{
ICloneable cloneable = original[i];
ICloneable copy = cloneable.Clone();
lock (copies)
{
copies.Add(copy);
}
});
}

Running multiple threads, starting new one as another finishes

I have an application that has many cases. Each case has many multipage tif files. I need to covert the tf files to pdf file. Since there are so many file, I thought I could thread the conversion process. I'm currently limiting the process to ten conversions at a time (i.e ten treads). When one conversion completes, another should start.
This is the current setup I'm using.
private void ConvertFiles()
{
List<AutoResetEvent> semaphores = new List<AutoResetEvet>();
foreach(String fileName in filesToConvert)
{
String file = fileName;
if(semaphores.Count >= 10)
{
WaitHandle.WaitAny(semaphores.ToArray());
}
AutoResetEvent semaphore = new AutoResetEvent(false);
semaphores.Add(semaphore);
ThreadPool.QueueUserWorkItem(
delegate
{
Convert(file);
semaphore.Set();
semaphores.Remove(semaphore);
}, null);
}
if(semaphores.Count > 0)
{
WaitHandle.WaitAll(semaphores.ToArray());
}
}
Using this, sometimes results in an exception stating the WaitHandle.WaitAll() or WaitHandle.WaitAny() array parameters must not exceed a length of 65. What am I doing wrong in this approach and how can I correct it?
There are a few problems with what you have written.
1st, it isn't thread safe. You have multiple threads adding, removing and waiting on the array of AutoResetEvents. The individual elements of the List can be accessed on separate threads, but anything that adds, removes, or checks all elements (like the WaitAny call), need to do so inside of a lock.
2nd, there is no guarantee that your code will only process 10 files at a time. The code between when the size of the List is checked, and the point where a new item is added is open for multiple threads to get through.
3rd, there is potential for the threads started in the QueueUserWorkItem to convert the same file. Without capturing the fileName inside the loop, the thread that converts the file will use whatever value is in fileName when it executes, NOT whatever was in fileName when you called QueueUserWorkItem.
This codeproject article should point you in the right direction for what you are trying to do: http://www.codeproject.com/KB/threads/SchedulingEngine.aspx
EDIT:
var semaphores = new List<AutoResetEvent>();
foreach (String fileName in filesToConvert)
{
String file = fileName;
AutoResetEvent[] array;
lock (semaphores)
{
array = semaphores.ToArray();
}
if (array.Count() >= 10)
{
WaitHandle.WaitAny(array);
}
var semaphore = new AutoResetEvent(false);
lock (semaphores)
{
semaphores.Add(semaphore);
}
ThreadPool.QueueUserWorkItem(
delegate
{
Convert(file);
lock (semaphores)
{
semaphores.Remove(semaphore);
}
semaphore.Set();
}, null);
}
Personally, I don't think I'd do it this way...but, working with the code you have, this should work.
Are you using a real semaphore (System.Threading)? When using semaphores, you typically allocate your max resources and it'll block for you automatically (as you add & release). You can go with the WaitAny approach, but I'm getting the feeling that you've chosen the more difficult route.
Looks like you need to remove the handle the triggered the WaitAny function to proceed
if(semaphores.Count >= 10)
{
int index = WaitHandle.WaitAny(semaphores.ToArray());
semaphores.RemoveAt(index);
}
So basically I would remove the:
semaphores.Remove(semaphore);
call from the thread and use the above to remove the signaled event and see if that works.
Maybe you shouldn't create so many events?
// input
var filesToConvert = new List<string>();
Action<string> Convert = Console.WriteLine;
// limit
const int MaxThreadsCount = 10;
var fileConverted = new AutoResetEvent(false);
long threadsCount = 0;
// start
foreach (var file in filesToConvert) {
if (threadsCount++ > MaxThreadsCount) // reached max threads count
fileConverted.WaitOne(); // wait for one of started threads
Interlocked.Increment(ref threadsCount);
ThreadPool.QueueUserWorkItem(
delegate {
Convert(file);
Interlocked.Decrement(ref threadsCount);
fileConverted.Set();
});
}
// wait
while (Interlocked.Read(ref threadsCount) > 0) // paranoia?
fileConverted.WaitOne();

C# producer/consumer

i've recently come across a producer/consumer pattern c# implementation. it's very simple and (for me at least) very elegant.
it seems to have been devised around 2006, so i was wondering if this implementation is
- safe
- still applicable
Code is below (original code was referenced at http://bytes.com/topic/net/answers/575276-producer-consumer#post2251375)
using System;
using System.Collections;
using System.Threading;
public class Test
{
static ProducerConsumer queue;
static void Main()
{
queue = new ProducerConsumer();
new Thread(new ThreadStart(ConsumerJob)).Start();
Random rng = new Random(0);
for (int i=0; i < 10; i++)
{
Console.WriteLine ("Producing {0}", i);
queue.Produce(i);
Thread.Sleep(rng.Next(1000));
}
}
static void ConsumerJob()
{
// Make sure we get a different random seed from the
// first thread
Random rng = new Random(1);
// We happen to know we've only got 10
// items to receive
for (int i=0; i < 10; i++)
{
object o = queue.Consume();
Console.WriteLine ("\t\t\t\tConsuming {0}", o);
Thread.Sleep(rng.Next(1000));
}
}
}
public class ProducerConsumer
{
readonly object listLock = new object();
Queue queue = new Queue();
public void Produce(object o)
{
lock (listLock)
{
queue.Enqueue(o);
// We always need to pulse, even if the queue wasn't
// empty before. Otherwise, if we add several items
// in quick succession, we may only pulse once, waking
// a single thread up, even if there are multiple threads
// waiting for items.
Monitor.Pulse(listLock);
}
}
public object Consume()
{
lock (listLock)
{
// If the queue is empty, wait for an item to be added
// Note that this is a while loop, as we may be pulsed
// but not wake up before another thread has come in and
// consumed the newly added object. In that case, we'll
// have to wait for another pulse.
while (queue.Count==0)
{
// This releases listLock, only reacquiring it
// after being woken up by a call to Pulse
Monitor.Wait(listLock);
}
return queue.Dequeue();
}
}
}
The code is older than that - I wrote it some time before .NET 2.0 came out. The concept of a producer/consumer queue is way older than that though :)
Yes, that code is safe as far as I'm aware - but it has some deficiencies:
It's non-generic. A modern version would certainly be generic.
It has no way of stopping the queue. One simple way of stopping the queue (so that all the consumer threads retire) is to have a "stop work" token which can be put into the queue. You then add as many tokens as you have threads. Alternatively, you have a separate flag to indicate that you want to stop. (This allows the other threads to stop before finishing all the current work in the queue.)
If the jobs are very small, consuming a single job at a time may not be the most efficient thing to do.
The ideas behind the code are more important than the code itself, to be honest.
You could do something like the following code snippet. It's generic and has a method for enqueue-ing nulls (or whatever flag you'd like to use) to tell the worker threads to exit.
The code is taken from here: http://www.albahari.com/threading/part4.aspx#_Wait_and_Pulse
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
namespace ConsoleApplication1
{
public class TaskQueue<T> : IDisposable where T : class
{
object locker = new object();
Thread[] workers;
Queue<T> taskQ = new Queue<T>();
public TaskQueue(int workerCount)
{
workers = new Thread[workerCount];
// Create and start a separate thread for each worker
for (int i = 0; i < workerCount; i++)
(workers[i] = new Thread(Consume)).Start();
}
public void Dispose()
{
// Enqueue one null task per worker to make each exit.
foreach (Thread worker in workers) EnqueueTask(null);
foreach (Thread worker in workers) worker.Join();
}
public void EnqueueTask(T task)
{
lock (locker)
{
taskQ.Enqueue(task);
Monitor.PulseAll(locker);
}
}
void Consume()
{
while (true)
{
T task;
lock (locker)
{
while (taskQ.Count == 0) Monitor.Wait(locker);
task = taskQ.Dequeue();
}
if (task == null) return; // This signals our exit
Console.Write(task);
Thread.Sleep(1000); // Simulate time-consuming task
}
}
}
}
Back in the day I learned how Monitor.Wait/Pulse works (and a lot about threads in general) from the above piece of code and the article series it is from. So as Jon says, it has a lot of value to it and is indeed safe and applicable.
However, as of .NET 4, there is a producer-consumer queue implementation in the framework. I only just found it myself but up to this point it does everything I need.
These days a more modern option is available using the namespace System.Threading.Tasks.Dataflow. It's async/await friendly and much more versatile.
More info here How to: Implement a producer-consumer dataflow pattern
It's included starting from .Net Core, for older .Nets you may need to install a package with the same name as the namespace.
I know the question is old, but it's the first match in Google for my request, so I decided to update the topic.
A modern and simple way to implement the producer/consumer pattern in C# is to use System.Threading.Channels. It's asynchronous and uses ValueTask's to decrease memory allocations. Here is an example:
public class ProducerConsumer<T>
{
protected readonly Channel<T> JobChannel = Channel.CreateUnbounded<T>();
public IAsyncEnumerable<T> GetAllAsync()
{
return JobChannel.Reader.ReadAllAsync();
}
public async ValueTask AddAsync(T job)
{
await JobChannel.Writer.WriteAsync(job);
}
public async ValueTask AddAsync(IEnumerable<T> jobs)
{
foreach (var job in jobs)
{
await JobChannel.Writer.WriteAsync(job);
}
}
}
Warning: If you read the comments, you'll understand my answer is wrong :)
There's a possible deadlock in your code.
Imagine the following case, for clarity, I used a single-thread approach but should be easy to convert to multi-thread with sleep:
// We create some actions...
object locker = new object();
Action action1 = () => {
lock (locker)
{
System.Threading.Monitor.Wait(locker);
Console.WriteLine("This is action1");
}
};
Action action2 = () => {
lock (locker)
{
System.Threading.Monitor.Wait(locker);
Console.WriteLine("This is action2");
}
};
// ... (stuff happens, etc.)
// Imagine both actions were running
// and there's 0 items in the queue
// And now the producer kicks in...
lock (locker)
{
// This would add a job to the queue
Console.WriteLine("Pulse now!");
System.Threading.Monitor.Pulse(locker);
}
// ... (more stuff)
// and the actions finish now!
Console.WriteLine("Consume action!");
action1(); // Oops... they're locked...
action2();
Please do let me know if this doesn't make any sense.
If this is confirmed, then the answer to your question is, "no, it isn't safe" ;)
I hope this helps.
public class ProducerConsumerProblem
{
private int n;
object obj = new object();
public ProducerConsumerProblem(int n)
{
this.n = n;
}
public void Producer()
{
for (int i = 0; i < n; i++)
{
lock (obj)
{
Console.Write("Producer =>");
System.Threading.Monitor.Pulse(obj);
System.Threading.Thread.Sleep(1);
System.Threading.Monitor.Wait(obj);
}
}
}
public void Consumer()
{
lock (obj)
{
for (int i = 0; i < n; i++)
{
System.Threading.Monitor.Wait(obj, 10);
Console.Write("<= Consumer");
System.Threading.Monitor.Pulse(obj);
Console.WriteLine();
}
}
}
}
public class Program
{
static void Main(string[] args)
{
ProducerConsumerProblem f = new ProducerConsumerProblem(10);
System.Threading.Thread t1 = new System.Threading.Thread(() => f.Producer());
System.Threading.Thread t2 = new System.Threading.Thread(() => f.Consumer());
t1.IsBackground = true;
t2.IsBackground = true;
t1.Start();
t2.Start();
Console.ReadLine();
}
}
output
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer
Producer =><= Consumer

How to effectively log asynchronously?

I am using Enterprise Library 4 on one of my projects for logging (and other purposes). I've noticed that there is some cost to the logging that I am doing that I can mitigate by doing the logging on a separate thread.
The way I am doing this now is that I create a LogEntry object and then I call BeginInvoke on a delegate that calls Logger.Write.
new Action<LogEntry>(Logger.Write).BeginInvoke(le, null, null);
What I'd really like to do is add the log message to a queue and then have a single thread pulling LogEntry instances off the queue and performing the log operation. The benefit of this would be that logging is not interfering with the executing operation and not every logging operation results in a job getting thrown on the thread pool.
How can I create a shared queue that supports many writers and one reader in a thread safe way? Some examples of a queue implementation that is designed to support many writers (without causing synchronization/blocking) and a single reader would be really appreciated.
Recommendation regarding alternative approaches would also be appreciated, I am not interested in changing logging frameworks though.
I wrote this code a while back, feel free to use it.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
namespace MediaBrowser.Library.Logging {
public abstract class ThreadedLogger : LoggerBase {
Queue<Action> queue = new Queue<Action>();
AutoResetEvent hasNewItems = new AutoResetEvent(false);
volatile bool waiting = false;
public ThreadedLogger() : base() {
Thread loggingThread = new Thread(new ThreadStart(ProcessQueue));
loggingThread.IsBackground = true;
loggingThread.Start();
}
void ProcessQueue() {
while (true) {
waiting = true;
hasNewItems.WaitOne(10000,true);
waiting = false;
Queue<Action> queueCopy;
lock (queue) {
queueCopy = new Queue<Action>(queue);
queue.Clear();
}
foreach (var log in queueCopy) {
log();
}
}
}
public override void LogMessage(LogRow row) {
lock (queue) {
queue.Enqueue(() => AsyncLogMessage(row));
}
hasNewItems.Set();
}
protected abstract void AsyncLogMessage(LogRow row);
public override void Flush() {
while (!waiting) {
Thread.Sleep(1);
}
}
}
}
Some advantages:
It keeps the background logger alive, so it does not need to spin up and spin down threads.
It uses a single thread to service the queue, which means there will never be a situation where 100 threads are servicing the queue.
It copies the queues to ensure the queue is not blocked while the log operation is performed
It uses an AutoResetEvent to ensure the bg thread is in a wait state
It is, IMHO, very easy to follow
Here is a slightly improved version, keep in mind I performed very little testing on it, but it does address a few minor issues.
public abstract class ThreadedLogger : IDisposable {
Queue<Action> queue = new Queue<Action>();
ManualResetEvent hasNewItems = new ManualResetEvent(false);
ManualResetEvent terminate = new ManualResetEvent(false);
ManualResetEvent waiting = new ManualResetEvent(false);
Thread loggingThread;
public ThreadedLogger() {
loggingThread = new Thread(new ThreadStart(ProcessQueue));
loggingThread.IsBackground = true;
// this is performed from a bg thread, to ensure the queue is serviced from a single thread
loggingThread.Start();
}
void ProcessQueue() {
while (true) {
waiting.Set();
int i = ManualResetEvent.WaitAny(new WaitHandle[] { hasNewItems, terminate });
// terminate was signaled
if (i == 1) return;
hasNewItems.Reset();
waiting.Reset();
Queue<Action> queueCopy;
lock (queue) {
queueCopy = new Queue<Action>(queue);
queue.Clear();
}
foreach (var log in queueCopy) {
log();
}
}
}
public void LogMessage(LogRow row) {
lock (queue) {
queue.Enqueue(() => AsyncLogMessage(row));
}
hasNewItems.Set();
}
protected abstract void AsyncLogMessage(LogRow row);
public void Flush() {
waiting.WaitOne();
}
public void Dispose() {
terminate.Set();
loggingThread.Join();
}
}
Advantages over the original:
It's disposable, so you can get rid of the async logger
The flush semantics are improved
It will respond slightly better to a burst followed by silence
Yes, you need a producer/consumer queue. I have one example of this in my threading tutorial - if you look my "deadlocks / monitor methods" page you'll find the code in the second half.
There are plenty of other examples online, of course - and .NET 4.0 will ship with one in the framework too (rather more fully featured than mine!). In .NET 4.0 you'd probably wrap a ConcurrentQueue<T> in a BlockingCollection<T>.
The version on that page is non-generic (it was written a long time ago) but you'd probably want to make it generic - it would be trivial to do.
You would call Produce from each "normal" thread, and Consume from one thread, just looping round and logging whatever it consumes. It's probably easiest just to make the consumer thread a background thread, so you don't need to worry about "stopping" the queue when your app exits. That does mean there's a remote possibility of missing the final log entry though (if it's half way through writing it when the app exits) - or even more if you're producing faster than it can consume/log.
Here is what I came up with... also see Sam Saffron's answer. This answer is community wiki in case there are any problems that people see in the code and want to update.
/// <summary>
/// A singleton queue that manages writing log entries to the different logging sources (Enterprise Library Logging) off the executing thread.
/// This queue ensures that log entries are written in the order that they were executed and that logging is only utilizing one thread (backgroundworker) at any given time.
/// </summary>
public class AsyncLoggerQueue
{
//create singleton instance of logger queue
public static AsyncLoggerQueue Current = new AsyncLoggerQueue();
private static readonly object logEntryQueueLock = new object();
private Queue<LogEntry> _LogEntryQueue = new Queue<LogEntry>();
private BackgroundWorker _Logger = new BackgroundWorker();
private AsyncLoggerQueue()
{
//configure background worker
_Logger.WorkerSupportsCancellation = false;
_Logger.DoWork += new DoWorkEventHandler(_Logger_DoWork);
}
public void Enqueue(LogEntry le)
{
//lock during write
lock (logEntryQueueLock)
{
_LogEntryQueue.Enqueue(le);
//while locked check to see if the BW is running, if not start it
if (!_Logger.IsBusy)
_Logger.RunWorkerAsync();
}
}
private void _Logger_DoWork(object sender, DoWorkEventArgs e)
{
while (true)
{
LogEntry le = null;
bool skipEmptyCheck = false;
lock (logEntryQueueLock)
{
if (_LogEntryQueue.Count <= 0) //if queue is empty than BW is done
return;
else if (_LogEntryQueue.Count > 1) //if greater than 1 we can skip checking to see if anything has been enqueued during the logging operation
skipEmptyCheck = true;
//dequeue the LogEntry that will be written to the log
le = _LogEntryQueue.Dequeue();
}
//pass LogEntry to Enterprise Library
Logger.Write(le);
if (skipEmptyCheck) //if LogEntryQueue.Count was > 1 before we wrote the last LogEntry we know to continue without double checking
{
lock (logEntryQueueLock)
{
if (_LogEntryQueue.Count <= 0) //if queue is still empty than BW is done
return;
}
}
}
}
}
I suggest to start with measuring actual performance impact of logging on the overall system (i.e. by running profiler) and optionally switching to something faster like log4net (I've personally migrated to it from EntLib logging a long time ago).
If this does not work, you can try using this simple method from .NET Framework:
ThreadPool.QueueUserWorkItem
Queues a method for execution. The method executes when a thread pool thread becomes available.
MSDN Details
If this does not work either then you can resort to something like John Skeet has offered and actually code the async logging framework yourself.
In response to Sam Safrons post, I wanted to call flush and make sure everything was really finished writting. In my case, I am writing to a database in the queue thread and all my log events were getting queued up but sometimes the application stopped before everything was finished writing which is not acceptable in my situation. I changed several chunks of your code but the main thing I wanted to share was the flush:
public static void FlushLogs()
{
bool queueHasValues = true;
while (queueHasValues)
{
//wait for the current iteration to complete
m_waitingThreadEvent.WaitOne();
lock (m_loggerQueueSync)
{
queueHasValues = m_loggerQueue.Count > 0;
}
}
//force MEL to flush all its listeners
foreach (MEL.LogSource logSource in MEL.Logger.Writer.TraceSources.Values)
{
foreach (TraceListener listener in logSource.Listeners)
{
listener.Flush();
}
}
}
I hope that saves someone some frustration. It is especially apparent in parallel processes logging lots of data.
Thanks for sharing your solution, it set me into a good direction!
--Johnny S
I wanted to say that my previous post was kind of useless. You can simply set AutoFlush to true and you will not have to loop through all the listeners. However, I still had crazy problem with parallel threads trying to flush the logger. I had to create another boolean that was set to true during the copying of the queue and executing the LogEntry writes and then in the flush routine I had to check that boolean to make sure something was not already in the queue and the nothing was getting processed before returning.
Now multiple threads in parallel can hit this thing and when I call flush I know it is really flushed.
public static void FlushLogs()
{
int queueCount;
bool isProcessingLogs;
while (true)
{
//wait for the current iteration to complete
m_waitingThreadEvent.WaitOne();
//check to see if we are currently processing logs
lock (m_isProcessingLogsSync)
{
isProcessingLogs = m_isProcessingLogs;
}
//check to see if more events were added while the logger was processing the last batch
lock (m_loggerQueueSync)
{
queueCount = m_loggerQueue.Count;
}
if (queueCount == 0 && !isProcessingLogs)
break;
//since something is in the queue, reset the signal so we will not keep looping
Thread.Sleep(400);
}
}
Just an update:
Using enteprise library 5.0 with .NET 4.0 it can easily be done by:
static public void LogMessageAsync(LogEntry logEntry)
{
Task.Factory.StartNew(() => LogMessage(logEntry));
}
See:
http://randypaulo.wordpress.com/2011/07/28/c-enterprise-library-asynchronous-logging/
An extra level of indirection may help here.
Your first async method call can put messages onto a synchonized Queue and set an event -- so the locks are happening in the thread-pool, not on your worker threads -- and then have yet another thread pulling messages off the queue when the event is raised.
If you log something on a separate thread, the message may not be written if the application crashes, which makes it rather useless.
The reason goes why you should always flush after every written entry.
If what you have in mind is a SHARED queue, then I think you are going to have to synchronize the writes to it, the pushes and the pops.
But, I still think it's worth aiming at the shared queue design. In comparison to the IO of logging and probably in comparison to the other work your app is doing, the brief amount of blocking for the pushes and the pops will probably not be significant.

Categories

Resources