Handling WF 4.0 long running activity using TPL - c#

I created an activity which executes a web request and stores the result into the database. Usually this process takes about 1 hour and it makes workflow engine to behave abnormally. I found out that for these long running activities I should write some different code so that the workflow engine thread won't be blocked.
Studying some blogs about writing long running activities I understand that I should use Bookmark concept. But I didn't any solution using TPL and Task.
Is this code correct for handling a long running activity using Tasks?
public sealed class WebSaveActivity : NativeActivity
{
protected override void Execute(NativeActivityContext context)
{
context.CreateBookmark("websave", (activityContext, bookmark, value) =>
{
});
Task.Factory.StartNew(() =>
{
GetAndSave(); // This takes 1 hour to accomplish.
context.RemoveBookmark("websave");
});
}
protected override bool CanInduceIdle
{
get
{
return true;
}
}
}

No, that is not the way bookmarks should be used. A bookmark is used when the workflow has to wait for input from an external process.
For example: I have a document approval workflow and at some point in time the workflow has to wait for a human reviewer to give an OK on the document. Instead of keeping the workflow instance in memory the workflow will be idled and activated again by the runtime when ResumeBookmark is called.
In your situation your workflow cannot be idled since it has an operation running in its context. That operation is your task that, by the way, is a fire-and-forget tasks so critical failures cannot be handled by the WF.
Now, for a possible solution you might consider to have an other process call the GetAndSave method and have that process ultimately call the ResumeBookmark on the WF so the workflow can be idled by the runtime. That process could even be the same process that hosts your workflow.
For an example see this blogpost. Just image that instead of waiting for a human to enter something in the console your long running task is performed.
You did not specify what comes after your activity but do note that it is possible to return data back to the workflow when the bookmark is resumed. So any result of the GetAndSave, even if it is just an error code you can use to decide how to further go along with the other activities in your workflow.
Hope this makes sense to you and you see what I try to outline as a possible solution.
EDIT
A quick note about using Tasks or async/await in WF. There are AFAIK no methods to override that return Tasks so you either have to make them block by using .Wait() or .Result or forget about it. Because if you cannot await them bad things will happen during workflow execution because other activities might be started before the one using Tasks has completed its work.
When the WF runtime was developed the whole concept of Tasks was still very young so the WF runtime did not / does not cater for them.
EDIT 2:
example implementation (Based on this excellent official documentation)
Your activity will be almost empty:
public sealed class TriggerDownload : NativeActivity<string>
{
[RequiredArgument]
public InArgument<string> BookmarkName { get; set; }
protected override void Execute(NativeActivityContext context)
{
// Create a Bookmark and wait for it to be resumed.
context.CreateBookmark(BookmarkName.Get(context),
new BookmarkCallback(OnResumeBookmark));
}
protected override bool CanInduceIdle
{
get { return true; }
}
public void OnResumeBookmark(NativeActivityContext context, Bookmark bookmark, object obj)
{
// When the Bookmark is resumed, assign its value to
// the Result argument. (This depends on whether you have a result on your GetData method like a string with a result code or something)
Result.Set(context, (string)obj);
}
}
It signals the workflow runtime that the workflow can be idled and how it can be resumed.
Now, for the workflow runtime configuration:
WorkflowApplication wfApp = new WorkflowApplication(<Your WF>);
// Workflow lifecycle events omitted except idle.
AutoResetEvent idleEvent = new AutoResetEvent(false);
wfApp.Idle = delegate(WorkflowApplicationIdleEventArgs e)
{
idleEvent.Set();
};
// Run the workflow.
wfApp.Run();
// Wait for the workflow to go idle before starting the download
idleEvent.WaitOne();
// Start the download and resume the bookmark when finished.
var result = await Task.Run(() => GetAndSave());
BookmarkResumptionResult result = wfApp.ResumeBookmark(new Bookmark("GetData"), result);
// Possible BookmarkResumptionResult values:
// Success, NotFound, or NotReady
Console.WriteLine("BookmarkResumptionResult: {0}", result);

I just saw your related question here: How to write a long running activity to call web services in WF 4.0
Another way is to implement your activity is as an AsyncCodeActivity:
namespace MyLibrary.Activities
{
using System;
using System.Activities;
public sealed class MyActivity : AsyncCodeActivity
{
protected override IAsyncResult BeginExecute(AsyncCodeActivityContext context, AsyncCallback callback, object state)
{
var delegateToLongOperation = new Func<bool>(this.LongRunningSave);
context.UserState = delegateToLongOperation;
return delegateToLongOperation.BeginInvoke(callback, state);
}
protected override void EndExecute(AsyncCodeActivityContext context, IAsyncResult result)
{
var longOperationDelegate = (Func<bool>) context.UserState;
var longOperationResult = longOperationDelegate.EndInvoke(result);
// Can continue your activity logic here.
}
private bool LongRunningSave()
{
// Logic to perform the save.
return true;
}
}
}
The workflow instance stays in memory, but at the very least the workflow runtime can handle its normal scheduling tasks without one of its threads being taken up by a long running process.

Related

Avoiding allocations and maintaining concurrency when wrapping a callback-based API with an async API on a hot path

I have read a number of articles and questions here on StackOverflow about wrapping a callback based API with a Task based one using a TaskCompletionSource, and I'm trying to use that sort of technique when communicating with a Solace PubSub+ message broker.
My initial observation was that this technique seems to shift responsibility for concurrency. For example, the Solace broker library has a Send() method which can possibly block, and then we get a callback after the network communication is complete to indicate "real" success or failure. So this Send() method can be called very quickly, and the vendor library limits concurrency internally.
When you put a Task around that it seems you either serialize the operations ( foreach message await SendWrapperAsync(message) ), or take over responsibility for concurrency yourself by deciding how many tasks to start (eg, using TPL dataflow).
In any case, I decided to wrap the Send call with a guarantor that will retry forever until the callback indicates success, as well as take responsibility for concurrency. This is a "guaranteed" messaging system. Failure is not an option. This requires that the guarantor can apply backpressure, but that's not really in the scope of this question. I have a couple of comments about it in my example code below.
What it does mean is that my hot path, which wraps the send + callback, is "extra hot" because of the retry logic. And so there's a lot of TaskCompletionSource creation here.
The vendor's own documentation makes recommendations about reusing their Message objects where possible rather then recreating them for every Send. I have decided to use a Channel as a ring buffer for this. But that made me wonder - is there some alternative to the TaskCompletionSource approach - maybe some other object that can also be cached in the ring buffer and reused, achieving the same outcome?
I realise this is probably an overzealous attempt at micro-optimisation, and to be honest I am exploring several aspects of C# which are above my pay grade (I'm a SQL guy, really), so I could be missing something obvious. If the answer is "you don't actually need this optimisation", that's not going to put my mind at ease. If the answer is "that's really the only sensible way", my curiosity would be satisfied.
Here is a fully functioning console application which simulates the behaviour of the Solace library in the MockBroker object, and my attempt to wrap it. My hot path is the SendOneAsync method in the Guarantor class. The code is probably a bit too long for SO, but it is as minimal a demo I could create that captures all of the important elements.
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Channels;
using System.Threading.Tasks;
internal class Message { public bool sent; public int payload; public object correlator; }
// simulate third party library behaviour
internal class MockBroker
{
public bool TrySend(Message m, Action<Message> callback)
{
if (r.NextDouble() < 0.5) return false; // simulate chance of immediate failure / "would block" response
Task.Run(() => { Thread.Sleep(100); m.sent = r.NextDouble() < 0.5; callback(m); }); // simulate network call
return true;
}
private Random r = new();
}
// Turns MockBroker into a "guaranteed" sender with an async concurrency limit
internal class Guarantor
{
public Guarantor(int maxConcurrency)
{
_broker = new MockBroker();
// avoid message allocations in SendOneAsync
_ringBuffer = Channel.CreateBounded<Message>(maxConcurrency);
for (int i = 0; i < maxConcurrency; i++) _ringBuffer.Writer.TryWrite(new Message());
}
// real code pushing into a T.T.T.DataFlow block with bounded capacity and parallelism
// execution options both equal to maxConcurrency here, providing concurrency and backpressure
public async Task Post(int payload) => await SendOneAsync(payload);
private async Task SendOneAsync(int payload)
{
Message msg = await _ringBuffer.Reader.ReadAsync();
msg.payload = payload;
// send must eventually succeed
while (true)
{
// *** can this allocation be avoided? ***
var tcs = new TaskCompletionSource<bool>(TaskCreationOptions.RunContinuationsAsynchronously);
msg.correlator = tcs;
// class method in real code, inlined here to make the logic more apparent
Action<Message> callback = (msg) => (msg.correlator as TaskCompletionSource<bool>).SetResult(msg.sent);
if (_broker.TrySend(msg, callback) && await tcs.Task) break;
else
{
// simple demo retry logic
Console.WriteLine($"retrying {msg.payload}");
await Task.Delay(500);
}
}
// real code raising an event here to indicate successful delivery
await _ringBuffer.Writer.WriteAsync(msg);
Console.WriteLine(payload);
}
private Channel<Message> _ringBuffer;
private MockBroker _broker;
}
internal class Program
{
private static async Task Main(string[] args)
{
// at most 10 concurrent sends
Guarantor g = new(10);
// hacky simulation since in this demo there's nothing generating continuous events,
// no DataFlowBlock providing concurrency (it will be limited by the Channel instead),
// and nobody to notify when messages are successfully sent
List<Task> sends = new(100);
for (int i = 0; i < 100; i++) sends.Add(g.Post(i));
await Task.WhenAll(sends);
}
}
Yes, you can avoid the allocation of TaskCompletionSource instances, by using lightweight ValueTasks instead of Tasks. At first you need a reusable object that can implement the IValueTaskSource<T> interface, and the Message seems like the perfect candidate. For implementing this interface you can use the ManualResetValueTaskSourceCore<T> struct. This is a mutable struct, so it should not be declared as readonly. You just need to delegate the interface methods to the corresponding methods of this struct with the very long name:
using System.Threading.Tasks.Sources;
internal class Message : IValueTaskSource<bool>
{
public bool sent; public int payload; public object correlator;
private ManualResetValueTaskSourceCore<bool> _source; // Mutable struct, not readonly
public void Reset() => _source.Reset();
public short Version => _source.Version;
public void SetResult(bool result) => _source.SetResult(result);
ValueTaskSourceStatus IValueTaskSource<bool>.GetStatus(short token)
=> _source.GetStatus(token);
void IValueTaskSource<bool>.OnCompleted(Action<object> continuation,
object state, short token, ValueTaskSourceOnCompletedFlags flags)
=> _source.OnCompleted(continuation, state, token, flags);
bool IValueTaskSource<bool>.GetResult(short token) => _source.GetResult(token);
}
The three members GetStatus, OnCompleted and GetResult are required for implementing the interface. The other three members (Reset, Version and SetResult) will be used for creating and controlling the ValueTask<bool>s.
Now lets wrap the TrySend method of the MockBroker class in an asynchronous method TrySendAsync, that returns a ValueTask<bool>
static class MockBrokerExtensions
{
public static ValueTask<bool> TrySendAsync(this MockBroker source, Message message)
{
message.Reset();
bool result = source.TrySend(message, m => m.SetResult(m.sent));
if (!result) message.SetResult(false);
return new ValueTask<bool>(message, message.Version);
}
}
The message.Reset(); resets the IValueTaskSource<bool>, and declares that the previous asynchronous operation has completed. A IValueTaskSource<T> supports only one asynchronous operation at a time, the produced ValueTask<T> can be awaited only once, and it can no longer be awaited after the next Reset(). That's the price you have to pay for avoiding the allocation of an object: you must follow stricter rules. If you try to bend the rules (intentionally or unintentionally), the ManualResetValueTaskSourceCore<T> will start throwing InvalidOperationExceptions all over the place.
Now lets use the TrySendAsync extension method:
while (true)
{
if (await _broker.TrySendAsync(msg)) break;
// simple demo retry logic
Console.WriteLine($"retrying {msg.payload}");
await Task.Delay(500);
}
You can print in the Console the GC.GetTotalAllocatedBytes(true) before and after the whole operation, to see the difference. Make sure to run the application in Release mode, to see the real picture. You might see that the difference in not that impressive, because the size of a TaskCompletionSource instance is pretty small compared to the bytes allocated by the Task.Delay, and by all the strings generated for writing stuff in the Console.

How do I prevent by Rx test from hanging?

I am reproducing my Rx issue with a simplified test case below. The test below hangs. I am sure it is a small, but fundamental, thing that I am missing, but can't put my finger on it.
public class Service
{
private ISubject<double> _subject = new Subject<double>();
public void Reset()
{
_subject.OnNext(0.0);
}
public IObservable<double> GetProgress()
{
return _subject;
}
}
public class ObTest
{
[Fact]
private async Task SimpleTest()
{
var service = new Service();
var result = service.GetProgress().Take(1);
var task = Task.Run(async () =>
{
service.Reset();
});
await result;
}
}
UPDATE
My attempt above was to simplify the problem a little and understand it. In my case GetProgress() is a merge of various Observables that publish the download progress, one of these Observables is a Subject<double> that publishes 0 everytime somebody calls a method to delete the download.
The race condition identified by Enigmativity and Theodor Zoulias may(??) happen in real life. I display a view which attempts to get the progress, however, quick fingers delete it just in time.
What I need to understand a bit more is if the download is started again (subscription has taken place by now, by virtue of displaying a view, which has already made the subscription) and somebody again deletes it.
public class Service
{
private ISubject<double> _deleteSubject = new Subject<double>();
public void Reset()
{
_deleteSubject.OnNext(0.0);
}
public IObservable<double> GetProgress()
{
return _deleteSubject.Merge(downloadProgress);
}
}
Your code isn't hanging. It's awaiting an observable that sometimes never gets a value.
You have a race condition.
The Task.Run is sometimes executing to completion before the await result creates the subscription to the observable - so it never sees the value.
Try this code instead:
private async Task SimpleTest()
{
var service = new Service();
var result = service.GetProgress().Take(1);
var awaiter = result.GetAwaiter();
var task = Task.Run(() =>
{
service.Reset();
});
await awaiter;
}
The line await result creates a subscription to the observable. The problem is that the notification _subject.OnNext(0.0) may occur before this subscription, in which case the value will pass unobserved, and the await result will continue waiting for a notification for ever. In this particular example the notification is always missed, at least in my PC, because the subscription is delayed for around 30 msec (measured with a Stopwatch), which is longer than the time needed for the task that resets the service to complete, probably because the JITer must load and compile some RX-related assembly. The situation changes when I do a warm-up by calling new Subject<int>().FirstAsync().Subscribe() before running the example. In that case the notification is observed almost always, and the hanging is avoided.
I can think of two robust solutions to this problem.
The solution suggested by Enigmativity, to create an awaitable subscription before starting the task that resets the service. This can be done with either GetAwaiter or ToTask.
To use a ReplaySubject<T> instead of a plain vanilla Subject<T>.
Represents an object that is both an observable sequence as well as an observer. Each notification is broadcasted to all subscribed and future observers, subject to buffer trimming policies.
The ReplaySubject will cache the value so that it can be observed by the future subscription, eliminating the race condition. You could initialize it with a bufferSize of 1 to minimize the memory footprint of the buffer.

how to propagate some data to main process from TPL tasks while tasks are running

I have a situation where I create a list of long running tasks which monitors some system/network resources and then sends email, logs into a txt file, and calls a web service when some conditions are met. Then begins monitoring again. These tasks are created in a windows service and hence will be running all the time.
I want them to raise events or something to notify the parent class (which created them) and it will performs the 3 operations i mentioned above instead of each object in tasks doing it by itself.
And how can it be controlled that only a single task uses that parent class's method at a single time. As Email and a web service call is involved, so two concurrent requests may beak the code.
UPDATE
These Watchers are of three types, each implements the following interface.
public interface IWatcher
{
void BeginWatch();
}
Classes that implement are
//this watcher is responsible for watching over a sql query result
public class DBWatcher : IWatcher
{
....
void BeginWatch()
{
//Here a timer is created which contiously checks the SQL query result.
//And would Call SERVICE, send an EMAIL and LOG into a file
Timer watchIterator = new Timer(this._intervalMinutes * 60000);
watchIterator.Elapsed += new ElapsedEventHandler(_watchIterator_Elapsed);
watchIterator.Start();
}
void _watchIterator_Elapsed(object sender, ElapsedEventArgs e)
{
//1. Check Query result
//3. Call SERVICE, send an EMAIL and LOG into a file if result is not as was expected
//I have done the work to this part!
//And I can do the functions as follows .. it should be simple.
//*********************
//SendEmail();
//LogIntoFile();
//CallService();
//But I want the above three methods to be present in one place so i dont have to replicate same functionality in different watcher.
//One approach could be to create a seperate class and wrape the above mentioned functions in it, create an instance of that class here and call them.
//Second option, which I am interested in but dont know how to do, is to have this functionality in the parent class which actually creates the tasks and have each watcher use it from HERE ...
}
....
}
//this watcher is responsible for watching over Folder
public class FolderWatcher : IWatcher
{
....
void BeginWatch()
{
///Same as above
}
....
}
First I create a List from an XML file. This can contain multiple instances of DBWatcher which will continously watch a different query result and FolderWatcher which will continously watch different Folders continously.
After the List is created, I call the following function that I call to create a separate Task. I call this function MANY times to create a different set of watchers.
private void _createWatcherThread(IWatcher wat, CancellationTokenSource cancellationToken)
{
//This represents a watcher that will watch some specific area for any activities
IWatcher watcher = wat.Copy();
bool hasWatchBegin = false;
try
{
//run forever
for (;;)
{
//dispose the watcher and stop this thread if CANCEL token has been issued
if (cancellationToken.IsCancellationRequested)
{
((IDisposable)watcher).Dispose();
break;
}
else if (!hasWatchBegin)
{
//This method of a watcher class creates a timer. which will
//continously check the status after a few minutes... So its the
//timer's elapsed method in Watcher object which will send the mail
//& call service etc to update the admin of current status of the watcher.
//this will be called only once in a watcher!
watcher.BeginWatch();
hasWatchBegin = true;
}
}
}
catch (Exception ex)
{
//Watcher has thrown an exception.
//Again, do the following operations
//*********************
//SendEmail();
//LogIntoFile();
//CallService();
}
}
Provided you make your email, logging & webservice calls threadsafe you can pass references to the code which sends to each of these sinks as a closure (Here's Jon Skeet's excellent explanation of c# closures) into the monitoring tasks. Here's an example where you need to launch multiple tasks:
...
void Email(string message){}
void Log(string message){}
void CallWebService(string message){}
void RunMonitoringTask()
{
var task = Task.TaskFactory.StartNew(() =>
{
string message = Monitor();
if( ShouldNotify(message))
{
Email(mesasge);
Log(message);
CallWebService(message);
}
}
)
}
...
EDIT
vs. an infinite monitor loop triggering tasks when necessary:
...
void Email(string message){}
void Log(string message){}
void CallWebService(string message){}
void Monitor()
{
while(true)
{
string message = Monitor();
if(ShouldNotify(message))
{
var task = Task.TaskFactory.StartNew(() =>
{
Email(mesasge);
Log(message);
CallWebService(message);
}
}
}
)
}
...
As far as how to implement these classes, I'd recomend an approach where each of these sinks accepts the message & then offloads it to it's own processing thread/task to avoid blocking your monitoring tasks & holding up the other notifications.
The Progress class is just perfect for this task. It is a means of allowing a long running process to notify someone (usually the caller) of the current progress of that operation.
Progress<string> progress = new Progress<string>();
progress.ProgressChanged += (s, data) => Console.WriteLine(data);
for (int i = 0; i < 2; i++)
Task.Run(() => DoWork(progress));
public static void DoWork(IProgress<string> progress)
{
int i = 0;
while (true)
{
Thread.Sleep(500);//placeholder for real work
progress.Report(i++.ToString());
}
}
If you have different types of information to report at different times then just pass in multiple IProgress instances to the worker method. (Or, if you are reporting the progress of several types of data at the same time wrap all of the data in a composite object.)
Also note that this is capable of handling the synchronization that you have said that you need. Progress instances, when created, capture the value of SynchronizationContext.Current at the time that it's created, and marshal all of the event handlers for the progress changed event into that sync context. So if your application will already have a context (i.e. a UI context from a desktop application) then you get that for free. If you don't have one (i.e. it's a console application) then you'll need to either manually synchronize the event handler with say a lock, or create your own SynchrnonizationContext to set as the current context.

c# Task class and memory leak

I have an application which handles data from text file - it reads a line from the file then handles it and then puts a result in another file. After handling one row it handles the next one until the whole file is done. Some rows from the file is very time-consuming for handling. So I decided to put handling-logic in separate thread and if handling takes longer then 10 sec. I kill the thread. So my code is like this:
public class Handler
{
public void Handle(string row)
{
// Perform handling
}
}
public class Program
{
private static bool HandleRow(string row)
{
Task task = new Task(() => new Handler().Handle(row));
task.Start(); // updated
var waitResult = task.Wait(timeout); // timeout is 10 sec.
if(waitResult == false || task.IsFaulted)
return false;
return true;
}
public static void Main()
{
foreach(var row in GetRowsToHandle())
HandleRow(row);
}
}
but somehow when running the program I get out of memory exception. It seems that memory is not released properly.
Does anyone know why memory leaks might happen?
UPDATED
I forgot to include task.Start() in my code sniffer. Now I put it there
Task is Disposable : task.Dispose();
Your 10s timeout only times out the task. It doesn't stop Handle() from executing (if indeed it ever starts - I can't see a Start there). It just means you locally see a timeout on task.
Also, it depends in part on what GetRowsToHandle() does - does it return a non-buffered sequence, or is it a list, etc.
While Task does support cancellation, this requires co-operation from the implementation. To be honest, since you aren't doing anything async you might be better off just handling your own "have I taken too long" basic timeout in Handle(). A thread-abort (the other option) is not to be recommended.

How to effectively log asynchronously?

I am using Enterprise Library 4 on one of my projects for logging (and other purposes). I've noticed that there is some cost to the logging that I am doing that I can mitigate by doing the logging on a separate thread.
The way I am doing this now is that I create a LogEntry object and then I call BeginInvoke on a delegate that calls Logger.Write.
new Action<LogEntry>(Logger.Write).BeginInvoke(le, null, null);
What I'd really like to do is add the log message to a queue and then have a single thread pulling LogEntry instances off the queue and performing the log operation. The benefit of this would be that logging is not interfering with the executing operation and not every logging operation results in a job getting thrown on the thread pool.
How can I create a shared queue that supports many writers and one reader in a thread safe way? Some examples of a queue implementation that is designed to support many writers (without causing synchronization/blocking) and a single reader would be really appreciated.
Recommendation regarding alternative approaches would also be appreciated, I am not interested in changing logging frameworks though.
I wrote this code a while back, feel free to use it.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
namespace MediaBrowser.Library.Logging {
public abstract class ThreadedLogger : LoggerBase {
Queue<Action> queue = new Queue<Action>();
AutoResetEvent hasNewItems = new AutoResetEvent(false);
volatile bool waiting = false;
public ThreadedLogger() : base() {
Thread loggingThread = new Thread(new ThreadStart(ProcessQueue));
loggingThread.IsBackground = true;
loggingThread.Start();
}
void ProcessQueue() {
while (true) {
waiting = true;
hasNewItems.WaitOne(10000,true);
waiting = false;
Queue<Action> queueCopy;
lock (queue) {
queueCopy = new Queue<Action>(queue);
queue.Clear();
}
foreach (var log in queueCopy) {
log();
}
}
}
public override void LogMessage(LogRow row) {
lock (queue) {
queue.Enqueue(() => AsyncLogMessage(row));
}
hasNewItems.Set();
}
protected abstract void AsyncLogMessage(LogRow row);
public override void Flush() {
while (!waiting) {
Thread.Sleep(1);
}
}
}
}
Some advantages:
It keeps the background logger alive, so it does not need to spin up and spin down threads.
It uses a single thread to service the queue, which means there will never be a situation where 100 threads are servicing the queue.
It copies the queues to ensure the queue is not blocked while the log operation is performed
It uses an AutoResetEvent to ensure the bg thread is in a wait state
It is, IMHO, very easy to follow
Here is a slightly improved version, keep in mind I performed very little testing on it, but it does address a few minor issues.
public abstract class ThreadedLogger : IDisposable {
Queue<Action> queue = new Queue<Action>();
ManualResetEvent hasNewItems = new ManualResetEvent(false);
ManualResetEvent terminate = new ManualResetEvent(false);
ManualResetEvent waiting = new ManualResetEvent(false);
Thread loggingThread;
public ThreadedLogger() {
loggingThread = new Thread(new ThreadStart(ProcessQueue));
loggingThread.IsBackground = true;
// this is performed from a bg thread, to ensure the queue is serviced from a single thread
loggingThread.Start();
}
void ProcessQueue() {
while (true) {
waiting.Set();
int i = ManualResetEvent.WaitAny(new WaitHandle[] { hasNewItems, terminate });
// terminate was signaled
if (i == 1) return;
hasNewItems.Reset();
waiting.Reset();
Queue<Action> queueCopy;
lock (queue) {
queueCopy = new Queue<Action>(queue);
queue.Clear();
}
foreach (var log in queueCopy) {
log();
}
}
}
public void LogMessage(LogRow row) {
lock (queue) {
queue.Enqueue(() => AsyncLogMessage(row));
}
hasNewItems.Set();
}
protected abstract void AsyncLogMessage(LogRow row);
public void Flush() {
waiting.WaitOne();
}
public void Dispose() {
terminate.Set();
loggingThread.Join();
}
}
Advantages over the original:
It's disposable, so you can get rid of the async logger
The flush semantics are improved
It will respond slightly better to a burst followed by silence
Yes, you need a producer/consumer queue. I have one example of this in my threading tutorial - if you look my "deadlocks / monitor methods" page you'll find the code in the second half.
There are plenty of other examples online, of course - and .NET 4.0 will ship with one in the framework too (rather more fully featured than mine!). In .NET 4.0 you'd probably wrap a ConcurrentQueue<T> in a BlockingCollection<T>.
The version on that page is non-generic (it was written a long time ago) but you'd probably want to make it generic - it would be trivial to do.
You would call Produce from each "normal" thread, and Consume from one thread, just looping round and logging whatever it consumes. It's probably easiest just to make the consumer thread a background thread, so you don't need to worry about "stopping" the queue when your app exits. That does mean there's a remote possibility of missing the final log entry though (if it's half way through writing it when the app exits) - or even more if you're producing faster than it can consume/log.
Here is what I came up with... also see Sam Saffron's answer. This answer is community wiki in case there are any problems that people see in the code and want to update.
/// <summary>
/// A singleton queue that manages writing log entries to the different logging sources (Enterprise Library Logging) off the executing thread.
/// This queue ensures that log entries are written in the order that they were executed and that logging is only utilizing one thread (backgroundworker) at any given time.
/// </summary>
public class AsyncLoggerQueue
{
//create singleton instance of logger queue
public static AsyncLoggerQueue Current = new AsyncLoggerQueue();
private static readonly object logEntryQueueLock = new object();
private Queue<LogEntry> _LogEntryQueue = new Queue<LogEntry>();
private BackgroundWorker _Logger = new BackgroundWorker();
private AsyncLoggerQueue()
{
//configure background worker
_Logger.WorkerSupportsCancellation = false;
_Logger.DoWork += new DoWorkEventHandler(_Logger_DoWork);
}
public void Enqueue(LogEntry le)
{
//lock during write
lock (logEntryQueueLock)
{
_LogEntryQueue.Enqueue(le);
//while locked check to see if the BW is running, if not start it
if (!_Logger.IsBusy)
_Logger.RunWorkerAsync();
}
}
private void _Logger_DoWork(object sender, DoWorkEventArgs e)
{
while (true)
{
LogEntry le = null;
bool skipEmptyCheck = false;
lock (logEntryQueueLock)
{
if (_LogEntryQueue.Count <= 0) //if queue is empty than BW is done
return;
else if (_LogEntryQueue.Count > 1) //if greater than 1 we can skip checking to see if anything has been enqueued during the logging operation
skipEmptyCheck = true;
//dequeue the LogEntry that will be written to the log
le = _LogEntryQueue.Dequeue();
}
//pass LogEntry to Enterprise Library
Logger.Write(le);
if (skipEmptyCheck) //if LogEntryQueue.Count was > 1 before we wrote the last LogEntry we know to continue without double checking
{
lock (logEntryQueueLock)
{
if (_LogEntryQueue.Count <= 0) //if queue is still empty than BW is done
return;
}
}
}
}
}
I suggest to start with measuring actual performance impact of logging on the overall system (i.e. by running profiler) and optionally switching to something faster like log4net (I've personally migrated to it from EntLib logging a long time ago).
If this does not work, you can try using this simple method from .NET Framework:
ThreadPool.QueueUserWorkItem
Queues a method for execution. The method executes when a thread pool thread becomes available.
MSDN Details
If this does not work either then you can resort to something like John Skeet has offered and actually code the async logging framework yourself.
In response to Sam Safrons post, I wanted to call flush and make sure everything was really finished writting. In my case, I am writing to a database in the queue thread and all my log events were getting queued up but sometimes the application stopped before everything was finished writing which is not acceptable in my situation. I changed several chunks of your code but the main thing I wanted to share was the flush:
public static void FlushLogs()
{
bool queueHasValues = true;
while (queueHasValues)
{
//wait for the current iteration to complete
m_waitingThreadEvent.WaitOne();
lock (m_loggerQueueSync)
{
queueHasValues = m_loggerQueue.Count > 0;
}
}
//force MEL to flush all its listeners
foreach (MEL.LogSource logSource in MEL.Logger.Writer.TraceSources.Values)
{
foreach (TraceListener listener in logSource.Listeners)
{
listener.Flush();
}
}
}
I hope that saves someone some frustration. It is especially apparent in parallel processes logging lots of data.
Thanks for sharing your solution, it set me into a good direction!
--Johnny S
I wanted to say that my previous post was kind of useless. You can simply set AutoFlush to true and you will not have to loop through all the listeners. However, I still had crazy problem with parallel threads trying to flush the logger. I had to create another boolean that was set to true during the copying of the queue and executing the LogEntry writes and then in the flush routine I had to check that boolean to make sure something was not already in the queue and the nothing was getting processed before returning.
Now multiple threads in parallel can hit this thing and when I call flush I know it is really flushed.
public static void FlushLogs()
{
int queueCount;
bool isProcessingLogs;
while (true)
{
//wait for the current iteration to complete
m_waitingThreadEvent.WaitOne();
//check to see if we are currently processing logs
lock (m_isProcessingLogsSync)
{
isProcessingLogs = m_isProcessingLogs;
}
//check to see if more events were added while the logger was processing the last batch
lock (m_loggerQueueSync)
{
queueCount = m_loggerQueue.Count;
}
if (queueCount == 0 && !isProcessingLogs)
break;
//since something is in the queue, reset the signal so we will not keep looping
Thread.Sleep(400);
}
}
Just an update:
Using enteprise library 5.0 with .NET 4.0 it can easily be done by:
static public void LogMessageAsync(LogEntry logEntry)
{
Task.Factory.StartNew(() => LogMessage(logEntry));
}
See:
http://randypaulo.wordpress.com/2011/07/28/c-enterprise-library-asynchronous-logging/
An extra level of indirection may help here.
Your first async method call can put messages onto a synchonized Queue and set an event -- so the locks are happening in the thread-pool, not on your worker threads -- and then have yet another thread pulling messages off the queue when the event is raised.
If you log something on a separate thread, the message may not be written if the application crashes, which makes it rather useless.
The reason goes why you should always flush after every written entry.
If what you have in mind is a SHARED queue, then I think you are going to have to synchronize the writes to it, the pushes and the pops.
But, I still think it's worth aiming at the shared queue design. In comparison to the IO of logging and probably in comparison to the other work your app is doing, the brief amount of blocking for the pushes and the pops will probably not be significant.

Categories

Resources