Semaphore and SemaphoreSlim usage Best Practices - c#

I have created a semaphore instance on top of this class
public static SemaphoreSlim _zReportSemaphore = new SemaphoreSlim(1, 500);
And somewhere in my code i need to retrieve and send some data.
while (_isRunning)
{
try
{
xBsonDocument = null;
//I think its very clear in this line...
MongoDBDAO.xGetInstance().GetZReportData(ref xBsonDocument);
foreach (BsonDocument item in xBsonDocument)
{
try
{
ThreadObject xThreadObject = new ThreadObject();
xThreadObject.m_strTerminalId = item.GetValue("_id")["TERMINAL_ID"].ToString();
xThreadObject.m_strZNo = item.GetValue("_id")["Z_NO"].ToString();
m_xBuildAndSendZReportThread =
new Thread(new ParameterizedThreadStart(vBuildAndSendZReport));
m_xBuildAndSendZReportThread.Start(xThreadObject);
}
catch (Exception xException)
{
xException.TraceError();
continue;
}
Thread.Sleep(m_litleStepQTime);
}
}
catch (Exception xException)
{
Thread.Sleep(m_bigStepQTime);
Trace.vInsertError(xException);
continue;
}
Thread.Sleep(m_iSleepTime);
}
This thread targeting to send files to ftp
private void vBuildAndSendZReport(object prm_objParameters)
{
_zReportSemaphore.Wait();
RetriveDataFromMongoAndSend();
_zReportSemaphore.Release();
}
In this structure; if i don't use a semaphore it has working great but sometimes thread count overloading the CPU or Memory usage and machine has been crushing.
1- How can i provide control over data usage (ballancing, isolating threads etc.) with this slim semaphore?
2- Can I use SemaphoreSlim for this type of job in production? What can be the advantages and disadvantages of using such a workflow organization like this?
Does it improve performance? in my special case
3- Is there another alternative that will provide system resource management and will wrap up the technical exception management
Update:
I asked this question during a job I did a long time ago. After solving the problem, I realized that I did not return.
In the above example, the report sending job was happening in the file sharing environment. Other solutions are possible, such as using a CDN.
The question was: Why should I use a thread if it can't keep me informed about what it's doing, if it doesn't tell me if it has had successful results? Why should I use SemaphoreSlim for example!?
yes, of course it can be done with async programming. but I didn't want to include this library in related environment. It had to be. I'm sure this situation is needed in many codes.
my solution was this: I eliminated the possibility of the exception in the code that was throwing the exception. so i synced the conflict with the thread outside the app. I made something like a threadpool. It worked flawlessly as a consumer. I did this by setting up a custom timing mechanism.
Regardless, I still agree. A thread should be set up to carry information about the job it is doing. I'm not talking about writing a Mutex object in between. Thread itself can carry this information.
By the way, I gave points to those who answered. Because they made the right comments according the question.

This is the first hit on Google for "Semaphore and SemaphoreSlim usage Best Practices", so I would like to add 1 remark:
At least, this code
semaphore.Wait();
DoSomeThing();
semaphore.Release();
should be at the minimum
semaphore.Wait();
try
{
DoSomeThing();
}
finally
{
semaphore.Release();
}
Or else you might end up in NEVER releasing the semaphore again if an exceptions occurs in DoSomeThing...
And in async programming, consider using
await semaphore.WaitAsync();

Is there any event when semaphore ends its all threads
No. It's not even clear what that might mean. For example, what do you want to happen if, due to thread-scheduling issues, you have just one running thread in the semaphore at the moment, and that thread completes, releasing the semaphore, before one or more other threads even get to try to acquire the semaphore?
The semaphore has no way to detect this condition as different from every thread being done.
If you want to know when some collection of asynchronous operations has completed, you'll need to wait on that specifically. You have a number of options in .NET, including:
Call Thread.Join() on all of the thread objects you've started.
Use Task to run your asynchronous tasks instead of Thread, and use Task.WhenAll() (or less preferably, Task.WaitAll()) to wait for them to complete.
Use CountdownEvent. Call AddCount() for each task you start, have each task call Signal() when it's done, and then wait on the CountdownEvent.
By the way, the code you posted is suspect in other ways:
Why are you specifying a maximum count for the SemaphoreSlim, and why is this maximum not the same as your initial count? Do you actually expect to call Release() more often than you call Wait()?
Code that calls Thread.Sleep() is often incorrect. It's not clear why you are doing that, but it's likely there are better ways to solve whatever issue you're trying to address with those calls.
Without a good Minimal, Complete, and Verifiable example, I can't say for sure that those things are wrong. But there's a low likelihood of them being right. :)

Related

How to handle multiple tasks running in parallel at different intervals inside a C# based Windows service?

I already have some experience in working with threads in Windows but most of that experience comes from using Win32 API functions in C/C++ applications. When it comes to .NET applications however, I am often not sure about how to properly deal with multithreading. There are threads, tasks, the TPL and all sorts of other things I can use for multithreading but I never know when to use which of those options.
I am currently working on a C# based Windows service which needs to periodically validate different groups of data from different data sources. Implementing the validation itself is not really an issue for me but I am unsure about how to handle all of the validations running simultaneously.
I need a solution for this which allows me to do all of the following things:
Run the validations at different (predefined) intervals.
Control all of the different validations from one place so I can pause and/or stop them if necessary, for example when a user stops or restarts the service.
Use the system ressources as efficiently as possible to avoid performance issues.
So far I've only had one similar project before where I simply used Thread objects combined with a ManualResetEvent and a Thread.Join call with a timeout to notify the threads about when the service is stopped. The logic inside those threads to do something periodically then looked like this:
while (!shutdownEvent.WaitOne(0))
{
if (DateTime.Now > nextExecutionTime)
{
// Do something
nextExecutionTime = nextExecutionTime.AddMinutes(interval);
}
Thread.Sleep(1000);
}
While this did work as expected, I've often heard that using threads directly like this is considered "oldschool" or even a bad practice. I also think that this solution does not use threads very efficiently as they are just sleeping most of the time. How can I achive something like this in a more modern and efficient way?
If this question is too vague or opinion-based then please let me know and I will try my best to make it as specific as possible.
Question feels a bit broad but we can use the provided code and try to improve it.
Indeed the problem with the existing code is that for the majority of the time it holds thread blocked while doing nothing useful (sleeping). Also thread wakes up every second only to check the interval and in most cases go to sleep again since it's not validation time yet. Why it does that? Because if you will sleep for longer period - you might block for a long time when you signal shutdownEvent and then join a thread. Thread.Sleep doesn't provide a way to be interrupted on request.
To solve both problems we can use:
Cooperative cancellation mechanism in form of CancellationTokenSource + CancellationToken.
Task.Delay instead of Thread.Sleep.
For example:
async Task ValidationLoop(CancellationToken ct) {
while (!ct.IsCancellationRequested) {
try {
var now = DateTime.Now;
if (now >= _nextExecutionTime) {
// do something
_nextExecutionTime = _nextExecutionTime.AddMinutes(1);
}
var waitFor = _nextExecutionTime - now;
if (waitFor.Ticks > 0) {
await Task.Delay(waitFor, ct);
}
}
catch (OperationCanceledException) {
// expected, just exit
// otherwise, let it go and handle cancelled task
// at the caller of this method (returned task will be cancelled).
return;
}
catch (Exception) {
// either have global exception handler here
// or expect the task returned by this method to fail
// and handle this condition at the caller
}
}
}
Now we do not hold a thread any more, because await Task.Delay doesn't do this. Instead, after specificed time interval it will execute the subsequent code on a free thread pool thread (it's more complicated that this but we won't go into details here).
We also don't need to wake up every second for no reason, because Task.Delay accepts cancellation token as a parameter. When that token is signalled - Task.Delay will be immediately interrupted with exception, which we expect and break from the validation loop.
To stop the provided loop you need to use CancellationTokenSource:
private readonly CancellationTokenSource _cts = new CancellationTokenSource();
And you pass its _cts.Token token into the provided method. Then when you want to signal the token, just do:
_cts.Cancel();
To futher improve the resource management - IF your validation code uses any IO operations (reads files from disk, network, database access etc) - use Async versions of said operations. Then also while performing IO you will hold no unnecessary threads blocked waiting.
Now you don't need to manage threads yourself anymore and instead you operatate in terms of tasks you need to perform, letting framework \ OS manage threads for you.
You should use Microsoft's Reactive Framework (aka Rx) - NuGet System.Reactive and add using System.Reactive.Linq; - then you can do this:
Subject<bool> starter = new Subject<bool>();
IObservable<Unit> query =
starter
.StartWith(true)
.Select(x => x
? Observable.Interval(TimeSpan.FromSeconds(5.0)).SelectMany(y => Observable.Start(() => Validation()))
: Observable.Never<Unit>())
.Switch();
IDisposable subscription = query.Subscribe();
That fires off the Validation() method every 5.0 seconds.
When you need to pause and resume, do this:
starter.OnNext(false);
// Now paused
starter.OnNext(true);
// Now restarted.
When you want to stop it all call subscription.Dispose().

Does async (one task) make sense in MVC?

I am using async/await in MVC, but only when I have more than one task (WaitAll).
I understand that having only one task is good to have the UI free, in case of WPF or Windows Form, but does it make sense for MVC to have only one task and await for that?
I've seen it a lot in code, in MVC, but I don't get the advantages.
HTTP requests are handled by thread pool threads.
If you block a thread, it will not be able to do other work. Because the total number of threads is limited, this can led to the starvation of the thread pool and new requests will be denied - 503.
Using async code, the thread is released back to the thread pool, becoming available to handle new requests or the continuations of async code.
Like on client UIs, on the server, async code is all about responsiveness, not performance. Requests will take longer but your server will be able to handle more requests.
It depends on what you are trying to achieve. For instance, if you have multiple calls to multiple services you can always do it in a way that only the last call makes the rest of the system "wait".
You can optimise your code in a way that X asynchronously calls to services start (almost) at the same time without having to 'await' for one another.
public async Task RunSomethings()
{
var something1 = AsyncCall1();
var something2 = AsyncCall2();
var something3 = await AsyncCall3();
}
private async Task<Something1> AsyncCall1()
{
return await Something1();
}
private async Task<Something2> AsyncCall2()
{
return await Something2();
}
private async Task<Something3> AsyncCall3()
{
return await Something3();
}
I hope it helps.
Good question. Using asynchronous methods is all about using the resources effectively as well as give a good user experience. Any time you need to call on a resource that could take time to collect, it's good to use an async call.
This will do a few things. First, while a worker thread is waiting for data, it can be put on 'hold' so to speak, and that worker thread can do something else until the data is returned, an error is returned or the call just times out.
This will give you the second advantage. The interface the user is using to call the resource will be released, temporarily, to do other things. And overall, less server resources are consumed by idle processes.
I'd recommend watching the videos on this page here: https://channel9.msdn.com/Series/Three-Essential-Tips-for-Async
It's probably the clearest explanation that can help leapfrog your learning on async.

Need a queue of jobs to be processed by threads

I have some work (a job) that is in a queue (so there a several of them) and I want each job to be processed by a thread.
I was looking at Rx but this is not what I wanted and then came across the parallel task library.
Since my work will be done in an web application I do not want client to be waiting for each job to be finished, so I have done the following:
public void FromWebClientRequest(int[] ids);
{
// I will get the objects for the ids from a repository using a container (UNITY)
ThreadPool.QueueUserWorkItem(delegate
{
DoSomeWorkInParallel(ids, container);
});
}
private static void DoSomeWorkInParallel(int[] ids, container)
{
Parallel.ForEach(ids, id=>
{
Some work will be done here...
var respository = container.Resolve...
});
// Here all the work will be done.
container.Resolve<ILogger>().Log("finished all work");
}
I would call the above code on a web request and then the client will not have to wait.
Is this the correct way to do this?
TIA
From the MSDN docs I see that Unitys IContainer Resolve method is not thread safe (or it is not written). This would mean that you need to do that out of the thread loop. Edit: changed to Task.
public void FromWebClientRequest(int[] ids);
{
IRepoType repoType = container.Resolve<IRepoType>();
ILogger logger = container.Resolve<ILogger>();
// remove LongRunning if your operations are not blocking (Ie. read file or download file long running queries etc)
// prefer fairness is here to try to complete first the requests that came first, so client are more likely to be able to be served "first come, first served" in case of high CPU use with lot of requests
Task.Factory.StartNew(() => DoSomeWorkInParallel(ids, repoType, logger), TaskCreationOptions.LongRunning | TaskCreationOptions.PreferFairness);
}
private static void DoSomeWorkInParallel(int[] ids, IRepoType repository, ILogger logger)
{
// if there are blocking operations inside this loop you ought to convert it to tasks with LongRunning
// why this? to force more threads as usually would be used to run the loop, and try to saturate cpu use, which would be doing nothing most of the time
// beware of doing this if you work on a non clustered database, since you can saturate it and have a bottleneck there, you should try and see how it handles your workload
Parallel.ForEach(ids, id=>{
// Some work will be done here...
// use repository
});
logger.Log("finished all work");
}
Plus as fiver stated, if you have .Net 4 then Tasks is the way to go.
Why go Task (question in comment):
If your method fromClientRequest would be fired insanely often, you would fill the thread pool, and overall system performance would probably not be as good as with .Net 4 with fine graining. This is where Task enters the game. Each task is not its own thread but the new .Net 4 thread pool creates enough threads to maximize performance on a system, and you do not need to bother on how many cpus and how much thread context switches would there be.
Some MSDN quotes for ThreadPool:
When all thread pool threads have been
assigned to tasks, the thread pool
does not immediately begin creating
new idle threads. To avoid
unnecessarily allocating stack space
for threads, it creates new idle
threads at intervals. The interval is
currently half a second, although it
could change in future versions of the
.NET Framework.
The thread pool has a default size of
250 worker threads per available
processor
Unnecessarily increasing the number of
idle threads can also cause
performance problems. Stack space must
be allocated for each thread. If too
many tasks start at the same time, all
of them might appear to be slow.
Finding the right balance is a
performance-tuning issue.
By using Tasks you discard those issues.
Another good thing is you can fine grain the type of operation to run. This is important if your tasks do run blocking operations. This is a case where more threads are to be allocated concurrently since they would mostly wait. ThreadPool cannot achieve this automagically:
Task.Factory.StartNew(() => DoSomeWork(), TaskCreationOptions.LongRunning);
And of course you are able to make it finish on demand without resorting to ManualResetEvent:
var task = Task.Factory.StartNew(() => DoSomeWork());
task.Wait();
Beside this you don't have to change the Parallel.ForEach if you don't expect exceptions or blocking, since it is part of the .Net 4 Task Parallel Library, and (often) works well and optimized on the .Net 4 pool as Tasks do.
However if you do go to Tasks instead of parallel for, remove the LongRunning from the caller Task, since Parallel.For is a blocking operations and Starting tasks (with the fiver loop) is not. But this way you loose the kinda first-come-first-served optimization, or you have to do it on a lot more Tasks (all spawned through ids) which probably would give less correct behaviour. Another option is to wait on all tasks at the end of DoSomeWorkInParallel.
Another way is to use Tasks:
public static void FromWebClientRequest(int[] ids)
{
foreach (var id in ids)
{
Task.Factory.StartNew(i =>
{
Wl(i);
}
, id);
}
}
I would call the above code on a web
request and then the client will not
have to wait.
This will work provided the client does not need an answer (like Ok/Fail).
Is this the correct
way to do this?
Almost. You use Parallel.ForEach (TPL) for the jobs but run it from a 'plain' Threadpool job. Better to use a Task for the outer job as well.
Also, handle all exceptions in that outer Task. And be careful about the thread-safety of the container etc.

When implementing time-constrained methods, should I abort the worker thread or let it run its course?

I'm currently writing a web services based front-end to an existing application. To do that, I'm using the WCF LOB Adapter SDK, which allows one to create custom WCF bindings that expose external data and operations as web services.
The SDK provides a few interfaces to implement, and some of their methods are time-constrained: the implementation is expected to complete its work within a specified timespan or throw a TimeoutException.
Investigations led me to the question "Implement C# Generic Timeout", which wisely advises to use a worker thread. Armed with that knowledge, I can write:
public MetadataRetrievalNode[] Browse(string nodeId, int childStartIndex,
int maxChildNodes, TimeSpan timeout)
{
Func<MetadataRetrievalNode[]> work = () => {
// Return computed metadata...
};
IAsyncResult result = work.BeginInvoke(null, null);
if (result.AsyncWaitHandle.WaitOne(timeout)) {
return work.EndInvoke(result);
} else {
throw new TimeoutException();
}
}
However, the consensus is not clear about what to do with the worker thread if it times out. One can just forget about it, like the code above does, or one can abort it:
public MetadataRetrievalNode[] Browse(string nodeId, int childStartIndex,
int maxChildNodes, TimeSpan timeout)
{
Thread workerThread = null;
Func<MetadataRetrievalNode[]> work = () => {
workerThread = Thread.CurrentThread;
// Return computed metadata...
};
IAsyncResult result = work.BeginInvoke(null, null);
if (result.AsyncWaitHandle.WaitOne(timeout)) {
return work.EndInvoke(result);
} else {
workerThread.Abort();
throw new TimeoutException();
}
}
Now, aborting a thread is widely considered as wrong. It breaks work in progress, leaks resources, messes with locking and does not even guarantee the thread will actually stop running. That said, HttpResponse.Redirect() aborts a thread every time it's called, and IIS seems to be perfectly happy with that. Maybe it's prepared to deal with it somehow. My external application probably isn't.
On the other hand, if I let the worker thread run its course, apart from the resource contention increase (less available threads in the pool), wouldn't memory be leaked anyway, because work.EndInvoke() never gets called? More specifically, wouldn't the MetadataRetrievalNode[] array returned by work remain around forever?
Is this only a matter of choosing the lesser of two evils, or is there a way not to abort the worker thread and still reclaim the memory used by BeginInvoke()?
Well, first off Thread.Abort is not nearly as bad as it used it to be. There were several improvements made to the CLR in 2.0 that fixed several of the major issues with aborting threads. It is still bad, mind you, so avoiding it is the best course of action. If you must resort to aborting threads then at the very least you should consider tearing down the application domain from where the abort originated. That is going to be incredibly invasive in most scenarios and would not resolve the possible corruption of unmanaged resources.
Aside from that, aborting in this case is going to have other implications. The most important being that you are attempting to abort a ThreadPool thread. I am really not sure what the end result of that would be and it could be different depending on which version of the framework is in play.
The best course of action is to have your Func<MetadataRetrievalNode[]> delegate poll a variable at safe points to see if it should terminate execution on its own.
public MetadataRetrievalNode[] Browse(string nodeId, int childStartIndex, int maxChildNodes, TimeSpan timeout)
{
bool terminate = false;
Func<MetadataRetrievalNode[]> work =
() =>
{
// Do some work.
Thread.MemoryBarrier(); // Ensure a fresh read of the terminate variable.
if (terminate) throw new InvalidOperationException();
// Do some work.
Thread.MemoryBarrier(); // Ensure a fresh read of the terminate variable.
if (terminate) throw new InvalidOperationException();
// Return computed metadata...
};
IAsyncResult result = work.BeginInvoke(null, null);
terminate = !result.AsyncWaitHandle.WaitOne(timeout);
return work.EndInvoke(result); // This blocks until the delegate completes.
}
The tricky part is how to deal with blocking calls inside your delegate. Obviously, you cannot check the terminate flag if the delegate is in the middle of a blocking call. But, assuming the blocking call is initiated from one of the canned BCL waiting mechansisms (WaitHandle.WaitOne, Monitor.Wait, etc.) then you could use Thread.Interrupt to "poke" it and that should immediately unblock it.
The answer depends on the type of work your worker thread is performing. My guess is it's working with external resources like a data connection. Thread.Abort() is indeed evil in any case of threads working with hooks to unmanaged resources, no matter how well-wrapped.
Basically, you want your service to give up if it times out. At this point, theoretically, the caller no longer cares how long the thread's going to take; it only cares that it's "too long", and should move on. Barring a bug in the worker thread's running method, it WILL end eventually; the caller just no longer cares when because it's not waiting any longer.
Now, if the reason the thread timed out is because it's caught in an infinite loop, or is told to wait forever on some other operation like a service call, then you have a problem that you should fix, but the fix is not to kill the thread. That would be analagous to sending your kid into a grocery store to buy bread while you wait in the car. If your kid keeps spending 15 minutes in the store when you think it should take 5, you eventually get curious, go in and find out what they're doing. If it's not what you thought they should be doing, like they've spent all the time looking at pots & pans, you "correct" their behavior for future occasions. If you go in and see your kid standing in a long checkout line, then you just start waiting longer. In neither of these cases should you press the button that detonates the explosive vest they're wearing; that just makes a big mess that will likely interfere with the next kid's ability to do the same errand later.

Whats the best way to unit test from multiple threads?

this kind of follows on from another question of mine.
Basically, once I have the code to access the file (will review the answers there in a minute) what would be the best way to test it?
I am thinking of creating a method which just spawns lots of BackgroundWorker's or something and tells them all load/save the file, and test with varying file/object sizes. Then, get a response back from the threads to see if it failed/succeeded/made the world implode etc.
Can you guys offer any suggestions on the best way to approach this? As I said before, this is all kinda new to me :)
Edit
Following ajmastrean's post:
I am using a console app to test with Debug.Asserts :)
Update
I originally rolled with using BackgroundWorker to deal with the threading (since I am used to that from Windows dev) I soon realised that when I was performing tests where multiple ops (threads) needed to complete before continuing, I realised it was going to be a bit of a hack to get it to do this.
I then followed up on ajmastrean's post and realised I should really be using the Thread class for working with concurrent operations. I will now refactor using this method (albeit a different approach).
In .NET, ThreadPool threads won't return without setting up ManualResetEvents or AutoResetEvents. I find these overkill for a quick test method (not to mention kind of complicated to create, set, and manage). Background worker is a also a bit complex with the callbacks and such.
Something I have found that works is
Create an array of threads.
Setup the ThreadStart method of each thread.
Start each thread.
Join on all threads (blocks the current thread until all other threads complete or abort)
public static void MultiThreadedTest()
{
Thread[] threads = new Thread[count];
for (int i = 0; i < threads.Length; i++)
{
threads[i] = new Thread(DoSomeWork());
}
foreach(Thread thread in threads)
{
thread.Start();
}
foreach(Thread thread in threads)
{
thread.Join();
}
}
#ajmastrean, since unit test result must be predictable we need to synchronize threads somehow. I can't see a simple way to do it without using events.
I found that ThreadPool.QueueUserWorkItem gives me an easy way to test such use cases
ThreadPool.QueueUserWorkItem(x => {
File.Open(fileName, FileMode.Open);
event1.Set(); // Start 2nd tread;
event2.WaitOne(); // Blocking the file;
});
ThreadPool.QueueUserWorkItem(x => {
try
{
event1.WaitOne(); // Waiting until 1st thread open file
File.Delete(fileName); // Simulating conflict
}
catch (IOException e)
{
Debug.Write("File access denied");
}
});
Your idea should work fine. Basically you just want to spawn a bunch of threads, and make sure the ones writing the file take long enough to do it to actually make the readers wait. If all of your threads return without error, and without blocking forever, then the test succeeds.

Categories

Resources