I'm trying to understand how the AsyncLock works.
First of all, here's a snippet to prove that it actually works:
var l = new AsyncLock();
var tasks = new List<Task>();
while (true)
{
Console.ReadLine();
var i = tasks.Count + 1;
tasks.Add(Task.Run(async () =>
{
Console.WriteLine($"[{i}] Acquiring lock ...");
using (await l.LockAsync())
{
Console.WriteLine($"[{i}] Lock acquired");
await Task.Delay(-1);
}
}));
}
By "works" I mean that you can run as many tasks as you want (by hitting Enter) and the number of threads doesn't grow. If you replace it with traditional lock, you'll see that the new threads are started, which is what we try to avoid.
But the first thing you see in the source code is... the lock
Can somebody please explain me how this works, why it doesn't block, and what am I missing here?
Can somebody please explain me how this works, why it doesn't block, and what am I missing here?
The short answer is that lock is just an internal mechanism used to guarantee thread safety. The lock is never exposed in any way, and there's no way for any thread to hold that lock for any real amount of time. In this way, it's similar to the locks used internally by various concurrent collections.
There is an alternate approach that uses lock-free programming, but I have found lock-free programming to be extremely difficult to write, read, and maintain. A great example of this (which is sadly not online) was a bunch of Dr. Dobb's articles in the late '90s, each one trying to out-do the last with a better lock-free queue implementation. It turns out they were all faulty - in some cases, the bugs took more than a decade to find.
For my own code, I do not use lock-free programming, except where the correctness of the code is trivially obvious.
As far as the async lock vs lock concepts, I'm going to take a stab at explaining this. There's a feeling I get that I have only felt when working with asynchronous coordination primitives. It's something I've thought a lot about writing a blog post on, but I don't have the right words to make it understandable. That said, here goes...
Asynchronous coordination primitives exist on a completely different plane than normal coordination primitives. Synchronous primitives block threads and signal threads. Asynchronous primitives just work on plain objects; the blocking or signaling is just "by convention".
So, with a normal lock, the calling code must take the lock immediately. But with an asynchronous "lock", the attempted lock is just a request, just an object. The calling code doesn't even need to await it. It's possible to request several locks and await them all together with Task.WhenAll. Or even combine them with other things; code can do crazy things like (a)wait for two locks to both be free or for a signal (like AsyncManualResetEvent) to be sent, and then cancel the lock requests if the signal comes in first.
From a thread perspective, it's kinda-sorta like user-mode thread scheduling. There's also some similarities to cooperative multitasking (as opposed to preemptive). But overall, the asynchronous primitives are "lifted" to a different plane, where one works only with objects and blocks of code, not threads.
The lock inside AsyncLock is beeing released very quickly. Each task which tries to acquire AsyncLock, successfully acquires it's internal lock and the actual locking logic is done with a queue.
By wrapping LockAsync() within using block, the lock is being released when the block ends since LockAsync returns a disposable object Key which will be disposed at the end of the using block, and upon disposing the lock will be released. see https://github.com/StephenCleary/AsyncEx/blob/master/src/Nito.AsyncEx.Coordination/AsyncLock.cs#L182-L185
Related
Recently I've stumbled upon a Parralel.For loop that performs way better than a regular for loop for my purposes.
This is how I use it:
Parallel.For(0, values.Count, i =>Products.Add(GetAllProductByID(values[i])));
It made my application work a lot faster, but still not fast enough. My question to you guys is:
Does Parallel.Foreach performs faster than Parallel.For?
Is there some "hybrid" method with whom I can combine my Parralel.For loop to perform even faster (i.e. use more CPU power)? If yes, how?
Can someone help me out with this?
If you want to play with parallel, I suggest using Parallel Linq (PLinq) instead of Parallel.For / Parallel.ForEach , e.g.
var Products = Enumerable
.Range(0, values.Count)
.AsParallel()
//.WithDegreeOfParallelism(10) // <- if you want, say 10 threads
.Select(i => GetAllProductByID(values[i]))
.ToList(); // <- this is thread safe now
With a help of With methods (e.g. WithDegreeOfParallelism) you can try tuning you implementation.
There are two related concepts: asynchronous programming and multithreading. Basically, to do things "in parallel" or asynchronously, you can either create new threads or work asynchronously on the same thread.
Keep in mind that either way you'll need some mechanism to prevent race conditions. From the Wikipedia article I linked to, a race condition is defined as follows:
A race condition or race hazard is the behavior of an electronic,
software or other system where the output is dependent on the sequence
or timing of other uncontrollable events. It becomes a bug when events
do not happen in the order the programmer intended.
As a few people have mentioned in the comments, you can't rely on the standard List class to be thread-safe - i.e. it might behave in unexpected ways if you're updating it from multiple threads. Microsoft now offers special "built-in" collection classes (in the System.Collections.Concurrent namespace) that'll behave in the expected way if you're updating it asynchronously or from multiple threads.
For well-documented libraries (and Microsoft's generally pretty good about this in their documentation), the documentation will often explicitly state whether the class or method in question is thread-safe. For example, in the documentation for System.Collections.Generic.List, it states the following:
Public static (Shared in Visual Basic) members of this type are thread
safe. Any instance members are not guaranteed to be thread safe.
In terms of asynchronous programming (vs. multithreading), my standard illustration of this is as follows: suppose you go a restaurant with 10 people. When the waiter comes by, the first person he asks for his order isn't ready; however, the other 9 people are. Thus, the waiter asks the other 9 people for their orders and then comes back to the original guy. (It's definitely not the case that they'll get a second waiter to wait for the original guy to be ready to order and doing so probably wouldn't save much time anyway). That's how async/await typically works (the exception being that some of the Task Parallel library calls, like Thread.Run(...), actually are executing on other threads - in our illustration, bringing in a second waiter - so make sure you check the documentation for which is which).
Basically, which you choose (asynchronously on the same thread or creating new threads) depends on whether you're trying to do something that's I/O-bound (i.e. you're just waiting for an operation to complete or for a result) or CPU-bound.
If your main purpose is to wait for a result from Ebay, it would probably be better to work asynchronously in the same thread as you may not get much of a performance benefit for using multithreading. Think back to our analogy: bringing in a second waiter just to wait for the first guy to be ready to order isn't necessarily any better than just having the waiter to come back to him.
I'm not sitting in front of an IDE so forgive me if this syntax isn't perfect, but here's an approximate idea of what you can do:
public async Task GetResults(int[] productIDsToGet) {
var tasks = new List<Task>();
foreach (int productID in productIDsToGet) {
Task task = GetResultFromEbay(productID);
tasks.Add(task);
}
// Wait for all of the tasks to complete
await Task.WhenAll(tasks);
}
private async Task GetResultFromEbay(int productIdToGet) {
// Get result asynchronously from eBay
}
What intent is expressed here?:
lock(Locker)
{
Task.Factory.StartNew(()=>
{
foreach(var item in this.MyNonCurrentCollection)
{
//modify non-concurrent collection
}
}, CancellationToken.None, TaskCreationOptions.None, TaskScheduler.FromCurrentSynchonizationContext())
.ContinueWith(t => this.RaisePropertyChanged("MyNonCurrentCollection"));
}
Will the system lock (queue) until the Task completes or will the system lock only to start a new Task? The latter implies that this lock is kind if useless, right? I am just trying to discover intent from someone else's code. The ideal here is to protect MyNonCurrentCollection from being modified by another thread.
Will the system lock (queue) until the Task completes
No.
will the system lock only to start a new Task?
Yes.
The latter implies that this lock is kind if useless, right?
It would seem so, although you can't always be sure without seeing the full context. For example, sometimes I'll write code that needs to check if it should start a task, based on a resource that requires locking, thus locking around code that just starts the task might be appropriate. If you're not doing anything besides starting the task though, that's probably not the case.
The ideal here is to protect MyNonCurrentCollection from being modified by another thread.
This does nothing to prevent that.
Side note, modifying a collection inside of a foreach over that collection is a bad idea. Some collections will be nice enough to just throw some sort of concurrent modification exception. Less nice collections will just produce mangled results.
The system will lock until the task is instantiated and kicked off. Task.Factory.StartNew is asynchronous. Your lock should not be acquired for very long, even if the task takes a while.
Inside the task, you should be actually locking the shared resource, not around the creation of the task. The lock will not have an effect on the safety of the resource unless the task completes extremely quickly and gets preemptively scheduled before the lock is exited.
This is a bug, yes.
I have several actions that I want to execute in the background, but they have to be executed synchronously one after the other.
I was wondering if it's a good idea to use the Task.ContinueWith method to achieve this. Do you foresee any problems with this?
My code looks something like this:
private object syncRoot =new object();
private Task latestTask;
public void EnqueueAction(System.Action action)
{
lock (syncRoot)
{
if (latestTask == null)
latestTask = Task.Factory.StartNew(action);
else
latestTask = latestTask.ContinueWith(tsk => action());
}
}
There is one flaw with this, which I recently discovered myself because I am also using this method of ensuring tasks execute sequentially.
In my application I had thousands of instances of these mini-queues and quickly discovered I was having memory issues. Since these queues were often idle I was holding onto the last completed task object for a long time and preventing garbage collection. Since the result object of the last completed task was often over 85,000 bytes it was allocated to Large Object Heap (which does not perform compaction during garbage collection). This resulted in fragmentation of the LOH and the process continuously growing in size.
As a hack to avoid this, you can schedule a no-op task right after the real one within your lock. For a real solution, I will need to move to a different method of controlling the scheduling.
This should work as designed (using the fact that TPL will schedule the continuation immediately if the corresponding task already has completed).
Personally in this case I would just use a dedicated thread using a concurrent queue (ConcurrentQueue) to draw tasks from - this is more explicit but easier to parse reading the code, especially if you want to find out i.e. how many tasks are currently queued etc.
I used this snippet and have seem to get it work as designed.
The number of instances in my case does not runs in to thousands, but in single digit.
Nevertheless, no issues so far.
I would be interested in the ConcurrentQueue example, if there is any?
Thanks
I'm working on some big multi threaded project, now yesterday I had a deadlock (my first one ever), and I traced it by adding a Console.WriteLine("FunctionName: Lock on VariableName") and Console.WriteLine("FunctionName: Unlocking VariableName"). Adding all those was quite some work.
First of all, the program has a main loop that runs 2 times per second, that loop pulses some other threads to complete their work after the main loop has processed. Now what happened was that I had one thread in wait state to be pulsed, when it was pulsed it called another method that'd also wait to get pulsed, but the pulse already happened, and the thread won't pulse again until the action is actually completed.
Now what I want to do is override the Monitor.Enter and Monitor.Exit functions, without wrapping them in a class.
I've heard a lot about Reflection, but I have no idea how to apply it for this purpose, I know the easiest way to achieve it all is by just using a wrapper class, but then the lock keyword won't work anymore, and I'd have to convert all locks into Monitor.Enter try { } finally { Monitor.Exit }, that's huge amount of work.
So my question: How to override the Monitor.Enter and Monitor.Exit functions, while keeping access to the base function to do the actual lock?
And if that's impossible: How to override the lock statement to call my wrapper class instead of the Monitor.Enter and Monitor.Exit functions?
EDIT FOR CLARITY:
I request this just for allowing me to log when the locks happen, to make the debugging process easier, that also means I don't want to create my own locking mechanism, I just want to log when a lock is established and when it's released.
The close will also not be executed most of the time, only when I come across a threading problem.
It sounds like you're looking for lock helpers. Jon Skeet's MiscUtil has some:
http://www.yoda.arachsys.com/csharp/miscutil/usage/locking.html
The idea is that you replace your lock statements with using statements and thus preserve the try-finally structure:
class Example
{
SyncLock padlock = new SyncLock();
void Method1
{
using (padlock.Lock())
{
// Now own the padlock
}
}
void Method2
{
using (padlock.Lock())
{
// Now own the padlock
}
}
}
With regards to deadlock prevention, the library offers a specialized ordered lock:
class Example
{
OrderedLock inner = new OrderedLock("Inner");
OrderedLock outer = new OrderedLock("Outer");
Example()
{
outer.InnerLock = inner;
}
}
Of course, you could extend Jon's helpers, or simply create your own (for logging purposes, etc). Check out the link above for more information.
Don't do it! That sounds bonkers ;-)
A deadlock occurs when 2 (or more) threads are all waiting to simultaneously hold 2 (or more) locks. And each thread gets a lock and waits for the other one.
You can often redesign your code so each thread only requires a single lock - which makes deadlock impossible.
Failing that, you can make a thread give up the first lock if it can't acquire the second lock.
That's a very bad idea. I never had to override Monitor.Enter / Exit or lock to overcome a deadlock. Please consider redesigning your code!
For example, use ManualResetEvent for the pulsing.
Alright...I've given the site a fair search and have read over many posts about this topic. I found this question: Code for a simple thread pool in C# especially helpful.
However, as it always seems, what I need varies slightly.
I have looked over the MSDN example and adapted it to my needs somewhat. The example I refer to is here: http://msdn.microsoft.com/en-us/library/3dasc8as(VS.80,printer).aspx
My issue is this. I have a fairly simple set of code that loads a web page via the HttpWebRequest and WebResponse classes and reads the results via a Stream. I fire off this method in a thread as it will need to executed many times. The method itself is pretty short, but the number of times it needs to be fired (with varied data for each time) varies. It can be anywhere from 1 to 200.
Everything I've read seems to indicate the ThreadPool class being the prime candidate. Here is what things get tricky. I might need to fire off this thing say 100 times, but I can only have 3 threads at most running (for this particular task).
I've tried setting the MaxThreads on the ThreadPool via:
ThreadPool.SetMaxThreads(3, 3);
I'm not entirely convinced this approach is working. Furthermore, I don't want to clobber other web sites or programs running on the system this will be running on. So, by limiting the # of threads on the ThreadPool, can I be certain that this pertains to my code and my threads only?
The MSDN example uses the event drive approach and calls WaitHandle.WaitAll(doneEvents); which is how I'm doing this.
So the heart of my question is, how does one ensure or specify a maximum number of threads that can be run for their code, but have the code keep running more threads as the previous ones finish up until some arbitrary point? Am I tackling this the right way?
Sincerely,
Jason
Okay, I've added a semaphore approach and completely removed the ThreadPool code. It seems simple enough. I got my info from: http://www.albahari.com/threading/part2.aspx
It's this example that showed me how:
[text below here is a copy/paste from the site]
A Semaphore with a capacity of one is similar to a Mutex or lock, except that the Semaphore has no "owner" – it's thread-agnostic. Any thread can call Release on a Semaphore, while with Mutex and lock, only the thread that obtained the resource can release it.
In this following example, ten threads execute a loop with a Sleep statement in the middle. A Semaphore ensures that not more than three threads can execute that Sleep statement at once:
class SemaphoreTest
{
static Semaphore s = new Semaphore(3, 3); // Available=3; Capacity=3
static void Main()
{
for (int i = 0; i < 10; i++)
new Thread(Go).Start();
}
static void Go()
{
while (true)
{
s.WaitOne();
Thread.Sleep(100); // Only 3 threads can get here at once
s.Release();
}
}
}
Note: if you are limiting this to "3" just so you don't overwhelm the machine running your app, I'd make sure this is a problem first. The threadpool is supposed to manage this for you. On the other hand, if you don't want to overwhelm some other resource, then read on!
You can't manage the size of the threadpool (or really much of anything about it).
In this case, I'd use a semaphore to manage access to your resource. In your case, your resource is running the web scrape, or calculating some report, etc.
To do this, in your static class, create a semaphore object:
System.Threading.Semaphore S = new System.Threading.Semaphore(3, 3);
Then, in each thread, you do this:
System.Threading.Semaphore S = new System.Threading.Semaphore(3, 3);
try
{
// wait your turn (decrement)
S.WaitOne();
// do your thing
}
finally {
// release so others can go (increment)
S.Release();
}
Each thread will block on the S.WaitOne() until it is given the signal to proceed. Once S has been decremented 3 times, all threads will block until one of them increments the counter.
This solution isn't perfect.
If you want something a little cleaner, and more efficient, I'd recommend going with a BlockingQueue approach wherein you enqueue the work you want performed into a global Blocking Queue object.
Meanwhile, you have three threads (which you created--not in the threadpool), popping work out of the queue to perform. This isn't that tricky to setup and is very fast and simple.
Examples:
Best threading queue example / best practice
Best method to get objects from a BlockingQueue in a concurrent program?
It's a static class like any other, which means that anything you do with it affects every other thread in the current process. It doesn't affect other processes.
I consider this one of the larger design flaws in .NET, however. Who came up with the brilliant idea of making the thread pool static? As your example shows, we often want a thread pool dedicated to our task, without having it interfere with unrelated tasks elsewhere in the system.