The given code attempts to run a fire-and-forget Task while acquiring a lock to avoid race conditions on a cached element.
while debugging the whole application I noticed bizarre behavior - getting to the same break point with the same key. That said, I'm not sure if the assumption that ConcurrentDictionary guarantees that only one writer will be applied to the element as well as locking inside the Task.Run block will assure me that this and only this Task will have access to the element itself.
Task.Run(() =>
{
lock (GetLockContext(key, region))
{
try
{
object newValue = cacheInterceptor.Intercept<T>(ctx);
Put(refreshContext.Key, newValue, refreshContext.Region, refreshAction);
}
catch (Exception ex)
{
refreshContext.UpdteRefreshNeeded(true);
LogError("HANDLE_REFRESH", null, null, ex);
}
});
}
private object GetLockContext(string key, string region)
{
string ctx = key + region;
object lckCtx = CTX_REPO.GetOrAdd(ctx, (dontcare) => new object());
return lckCtx;
}
private static readonly ConcurrentDictionary<string, object> CTX_REPO
The lock (GetLockContext(key, region)) should guarantee that only one thread in the current process will enter the critical region for the same key/region combination. The only problems that I can think of are:
The key + region generates the same combination for different key/region pairs. For example "ab" + "c" and "a" + "bc".
Your application is hosted, and the host starts a new process before the currently running process has terminated.
As a side note instead of locking synchronously you could consider using an asynchronous locker like the SemaphoreSlim class, in order to avoid blocking ThreadPool threads. You could check out this question for ideas:
Asynchronous locking based on a key.
Related
Sometimes I encounter async/await code that accesses fields of an object. For example this snippet of code from the Stateless project:
private readonly Queue<QueuedTrigger> _eventQueue = new Queue<QueuedTrigger>();
private bool _firing;
async Task InternalFireQueuedAsync(TTrigger trigger, params object[] args)
{
if (_firing)
{
_eventQueue.Enqueue(new QueuedTrigger { Trigger = trigger, Args = args });
return;
}
try
{
_firing = true;
await InternalFireOneAsync(trigger, args).ConfigureAwait(false);
while (_eventQueue.Count != 0)
{
var queuedEvent = _eventQueue.Dequeue();
await InternalFireOneAsync(queuedEvent.Trigger, queuedEvent.Args).ConfigureAwait(false);
}
}
finally
{
_firing = false;
}
}
If I understand correctly the await **.ConfigureAwait(false) indicates that the code that is executed after this await does not necessarily has to be executed on the same context. So the while loop here could be executed on a ThreadPool thread. I don't see what is making sure that the _firing and _eventQueue fields are synchronized, for example what is creating the a lock/memory-fence/barrier here? So my question is; do I need to make the fields thread-safe, or is something in the async/await structure taking care of this?
Edit: to clarify my question; in this case InternalFireQueuedAsync should always be called on the same thread. In that case only the continuation could run on a different thread, which makes me wonder, do I need synchronization-mechanisms(like an explicit barrier) to make sure the values are synchronized to avoid the issue described here: http://www.albahari.com/threading/part4.aspx
Edit 2: there is also a small discussion at stateless:
https://github.com/dotnet-state-machine/stateless/issues/294
I don't see what is making sure that the _firing and _eventQueue fields are synchronized, for example what is creating the a lock/memory-fence/barrier here? So my question is; do I need to make the fields thread-safe, or is something in the async/await structure taking care of this?
await will ensure all necessary memory barriers are in place. However, that doesn't make them "thread-safe".
in this case InternalFireQueuedAsync should always be called on the same thread.
Then _firing is fine, and doesn't need volatile or anything like that.
However, the usage of _eventQueue is incorrect. Consider what happens when a thread pool thread has resumed the code after the await: it is entirely possible that Queue<T>.Count or Queue<T>.Dequeue() will be called by a thread pool thread at the same time Queue<T>.Enqueue is called by the main thread. This is not threadsafe.
If the main thread calling InternalFireQueuedAsync is a thread with a single-threaded context (such as a UI thread), then one simple fix is to remove all the instances of ConfigureAwait(false) in this method.
To be safe, you should mark field _firing as volatile - that will guarantee the memory barrier and be sure that the continuation part, which might run on a different thread, will read the correct value. Without volatile, the compiler, the CLR or the JIT compiler, or even the CPU may do some optimizations that cause the code to read a wrong value for it.
As for _eventQueue, you don't modify the field, so marking it as volatile is useless. If only one thread calls 'InternalFireQueuedAsync', you don't access it from multiple threads concurrently, so you are ok.
However, if multiple threads call InternalFireQueuedAsync, you will need to use a ConcurrentQueue instead, or lock your access to _eventQueue. You then better also lock your access to _firing, or access it using Interlocked, or replace it with a ManualResetEvent.
ConfigureAwait(false) means that the Context is not captured to run the continuation. Using the Thread Pool Context does not mean that continuations are run in parallel. Using await before and within the while loop ensures that the code (continuations) are run sequentially so no need to lock in this case.
You may have however a race condition when checking the _firing value.
use lock or ConcurrentQueue.
solution with lock:
private readonly Queue<QueuedTrigger> _eventQueue = new Queue<QueuedTrigger>();
private bool _firing;
private object _eventQueueLock = new object();
async Task InternalFireQueuedAsync(TTrigger trigger, params object[] args)
{
if (_firing)
{
lock(_eventQueueLock)
_eventQueue.Enqueue(new QueuedTrigger { Trigger = trigger, Args = args });
return;
}
try
{
_firing = true;
await InternalFireOneAsync(trigger, args).ConfigureAwait(false);
lock(_eventQueueLock)
while (_eventQueue.Count != 0)
{
var queuedEvent = _eventQueue.Dequeue();
await InternalFireOneAsync(queuedEvent.Trigger, queuedEvent.Args).ConfigureAwait(false);
}
}
finally
{
_firing = false;
}
}
solution with ConcurrentQueue:
private readonly ConccurentQueue<QueuedTrigger> _eventQueue = new ConccurentQueue<QueuedTrigger>();
private bool _firing;
async Task InternalFireQueuedAsync(TTrigger trigger, params object[] args)
{
if (_firing)
{
_eventQueue.Enqueue(new QueuedTrigger { Trigger = trigger, Args = args });
return;
}
try
{
_firing = true;
await InternalFireOneAsync(trigger, args).ConfigureAwait(false);
lock(_eventQueueLock)
while (_eventQueue.Count != 0)
{
object queuedEvent; // change object > expected type
if(!_eventQueue.TryDequeue())
continue;
await InternalFireOneAsync(queuedEvent.Trigger, queuedEvent.Args).ConfigureAwait(false);
}
}
finally
{
_firing = false;
}
}
Please tell me if I am thinking it alright.
A different thread cannot enter the same critical section using
the same lock just because the first thread called Monitor.Wait, right? The Wait method only allows a different thread to acquire
the same monitor, i.e. the same synchronization lock but only for a different critical section and never for the same critical
section.
Is my understanding correct?
Because if the Wait method meant that anyone can now enter this
same critical section using this same lock, then that would defeat
the whole purpose of synchronization, right?
So, in the code below (written in notepad, so please forgive any
typos), ThreadProc2 can only use syncLock to enter the code in
ThreadProc2 and not in ThreadProc1 while the a previous thread
that held and subsequently relinquished the lock was executing
ThreadProc1, right?
Two or more threads can use the same synchronization lock to run
different pieces of code at the same time, right? Same question as
above, basically, but just confirming for the sake of symmetry with
point 3 below.
Two or more threads can use a different synchronization lock to
run the same piece of code, i.e. to enter the same critical section.
Boilerplate text to correct the formatting.
class Foo
{
private static object syncLock = new object();
public void ThreadProc1()
{
try
{
Monitor.Enter(syncLock);
Monitor.Wait(syncLock);
Thread.Sleep(1000);
}
finally
{
if (Monitor.IsLocked(syncLock))
{
Monitor.Exit(syncLock);
}
}
}
public void ThreadProc2()
{
bool acquired = false;
try
{
// Calling TryEnter instead of
// Enter just for the sake of variety
Monitor.TryEnter(syncLock, ref acquired);
if (acquired)
{
Thread.Sleep(200);
Monitor.Pulse(syncLock);
}
}
finally
{
if (acquired)
{
Monitor.Exit(syncLock);
}
}
}
}
Update
The following illustration confirms that #3 is correct although I don't think it will be a nice thing to do.
using System;
using System.Collections.Generic;
using System.Threading.Tasks;
namespace DifferentSyncLockSameCriticalSection
{
class Program
{
static void Main(string[] args)
{
var sathyaish = new Person { Name = "Sathyaish Chakravarthy" };
var superman = new Person { Name = "Superman" };
var tasks = new List<Task>();
// Must not lock on string so I am using
// an object of the Person class as a lock
tasks.Add(Task.Run( () => { Proc1(sathyaish); } ));
tasks.Add(Task.Run(() => { Proc1(superman); }));
Task.WhenAll(tasks);
Console.WriteLine("Press any key to exit.");
Console.ReadKey();
}
static void Proc1(object state)
{
// Although this would be a very bad practice
lock(state)
{
try
{
Console.WriteLine((state.ToString()).Length);
}
catch(Exception ex)
{
Console.WriteLine(ex.Message);
}
}
}
}
class Person
{
public string Name { get; set; }
public override string ToString()
{
return Name;
}
}
}
When a thread calls Monitor.Wait it is suspended and the lock released. This will allow another thread to acquire the lock, update some state, and then call Monitor.Pulse in order to communicate to other threads that something has happened. You must have acquired the lock in order to call Pulse. Before Monitor.Wait returns the framework will reacquire the lock for the thread that called Wait.
In order for two threads to communicate with each other they need to use the same synchronization primitive. In your example you've used a monitor, but you usually need to combine this with some kind of test that the Wait returned in response to a Pulse. This is because it is technically possible to Wait to return even if Pulse wasn't called (although this doesn't happen in practice).
It's also worth remembering that a call to Pulse isn't "sticky", so if nobody is waiting on the monitor then Pulse does nothing and a subsequent call to Wait will miss the fact that Pulse was called. This is another reason why you tend to record the fact that something has been done before calling Pulse (see the example below).
It's perfectly valid for two different threads to use the same lock to run different bits of code - in fact this is the typical use-case. For example, one thread acquires the lock to write some data and another thread acquires the lock to read the data. However, it's important to realize that they don't run at the same time. The act of acquiring the lock prevents another thread from acquiring the same lock, so any thread attempting to acquire the lock when it is already locked will block until the other thread releases the lock.
In point 3 you ask:
Two or more threads can use a different synchronization lock to run
the same piece of code, i.e. to enter the same critical section.
However, if two threads are using different locks then they are not entering the same critical section. The critical section is denoted by the lock that protects it - if they're different locks then they are different sections that just happen to access some common data within the section. You should avoid doing this as it can lead to some difficult to debug data race conditions.
Your code is a bit over-complicated for what you're trying to accomplish. For example, let's say we've got 2 threads, and one will signal when there is data available for another to process:
class Foo
{
private readonly object syncLock = new object();
private bool dataAvailable = false;
public void ThreadProc1()
{
lock(syncLock)
{
while(!dataAvailable)
{
// Release the lock and suspend
Monitor.Wait(syncLock);
}
// Now process the data
}
}
public void ThreadProc2()
{
LoadData();
lock(syncLock)
{
dataAvailable = true;
Monitor.Pulse(syncLock);
}
}
private void LoadData()
{
// Gets some data
}
}
}
I am using the Impersonator class (see http://www.codeproject.com/KB/cs/zetaimpersonator.aspx) to switch the user context at runtime.
At the same time, i am now restructuring my program from a single threaded design to multi-threaded one (using TPL / mainly the Task type, that is).
As this impersonation is something that is happening with native API functions on a thread level, i was wondering how far TPL is compatible with it. If i change the user context inside a task, is that user context still set if the task is finished and the thread returns to the ThreadPool? Will other tasks started inside this task implicitly use that context?
I tried to find out by myself with unit testing, and my deduction from the first unit test:
Tasks started inside a thread while impersonated "magically" inherit the user context.
The inherited impersonation is not revoked when the origin task/thread does its Impersonation.Undo().
The second unit test shows that if the impersonation is not explicitly revoked, the user context "survives" on the thread returning to the thread pool, and other following tasks may now be randomly run in different user contexts, depending on the thread they are assigned to.
My question: Is there a better way to realize impersonation than via native API calls? Maybe one that is more focused on TPL and bound to a task instead of a thread? If there is a change to mitigate the risk of executing tasks in a random context i would gladly do it...
These are the 2 unit tests i wrote. You will have to modify the code slightly to use your own mechanism for receiving user credentials if you want to run the tests yourself, and the log4net calls are surely easily removed.
Yeah, i know, Thread.Sleep() is bad style, i am guilty of having been lazy there... ;-)
private string RetrieveIdentityUser()
{
var windowsIdentity = WindowsIdentity.GetCurrent();
if (windowsIdentity != null)
{
return windowsIdentity.Name;
}
return null;
}
[TestMethod]
[TestCategory("LocalTest")]
public void ThreadIdentityInheritanceTest()
{
string user;
string pw;
Security.Decode(CredentialsIdentifier, out user, out pw);
string userInMainThread = RetrieveIdentityUser();
string userInTask1BeforeImpersonation = null;
string userInTask1AfterImpersonation = null;
string userInTask2 = null;
string userInTask3 = null;
string userInTask2AfterImpersonationUndo = null;
var threadlock = new object();
lock (threadlock)
{
new Task(
() =>
{
userInTask1BeforeImpersonation = RetrieveIdentityUser();
using (new Impersonator(user, Domain, pw))
{
userInTask1AfterImpersonation = RetrieveIdentityUser();
lock (threadlock)
{
Monitor.Pulse(threadlock);
}
new Task(() =>
{
userInTask2 = RetrieveIdentityUser();
Thread.Sleep(200);
userInTask2AfterImpersonationUndo = RetrieveIdentityUser();
}).Start();
Thread.Sleep(100);
}
}).Start();
Monitor.Wait(threadlock);
RetrieveIdentityUser();
new Task(() => { userInTask3 = RetrieveIdentityUser(); }).Start();
Thread.Sleep(300);
Assert.IsNotNull(userInMainThread);
Assert.IsNotNull(userInTask1BeforeImpersonation);
Assert.IsNotNull(userInTask1AfterImpersonation);
Assert.IsNotNull(userInTask2);
Assert.IsNotNull(userInTask3);
// context in both threads equal before impersonation
Assert.AreEqual(userInMainThread, userInTask1BeforeImpersonation);
// context has changed in task1
Assert.AreNotEqual(userInTask1BeforeImpersonation, userInTask1AfterImpersonation);
// impersonation to the expected user
Assert.AreEqual(Domain + "\\" + user, userInTask1AfterImpersonation);
// impersonation is inherited
Assert.AreEqual(userInTask1AfterImpersonation, userInTask2);
// a newly started task from the main thread still shows original user context
Assert.AreEqual(userInMainThread, userInTask3);
// inherited impersonation is not revoked
Assert.AreEqual(userInTask2, userInTask2AfterImpersonationUndo);
}
}
[TestMethod]
[TestCategory("LocalTest")]
public void TaskImpersonationTest()
{
int tasksToRun = 100; // must be more than the minimum thread count in ThreadPool
string userInMainThread = RetrieveIdentityUser();
var countdownEvent = new CountdownEvent(tasksToRun);
var exceptions = new List<Exception>();
object threadLock = new object();
string user;
string pw;
Security.Decode(CredentialsIdentifier, out user, out pw);
for (int i = 0; i < tasksToRun; i++)
{
new Task(() =>
{
try
{
try
{
Logger.DebugFormat("Executing task {0} on thread {1}...", Task.CurrentId, Thread.CurrentThread.GetHashCode());
Assert.AreEqual(userInMainThread, RetrieveIdentityUser());
//explicitly not disposing impersonator / reverting impersonation
//to see if a thread reused by TPL has its user context reset
// ReSharper disable once UnusedVariable
var impersonator = new Impersonator(user, Domain, pw);
Assert.AreEqual(Domain + "\\" + user, RetrieveIdentityUser());
}
catch (Exception e)
{
lock (threadLock)
{
var newException = new Exception(string.Format("Task {0} on Thread {1}: {2}", Task.CurrentId, Thread.CurrentThread.GetHashCode(), e.Message));
exceptions.Add(newException);
Logger.Error(newException);
}
}
}
finally
{
countdownEvent.Signal();
}
}).Start();
}
if (!countdownEvent.Wait(TimeSpan.FromSeconds(5)))
{
throw new TimeoutException();
}
Assert.IsTrue(exceptions.Any());
Assert.AreEqual(typeof(AssertFailedException), exceptions.First().InnerException.GetType());
}
}
The WindowsIdentity (which is changed through impersonation) is stored in a SecurityContext. You can determine how this impersonation "flows" in various ways. Since you mentioned, you are using p/invoke, note the caveat in the SecurityContext documentation:
The common language runtime (CLR) is aware of impersonation operations
performed using only managed code, not of impersonation performed
outside of managed code, such as through platform invoke...
However, I'm not entirely certain that the impersonation is really inherited from one task to the other here, or whether the behavior you are observing is due to task inlining. Under certain circumstances, a new task might execute synchronously on the same thread-pool thread. You can find an excellent discussion of this here.
Nonetheless, if you want to make sure that certain tasks are always running under impersonation, while others don't, may I suggest looking into custom Task Schedulers. There is documentation on MSDN on how to write your own, including code-samples for a few common types of schedulers that you can use as a starting point.
Since impersonation is a per-thread setting, you could have your own task scheduler that keeps around one thread (or a few threads) that are running under impersonation when they execute tasks. This could also reduce the number of times you have to switch in and out of impersonation when you have many small units of work.
Here is the scenario:
I have a proxy that is shared among all the threads.
if this proxy gets blocked, then only ONE thread needs to dequeue a proxy from ProxyQueue, not all of them.
For dequeuing I am using interlocked right now so only one thread at a time can enter the function.
private static volatile string httpProxy = "1.1.1.1";
private static int usingResource = 0;
string localHttpProxy;
try
{
HttpWebRequest oReqReview = HttpWebRequest)WebRequest.Create(url);
if (IsHttpProxyDequeue)
{
oReqReview.Proxy = new WebProxy(httpProxy, 8998);
localHttpProxy = httpProxy;
}
HttpWebResponse respReview = (HttpWebResponse)oReqReview.GetResponse();
DoSomthing();
}
catch (Exception ex)
{
if (0 == Interlocked.Exchange(ref usingResource, 1))
{
if (ex.Message == "The remote server returned an error: (403) Forbidden." && httpProxy == localHttpProxy)
{
IsHttpProxyDequeue = currentQueueProxy.TryDequeue(out httpProxy);
}
Interlocked.Exchange(ref usingResource, 0);
}
}
Interlocked.Exchange does not block. It merely performs the swap of the value and reports the results. If the initial value of usingResource is 0 and three threads hit Interlocked.Exchange at exactly the same time, on one thread the Exchange() will return zero and set usingResource to 1, and on the other two threads Exchange() will return 1. Threads 2 and 3 will immediately continue executing with the first statement following the if block.
If you want threads 2 and 3 to block waiting for thread one to finish, then you should use something like a mutex lock, like the C# lock(object) syntax. Locks block threads.
Interlocked.Exchange does not block threads. Interlocked.Exchange is useful when writing non-blocking thread coordination. Interlocked.Exchange says "If I get the special value from this swap I'll take a detour and do this special operation, otherwise I'll just continue doing this other thing without waiting."
The Interlocked does provide synchronization on that value, so if multiple threads reach that point at the same time, only one of them will get a 0 back. All others will get a 1 back until the value gets set back to '0'.
You have a race condition in your code, which is probably what's causing the problem. Consider this sequence of events:
Thread A sees that `IsProxyDequeue` is `false`
Thread A calls `Interlocked.Exchange` and gets a value of '0'
Thread A logs the error
Thread B sees that `IsProxyDequeue` is `false`
Thread A dequeues the proxy and sets `usingResource` back to `0`
Thread B calls `Interlocked.Exchange` and gets a value of `0`
At this point, Thread B is also going to dequeue the proxy.
You'll need to come up with a different way of providing the synchronization. I suspect you'll want something like:
object lockObj = new object();
lock (lockObj)
{
if (!IsProxyDequeue)
{
// get the proxy
IsProxyDequeue = true;
}
oReqReview.Proxy = new WebProxy(httpProxy, 8989);
}
If you want to avoid the race condition, but you don't want other threads to block, then use Monitor.TryEnter rather than lock.
I have a search application that takes some time (10 to 15 seconds) to return results for some requests. It's not uncommon to have multiple concurrent requests for the same information. As it stands, I have to process those independently, which makes for quite a bit of unnecessary processing.
I've come up with a design that should allow me to avoid the unnecessary processing, but there's one lingering problem.
Each request has a key that identifies the data being requested. I maintain a dictionary of requests, keyed by the request key. The request object has some state information and a WaitHandle that is used to wait on the results.
When a client calls my Search method, the code checks the dictionary to see if a request already exists for that key. If so, the client just waits on the WaitHandle. If no request exists, I create one, add it to the dictionary, and issue an asynchronous call to get the information. Again, the code waits on the event.
When the asynchronous process has obtained the results, it updates the request object, removes the request from the dictionary, and then signals the event.
This all works great. Except I don't know when to dispose of the request object. That is, since I don't know when the last client is using it, I can't call Dispose on it. I have to wait for the garbage collector to come along and clean up.
Here's the code:
class SearchRequest: IDisposable
{
public readonly string RequestKey;
public string Results { get; set; }
public ManualResetEvent WaitEvent { get; private set; }
public SearchRequest(string key)
{
RequestKey = key;
WaitEvent = new ManualResetEvent(false);
}
public void Dispose()
{
WaitEvent.Dispose();
GC.SuppressFinalize(this);
}
}
ConcurrentDictionary<string, SearchRequest> Requests = new ConcurrentDictionary<string, SearchRequest>();
string Search(string key)
{
SearchRequest req;
bool addedNew = false;
req = Requests.GetOrAdd(key, (s) =>
{
// Create a new request.
var r = new SearchRequest(s);
Console.WriteLine("Added new request with key {0}", key);
addedNew = true;
return r;
});
if (addedNew)
{
// A new request was created.
// Start a search.
ThreadPool.QueueUserWorkItem((obj) =>
{
// Get the results
req.Results = DoSearch(req.RequestKey); // DoSearch takes several seconds
// Remove the request from the pending list
SearchRequest trash;
Requests.TryRemove(req.RequestKey, out trash);
// And signal that the request is finished
req.WaitEvent.Set();
});
}
Console.WriteLine("Waiting for results from request with key {0}", key);
req.WaitEvent.WaitOne();
return req.Results;
}
Basically, I don't know when the last client will be released. No matter how I slice it here, I have a race condition. Consider:
Thread A Creates a new request, starts Thread 2, and waits on the wait handle.
Thread B Begins processing the request.
Thread C detects that there's a pending request, and then gets swapped out.
Thread B Completes the request, removes the item from the dictionary, and sets the event.
Thread A's wait is satisfied, and it returns the result.
Thread C wakes up, calls WaitOne, is released, and returns the result.
If I use some kind of reference counting so that the "last" client calls Dispose, then the object would be disposed by Thread A in the above scenario. Thread C would then die when it tried to wait on the disposed WaitHandle.
The only way I can see to fix this is to use a reference counting scheme and protect access to the dictionary with a lock (in which case using ConcurrentDictionary is pointless) so that a lookup is always accompanied by an increment of the reference count. Whereas that would work, it seems like an ugly hack.
Another solution would be to ditch the WaitHandle and use an event-like mechanism with callbacks. But that, too, would require me to protect the lookups with a lock, and I have the added complication of dealing with an event or a naked multicast delegate. That seems like a hack, too.
This probably isn't a problem currently, because this application doesn't yet get enough traffic for those abandoned handles to add up before the next GC pass comes and cleans them up. And maybe it won't ever be a problem? It worries me, though, that I'm leaving them to be cleaned up by the GC when I should be calling Dispose to get rid of them.
Ideas? Is this a potential problem? If so, do you have a clean solution?
Consider using Lazy<T> for SearchRequest.Results maybe? But that would probably entail a bit of redesign. Haven't thought this out completely.
But what would probably be almost a drop-in replacement for your use case is to implement your own Wait() and Set() methods in SearchRequest. Something like:
object _resultLock;
void Wait()
{
lock(_resultLock)
{
while (!_hasResult)
Monitor.Wait(_resultLock);
}
}
void Set(string results)
{
lock(_resultLock)
{
Results = results;
_hasResult = true;
Monitor.PulseAll(_resultLock);
}
}
No need to dispose. :)
I think that your best bet to make this work is to use the TPL for all of you multi-threading needs. That's what it is good at.
As per my comment on your question, you need to keep in mind that ConcurrentDictionary does have side-effects. If multiple threads try to call GetOrAdd at the same time then the factory can be invoked for all of them, but only one will win. The values produced for the other threads will just be discarded, however by then the compute has been done.
Since you also said that doing searches is expensive then the cost of taking a lock ad then using a standard dictionary would be minimal.
So this is what I suggest:
private Dictionary<string, Task<string>> _requests
= new Dictionary<string, Task<string>>();
public string Search(string key)
{
Task<string> task;
lock (_requests)
{
if (_requests.ContainsKey(key))
{
task = _requests[key];
}
else
{
task = Task<string>
.Factory
.StartNew(() => DoSearch(key));
_requests[key] = task;
task.ContinueWith(t =>
{
lock(_requests)
{
_requests.Remove(key);
}
});
}
}
return task.Result;
}
This option nicely runs the search, remembers the task throughout the duration of the search and then removes it from the dictionary when it completes. All requests for the same key while a search is executing get the same task and so will get the same result once the task is complete.
I've test the code and it works.