Concurrent object locks based on ID field

Concurrent object locks based on ID field - c#

I have a producer/consumer process. The consumed object has an ID property (of type integer), I want only one object with the same ID to be consumed at a time. How can I perform this ?
Maybe I can do something like this, but I don't like it (too many objects created while only one or two with the same ID a day can be consumed and the lock(_lockers) is a bit time consuming :
private readonly Dictionary<int,object> _lockers = new Dictionary<int,object>();
private object GetLocker(int id)
{
lock(_lockers)
{
if(!_lockers.ContainsKey(id))
_lockers.Add(id,new object());
return _lockers[id];
}
}
private void Consume(T notif)
{
lock(GetLocker(notif.ID))
{
...
}
}
enter code here
NB : Same question with the ID property being of type string (in that cas maybe I can lock over the string.Internal(currentObject.ID)

As indicated in comment, one approach would be to have a fixed pool of locks (say 32), and take the ID modulo 32 to determine which lock to take. This would result in some false sharing of locks. 32 is number picked from the air - it would depend on your distibution of ID values, how many consumers, etc.

Can you make your IDs to be unique for each object? If so, you could just apply a lock on the object itself.

First off,
have you profiled to establish that lock(_lockers) is indeed a bottleneck? Because if it's not broken, don't fix it.
Edit: I didn't read carefully enough, this is about the (large) number of helper objects created.
I think Damien's got a good idea for that, I'll leave this bit about the strings:
Regarding
NB : Same question with the ID
property being of type string (in that
cas maybe I can lock over the
string.Internal(currentObject.ID)
No, bad idea. You can lock on a string but then you will have to worry about wheter they may be interned. Hard to be sure they are unique.

I would consider a synced FIFO queue as a seperate class/singleton for all your produced objects - the producers enqueues the objects and the consumers dequeue - thus the actual objects do not require any synchronization anymore. The synchronisation is then done outside the actual objects.

How about assigning IDs from a pool of ID objects and locking on these?
When you create your item:
var item = CreateItem();
ID id = IDPool.Instance.Get(id);
//assign id to object
item.ID = id;
the ID pool creates and maintains shared ID instances:
class IDPool
{
private Dictionary<int, ID> ids = new Dictionary<int, ID>();
public ID Get(int id)
{
//get ID from the shared pool or create new instance in the pool.
//always returns same ID instance for given integer
}
}
you then lock on ID which is now a reference in your Consume method:
private void Consume(T notif)
{
lock(notif.ID)
{
...
}
}
This is not the optimal solution and only offsets the problem to a different place - but if you believe that you have pressure on the lock you may get a performance improvement using this approach (given that e.g. you objects are created on a single thread, you do not need to synchronize the ID pool then).

See How to: Synchronize a Producer and a Consumer Thread (C# Programming Guide)
In addition to simply preventing
simultaneous access with the lock
keyword, further synchronization is
provided by two event objects. One is
used to signal the worker threads to
terminate, and the other is used by
the producer thread to signal to the
consumer thread when a new item has
been added to the queue. These two
event objects are encapsulated in a
class called SyncEvents. This allows
the events to be passed to the objects
that represent the consumer and
producer threads easily.
--Edit--
A simple code snippet that I wrote sometime back; see if this helps. I think this is what weismat is pointing towards?
--Edit--
How about following:
Create an object, say CCustomer that would hold:
An object of type object
And a bool - for instance, bool bInProgress
Dictionary that would hold
now when you check following
if(!_lockers.ContainsKey(id))
_lockers.Add(id,new CCustomer(/**bInProgress=true**/));
return _lockers[id]; **//here you can check the bInProgress value and respond accordingly**.

Related

Is it permissible to cache/reuse Thread.GetNamedSlot between threads?

The Thread.GetNamedDataSlot method acquires a slot name that can be used with Thread.SetData.
Can the result of the GetNamedDataSlot function be cached (and reused across all threads) or should it be invoked in/for every thread?
The documentation does not explicitly say it "shouldn't" be re-used although it does not say it can be either. Furthermore, the example shows GetNamedDataSlot used at every GetData/SetData site; even within the same thread.
For example (note that BarSlot slot is not created/assigned on all specific threads that the TLS is accessed);
public Foo {
private static LocalStorageDataSlot BarSlot = Thread.GetNamedDataSlot("foo_bar");
public static void SetMethodCalledFromManyThreads(string awesome) {
Thread.SetData(BarSlot, awesome);
}
public static void ReadMethodCalledFromManyThreads() {
Console.WriteLine("Data:" + Thread.GetData(BarSlot));
}
}
I asks this question in relationship to code structure; any micro performance gains, if any, are a freebie. Any critical issues or performance degradation with the reuse make it not a viable option.

Can the result of the GetNamedDataSlot function be cached (and reused across all threads) or should it be invoked in/for every thread?
Unfortunately, the documentation isn't 100% clear on this point. Some interesting passages include…
From Thread.GetNamedDataSlot Method (String):
Data slots are unique per thread. No other thread (not even a child thread) can get that data
And from LocalDataStoreSlot Class:
The data slots are unique per thread or context; their values are not shared between the thread or context objects
At best, these make clear that each thread gets its own copy of the data. But the passages can be read to mean either that the LocalDataStoreSlot itself is per-thread, or simply the data to which it refers is per-thread. I believe it's the latter, but I can't point to a specific MSDN page that says so.
So, we can look at the implementation details:
There is a single slot manager per process, which is used to maintain all of the per-thread slots. A LocalDataStoreSlot returned in one thread can be passed to another thread and used there, and it would be owned by the same manager, and use the same slot index (because the slot table is also per-process). It also happens that the Thread.SetData() method will implicitly create the thread-local data store for that slot if it doesn't already exist.
The Thread.GetData() method simply returns null if you haven't already set a value or the thread-local data store hasn't been created. So, the behavior of GetData() remains consistent whether or not you have called SetData() in that thread already.
Since the slots are managed at a process-level basis, you can reuse the LocalDataStoreSlot values across threads. Once allocated, the slot is used up for all threads, and the data stored for that slot will be unique for each thread. Sharing the LocalDataStoreSlot value across threads shares the slot, but even for a single slot, you get thread-local storage for each thread.
Indeed, looking at it this way, the implementation you show would be the desirable way to use this API. After all, it's an alternative to [ThreadStatic], and the only way to ensure a different LocalDataStoreSlot value for each thread in your code would be either to use [ThreadStatic] (which if you wanted to use, you should have just used for the data itself), or to maintain your own dictionary of LocalDataStoreSlot values, indexed presumably by Thread.ManagedThreadId.
Personally, I'd just use [ThreadStatic]. MSDN even recommends this, and it has IMHO clearer semantics. But if you want to use LocalDataStoreSlot, it seems to me that the implementation you have is correct.

Preventing concurrent access to an object, that gets handed around

Think of a network of nodes (update: 'network of nodes' meaning objects in the same application domain, not a network of independent applications) passing objects to each other (and doing some processing on them). Is there a pattern in C# for restricting the access to an object to only the node that is actually processing it?
Main motivation: Ensuring thread-safety (no concurrent access) and object consistency (regarding the data stored in it).
V1: I thought of something like this:
class TransferredObject
{
public class AuthLock
{
public bool AllowOwnerChange { get; private set; }
public void Unlock() { AllowOwnerChange = true; }
}
private AuthLock currentOwner;
public AuthLock Own()
{
if (currentOwner != null && !currentOwner.AllowOwnerChange)
throw new Exception("Cannot change owner, current lock is not released.");
return currentOwner = new AuthLock();
}
public void DoSomething(AuthLock authentification)
{
if (currentOwner != authentification)
throw new Exception("Don't you dare!");
// be sure, that this is only executed by the one holding the lock
// Do something...
}
}
class ProcessingNode
{
public void UseTheObject(TransferredObject x)
{
// take ownership
var auth = x.Own();
// do processing
x.DoSomething(auth);
// release ownership
auth.Unlock();
}
}
V2: Pretty much overhead - a less 'strict' implementation would perhaps be to ignore the checking and rely on the "lock/unlock" logic:
class TransferredObject
{
private bool isLocked;
public Lock()
{
if(isLocked)
throw new Exception("Cannot lock, object is already locked.");
isLocked = true;
}
public Unlock() { isLocked = false; }
public void DoSomething()
{
if (isLocked)
throw new Exception("Don't you dare!");
// Do something...
}
}
class ProcessingNode
{
public void UseTheObject(TransferredObject x)
{
// take ownership
x.Lock = true;
// do processing
x.DoSomething();
// release ownership
x.Unlock = true;
}
}
However: This looks a bit unintuitive (and having to pass an the auth instance with ervery call is ugly). Is there a better approach? Or is this a problem 'made by design'?

To clarify your question: you seek to implement the rental threading model in C#. A brief explanation of different ways to handle concurrent access to an object would likely be helpful.
Single-threaded: all accesses to the object must happen on the main thread.
Free-threaded: any access to the object may happen on any thread; the developer of the object is responsible for ensuring the internal consistency of the object. The developer of the code consuming the object is responsible for ensuring that "external consistency" is maintained. (For example, a free-threaded dictionary must maintain its internal state consistently when adds and removes happen on multiple threads. An external caller must recognize that the answer to the question "do you contain this key?" might change due to an edit from another thread.)
Apartment threaded: all accesses to a given instance of an object must happen on the thread that created the object, but different instances can be affinitized to different threads. The developer of the object must ensure that internal state which is shared between objects is safe for multithreaded access but state which is associated with a given instance will only ever be read or written from a single thread. Typically UI controls are apartment threaded and must be in the apartment of the UI thread.
Rental threaded: access to a given instance of an object must happen from only a single thread at any one time, but which thread that is may change over time
So now let's consider some questions that you should be asking:
Is the rental model a reasonable way to simplify my life, as the author of an object?
Possibly.
The point of the rental model is to achieve some of the benefits of multithreading without taking on the cost of implementing and testing a free-threaded model. Whether those increased benefits and lowered costs are a good fit, I don't know. I personally am skeptical of the value of shared memory in multithreaded situations; I think the whole thing is a bad idea. But if you're bought into the crazy idea that multiple threads of control in one program modifying shared memory is goodness, then maybe the rental model is for you.
The code you are writing is essentially an aid to the caller of your object to make it easier for the caller to obey the rules of the rental model and easier to debug the problem when they stray. By providing that aid to them you lower their costs, at some moderate increase to your own costs.
The idea of implementing such an aid is a good one. The original implementations of VBScript and JScript at Microsoft back in the 1990s used a variation on the apartment model, whereby a script engine would transit from a free-threaded mode into an apartment-threaded mode. We wrote a lot of code to detect callers that were violating the rules of our model and produce errors immediately, rather than allowing the violation to produce undefined behaviour at some unspecified point in the future.
Is my code correct?
No. It's not threadsafe! The code that enforces the rental model and detects violations of it cannot itself assume that the caller is correctly using the rental model! You need to introduce memory barriers to ensure that the various threads reading and writing your lock bools are not moving those reads and writes around in time. Your Own method is chock full of race conditions. This code needs to be very, very carefully designed and reviewed by an expert.
My recommendation - assuming again that you wish to pursue a shared memory multithreaded solution at all - is to eliminate the redundant bool; if the object is unowned then the owner should be null. I don't usually advocate a low-lock solution, but in this case you might consider looking at Interlocked.CompareExchange to do an atomic compare-and-swap on the field with a new owner. If the compare to null fails then the user of your API has a race condition which violates the rental model. This introduces a memory barrier.

Maybe your example is too simplified and you really need this complex ownership thing, but the following code should do the job:
class TransferredObject
{
private object _lockObject = new object();
public void DoSomething()
{
lock(_lockObject)
{
// TODO: your code here...
}
}
}
Your TransferredObject has an atomic method DoSomething that changes some state(s) and should not run multiple times at the same time. So, just put a lock into it to synchronize the critical section.
See http://msdn.microsoft.com/en-us/library/c5kehkcz%28v=vs.90%29.aspx

How to make static method thread safe?

I have written a static class which is a repository of some functions which I am calling from different class.
public static class CommonStructures
{
public struct SendMailParameters
{
public string To { get; set; }
public string From { get; set; }
public string Subject { get; set; }
public string Body { get; set; }
public string Attachment { get; set; }
}
}
public static class CommonFunctions
{
private static readonly object LockObj = new object();
public static bool SendMail(SendMailParameters sendMailParam)
{
lock (LockObj)
{
try
{
//send mail
return true;
}
catch (Exception ex)
{
//some exception handling
return false;
}
}
}
private static readonly object LockObjCommonFunction2 = new object();
public static int CommonFunction2(int i)
{
lock (LockObjCommonFunction2)
{
int returnValue = 0;
try
{
//send operation
return returnValue;
}
catch (Exception ex)
{
//some exception handling
return returnValue;
}
}
}
}
Question 1: For my second method CommonFunction2, do I use a new static lock i.e. LockObjCommonFunction2 in this example or can I reuse the same lock object LockObj defined at the begining of the function.
Question 2: Is there anything which might lead to threading related issues or can I improve the code to be safe thread.
Quesiton 3: Can there be any issues in passing common class instead of struct.. in this example SendMailParameters( which i make use of wrapping up all parameters, instead of having multiple parameters to the SendMail function)?
Regards,
MH

Question 1: For my second method CommonFunction2, do I use a new
static lock i.e. LockObjCommonFunction2 in this example or can I reuse
the same lock object LockObj defined at the begining of the function.
If you want to synchronize these two methods, then you need to use the same lock for them. Example, if thread1 is accessing your Method1, and thread2 is accessing your Method2 and you want them to not concurrently access both insides, use the same lock. But, if you just want to restrict concurrent access to just either Method1 or 2, use different locks.
Question 2: Is there anything which might lead to threading related
issues or can I improve the code to be safe thread.
Always remember that shared resources (eg. static variables, files) are not thread-safe since they are easily accessed by all threads, thus you need to apply any kind of synchronization (via locks, signals, mutex, etc).
Quesiton 3: Can there be any issues in passing common class instead of
struct.. in this example SendMailParameters( which i make use of
wrapping up all parameters, instead of having multiple parameters to
the SendMail function)?
As long as you apply proper synchronizations, it would be thread-safe. For structs, look at this as a reference.
Bottomline is that you need to apply correct synchronizations for anything that in a shared memory. Also you should always take note of the scope the thread you are spawning and the state of the variables each method is using. Do they change the state or just depend on the internal state of the variable? Does the thread always create an object, although it's static/shared? If yes, then it should be thread-safe. Otherwise, if it just reuses that certain shared resource, then you should apply proper synchronization. And most of all, even without a shared resource, deadlocks could still happen, so remember the basic rules in C# to avoid deadlocks. P.S. thanks to Euphoric for sharing Eric Lippert's article.
But be careful with your synchronizations. As much as possible, limit their scopes to only where the shared resource is being modified. Because it could result to inconvenient bottlenecks to your application where performance will be greatly affected.
static readonly object _lock = new object();
static SomeClass sc = new SomeClass();
static void workerMethod()
{
//assuming this method is called by multiple threads
longProcessingMethod();
modifySharedResource(sc);
}
static void modifySharedResource(SomeClass sc)
{
//do something
lock (_lock)
{
//where sc is modified
}
}
static void longProcessingMethod()
{
//a long process
}

You can reuse the same lock object as many times as you like, but that means that none of the areas of code surrounded by that same lock can be accessed at the same time by various threads. So you need to plan accordingly, and carefully.
Sometimes it's better to use one lock object for multiple location, if there are multiple functions which edit the same array, for instance. Other times, more than one lock object is better, because even if one section of code is locked, the other can still run.
Multi-threaded coding is all about careful planning...
To be super duper safe, at the expense of potentially writing much slower code... you can add an accessor to your static class surround by a lock. That way you can make sure that none of the methods of that class will ever be called by two threads at the same time. It's pretty brute force, and definitely a 'no-no' for professionals. But if you're just getting familiar with how these things work, it's not a bad place to start learning.

1) As to first it depends on what you want to have:
As is (two separate lock objects) - no two threads will execute the same method at the same time but they can execute different methods at the same time.
If you change to have single lock object then no two threads will execute those sections under shared locking object.
2) In your snippet there is nothing that strikes me as wrong - but there is not much of code. If your repository calls methods from itself then you can have a problem and there is a world of issues that you can run into :)
3) As to structs I would not use them. Use classes it is better/easier that way there is another bag of issues related with structs you just don't need those problems.

The number of lock objects to use depends on what kind of data you're trying to protect. If you have several variables that are read/updated on multiple threads, you should use a separate lock object for each independent variable. So if you have 10 variables that form 6 independent variable groups (as far as how you intend to read / write them), you should use 6 lock objects for best performance. (An independent variable is one that's read / written on multiple threads without affecting the value of other variables. If 2 variables must be read together for a given action, they're dependent on each other so they'd have to be locked together. I hope this is not too confusing.)
Locked regions should be as short as possible for maximum performance - every time you lock a region of code, no other thread can enter that region until the lock is released. If you have a number of independent variables but use too few lock objects, your performance will suffer because your locked regions will grow longer.
Having more lock objects allows for higher parallelism since each thread can read / write a different independent variable - threads will only have to wait on each other if they try to read / write variables that are dependent on each other (and thus are locked through the same lock object).
In your code you must be careful with your SendMailParameters input parameter - if this is a reference type (class, not struct) you must make sure that its properties are locked or that it isn't accessed on multiple threads. If it's a reference type, it's just a pointer and without locking inside its property getters / setters, multiple threads may attempt to read / write some properties of the same instance. If this happens, your SendMail() function may end up using a corrupted instance. It's not enough to simply have a lock inside SendMail() - properties and methods of SendMailParameters must be protected as well.

Locking based on parameters

Suppose I have this method:
void Foo(int bar)
{
// do stuff
}
Here is the behavior I want Foo to have:
If thread 1 calls Foo(1) and thread 2 calls Foo(2), both threads can run concurrently.
If thread 1 calls Foo(1) and thread 2 calls Foo(1), both threads cannot run concurrently.
Is there a good, standard way in .net to specify this type of behavior? I have a solution that uses a dictionary of objects to lock on, but that feels kind of messy.

Use a dictionary that provides different lock objects for the different arguments. Set up the dictionary when you instantiate the underlying object (or statically, if applicable):
var locks = new Dictionary<int, object>() {
{1, new Object()},
{2, new Object()},
…
};
And then use it inside your method:
void Foo(int bar) {
lock (locks[bar]) {
…
}
}
I wouldn’t say that this solution is messy, on the contrary: providing a fine lock granularity is commendable and since locks on value types don’t work in .NET, having a mapping is the obvious solution.
Be careful though: the above only works as long as the dictionary isn’t concurrently modified and read. It is therefore best to treat the dictionary as read-only after its set-up.

Bottom line: you can't lock on value types.
The dictionary you're using is the best approach I can think of. It's kludgey, but it works.
Personally, I'd pursue an architectural solution that makes the locking unnecessary, but I don't know enough about your system to give you pointers there.

Using Dictionary is not enough, you should use "ConcurrentDictionary" or implement a data structure that supports multi-thread access.

Creating a Dictionary<> so that you can lock on a value seems overkill to me. I got this working using a string. There are people (e.g. Jon Skeet) who do not like this approach (and for valid reasons - see this post: Is it OK to use a string as a lock object?)
But I have a way to mitigate for those concerns: intern the string on the fly and combine it with an unique identifier.
// you should insert your own guid here
string lockIdentifier = "a8ef3042-e866-4667-8673-6e2268d5ab8e";
public void Foo(int bar)
{
lock (string.Intern(string.Format("{0}-{1}", lockIdentifier, bar)))
{
// do stuff
}
}
What happens is that distinct values are stored in a string intern pool (which crosses AppDomain boundaries). Adding lockIdentifier to the string ensures that the string won't conflict with interned strings used in other applications, meaning the lock will only take effect in your own application.
So the intern pool will return a reference to an interned string - this is ok to lock on.

Should synclock be used on properties?

I have been reading around and am getting conflicting answers on whether I should or should not use synclock on properties.
I have a multi-threaded application that needs to get/set properties across threads on instance objects. It is currently implemented without using synclock and I have not noticed any problems so far. I am using synclock on common static methods but I'd like to implement my instance classes properly and in a thread safe way.
Any feedback would be greatly appreciated.

A good rule of thumb is that you need to lock if any of the following conditions hold true:
if any field of an object is going to be modified on more than one thread
if any modifications involve accessing more than one field
if any modifiable field is a Double, Decimal, or structured value type
if any modifications involve read-modify-write (i.e. adding to a field or setting one field with the value from another)
then you probably need to lock in every method or property that accesses those fields.
EDIT: Keep in mind that locking inside of a class is rarely sufficient -- what you need to do is make sure that things don't go wrong across the span of an entire logical operation.
As #Bevan points out, if calling code needs to access an object more than once, the client code should take out its own lock on the object for the entire duration of its work to ensure that another thread doesn't get "in between" its accesses and foul up its logic.
You also need to take care that if anything needs to take out multiple locks at once, that they always be taken in the same order. If thread 1 has a lock on instance A and tries to lock instance B, and thread 2 has a lock on instance B and tries to get a lock on instance A, both threads are stuck and unable to proceed -- you have a deadlock.

You can't make an object thread safe just by surrounding individual methods with locks. All you end up doing is serialising (slowing down) access to the object.
Consider this minor example:
var myObject = ...
var myThreadSafeList = ...
if (!myThreadSafeList.Contains(myObject))
{
myThreadSafeList.Add(myObject);
}
Even if myThreadSafeList has every method locked, this isn't threadsafe because another thread can alter the contents of the list between the calls to Contains() and Add().
In the case of this list, an additional method is needed: AddIfMissing():
var myObject = ...
var myThreadSafeList = ...
myThreadSafeList.AddIfMissing(myObject);
Only by moving the logic into the object can you surround both operations with the lock and make it safe.
Without further details, it's hard to comment futher, but I'd suggest the following:
Make all properties read-only, and allow anyone to read them at any time
Provide mutator methods that take sets of properties that get modified together, and make the changes atomically within a lock
To illustrate:
public class Person {
public string FullName { get; private set; }
public string FamilyName { get; private set; }
public string KnownAs { get; private set; }
public void SetNames( string full, string family, string known) {
lock (padLock) {
...
}
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.