What is the proper way to lock code areas - c#

What is better:
to have large code area in lock statement
or
to have small locks in large area?..
exchanges in this sample are not changable.
lock (padLock)
{
foreach (string ex in exchanges)
{
sub.Add(x.ID, new Subscription(ch, queue.QueueName, true));
.........
}
or
foreach (string ex in exchanges)
{
lock (padLock)
{
sub.Add(x.ID, new Subscription(ch, queue.QueueName, true));
}
.....

The wider lock - the less you get from multithreading, and vice versa
So, use of locks completely depends on logic. Lock only things and places which changing and have to run only by one thread in a time
If you lock for using the collection sub - use smaller lock, but if you run multiple simultaneous foreach loops in parallel

Good practise would be to only lock that area which you want to be executed by only one thread at a given time
If that area is whole foreach loop then first approach is fine
but if that area is just one line as you have shown I second approach then go for the second approach

In this particular case, the best option is the first one, because otherwise you're just wasting time locking/unlocking since you have to execute the entire loop anyway. So there's not much opportunity for parallelism in a loop that executes individually atomic operations anyway.
For more general advice on critical section sizes check this article: http://software.intel.com/en-us/articles/managing-lock-contention-large-and-small-critical-sections/

I think there are two different questions:
1. Which would be correct?
2. Which would give better performance?
The correctness question is complicated. It depends on your data structures, and how you intend the lock to protect them. If the "sub" object is not thread-safe, you definitely need the big lock.
The performance question is simpler and less important (but for some reason, I think you're focused on it more).
Many small locks may be slower, because they just do more work. But if you manage to run a large portion of the loop code without lock, you get some concurrency, so it would be better.

You can't effectively judge which is "right" with the given code snippets. The first example says it is not OK for people to see sub with only part of the content from exchanges. The second example says it is OK for people to see sub with only part of the content from exchanges.

Related

Iterate and do operations on a generic collection concurrently

I've created a game emulation program using c# async socks. I need to remove/add & do iterations on a collection (a list that holds clients) concurrently. I am currently using "lock", however, it's a a huge performance drop. I also do not want to use "local lists/copies" to keep the list up-to-date. I've heard about "ConcurrentBags", however, I am not sure how thread safe they are for iterations (for instance if a thread removes an element from the list while another thread is doing an iteration on it!?).
What do you suggest?
Edit: here is a situation
this is when a packet is sent to all the users in a room
lock (parent.gameClientList)
{
for (int i = 0; i <= parent.gameClientList.Count() - 1; i++) if (parent.gameClientList[i].zoneId == zoneId) parent.gameClientList[i].SendXt(packetElements); //if room matches - SendXt sends a packet
}
When a new client connects
Client connectedClient = new Client(socket, this);
lock (gameClientList)
{
gameClientList.Add(connectedClient);
}
Same case when a client disconnects.
I am asking for a better alternative (performance-wise) because the locks slow down everything.
It sounds like the problem is that you're doing all the work within your foreach loop, and it's locking out the add/remove methods for too long. The way around this is to quickly make a copy of the collection while it's locked, and then you can close the lock and iterate on the copy.
Thing[] copy;
lock(myLock) {
copy = _collection.ToArray();
}
foreach(var thing in copy) {...}
The drawback is that by the time you get around to operating on some object of that copy, it may have been removed from the original collection and so maybe you don't want to operate on it anymore. That's another thing you'll just have to figure out the requirements. If that's a problem, a simple option would be to lock each iteration of the loop, which of course will slow things down but at least it won't lock for the entire duration the loop is running:
foreac(var thing in copy) {
lock(myLock) {
if (_collection.Contains(thing)) //check that it's still in the original colleciton
DoWork(thing); //If you can move this outside the lock it'd make your app snappier, but it could also cause problems if you're doing something "dangerous" in DoWork.
}
}
If this is what you meant by "local copies", then you can disregard this option, but I figured I'd offer it in case you meant something else.
Every time you do something concurrently you are going to have loss due to task management (i.e. locks). I suggest you look at what is the bottleneck in your process. You seem to have a shared memory model, as opposed to a message passing model. If you know you need to modify the entire collection at once, there may not be a good solution. But if you are making changes in a particular order you can leverage that order to prevent delays. Locks is an implementation of pessimistic concurrency. You could switch to an optimistic concurrency model. In one the cost is waiting in the other the cost is retrying. Again the actual solution depends on your use case.
On problem with ConcurrentBag is that it is unordered so you cannot pull items out by index the same way you are doing it currently. However, you can iterate it via foreach to get the same effect. This iteration will be thread-safe. It will not go bizerk if an item is added or removed while the iteration is happening.
There is another problem with ConcurrentBag though. It actually copies the contents to a new List internally to make the enumerator work correctly. So even if you wanted to just pick off a single item via the enumerator it would still be a O(n) operation because of the way enumerator works. You can verify this by disassembling it.
However, based on context clues from your update I assume that this collection is going to be small. It appears that there is only one entry per "game client" which means it is probably going to store a small number of items right? If that is correct then the performance of the GetEnumerator method will be mostly insignificant.
You should also consider ConcurrentDictionary as well. I noticed that you are trying to match items from the collection based on zoneId. If you store the items in the ConcurrentDictionary keyed by zoneId then you would not need to iterate the collection at all. Of course, this assumes that there is only one entry per zoneId which may not be the case.
You mentioned that you did not want to use "local lists/copies", but you never said why. I think you should reconsider this for the following reasons.
Iterations could be lock-free.
Adding and removing items appears to be infrequent based context clues from your code.
There are a couple of patterns you can use to make the list copying strategy work really well. I talk about them in my answers here and here.

lock performance cloning object or not

I have a DataTable that can contains a large number of DataRows; this DataTable can be accessed from several threads. Some threads can also change values into some rows. Actually there are some search functions that locks the DataTable, search into it using linq and return the expected value, something similar of:
lock(tableContent)
{
var t = (from row in tableContent.AsEnumerable()
where row[fieldName] != DBNull.Value && row.Field<T>(fieldName).Equals(someValue) select row);
if (t.Any())
{ ... }
}
The question is: If I clone (locking the original) the DataTable and search into the cloned object, will the locked period be faster than searching directly into the original?
I think that the Cloning operation will take O(n) to copy each row, so the time will be the same as the search, but I don't know if there is some optimizations (memory copy, ...) that reduces the Cloning time or something similar.
Clone is O(n), but that doesn't tell the full story. A clone can be a shallow clone (just copy the references in the table) or a deep clone (copy the objects themselves). A deep clone can be a very expensive operation. Searching time can vary, too, from a quick search that checks just a single integer field, to a complex search that compares multiple values and is pretty expensive. In addition, if your data is sorted on the field that you're searching, then search is O(log n), which will be considerably faster than O(n).
If you need to take into account the possibility that somebody will add, modify, or delete rows, then you either have to lock or clone. If you're doing a single search, then cloning doesn't really make sense because you'd have to lock the table in order to clone it. And cloning will most likely take longer than searching, unless your searches are unusually expensive.
You say that modifications are rare and searches are frequent. In that case, I would suggest that you use a reader-writer lock, which will support an unlimited number of readers, or one writer. In .NET, you probably want the ReaderWriterLockSlim. Using that, your code would look like this:
private ReaderWriterLockSlim tableLock = new ReaderWriterLockSlim();
public bool Search(string s)
{
tableLock.EnterReadLock();
try
{
// do the search here
return result;
}
finally
{
tableLock.ExitReadLock();
}
}
Any number of readers can be searching the table concurrently. Provided, of course, that they don't modify it. If you want to modify the table, you have to acquire the write lock:
public void Modify(string s)
{
tableLock.EnterWriteLock();
try
{
// do the modification here
return;
}
finally
{
tableLock.ExitWriteLock();
}
}
When a thread tries to enter the write lock, it has to wait for all existing readers to exit. Readers that come in after the write lock was requested have to wait for the existing readers to exit, and for the writer to acquire and then release the lock.
The reader/writer lock works really well for me in situations similar to the one you described: frequent reads and infrequent writes. It's worth looking into, if nothing else because it's so easy to test.
The reader/writer lock still works well if searches and updates are approximately equal because it still allows multiple readers when possible. Come to think of it, it even works well if writes are much more frequent, again because it will allow multiple reads when possible. I almost always use ReaderWriterLockSlim when I have a data structure that can be searched and updated.
There are other solutions, but they involve custom data structures that can be much more difficult to implement and maintain. I'd suggest you give the reader/writer lock a try. If after that, profiling shows that threads are still waiting on the lock and slowing your application's response time, you can look into alternatives.
I'm a little concerned, though, that you're doing more than just searching. Your sample selects a bunch of rows and then you do if (t.Any()) { ... }. Just what are you doing in that { ... }? If that takes a long time, you might be better off making your code clone just the rows that you select. You can then release the lock and party on the result set to your heart's content without affecting other threads that need to access the data structure.

ReaderWriteLockSlim or Lock

I m using ConcurrentBag to store object in run time. At some point I need to empty the bag and store the bag content to a list. This is what i do:
IList<T> list = new List<T>();
lock (bag)
{
T pixel;
while (bag.TryTake(out pixel))
{
list.Add(pixel);
}
}
My Question is with synchronization, As far as I read in the book lock is faster than others synchronization methods. Source -- http://www.albahari.com/threading/part2.aspx.
Performance is my second concern, I d like to know if I can use ReaderWriterLockSlim at this point. What would be the benefit of using ReaderWriterLockSlim? The reason is that, I dont want this operation to block incoming requests.
If yes, Should I use Upgradable Lock?
Any ideas ? Comments?
I'm not sure why you're using the lock. The whole idea behind ConcurrentBag is that it's concurrent.
Unless you're just trying to prevent some other thread from taking things or adding things to the bag while you're emptying it.
Re-reading your question, I'm pretty sure you don't want to synchronize access here at all. ConcurrentBag allows multiple threads to Take and Add, without you having to do any explicit synchronization.
If you lock the bag, then no other thread can add or remove things while your code is running. Assuming, of course, that you protect every other access to the bag with a lock. And once you do that, you've completely defeated the purpose of having a lock-free concurrent data structure. Your data structure has become a poorly-performing list that's controlled by a lock.
Same thing if you use a reader-writer lock. You'd have to synchronize every access.
You don't need to add any explicit synchronization in this case. Ditch the lock.
Lock is great when threads will do a lot of operations in a row(bursty - low contention)
RWSlim is great when you have a lot more read locks than write locks(read heavy - high read contention)
Lockless is great when you need a multiple readers and/or writers all working at the same time(mix of read/write - lots of contention)

Parallel.ForEach not spinning up new threads

Parallel.ForEach Not Spinning Up New Threads
Hello all, we have a very IO-intensive operation that we wrote using Parallel.ForEach from Microsoft's Parallel Extensions for the .NET Framework. We need to delete a large number of files, and we represent the files to be deleted as a list of lists. Each nested list has 1000 messages in it, and we have 50 of these lists. The issue here is that when I look in the logs afterwards, I only see one thread executing inside of our Parallel.ForEach block.
Here's what the code looks like:
List<List<Message>> expiredMessagesLists = GetNestedListOfMessages();
foreach (List<Message> subList in expiredMessagesLists)
{
Parallel.ForEach(subList, msg =>
{
try
{
Logger.LogEvent(TraceEventType.Information, "Purging Message {0} on Thread {1}", msg.MessageID, msg.ExtensionID, Thread.CurrentThread.Name);
DeleteMessageFiles(msg);
}
catch (Exception ex)
{
Logger.LogException(TraceEventType.Error, ex);
}
});
}
I wrote some sample code with a simpler data structure and no IO logic, and I could see several different threads executing within the Parallel.ForEach block. Are we doing something incorrect with Parallel.ForEach in the code above? Could it be the list of lists that's tripping it up, or is there some sort of threading limitation for IO operations?
There are a couple of possibilities.
First off, in most cases, Parallel.ForEach will not spawn a new thread. It uses the .NET 4 ThreadPool (all of the TPL does), and will reuse ThreadPool threads.
That being said, Parallel.ForEach uses a partitioning strategy based on the size of the List being passed to it. My first guess is that your "outer" list has many messages, but the inner list only has one Message instance, so the ForEach partitioner is only using a single thread. With one element, Parallel is smart enough to just use the main thread, and not spin work onto a background thread.
Normally, in situations like this, it's better to parallelize the outer loop, not the inner loop. That will usually give you better performance (since you'll have larger work items), although it's difficult to know without having a good sense of the loop sizes plus the size of the Unit of Work. You could also, potentially, parallelize both the inner and outer loops, but without profiling, it'd be difficult to tell what would be the best option.
One other possibility:
Try using [Thread.ManagedThreadId][1] instead of Thread.CurrentThread.Name for your logging. Since Parallel uses ThreadPool threads, the "Name" is often identical across multiple threads. You may think you're only using a single thread, when you're in fact using more than one....
The assumption underlying your code is that it is possible to delete files in parallel. I'm not saying it isn't (I'm no expert on the matter), but I wouldn't be surprised if that is simply not possible for most hardware. You are, after all, performing an operation with a physical object (your hard disk) when you do this.
Suppose you had a class, Person, with a method called RaiseArm(). You could always try shooting off RaiseArm() on 100 different threads, but the Person is only ever going to be able to raise two at a time...
Like I said, I could be wrong. This is just my suspicion.

Another locking question

I'm trying to get my multithreading understanding locked down. I'm doing my best to teach myself, but some of these issues need clarification.
I've gone through three iterations with a piece of code, experimenting with locking.
In this code, the only thing that needs locking is this.managerThreadPriority.
First, the simple, procedural approach, with minimalistic locking.
var managerThread = new Thread
(
new ThreadStart(this.ManagerThreadEntryPoint)
);
lock (this.locker)
{
managerThread.Priority = this.managerThreadPriority;
}
managerThread.Name = string.Format("Manager Thread ({0})", managerThread.GetHashCode());
managerThread.Start();
Next, a single statement to create and launch a new thread, but the lock appears to be scoped too large, to include the creation and launching of the thread. The compiler doesn't somehow magically know that the lock can be released after this.managerThreadPriority is used.
This kind of naive locking should be avoided, I would assume.
lock (this.locker)
{
new Thread
(
new ThreadStart(this.ManagerThreadEntryPoint)
)
{
Priority = this.managerThreadPriority,
Name = string.Format("Manager Thread ({0})", GetHashCode())
}
.Start();
}
Last, a single statement to create and launch a new thread, with a "embedded" lock only around the shared field.
new Thread
(
new ThreadStart(this.ManagerThreadEntryPoint)
)
{
Priority = new Func<ThreadPriorty>(() =>
{
lock (this.locker)
{
return this.managerThreadPriority;
}
})(),
Name = string.Format("Manager Thread ({0})", GetHashCode())
}
.Start();
Care to comment about the scoping of lock statements? For example, if I need to use a field in an if statement and that field needs to be locked, should I avoid locking the entire if statement? E.g.
bool isDumb;
lock (this.locker) isDumb = this.FieldAccessibleByMultipleThreads;
if (isDumb) ...
Vs.
lock (this.locker)
{
if (this.FieldAccessibleByMultipleThreads) ...
}
1) Before you even start the other thread, you don't have to worry about shared access to it at all.
2) Yes, you should lock all access to shared mutable data. (If it's immutable, no locking is required.)
3) Don't use GetHashCode() to indicate a thread ID. Use Thread.ManagedThreadId. I know, there are books which recommend Thread.GetHashCode() - but look at the docs.
Care to comment about the scoping of lock statements? For example, if I
need to use a field in an if statement and that field needs to be locked,
should I avoid locking the entire if statement?
In general, it should be scoped for the portion of code that needs the resource being guarded, and no more than that. This is so it can be available for other threads to make use of it as soon as possible.
But it depends on whether the resource you are locking is part of a bigger picture that has to maintain consistency, or whether it is a standalone resource not related directly to any other.
If you have interrelated parts that need to all change in a synchronized manner, that whole set of parts needs to be locked for the duration of the whole process.
If you have an independent, single item uncoupled to anything else, then only that one item needs to be locked long enough for a portion of the process to access it.
Another way to say it is, are you protecting synchronous or asynchronous access to the resource?
Synchronous access needs to hold on to it longer in general because it cares about a bigger picture that the resource is a part of. It must maintain consistency with related resources. You may very well wrap an entire for-loop in such a case if you want to prevent interruptions until all are processed.
Asynchronous access should hold onto it as briefly as possible. Thus, the more appropriate place for the lock would be inside portions of code, such as inside a for-loop or if-statement so you can free up the individual elements right away even before other ones are processed.
Aside from these two considerations, I would add one more. Avoid nesting of locks involving two different locking objects. I have learned by experience that it is a likely source of deadlocks, particularly if other parts of the code use them. If the two objects are part of a group that needs to be treated as a single whole all the time, such nesting should be refactored out.
There is no need to lock anything before you have started any threads.
If you are only going to read a variable there's no need for locks either. It's when you mix reads and writes that you need to use mutexes and similar locking, and you need to lock in both the reading and the writing thread.

Categories

Resources