We have a situation where we want to limit the number of paralell requests our application can make to its application server. We have potentially 100+ background threads running that will want to at some point make a call to the application server but only want 5 threads to be able to call SendMessage() (or whatever the method will be) at any one time. What is the best way of achieving this?
I have considered using some sort of gatekeeper object that blocks threads coming into the method until the number of threads executing in it has dropped below the threshold. Would this be a reasonable solution or am I overlooking the fact that this might be dirty/dangerous?
We are developing in C#.NET 3.5.
Thanks,
Steve
Use a semaphore
http://msdn.microsoft.com/en-us/library/system.threading.semaphore.aspx
Limits the number of threads that can
access a resource or pool of resources
concurrently.
You want a semaphore... System.Threading.Semaphore
public static class MyClass
{
private static Semaphore sem = new Semaphore(5, 5);
public static void SendMessage()
{
sem.WaitOne();
try
{
}
finally
{
sem.Release(1);
}
}
}
Alternatively, if you only want a single thread to be able to call a method at a given time, .NET also exposes a concept equivalent with java's synchronized attribute:
[System.Runtime.CompilerServices.MethodImpl(MethodImpl.Synchronized)]
The Semaphore class was designed for exactly this scenario.
Design Pattern Approach:
- Use command pattern with five Executor threads and wrap your requests in Command classes.
Related
I have read that to do a fake async method in this way it is a bad idea:
public int myMethodSyn()
{
//sync operations
return result;
}
public async int myMethodAsync()
{
return await Task.Run(myMethodSync);
}
One of the reasons that I have read it is because for example, ASP can have scalability problems with this kind of libraries because tasks use the thread pool and ASP need the thread pool to attend each call. So the library can consume all the threads of the thread pool al block ASP. SO it is better allow to the client decides how to use the thread pool.
If am not wrong, Parallel.Invoke use the thread pool too to run methods in parallel, so I guess that if I use a method in my library that uses parallel.Invoke, or parallel.Foreach or any of this ways to run code in parallel, I would have the same problem. Is it true?
My idea is to run two mthods in parallel because they are indepent and I could get a better performance if I run them in parallel. So I would have somthing like that:
public int myMainMethodSync()
{
int result01 = myMethod01Sync();
int result02 = myMethod02Sync();
return result01 + result02;
}
private void myMethod01Sync()
{
}
private void myMethod02Sync()
{
}
public int myMainMethodAsync()
{
Task myTsk01 = Task.Run(myMethod01Sync);
Task myTsk02 = Task.Run(myMethod02Sync);
Task.WhenAll(myTsk01, myTsk02);
return myTsk01.Result + myTsk02.Result;
}
public int Task myMainMethodParallel()
{
int result01;
int result02;
Parallel.Invoke(() => result01 = myMethod01Sync(),
() => result02 = myMethod02Sync());
return result01 + result02;
}
The idea is it to have a sync method that run the two methods in sync. So the client who use the library knows that the method will not use thread pool.
Later I have two options to run the methods at the same time, with tasks or with parallel.Invoke.
In the case of the tasks, I am using a fake async methods because I am wraping the sync method inside a task, that use two threads from the threadpool. If I am not wrong, this is not recommended.
The other option it is to use Parallel.Invoke, that uses threads from thread pool too, so I guess it has the same problem that with tasks, so I guess that it is not recommended too.
In my case I would prefer to use task, because I can decide with a condition when to run the method02Sync for example, according to some condiciotion, so I could save the cost to assign a thread to run the second method if I know that it is not needed in some cases. I guess in parallel.Invoke this is not possible.
However, I think that in this case, how I implement a sync method too, I let the client to choose the method that it considerates better in its case, so really it is a bad option to use tasks in the async method?
If both solutions are bad, tasks and Parallel.Invloke, then it is not recommended to run parallel code in libraries and only use it in the top level, in the UI or client of the library? Because I guess that in this case the use of parallel is very restrictive, because in the top level, in the UI, it is not possible to use parallel if it decides it is possible because tell to library use threads or not, because it wouldn't have parallel methods.
In sumary, is my solution, expose sync and async methods a bad idea? is it bad idea to use task or parallel code in the libraries? If one of them it is better option, which one?
Thanks.
is my solution, expose sync and async methods a bad idea?
Let me reformulate the question to make it more general:
Is it a good idea to expose two versions of a method with different performance characteristics?
I think that most of the time, it is a bad idea. The API of your library should be clear, you should not make the users of your library constantly keep choosing between the two options. I think it's your responsibility as a library author to make the decision, even if it's going to be the wrong one for some of your users.
If the differences between the two options are dramatic, you could consider some approach that lets your users to choose between them. But I think having two separate methods is the wrong choice, something like an optional parameter would be a better approach, because it means there is a clear default.
The one exception I can think of is if the signatures of the two methods are different, like with truly async methods. But I don't think that applies to your use of Tasks to parallelize CPU-bound methods.
Is it bad idea to use task or parallel code in the libraries?
I think you should use them cautiously. You are right that your users might not be happy if your library uses more resources (here, threads) to make itself faster. On the other hand, most methods of parallelizing code are smart enough that if the amount of available thread pool threads is limited, they will still work fine. So, if you measured that the speedup gained by parallelizing your code is significant, I think it's okay to do it.
If one of them it is better option, which one?
I think this is more a matter of which one you prefer as a matter of code style. The performance characteristics of Parallel.Invoke() with two actions and synchronously waiting for two Tasks should be comparable.
Though keep in mind that your call to Task.WhenAll doesn't really do anything, since WhenAll returns a Task that completes when all its component Tasks complete. You could instead use Task.WaitAll, but I'm not sure what would be the point, since you're already implicitly waiting for both Tasks by acessing their Results.
What I'm trying to accomplish is I have a action block with MaxDegreeOfParallelism = 4. I want to create one local instance of a session object I have for each parallel path, So I want to total of 4 session objects. If this was threads I would creating something like:
ThreadLocal<Session> sessionPerThread = new ThreadLocal<Session>(() => new Session());
I know blocks are not threads so I'm looking for something similar but for blocks. Any way to create this?
This block is in a service and runs for months on end. During that time period tons of threads are used for each concurrent slot of the block so thread local storage is not appropriate. I need something tied to the logical block slot. Also this block never completes, it runs the entire lifetime of the service.
Note: The above suggested answer is not valid for what I am asking. I'm specifically asking for something different than thread local and the above answer is using thread local. This is a different question entirely.
As it sounds like you already know, Dataflow blocks provide absolutely no guarantee of correlation between blocks, execution, and threads. Even with max parallelism set to 4, all 4 tasks could be executing on the same thread. Or an individual task may execute on many threads.
Given that you ultimately want to reuse n instances of an expensive service for your n degrees of parallelism, let's take dataflow completely out of the picture for a minute, since it doesn't help (or directly hinder) you from any general solution to this problem. It's actually quite simple. You can use a ConcurrentStack<T>, where T is the type of your service that is expensive to instantiate. You have code that appears at the top of the method (or delegate) that represents one of your parallel units of work:
private ConcurrentStack<T> reusableServices;
private void DoWork() {
T service;
if (!this.reusableServices.TryPop(out service)) {
service = new T(); // expensive construction
}
// Use your shared service.
//// Code here.
// Put the service back when we're done with it so someone else can use it.
this.reusableServices.Push(service);
}
Now in this way, you can quickly see that you create exactly as many instances of your expensive service as you have parallel executions of DoWork(). You don't even have to hard-code the degree of parallelism you expect. And it's orthogonal to how you actually schedule that parallelism (so threadpool, Dataflow, PLINQ, etc. doesn't matter).
So you can just use DoWork() as your Dataflow block's delegate and you're set to go.
Of course, there's nothing magical about ConcurrentStack<T> here, except that the locks around push and pop are built into the type so you don't have to do it yourself.
I have written a program in C#. Now I finished all the functionality and it works. But only running with one thread. I'm doing a lot of calculation and sometimes loading about 300 MB or more of measurement files into the application.
I now want to make the program multithreaded because the user experiance is really bad in times of intense processing or i/o operations.
What is the best way to refactor the program, so that it can be made multithreaded without too much affort? I know this is stuff I should have thougth before. But I havn't.
I used the singleton pattern for about 3 big and important modules which are involved in nearly every other functionality of the program.
I used a more or less clean MVC (Model View Control) architecture. So I wonder if it is maybe possible to let the User Interface run in one thread and the rest of the application in another.
If not, loading and parsing 300MB, creating objects will take about 3 minutes to finish. In this time the user gets no response from the GUI. :/
UPDATE:
My singletons are used as a kind of storage. One singleton saves the objects of the parsed measurement files, while the other singleton saves the result. I have different calculations, which use the same measurementfiles and creating results which they want to save using the other singleton. This is one problem.
The second is to keep the guy responsive to user action or at least avoid this warning that the window is not responding.
Thank you all for all advices. I will try them. Sorry for the late answere.
Generally, I avoid the singleton pattern because it creates a lot of issues down the road, particularly in testing. However, there is a fairly simple solution to making this work for multiple threads, if what you want is a singleton per thread. Put your singleton reference in a field (not a property) and decorate it with the ThreadStaticAttribute:
public class MySingleton
{
[ThreadStatic]
private static MySingletonClass _instance = new MySingletonClass();
public static MySingletonClass Instance { get { return _instance; } }
}
Now each thread will have its own instance of MySingleton.
The easiest way is to move all calculations to one separate thread and update the GUI using Invoke/InvokeRequired.
public partial class MyForm : Form
{
Thread _workerThread;
public MyForm()
{
_workerThread = new Thread(Calculate);
}
public void StartCalc()
{
_workerThread.Start();
}
public void Calculate()
{
//call singleton here
}
// true if user are allowed to change calc settings
public bool CanUpdateSettings
{
get { return !_workerThread.IsAlive; } }
}
}
In this way you have get a response GUI while the calculations are running.
The application will be thread safe as long as you don't allow the user to make changes during a running calculation.
Using several threads for doing the calculations is a much more complex story which we need more information for to give you a proper answer.
You can use TPL
You can make the loops with TPL parallel, and further more it is built-in with .NET 4.0 so that you don't have to change your program so much
What is a best approach to make a function or set of statements thread safe in C#?
Don't use shared read/write state when possible. Go with immutable types.
Take a look at the C# lock statement. Read Jon Skeet's article on multi threading in .net.
It depends on what you're trying to accomplish.
If you want to make sure that in any given time only one thread would run a specific code use lock or Monitor:
public void Func(...)
{
lock(syncObject)
{
// only one thread can enter this code
}
}
On the other hand you want multiple threads to run the same code but do not want them to cause race conditions by changing the same point in memory don't write to static/shared objects which can be reached by multiple at the same time.
BTW - If you want to create a static object that would be shared only within a single thread use the ThreadStatic attribute (http://msdn.microsoft.com/en-us/library/system.threadstaticattribute(VS.71).aspx).
Use lock statement around shared state variables. Once you ensured thread safety, run code through code profiler to find bottlenecks and optimize those places with more advanced multi-threading constructs.
The best approach will vary depending on your exact problem at hand.
The simplest approach in C# is to "lock" resources shared by multiple threads using a lock statement. This creates a block of code which can only be accessed by one thread at a time: the one which has obtained the "lock" object. For example, this property is thread safe using the lock syntax:
public class MyClass
{
private int _myValue;
public int MyProperty
{
get
{
lock(this)
{
return _myValue;
}
}
set
{
lock(this)
{
_myValue = value;
}
}
}
}
A thread aquires the lock at the start of the block and only releases the lock at the end of the block. If the lock is not available, the thread will wait until the lock is available. Obviously, access to the private variable within the class is not thread-safe, so all threads must access the value through the property to be safe.
This is by far the simplest way for threads to have safe access to shared data, however it only touches the tip of the iceberg of techniques for threading.
Write the function in such a way that:
It does not modify its parameters in any way
It does not access any state outside of its local variables.
Otherwise, race conditions MAY occur. The code must be thoroughly examined for such conditions and appropriate thread synchronization must be implemented (locks, etc...). Writing code that does not require synchronization is the best way to make it thread-safe. Of course, this is often not possible - but should be the first option considered in most situations.
There's a lot to understand when learning what "thread safe" means and all the issues that are introduced (synchronization, etc).
I'd recommend reading through this page in order to get a better feel for what you're asking: Threading in C#. It gives a pretty comprehensive overview of the subject, which sounds like it could be pretty helpful.
And Mehrdad's absolutely right -- go with immutable types if you can help it.
Alright...I've given the site a fair search and have read over many posts about this topic. I found this question: Code for a simple thread pool in C# especially helpful.
However, as it always seems, what I need varies slightly.
I have looked over the MSDN example and adapted it to my needs somewhat. The example I refer to is here: http://msdn.microsoft.com/en-us/library/3dasc8as(VS.80,printer).aspx
My issue is this. I have a fairly simple set of code that loads a web page via the HttpWebRequest and WebResponse classes and reads the results via a Stream. I fire off this method in a thread as it will need to executed many times. The method itself is pretty short, but the number of times it needs to be fired (with varied data for each time) varies. It can be anywhere from 1 to 200.
Everything I've read seems to indicate the ThreadPool class being the prime candidate. Here is what things get tricky. I might need to fire off this thing say 100 times, but I can only have 3 threads at most running (for this particular task).
I've tried setting the MaxThreads on the ThreadPool via:
ThreadPool.SetMaxThreads(3, 3);
I'm not entirely convinced this approach is working. Furthermore, I don't want to clobber other web sites or programs running on the system this will be running on. So, by limiting the # of threads on the ThreadPool, can I be certain that this pertains to my code and my threads only?
The MSDN example uses the event drive approach and calls WaitHandle.WaitAll(doneEvents); which is how I'm doing this.
So the heart of my question is, how does one ensure or specify a maximum number of threads that can be run for their code, but have the code keep running more threads as the previous ones finish up until some arbitrary point? Am I tackling this the right way?
Sincerely,
Jason
Okay, I've added a semaphore approach and completely removed the ThreadPool code. It seems simple enough. I got my info from: http://www.albahari.com/threading/part2.aspx
It's this example that showed me how:
[text below here is a copy/paste from the site]
A Semaphore with a capacity of one is similar to a Mutex or lock, except that the Semaphore has no "owner" – it's thread-agnostic. Any thread can call Release on a Semaphore, while with Mutex and lock, only the thread that obtained the resource can release it.
In this following example, ten threads execute a loop with a Sleep statement in the middle. A Semaphore ensures that not more than three threads can execute that Sleep statement at once:
class SemaphoreTest
{
static Semaphore s = new Semaphore(3, 3); // Available=3; Capacity=3
static void Main()
{
for (int i = 0; i < 10; i++)
new Thread(Go).Start();
}
static void Go()
{
while (true)
{
s.WaitOne();
Thread.Sleep(100); // Only 3 threads can get here at once
s.Release();
}
}
}
Note: if you are limiting this to "3" just so you don't overwhelm the machine running your app, I'd make sure this is a problem first. The threadpool is supposed to manage this for you. On the other hand, if you don't want to overwhelm some other resource, then read on!
You can't manage the size of the threadpool (or really much of anything about it).
In this case, I'd use a semaphore to manage access to your resource. In your case, your resource is running the web scrape, or calculating some report, etc.
To do this, in your static class, create a semaphore object:
System.Threading.Semaphore S = new System.Threading.Semaphore(3, 3);
Then, in each thread, you do this:
System.Threading.Semaphore S = new System.Threading.Semaphore(3, 3);
try
{
// wait your turn (decrement)
S.WaitOne();
// do your thing
}
finally {
// release so others can go (increment)
S.Release();
}
Each thread will block on the S.WaitOne() until it is given the signal to proceed. Once S has been decremented 3 times, all threads will block until one of them increments the counter.
This solution isn't perfect.
If you want something a little cleaner, and more efficient, I'd recommend going with a BlockingQueue approach wherein you enqueue the work you want performed into a global Blocking Queue object.
Meanwhile, you have three threads (which you created--not in the threadpool), popping work out of the queue to perform. This isn't that tricky to setup and is very fast and simple.
Examples:
Best threading queue example / best practice
Best method to get objects from a BlockingQueue in a concurrent program?
It's a static class like any other, which means that anything you do with it affects every other thread in the current process. It doesn't affect other processes.
I consider this one of the larger design flaws in .NET, however. Who came up with the brilliant idea of making the thread pool static? As your example shows, we often want a thread pool dedicated to our task, without having it interfere with unrelated tasks elsewhere in the system.