I have some common code that i seem to use often when it comes to some of our in house applications.
public State SomeState { get; set; }
Thread thread = new Thread(ThreadProc)
thread.Start();
void ThreadProc() {
while(isTrue){
//Changes SomeState after ping
//Can take up to 5 seconds to complete if service is unavailable
PingServiceTocheckState();
UpdateUI();
Thread.Sleep(200);
}
}
void UpdateUI() {
if (InvokeRequired)
{
Invoke(new MethodInvoker(UpdateUI));
return;
}
if(SomeState == State.Up){
//set these UI values (txt.Text = ...)
}
......
}
And the constant action may be updating Ui on a windows form or constantly reading from a network stream.
I feel this is probably outdated when compared to TPL and Async.
Am I correct to feel it can be better managed using Async/TPL?
And if so, could someone give me an example of how i would accomplish the above using these new ideas?
*Note: I currently use async occasionally and I am trying to make it more prevalent in my coding, but I am a little lost on the best method for creating a longrunning "background thread" using async.
Edit:
I have updated my example with a little more detail relating to my current project. I have code similar to this in a UserControl. I then have a windows form which adds ~50 of these controls dynamically. Without the Thread the form basically hangs, with the thread everything runs smooth. I just figured that rather than managing the thread myself async/TPL would manage the tasks (Ping and Update) more efficiently than myself.
The TPL is well suited to tasks that are short in duration - as you are basically using up workers from a pool. If your operation is continuous, a dedicated thread is entirely appropriate. If, however, the PerformAction is actually blocking waiting on data coming into a queue, and only actually has anything to do very briefly and occasionally, then yes: the TPL may come back into play. But then there is a question of sequential vs concurrent programming: at the moment, in that scenario, you will be processing things sequentially. The TPL makes no such guarantees, and is intended to process things concurrently. It is of course possible to write code that uses the TPL (or just the regualar ThreadPool) to process a sequential queue, but it is a bit more complicated, and there are edge-cases and thread-races around the "is there a current worker? do I need to start one? is it me?" questions.
Thread, TPL, Async stand on shoulders of the same OS kernel objects and artifacts. They are just different classes that provide different functionality to you.
None of them provides you with true low level control over threading, but, for example TPL, if can, splits your task among different cores of your CPU, and finds in general best possible way of paralleling your task on multi-core modern processors, so tends to be faster and more lightweight.
So TPL is a suggested library for multi threading handling in your app, if you have and can use most updated .NET framework.
Related
I've looked over multiple similar questions on SO, but I still couldn't answer my own question.
I have a console app (an Azure Webjob actually) which does file processing and DB management. Some heavy data being downloaded from multiple sources and processed on the DB.
Here's an example of my code:
var dbLongIndpendentProcess = doProcesAsync();
var myfilesTasks = files.Select(file => Task.Run(
async () =>
{
// files processing
}
await myfilesTasks.WhenAll();
await dbLongIndpendentProcess;
// continue with other stuff;
It all works fine and does what I am expecting it to do. There are other tasks running in this whole process, but I guess the idea is clear from the code above.
My question: Is this a fair way of approaching this, or would I get more performance (or sense?) by doing the good old "manual" multithreading? The main reason I chose this approach was that it's simple and straightforward.
However, wasn't async/await primarily aimed at doing asynchronous not to block the main (UI) thread. Here I don't have any UI and I am not doing anything. event-driven.
Thanks,
I don't think you're multithreading by using this approach (except the single Task.Run), async doesn't generally run things on separate threads, it only prevents things from blocking. See: https://msdn.microsoft.com/en-gb/library/mt674882.aspx#Anchor_5
The async and await keywords don't cause additional threads to be
created. Async methods don't require multithreading because an async
method doesn't run on its own thread. The method runs on the current
synchronization context and uses time on the thread only when the
method is active. You can use Task.Run to move CPU-bound work to a
background thread, but a background thread doesn't help with a process
that's just waiting for results to become available.
It would be much better to use tasks for the things you want to multithread, then you can take better advantage of machine cores and resources. You might want to look at a task based solution such as Pipelining (which may work in this scenario) etc...: https://msdn.microsoft.com/en-gb/library/ff963548.aspx or another alternative.
I'm currently picking up C# again and developping a simple application that sends broadcast messages and when received shown on a Windows Form.
I have a discovery class with two threads, one that broadcasts every 30 seconds, the other thread listens on a socket. It is in a thread because of the blocking call:
if (listenSocket.Poll(-1, SelectMode.SelectRead))
The first thread works much like a timer in a class library, it broadcasts the packet and then sleeps for 30 seconds.
Now in principle it works fine, when a packet is received I throw it to an event and the Winform places it in a list. The problems start with the form though because of the main UI thread requiring Invoke. Now I only have two threads and to me it doesn't seem to be the most effective in the long run becoming a complex thin when the number of threads will grow.
I have explored the Tasks but these seem to be more orientated at a once off long running task (much like the background worker for a form).
Most threading examples I find all report to the console and do not have the problems of Invoke and locking of variables.
As i'm using .NET 4.5 should I move to Tasks or stick to the threads?
Async programming will still delegate some aspects of your application to a different thread (threadpool) if you try to update the GUI from such a thread you are going to have similar problems as you have today with regular threads.
However there are many techniques in async await that allow you to delegate to a background thread, and yet put a kind off wait point saying please continue here on the GUI thread when you are finished with that operation which effectively allows you to update the GUI thread without invoke, and have a responsive GUI. I am talking about configureAwait. But there are other techniques as well.
If you don't know async await mechanism yet, this will take you some investment of your time to learn all these new things. But you'll find it very rewarding.
But it is up to you to decide if you are willing to spend a few days learning and experimenting with a technology that is new to you.
Google around a bit on async await, there are some excellent articles from Stephen Cleary for instance http://blog.stephencleary.com/2012/02/async-and-await.html
Firstly if you're worried about scalability you should probably start off with an approach that scales easily. ThreadPool would work nice. Tasks are based on ThreadPool as well and they allow for a bit more complex situations like tasks/threads firing in a sequence (also based on condition), synchronization etc. In your case (server and client) this seems unneeded.
Secondly it looks to me that you are worried about a bottleneck scenario where more than one thread will try to access a common resource like UI or DB etc. With DBs - don't worry they can handle multiple access well. But in case of UI or other not-multithread-friendly resource you have to manage parallel access yourself. I would suggest something like BlockingCollection which is a nice way to implement "many producers, one consumer" pattern. This way you could have multiple threads adding stuff and just one thread reading it from the collection and passing it on the the single-threaded resource like UI.
BTW, Tasks can also be long running i.e. run loops. Check this documentation.
For a given task T, and blocks of code (wrapped in methods here) m1, m2 and m3,
is there a way to force any one of them - say m2 - to run uninterruptedly, i.e., that the thread running this program, upon reaching its time slice limit, if in the middle of execution of m2, do not leave the processor until m2 finishes, only then making room for a new thread ?
Is this possible ?
Example:
class Program
{
static void Main(string[] args)
{
Task task = new Task(() =>
{
m1();
//I want to assure m2 runs uninterruptedly, i.e., that the running thread does not stop while executing m2
m2();
m3();
});
}
private static void m1()
{
//whatever1...
}
private static void m2()
{
//whatever2...
}
private static void m3()
{
//whatever3...
}
}
Answering everybody's "why did you come up with this?", it popped up while trying to find a solution to this problem. What I thought: I need to guarantee that my switches from and to the main window handle is not interrupted/preempted.
Whether this is a plausible solution for my real problem or not - I thought the question was worth asking.
No. You are running your code on preemptive OS so there is nothing in your power to prevent preemption.
What is your real problem? Try to describe the actual problem you have, why did you come up with this requirement, and then perhaps one can offer advice to your actual problem..
After update
Remember this simple rule of thumb for multi-threaded programming: threads are almost never the right answer, event driven programming almost always is. Unless you have CPU intensive computation than can scale-up on distinct data sets, there is seldom a need to add threads. Spawning a new thread (or Task) simply to wait for IO (navigate to URL...) is an anti-pattern. Unfortunately 99.999% of multi-threaded programming examples simply use threads to simplify the control flow.
With that in mind, consider using an event driven model, for example EventFiringWebDriver. You will only have one thread, you main thread, and react in it to your web driver events (URLs downloaded, tabs opened, etc etc).
This is one of those situations where the correct answer depends on why you are trying to do what you are asking. Doing this right is much harder than it sounds.
In the general case, yes. There are ways that you can make that happen. However, I assume from the context that you are wondering if you can make that happen on a stock Windows OS. The short answer is no. Windows has a preemptively multi-tasking kernel, and there is a nothing a user process can do to prevent it from interrupting the thread when the time slice ends.
However, that's not the whole story. You can set your thread's priority to "REALTIME", which will keep it from being preempted by any other thread. The scheduler will still preempt you, but because no one else has a higher priority it will come right back to you. This document explains how to do that.
Note that this is know to be a bad idea. If not properly managed it will take over the whole machine and bring it to a screaming halt.
For most users, it is suggested to use the Multimedia Class Scheduling Service
If you really need real time services for your software (that is, being able to run something with hard guarantees about quality of service) you might look at real time extensions for Windows or Linux, or one of the fully real time systems like vxWorks or QNX.
Not without switching to a embedded version of windows that supports Real-Time Extensions. And even then you would need to switch from C# to a native language to be able to use those scheduling features.
The best you can do is change the thread's priority during the execution of m2 to make it less likely that it will get scheduled away. However if you are going to do that you should not use Task.Run and instead use a actual Thread object, you should not change priorities on a ThreadPool thread (which is what Task.Run uses).
In my C# project I have a form that is displayed which shows a progress bar.
Some processing is then done which involves communication over the internet.
While this processing is occurring the form says "Not Responding" when run in Win 7, in XP the form just goes white.
Either way, it is not acceptable.
Do I need to use threads to solve this problem?
What are some of the basics of threads that I would need to use in this scenario?
Your processing must be done within a thread.
Out of your thread you have to invoke your progress bar to show the progress.
progressBar1.Invoke((MethodInvoker)delegate
{
progressBar1.Value = (int)((i / limit) * 100);
});
Yes you have to use threads to keep your UI responsive while something gets done in background. But this question cannot be just answered just like "use Threads to solve it", because there are a lot of forms in which you could use threads. (Backgroundworker, Threadpool, Asynch IO, Creating a Thread, Task Parallel Library, CCR, and a lot more you could imagine for every kind of parallelization scenarios).
As you said you are doing some processing which needs connecting to internet. Where does the most amount of time spent? is it IO over network which takes most time in that case probably Asynchronous IO makes a lot of sense. If time spent is in one huge processing operation then Background worker is perfect, but if this processing can be further broken down into smaller chunks of parallel processing tasks then TPL or ThreadPool is preferred. Till now I am talking only about some processing which happens on Windows forms event, and keep the UI responsive. But based on the scenario there are numerous other options you could use to make threading work for you.
Asynch IO doesnt look like you are doing threading but it more matches with eventing model of winforms. So you could look at that if you are very comfortable with event based programming.
Threadpool looks more like a queue of workers to which you could keep throwing all the work needs to be done, and the framework figures out how many threads to run based on the kind of machine you are using (dual core, quad core etc) and it would get your work items doen in optimal way.
Bottom line its not one answer to use one over other, instead based on the kind of problem you are solving threading model needs to be decided on.
A cheaper option is to add the line Application.DoEvents() inside whatever loops your app is running, which will cause it to process messages each time it gets there.
If you use System.Net.WebClient, you can use DownloadDataAsync() to communicate in a non blocking way.
The System.Net.Sockets.Socket class proviede non blocking communication, too.
Sockets example
WebClient example
Yes, better way is use BackGroundWorker component. It is wrapper over threads, so you don't need to manage threads. You can get more info from Background worker on MSDN
As long as the program remain in the function to process something, the UI will not update. That is why you may need to start a background thread, so that your UI can continue functioning while your background thread can do the work. An alternatively is to use asynchronous functions.
example of background thread
From your description I'll assume that all your work is currently being done on a single thread, the main thread which is also used for your GUI.
The progress bar can only update when that main thread gets a chance to check its state and apply any expected changes.
Therefore it is important that your processing work does not occupy the main thread for extended periods of time.
There are two main approaches to handling this:
Stepping the processing activity.
Break down the processing step into a number of serial tasks - each short in nature.
Progressively call each of these serial tasks in the OnIdle event on your main thread.
Using a background thread.
See other answers giving more detail on how this would work.
The stepping approach can be useful if you want to avoid the sublties of thread synchronisation. The threading approach is probably better but only essential if it is impossible to guarantee serial short steps.
We have a situation where our application needs to process a series of files and rather than perform this function synchronously, we would like to employ multi-threading to have the workload split amongst different threads.
Each item of work is:
1. Open a file for read only
2. Process the data in the file
3. Write the processed data to a Dictionary
We would like to perform each file's work on a new thread?
Is this possible and should be we better to use the ThreadPool or spawn new threads keeping in mind that each item of "work" only takes 30ms however its possible that hundreds of files will need to be processed.
Any ideas to make this more efficient is appreciated.
EDIT: At the moment we are making use of the ThreadPool to handle this. If we have 500 files to process we cycle through the files and allocate each "unit of processing work" to the threadpool using QueueUserWorkItem.
Is it suitable to make use of the threadpool for this?
I would suggest you to use ThreadPool.QueueUserWorkItem(...), in this, threads are managed by the system and the .net framework. The chances of you meshing up with your own threadpool is much higher. So I would recommend you to use Threadpool provided by .net .
It's very easy to use,
ThreadPool.QueueUserWorkItem(new WaitCallback(YourMethod), ParameterToBeUsedByMethod);
YourMethod(object o){
Your Code here...
}
For more reading please follow the link http://msdn.microsoft.com/en-us/library/3dasc8as%28VS.80%29.aspx
Hope, this helps
I suggest you have a finite number of threads (say 4) and then have 4 pools of work. I.e. If you have 400 files to process have 100 files per thread split evenly. You then spawn the threads, and pass to each their work and let them run until they have finished their specific work.
You only have a certain amount of I/O bandwidth so having too many threads will not provide any benefits, also remember that creating a thread also takes a small amount of time.
Instead of having to deal with threads or manage thread pools directly I would suggest using a higher-level library like Parallel Extensions (PEX):
var filesContent = from file in enumerableOfFilesToProcess
select new
{
File=file,
Content=File.ReadAllText(file)
};
var processedContent = from content in filesContent
select new
{
content.File,
ProcessedContent = ProcessContent(content.Content)
};
var dictionary = processedContent
.AsParallel()
.ToDictionary(c => c.File);
PEX will handle thread management according to available cores and load while you get to concentrate about the business logic at hand (wow, that sounded like a commercial!)
PEX is part of the .Net Framework 4.0 but a back-port to 3.5 is also available as part of the Reactive Framework.
I suggest using the CCR (Concurrency and Coordination Runtime) it will handle the low-level threading details for you. As for your strategy, one thread per work item may not be the best approach depending on how you attempt to write to the dictionary, because you may create heavy contention since dictionaries aren't thread safe.
Here's some sample code using the CCR, an Interleave would work nicely here:
Arbiter.Activate(dispatcherQueue, Arbiter.Interleave(
new TeardownReceiverGroup(Arbiter.Receive<bool>(
false, mainPort, new Handler<bool>(Teardown))),
new ExclusiveReceiverGroup(Arbiter.Receive<object>(
true, mainPort, new Handler<object>(WriteData))),
new ConcurrentReceiverGroup(Arbiter.Receive<string>(
true, mainPort, new Handler<string>(ReadAndProcessData)))));
public void WriteData(object data)
{
// write data to the dictionary
// this code is never executed in parallel so no synchronization code needed
}
public void ReadAndProcessData(string s)
{
// this code gets scheduled to be executed in parallel
// CCR take care of the task scheduling for you
}
public void Teardown(bool b)
{
// clean up when all tasks are done
}
In the long run, I think you'll be happier if you manage your own threads. This will let you control how many are running and make it easy to report status.
Build a worker class that does the processing and give it a callback routine to return results and status.
For each file, create a worker instance and a thread to run it. Put the thread in a Queue.
Peel threads off of the queue up to the maximum you want to run simultaneously. As each thread completes go get another one. Adjust the maximum and measure throughput. I prefer to use a Dictionary to hold running threads, keyed by their ManagedThreadId.
To stop early, just clear the queue.
Use locking around your thread collections to preserve your sanity.
Use ThreadPool.QueueUserWorkItem to execute each independent task. Definitely don't create hundreds of threads. That is likely to cause major headaches.
The general rule for using the ThreadPool is if you don't want to worry about when the threads finish (or use Mutexes to track them), or worry about stopping the threads.
So do you need to worry about when the work is done? If not, the ThreadPool is the best option. If you want to track the overall progress, stop threads then your own collection of threads is best.
ThreadPool is generally more efficient if you are re-using threads. This question will give you a more detailed discussion.
Hth
Using the ThreadPool for each individual task is definitely a bad idea. From my experience this tends to hurt performance more than helping it. The first reason is that a considerable amount of overhead is required just to allocate a task for the ThreadPool to execute. By default, each application is assigned it's own ThreadPool that is initialized with ~100 thread capacity. When you are executing 400 operations in a parallel, it does not take long to fill the queue with requests and now you have ~100 threads all competing for CPU cycles. Yes the .NET framework does a great job with throttling and prioritizing the queue, however, I have found that the ThreadPool is best left for long-running operations that probably won't occur very often (loading a configuration file, or random web requests). Using the ThreadPool to fire off a few operations at random is much more efficient than using it to execute hundreds of requests at once. Given the current information, the best course of action would be something similar to this:
Create a System.Threading.Thread (or use a SINGLE ThreadPool thread) with a queue that the application can post requests to
Use the FileStream's BeginRead and BeginWrite methods to perform the IO operations. This will cause the .NET framework to use native API's to thread and execute the IO (IOCP).
This will give you 2 leverages, one is that your requests will still get processed in parallel while allowing the operating system to manage file system access and threading. The second is that because the bottleneck of the vast majority of systems will be the HDD, you can implement a custom priority sort and throttling to your request thread to give greater control over resource usage.
Currently I have been writing a similar application and using this method is both efficient and fast... Without any threading or throttling my application was only using 10-15% CPU, which can be acceptable for some operations depending on the processing involved, however, it made my PC as slow as if an application was using 80%+ of the CPU. This was the file system access. The ThreadPool and IOCP functions do not care if they are bogging the PC down, so don't get confused, they are optimized for performance, even if that performance means your HDD is squeeling like a pig.
The only problem I have had is memory usage ran a little high (50+ mb) during the testing phaze with approximately 35 streams open at once. I am currently working on a solution similar to the MSDN recommendation for SocketAsyncEventArgs, using a pool to allow x number of requests to be operating simultaneously, which ultimately led me to this forum post.
Hope this helps somebody with their decision making in the future :)