The scenario is that there is lets say 1 TB of objects each of ten mb in database.
I have a function named MATCH() which has a query object, whose return type is double, and in this function I have mathematical calculations. I have a check that if the value of the result is in between 0 and 1 then i have:
double[ ] Result=new double[eg 1000]
How can i do this, as the system has
2 GB RAM - Performance.
Which section I should lock, use
mutex or use thread pool? - Thread
Safety
How many threads can I run
simultaneously, specifically compared
to a BackgroundWorker?
Please give me architecture of the program. (ED: I reckon just ignore this line.)
Here are some things about threads that could help you.
In reality you never need more than one thread per cpu, more threads would just add more overhead on the scheduler. However, thread often block, like it would if you query data over a database, so it is not feasible to keep only one thread per cpu, you will probably need more to get the CPU usage to 100%.
That said, in your scenario, having more than one or two threads querying data over the same database won't help you much, because the database is the overhead. I would consider creating only one or two thread that simultaneously query data to the database, or better use the asynchronous pattern and use the Command.BeginExecute...() method and allow only a few simultaneous query in parallel. When the querying is done, you can now queue the processing you have to do on the data, this could be done on the .Net ThreadPool or in a custom thread pool containing only one thread per cpu if the processing of the data takes longer than querying it.
If this is an application that does not have a UI, use the ThreadPool. You can set the maximum number of threads to use, and since this seems like a specialized application, tinker with it until you have it just right.
ThreadPool examples here (MSDN).
Related
I have an ASP .NET MVC 2 - application, which calls functions of a C#-DLL.
The DLL itself is multithreaded. In the worst case it uses up to 200 threads, which do not run very long.
I use asynchronous delegates in order to generate the threads. In order to speed up the initialization of the delegates, I calculate the number of threads I need in advance and give it to the ThreadPool:
ThreadPool.SetMinThreads(my_num_threads, ...);
I just wonder, if I need to do this early enough, such that the ThreadPool has enough time to create the threads? Do I have to consider, when I set the size of the ThreadPool or are the threads available immediately after I call SetMinThreads?
Furthermore, if I set the size outside the DLL in my ASP .NET MVC-application (before I call the DLL), will this setting be available/visible for the DLL?
They share the same application domain so that setting ThreadPool anywhere affects all. Note that this also impacts on the ASP.NET framework which will use ThreadPool itself for all its own async tasks etc.
With that in mind, if you knew roughly the minimum number of threads you wanted you could set that on application startup ready for use later on.
However, 200 threads seems somewhat excessive, to put it into context my Chrome with about 8 tabs open uses about 35 and my SQL Server ~50. What are you doing that demands so many?
Also realise that eventually you'll reach a limit where performance will degrade as there are so many threads that must be serviced. Microsoft says such on MSDN:
You can use the SetMinThreads method to increase the minimum number of threads. However, unnecessarily increasing these values can cause performance problems. If too many tasks start at the same time, all of them might appear to be slow. In most cases, the thread pool will perform better with its own algorithm for allocating threads. Reducing the minimum to less than the number of processors can also hurt performance.
I am using Multithreading in my while loop ,
as
while(reader.read())
{
string Number= (reader["Num"] != DBNull.Value) ? reader["Num"].ToString() : string.Empty;
threadarray[RowCount] = new Thread(() =>
{
object ID= (from r in Datasetds.Tables[0].AsEnumerable()
where r.Field<string>("Num") == Number
select r.Field<int>("ID")).First<int>();
});
threadarray[RowCount].Start();
RowCount++;
}
But with sequential execution ,for 200 readers it just takes 0.4 s
but with threading it takes 1.1 sec... This is an example but i have same problem when i execute it with number of lines of code in threading with multiple database operations.
for sequential it takes 10 sec for threading it takes more...
can anyone please suggest me?
Thanks...
Threading is not always quicker and in many cases can be slower (such as the case seen here). There are plenty of reasons why but the two most significant are
Creating a thread is a relatively expensive OS operation
Context switching (where the CPU stops working on one thread and starts working on another) is again a relatively expensive operation
Creating 200 threads will take a fair amount of time (with the default stack size this will allocate 200MB of memory for stacks alone), and unless you have a machine with 200 cores the OS will also need to spend a fair amount of time context switching between those threads too.
The end result is that the time that the machine spends creating threads and switching between them simply outstrips the amount of time that the machine spends doing any work. You may see a performance improvement if you reduce the number of threads being used. Try starting with 1 thread for each core that your machine has.
Multithreading where you have more threads than cores is generally only useful in scenarios where the CPU is hanging around waiting for things to happen (like disk IO or network communication). This isn't the case here.
Threading isn't always the solution and the way you're using it is definitely not thread-safe. Things like disk I/O or other bottlenecks won't benefit from threading in certain circumstances.
Also, there is a cost for starting up threads. Not that I would recommend it for your situation, but check out the TPL. http://msdn.microsoft.com/en-us/library/dd460717.aspx
Multithreading usually is a choice for non blocking execution. Like everything on earth it has its associated costs.
For the commodity of parallel execution, we pay with performance.
Usually there is nothing faster then sequential execution of a single task.
It's hard to suggest something real, in you concrete scenario.
May be you can think about multiple process execution, instead of multiple threads execution.
But I repeast it's hard to tell if you will get benefits from this, without knowing application complete architecture and requirements.
it seems you are creating a thread for each read(). so if it has 200 read(), you have 200 threads running (maybe fewer since some may finished quickly). depends on what you are doing in the thread, 200 threads running at the same time may actually slow down the system because of overheads like others mentioned.
multuthread helps you when 1) the job in the thread takes some time to finish; 2) you have a control on how many threads running at the same time.
in your case, you need try, say 10 threads. if 10 threads are running, wait until 1 of them finished, then allocate the thread to a new read().
if the job in the thread does not take much time, then better use single thread.
Sci Fi author and technologist Jerry Pournelle once said that in an ideal world every process should have its own processor. This is not an ideal world, and your machine has probably 1 - 4 processors. Your Windows system alone is running scads of processes even when you yourself are daydreaming. I just counted the processes running on my Core 2 Quad XP machine, and SYSTEM is running 65 processes. That's 65 processes that have to be shared between 4 processors. Add more threads and each one gets only a slice of processor power.
If you had a Beowulf Cluster you could share threads off into individual machines and you'd probably get very good timings. But your machine can't do this with only 4 processors. The more you ask it to do the worse performance is going to be.
in my game I have entities (about 2000-5000) that do very performance heavy calculations every frame (or every 2nd or 3rd).
I know that parallel programming will help in this scenario.
The question is how do I use multiple threads / cpu-cores in the best way possible.
I tried the static Parallel class in .NET, but the startup cost for every Parallel.For / Parallel.ForEach is simply to big.
Does it start a number of threads each time I call a function of the Parallel class ??
The game's target fps is 60 fps, so whatevery approach I use, it should never involve starting multiple threads every game-frame!
So my question is:
What other alternatives are there to parallel-process multiple entities?
Should I create the threads myself, and if so, what pitfalls are there?
Is the threadpool a good alternative?
By default the Parallel.For/Parallel.Foreach uses ThreadPool thread. Infact TPL api like Parallel class, Task instance etc uses ThreadPool thread.
The number of threads started by TPL will depend upon lots of factors like number of available ideal thread in threadpool etc. So assuming TPL will create lots of threads as soon as you will submit request will not be entirely correct.
Important point is that you can control, the number of threads used the TPL for executing the tasks via MaxDegreeOfParallelism flag. By setting this value to say number "X", you limit the number of concurrent items being processed to "X". Personally, I feel this is a very powerful feature of TPL library.
Also, if you do not want your task to block other requests in queue, then you can set the TaskCreationOptions to be of type "LongRunning", in that case, TPL will run the task on a dedicated thread.
In your case, you can try using Parallel.For/ForEach and set the MaxDegreeOfParalleleism value to some optimal value and see how that behaves.
You need some computers or servers to test your parallel programmed game. Other wise on a single computer, since it will work on thread, it will cost more.
Additionally, since the game is calculated and drawn 60 times in a second, working on threads is not logical, your game is already running on threads.
Maybe you have to change or optimize your algorithm. Do not let everything is done by CPU, use GPU as much as possible.
There isn't a simple answer.
When every entity of your game can be calculated without knowing any other entity then you can simply move that calcultion to a given number of threads. A threadpool would be a good start.
You could set this up when your programm starts.
But even when you do so, you have to check if switching threads performs better than having one single thread. Context switch is here the keyword you can search for.
But what happens when entity A on thread A does a calculation with affects entity B on thread B and forces entity B to be destroyed which forces entity C on thread C to be destroyed also?
And now to make it harder :P entity C was calculated before entity B, before entity A...
What should happen? Is it ok that those entities will be cleaned up in 2 or 3 frames in the future?
Does it affect or break your game logic?
If so then things are getting complicated.
We have a situation where our application needs to process a series of files and rather than perform this function synchronously, we would like to employ multi-threading to have the workload split amongst different threads.
Each item of work is:
1. Open a file for read only
2. Process the data in the file
3. Write the processed data to a Dictionary
We would like to perform each file's work on a new thread?
Is this possible and should be we better to use the ThreadPool or spawn new threads keeping in mind that each item of "work" only takes 30ms however its possible that hundreds of files will need to be processed.
Any ideas to make this more efficient is appreciated.
EDIT: At the moment we are making use of the ThreadPool to handle this. If we have 500 files to process we cycle through the files and allocate each "unit of processing work" to the threadpool using QueueUserWorkItem.
Is it suitable to make use of the threadpool for this?
I would suggest you to use ThreadPool.QueueUserWorkItem(...), in this, threads are managed by the system and the .net framework. The chances of you meshing up with your own threadpool is much higher. So I would recommend you to use Threadpool provided by .net .
It's very easy to use,
ThreadPool.QueueUserWorkItem(new WaitCallback(YourMethod), ParameterToBeUsedByMethod);
YourMethod(object o){
Your Code here...
}
For more reading please follow the link http://msdn.microsoft.com/en-us/library/3dasc8as%28VS.80%29.aspx
Hope, this helps
I suggest you have a finite number of threads (say 4) and then have 4 pools of work. I.e. If you have 400 files to process have 100 files per thread split evenly. You then spawn the threads, and pass to each their work and let them run until they have finished their specific work.
You only have a certain amount of I/O bandwidth so having too many threads will not provide any benefits, also remember that creating a thread also takes a small amount of time.
Instead of having to deal with threads or manage thread pools directly I would suggest using a higher-level library like Parallel Extensions (PEX):
var filesContent = from file in enumerableOfFilesToProcess
select new
{
File=file,
Content=File.ReadAllText(file)
};
var processedContent = from content in filesContent
select new
{
content.File,
ProcessedContent = ProcessContent(content.Content)
};
var dictionary = processedContent
.AsParallel()
.ToDictionary(c => c.File);
PEX will handle thread management according to available cores and load while you get to concentrate about the business logic at hand (wow, that sounded like a commercial!)
PEX is part of the .Net Framework 4.0 but a back-port to 3.5 is also available as part of the Reactive Framework.
I suggest using the CCR (Concurrency and Coordination Runtime) it will handle the low-level threading details for you. As for your strategy, one thread per work item may not be the best approach depending on how you attempt to write to the dictionary, because you may create heavy contention since dictionaries aren't thread safe.
Here's some sample code using the CCR, an Interleave would work nicely here:
Arbiter.Activate(dispatcherQueue, Arbiter.Interleave(
new TeardownReceiverGroup(Arbiter.Receive<bool>(
false, mainPort, new Handler<bool>(Teardown))),
new ExclusiveReceiverGroup(Arbiter.Receive<object>(
true, mainPort, new Handler<object>(WriteData))),
new ConcurrentReceiverGroup(Arbiter.Receive<string>(
true, mainPort, new Handler<string>(ReadAndProcessData)))));
public void WriteData(object data)
{
// write data to the dictionary
// this code is never executed in parallel so no synchronization code needed
}
public void ReadAndProcessData(string s)
{
// this code gets scheduled to be executed in parallel
// CCR take care of the task scheduling for you
}
public void Teardown(bool b)
{
// clean up when all tasks are done
}
In the long run, I think you'll be happier if you manage your own threads. This will let you control how many are running and make it easy to report status.
Build a worker class that does the processing and give it a callback routine to return results and status.
For each file, create a worker instance and a thread to run it. Put the thread in a Queue.
Peel threads off of the queue up to the maximum you want to run simultaneously. As each thread completes go get another one. Adjust the maximum and measure throughput. I prefer to use a Dictionary to hold running threads, keyed by their ManagedThreadId.
To stop early, just clear the queue.
Use locking around your thread collections to preserve your sanity.
Use ThreadPool.QueueUserWorkItem to execute each independent task. Definitely don't create hundreds of threads. That is likely to cause major headaches.
The general rule for using the ThreadPool is if you don't want to worry about when the threads finish (or use Mutexes to track them), or worry about stopping the threads.
So do you need to worry about when the work is done? If not, the ThreadPool is the best option. If you want to track the overall progress, stop threads then your own collection of threads is best.
ThreadPool is generally more efficient if you are re-using threads. This question will give you a more detailed discussion.
Hth
Using the ThreadPool for each individual task is definitely a bad idea. From my experience this tends to hurt performance more than helping it. The first reason is that a considerable amount of overhead is required just to allocate a task for the ThreadPool to execute. By default, each application is assigned it's own ThreadPool that is initialized with ~100 thread capacity. When you are executing 400 operations in a parallel, it does not take long to fill the queue with requests and now you have ~100 threads all competing for CPU cycles. Yes the .NET framework does a great job with throttling and prioritizing the queue, however, I have found that the ThreadPool is best left for long-running operations that probably won't occur very often (loading a configuration file, or random web requests). Using the ThreadPool to fire off a few operations at random is much more efficient than using it to execute hundreds of requests at once. Given the current information, the best course of action would be something similar to this:
Create a System.Threading.Thread (or use a SINGLE ThreadPool thread) with a queue that the application can post requests to
Use the FileStream's BeginRead and BeginWrite methods to perform the IO operations. This will cause the .NET framework to use native API's to thread and execute the IO (IOCP).
This will give you 2 leverages, one is that your requests will still get processed in parallel while allowing the operating system to manage file system access and threading. The second is that because the bottleneck of the vast majority of systems will be the HDD, you can implement a custom priority sort and throttling to your request thread to give greater control over resource usage.
Currently I have been writing a similar application and using this method is both efficient and fast... Without any threading or throttling my application was only using 10-15% CPU, which can be acceptable for some operations depending on the processing involved, however, it made my PC as slow as if an application was using 80%+ of the CPU. This was the file system access. The ThreadPool and IOCP functions do not care if they are bogging the PC down, so don't get confused, they are optimized for performance, even if that performance means your HDD is squeeling like a pig.
The only problem I have had is memory usage ran a little high (50+ mb) during the testing phaze with approximately 35 streams open at once. I am currently working on a solution similar to the MSDN recommendation for SocketAsyncEventArgs, using a pool to allow x number of requests to be operating simultaneously, which ultimately led me to this forum post.
Hope this helps somebody with their decision making in the future :)
I am working on a C# application that works with an array. It walks through it (meaning that at one time only a narrow part of the array is used). I am considering adding threads in it to make it perform faster (it runs on a dualcore computer). The problem is that I do not know if it would actually help, because threads cost something and this cost could easily be more than the parallel gain... So how do I determine if threading would help?
Try writing some benchmarks that mimic, as closely as possible, the real-world conditions in which your software will actually be used.
Test and time the single-threaded version. Test and time the multi-threaded version. Compare the two sets of results.
If your application is CPU bound (i.e. it isn't spending time trying to read files or waiting for data from a device) and there is little to no sharing of live data (data being altered, if its read only its fine) between the threads then you can pretty much increase the speed by 50->75% by adding another thread (as long as it still remains CPU bound of course).
The main overhead in multithreading comes from 2 places.
Creation & initialization of the thread. Creating a thread requires quite a few resources to be allocated and involves swaps between kernel and user mode, this is expensive though a once off per thread so you can pretty much ignore it if the thread is running for any reasonable amount of time. The best way to mitigate this problem is to use a thread pool as it will keep the thread on hand and not need to be recreated.
Handling synchronization of data. If one thread is reading from data that another is writing, bad things will generally happen (worse if both are changing it). This requires you to lock your data before altering it so that no thread reads a half written value. These locks are generally quite slow as well. To mitigate this problem, you need to design your data layout so that the threads don't need to read or write to the same data as much as possible. If you do need a lot of these locks it can then become slower than the single thread option.
In short, if you are doing something that requires the CPU's to share a lot of data, then multi-threading it will be slower and if the program isn't CPU bound there will be little or no difference (could be a lot slower depending on what it is bound to, e.g. a cd/hard drive). If your program matches these conditions, then it will PROBABLY be worthwhile to add another thread (though the only way to be certain would be profiling).
One more little note, you should only create as many CPU bound threads as you have physical cores (threads that idle most of the time, such as a GUI message pump thread, can be ignored for this condition).
P.S. You can reduce the cost of locking data by using a methodology called "lock-free programming", though this something that should really only be attempted by people with a lot of experience with multi-threading and a clear understanding of their target architecture (including how the cache is treated and the memory bus).
I agree with Luke's answer. Benchmark it, it's the only way to be sure.
I can also give a prediction of the results - the fastest version will be when the number of threads matches the number of cores, EXCEPT if the array is very small and each thread would have to process just a few items, the setup/teardown times might get larger than the processing itself. How few - that depends on what you do. Again - benchmark.
I'd advise to find out a "minimum number of items for a thread to be useful". Then, when you are deciding how many threads to spawn (or take from a pool), check how many cores the computer has and how many items there are. Spawn as many threads as possible, but no more than the computer has cores, and not so many that each thread would have less than the minimum number of items to process.
For example if the minimum number of items is, say, 1000; and the computer has 4 cores; and your list contains 2500 items, you would spawn just 2 threads, because more threads would be inefficient (each would process less than 1000 items).
Making a step by step list for Luke's idea:
Make a single threaded test app
Download Sysinternals Process Monitor and run it
Run your test app and find it on the process list (remember to run it as a release build outside of Visual Studio)
Double click the process and select the Performance Graph tab
Observe the CPU time used by your process
If the CPU time is sittling flat 50% for more than a few seconds, you can probably speed your overall process up using threads (assuming the bunch of stuff Mr Peters refered to holds true)
(However, the best you can do on a duel core machine is to halve the time it takes to run. If your process only take 4 seconds, it might not be worth getting it to run in 2 seconds)
Using the task parallel library / Rx provides a friendlier interface than System.Threading.ThreadPool, which might make your world a bit easier.
You miss imho one item, which is that it is not always about execution time. There is:
The problem to koop a UI operational during an operation. Even if the UI is "dormant", a nonresponsive message pump makes a worse impression.
The possibility to use a thread pool to actually not ahve to start / stop threads all the time. I use thread pools very extensively, and various parts of the applications keep them busy.
Anyhow, ignoring my point 1 - where you may go multi threaded without speeding things up in order to keep your UI responsive - I would say it is always then faster when you can actually either split up work (so you can keep more than one core busy) or offload it for othe reasons.