I have a game in XNA which needs to do network calls. In the update method I determine what needs to be sent and then add it to a list of stuff to send. Then I run the network call. This slows down the application alot obviously. So I first tried creating a new thread like this in the update to make it do it on a seperate thread:
Thread thread;
thread = new Thread(
new ThreadStart(DoNetworkThing));
thread.Start();
I presume creating threads has overhead etc which results in this being even slower.
Finally, I made a method that has while(true){DoNetworkThing();} in it which will keep looping round and run the network call over and over again(it does check if its already busy with one, and whether there is stuff to send). That method I called in the LoadContent method in a thread, so it will run alongside the game in its own thread.
But that is really slow too.
So what am I doing wrong? What is the best way of doing this?
Thanks
I had this exact problem when initially attempting to use threads on XNA -- adding a thread slowed everything down. It turned out that thread affinity was the issue.
On Xbox 360, by default, all threads run on the same processor core; this behaviour differs from Windows, where the kernel places threads onto other cores for you. (See answers on this thread at MSDN social for detail.)
To get around it, you need to set the affinity for your thread to another core in your thread function:
void DoNetworkThing()
{
#ifdef XBOX
Thread.SetProcessorAffinity(3); // see note below
#endif
/* your code goes here */
}
Thread thread = new Thread(
new ThreadStart(DoNetworkThing));
thread.Start();
The documentation for Thread.SetProcessorAffinity states that on XNA, cores 0 and 2 are reserved for the framework; core 1 and cores 3-5 are free for your use. The main thread (the thread containing your main() function) will be on core 1. The code above arbitrarily sets the thread to run on core 3, but you can choose a different core, or add code to programmatically choose a core (which you may want to do if you have more than a few threads).
Finally, note the #ifdef guard - Thread.SetProcessorAffinity is only available on Xbox; it won't even compile on Windows!
OK, so first of all- creating a new thread does have some minimal overhead, but you're only creating the thread once (or a few times)... so unless you're creating hundreds of threads (which you shouldn't be), then you will not need to worry about the overhead.
Let's look at your first example:
Thread thread;
thread = new Thread(new ThreadStart(DoNetworkThing));
// you should set the thread to background, unless you
// want it to live on even after your application closes
thread.IsBackground = true;
thread.Start();
If you were sticking with that model, the DoNetworkThing function would look like this:
void DoNetworkThing()
{
while(someConditionIsTrue)
{
// do the networking stuff here
}
}
I presume that in your next attempt you did something like this:
Thread thread = new Thread(()=>
{
while(true)
{
DoNetworkThing();
}
});
thread.IsBackground = true;
thread.Start();
Both approaches are fine, but the only difference is the content of DoNetworkingThing. In the second approach it will look like this:
void DoNetworkThing()
{
// do the networking thing, but note
// that this time your infinite while
// loop is outside the function
}
Now you said that both of these attempts are really slow, but nothing given in your examples would indicate that there should be any noticeable performance impact. It would be great if you can:
Give us an example that would demonstrate the slow down.
Tell us how many cores you have on the machine that the game is running on.
Finally, if you're mingling with threads then I would STRONGLY suggest that you pickup a good book on multithreading and really familiarize yourself with the concepts behind it, go through the exercises and write a couple of simple programs. It will take you a couple of months, but if you're not familiar with threading then it will help you learn how to avoid a lot of ugly mistakes.
A lot of people recommend Joe Duffy's Concurrent Programming on Windows, but feel free to check some other C# multithreading books too.
It might be worthwhile to check if using a ThreadPool improves performance. Creating many threads may be expensive.
http://msdn.microsoft.com/en-us/library/h4732ks0(v=vs.100).aspx
Related
Basically I'm working on this beat detection algorithm, the strange thing i am encountering right now is that when i split the work load on to another thread. So now i have one main thread and another worker thread. My worker thread somehow is always working faster than the main thread. It seems strange because what i have learned is that the main thread should theoretically always be faster because it is not taking time to initialise the thread. However what i get is even i pass a extra 1024 samples to the worker thread( they are both working with around 30 million samples currently) it is still faster than the main thread. Is it because i have applications running on my main thread? I'm really confused right now. here is the code
UnityEngine.Debug.Log ("T800 Start");
Step3 s1= new Step3();
Step3WOMT s2= new Step3WOMT();
System.Object tempObj= samples2 as System.Object;
float[] tempArray = new float[eS.Length/ 2];
System.Threading.ParameterizedThreadStart parameterizedts = new System.Threading.ParameterizedThreadStart(s1.DoStep3);
System.Threading.Thread T1 = new System.Threading.Thread(parameterizedts);
T1.Start (tempObj);
s2.DoStep3(samples1);
UnityEngine.Debug.Log ("s2");
//UnityEngine.Debug.Log (stopwatch.ElapsedMilliseconds);
T1.Join();
Don't worry I'm only using c# features in the multithread so I believe it should be fine. What i am really confused about is that if i comment out the T1.join(); line the whole thing somehow go even slower. Im genuinely confused right now as there seems no reasonable answer to this question.
T1.join() does all the magic. It allow main thread to wait till all the worker threads are complete. Is that necessary ? depends on ur application. it is expected for a main thread to wait for the end of execution of its worker threads.
Your system must have multiple cores.
It's possible that Thread.Start can return not immediately after the thread is initialized. Anyway you should use ThreadPool.EnqueueUserWorkItem and ManualResetEvent to wait instead of join.
To see real results your samples count must be big enough so that thread initialization time is minimal compared to the execution time of your code. ThreadPool often doesn't have to initialize a new thread but it still takes some time to launch your code. I think you should not use multithreading for tasks which takes <~50ms.
If you compare the execution time for big samples count (takes few seconds) you'll see that there is no difference in the performance of the main thread and the background one (unless the main thread have higher priority).
In my HFT trading application I have several places where I receive data from network. In most cases this is just a thread that only receives and process data. Below is part of such processing:
public Reciver(IPAddress mcastGroup, int mcastPort, IPAddress ipSource)
{
thread = new Thread(ReceiveData);
s = new Socket(AddressFamily.InterNetwork, SocketType.Dgram, ProtocolType.Udp);
s.ReceiveBufferSize = ReceiveBufferSize;
var ipPort = new IPEndPoint(LISTEN_INTERFACE/* IPAddress.Any*/, mcastPort);
s.Bind(ipPort);
option = new byte[12];
Buffer.BlockCopy(mcastGroup.GetAddressBytes(), 0, option, 0, 4);
Buffer.BlockCopy(ipSource.GetAddressBytes(), 0, option, 4, 4);
Buffer.BlockCopy(/*IPAddress.Any.GetAddressBytes()*/LISTEN_INTERFACE.GetAddressBytes(), 0, option, 8, 4);
}
public void ReceiveData()
{
byte[] byteIn = new byte[4096];
while (needReceive)
{
if (IsConnected)
{
int count = 0;
try
{
count = s.Receive(byteIn);
}
catch (Exception e6)
{
Console.WriteLine(e6.Message);
Log.Push(LogItemType.Error, e6.Message);
return;
}
if (count > 0)
{
OnNewMessage(new NewMessageEventArgs(byteIn, count));
}
}
}
}
This thread works forever once created. I just wonder if I should configure this thread to run on certain core? As I need lowest latency I want to avoid context switch. As I want to avoid context switch I better to run the same thread on the same processor core, right?
Taking into account that i need lowest latency is that correct that:
It would be better to set "thread afinity" for the most part of the "long-running" threads?
It would be better to set "thread afinity" for the thread from my example above?
I rewriting above code to c++ right now to port to Linux later if this is important however I assume that my question is more about hardware than language or OS.
I think the algorithm that has as little latency as possible would be to pin your threads to one core and set them to realtime priority (or whatever is the highest one).
This will cause the OS to evict any other thread which happens to use that core.
Hopefully the CPU cache will still contain useful data when your thread gets scheduled there. For that reason I like the idea of pinning to a core.
You should probably set your entire process to a high priority class and minimize other activity on your box. Also turn off unused hardware because it might generate interrupts. Fix your NIC's interrupts to a different CPU core (some better NICs can do that).
As I want to avoid context switch I better to run the same thread on the same processor core, right?
No. A context switch will not necessarily be avoided by setting affinity to one CPU. You have no control over context switches, they are in the hands of the OS thread scheduler. They occur when a thread quantum (time slice) has elapsed or when a higher priority thread interrupts your thread.
Latency you talk about, I assume is network or memory latency, is not at all avoided by setting thread affinity. Memory latency can be avoided by making your code cache friendly (ie it can all be in the L1 - L2 caches, for example). Network latency is really just part of any network, and not something I suspect you can do much about.
As Tony The Lion has already answered your question, I would like to address your comment:
"why not setting thread afinity to my code? why thread from my example need to travel between cores?"
Your thread doesn't travel anywhere.
Context switch happens when OS thread scheduler decides to give your thread a slice of time to execute. Then the environment is prepared for your thread, e.g. the CPU registers
are set up to correct values etc. This is called context switch.
So regardless of thread affinity, the same CPU setup work has to be done, whether it is the same CPU/core which was used in previous slice when your thread was running or another one. And at this moments, your computer has more info to do it properly then you do at compile time.
You seem to believe that thread somehow resides on the CPU, but it is not so. What you use is a logical thread and there can be hundreds or even thousands of them. Common CPUs, OTOH, usually have 1 or 2 hardware threads per core, and your logical thread gets mapped to one of these every time it is scheduled, even if OS always picks the same HW thread.
EDIT: it seems that you have already picked the answer you want to hear and I don't like long discussion threads on answers so I will put it here.
you should try and measure it. I believe that you will be dissapointed
running some threads on high priority thread might easily mess up other processes
you are worried about context switch latency, but you have no problems that GC thread will freeze your thread? BTW, on which core will your GC thread run? :)
what if your highest priority thread blocks GC thread? memory leaks? do you know what is priority of that thread so you are sure it would work?
really, why not C or hand optimized assembly if microseconds are important?
as someone suggested, you should use an RTOS if you want to control this aspect of execution
it doesn't seem likely that your data travels through data center just 4-5 times slower than it takes to setup a thread context on one machine, but who knows...
In my C#/.NET 3.5 program I am using Threadpool threads ( delegate+BeginInvoke/EndInvoke) to parallelize and speed up some file loading. SystemInternals tool ProcessExplorer shows that number of threads in process is increasing over time, while I would expect to stay the same. Looks like some Threads/Threads handles stay hanging around for no reason.
Interestingly enough, I can not find pattern how threads grow and seems that happen sporadically, without repeatable pattern each time I start application. I spend some time analyzing and here are some observations:
1) code looks like this:
ArrayList IAsyncResult_s = new ArrayList();
AsyncProcessing thread1 = processRasterLayer;
... ArrayList filesToRender....
foreach (string FileName in filesToRender)
{
string fileName2 = FileName;
GeoImage partialImage1;
IAsyncResult asyncResult = thread1.BeginInvoke(
fileName2, .....,
out partialImage1, ..., null, null);
IAsyncResult_s.Add(asyncResult);
asyncResult = null;
}
.................
//block and render all
foreach (IAsyncResult asyncResult in IAsyncResult_s)
{
GeoImage partialImage1;
thread1.EndInvoke(
out partialImage1, , asyncResult);
//render image.. some calls to render partial image here
partialImage1.Dispose();
partialImage1 = null;
}
IAsyncResult_s.Clear();
IAsyncResult_s = null;
thread1 = null;
2) Number of Process Threads
My trace shows that during execution inside loop, ThreadPool.GetAvailableThreads(out workerThreads, out completionPortThreads); gives numbers like 493, 1000.
At the end of loops , , ThreadPool.GetAvailableThreads(out workerThreads, out completionPortThreads); gives numbers 500, 1000. So, number of available thread returns to same
Number of process threads reported by SystemInternals ProcessExplorer and API System.Diagnostics.Process.GetCurrentProcess().Threads.Count is the 16 before loops, and around 21 after loops.
If I call againg those loops, number of threads in process grows, but not by fixed nubmer each time, but grows 1-4 each time I repeat above code, so grows like 16->21->22->26->31...
3)Forced garbage collection didn’t htelp
I tried to froce garbage collection to get rid of those extra threads, but that didn’t removed them from process.
4)Profling tools
I was using RedGates Memory and Performace profilers, but hasen’t found obvious reason. I saw several extra threas and their object (ThreadContext etc) hanging, but saw no object holding those threads in memory. I am prety sure those extra threads were involved into loops work above, since I added thread name inside calls, and they still have that name I gave them.
5) Intelitrace
Intelitrace debuging showed also extra threads hanging. They still have names I gave them. But interestingly, it also showed that same thread that is hanging now, was used by above loop in the past, but also same thread was executing some timer related evens from timers form my code.
6) Locating issue
So, When I disable above loops that process filse Asynchroniously, and load files sequentialy, I do not have extra threads, and number of threads in my application is constant and and around 16.
7) Regarding SetMaxThreads :Here how it looks on my machine (XP, .NET 3.5):
Code like this:
ThreadPool.GetAvailableThreads(out AvailableWorkerThreads, out AvailableCompletionPortThreads);
ThreadPool.GetMaxThreads(out MaxWorkerThreads, out MaxCompletionPortThreads);
ThreadPool.GetMinThreads(out MinWorkerThreads, out MinCompletionPortThreads);
Gives result:
MinWorkerThreads:2 MaxWorkerThreads:500 MinCompletionPortThreads:2 MaxCompletionPortThreads:1000 AvailableWorkerThreads:500 AvailableCompletionPortThreads:1000
My app is using maybe 8 worker threads at the same time. I see no problem with SetMaxThreads.
8)
Functionally, I have no problems so far with this solution above. But somehow, if tools report that number of threads in my app is growing, it looks like “resource leak” of some kind, and I would like to address it. It looks like some thread handles are hanging around for no reason.
9) Here is one interesting article. It sasy that thread resources are cleaned once EndInvoke is called. I am doing so in my code. Article sasy: ..”. Because EndInvoke cleans up after the spawned thread, you must make sure that an EndInvoke is called for each BeginInvoke.” “If the thread pool thread has exited, EndInvoke does the following: It cleans up the exited thread's loose ends and disposes of its resources.” See: http://en.csharp-online.net/Asynchronous_Programming%E2%80%94BeginInvoke_EndInvoke
10) Another interesting article. Author says he had thread handle leaks because he was creating controls from non-gui thread. It is pretty elaborate article, see: http://msmvps.com/blogs/senthil/archive/2008/05/29/the-case-of-the-leaking-thread-handles.aspx
11) Another interesting article. It talks about ThreadPool.SetMinThreads property. It seems that it is not ThreadPool.SetMaxThreads but ThreadPool.SetMinThreads that enables useful control over ThradPool. This article is an eye-opener for me, and made me think about how ThreadPool works and performance problems it might cause. Article is: http_://www.dotnetperls.com/threadpool-setminthreads . Anoter similar one is : http_://www.codeproject.com/Articles/3813/NET-s-ThreadPool-Class-Behind-The-Scenes
12) Another interesting article. It is talking about throttling issue with ThreadPool. Article mentions ThreadPool limit of 2 new threads per second increase. See http_://social. msdn. microsoft. com/forums/en-US/clr/thread/3325cb32-371b-4f3e-965f-6ca88538dc3e/
13) So, in maybe 30 tests I saw only 2 times that number of threads allocated would shrink. But, it did happen. I saw once thread number going like 16->....->31->61-> ->30->16. So, it went back to 16. It doesn’t happen often, and it is not about time waited, it was like big activity in process, followed by a period of constant low level activity.
14) ThreadPool.SetMinThreads Method documentation. It talks about 2 new threads per second limit for threadpool. It is not clear if setting this property would remove that limit. http_://msdn.microsoft. com/en-ca/library/system. threading.threadpool.setminthreads(v=vs.90).aspx
So the answer is: there's no leak here. This is how the thread pool works. It keeps around threads that finished working so you don't have to pay the price of thread creation next time you need one. If you have many concurrent work items then the number of threads in the pool will increase but they'll max out at MaxWorkerThreads. (And it has nothing to do with the garbage collector.)
See this article for more info:
http://msdn.microsoft.com/en-us/library/0ka9477y.aspx
i would consider a consumer producer pattern. the idea behind a threadpool is to recycle threads, not create hundreds of new. in best case you have for each cpu one thread, and queue the work. this will be sure faster as you avoid useless context switches and waits for creating new threads, as far as i remember the net threadpool waits about one second until a new thread is created, to give other threads a chance to get recycled.
If I have Thread A which is the main Application Thread and a secondary Thread. How can I check if a function is being called within Thread B?
Basically I am trying to implement the following code snippit:
public void ensureRunningOnCorrectThread()
{
if( function is being called within ThreadB )
{
performIO()
}
else
{
// call performIO so that it is called (invoked?) on ThreadB
}
}
Is there a way to perform this functionality within C# or is there a better way of looking at the problem?
EDIT 1
I have noticed the following within the MSDN documentation, although Im a dit dubious as to whether or not its a good thing to be doing! :
// if function is being called within ThreadB
if( System.Threading.Thread.CurrentThread.Equals(ThreadB) )
{
}
EDIT 2
I realise that Im looking at this problem in the wrong way (thanks to the answers below who helped me see this) all I care about is that the IO does not happen on ThreadA. This means that it could happen on ThreadB or indeed anyother Thread e.g. a BackgroundWorker. I have decided that creating a new BackgroundWorker within the else portion of the above f statement ensures that the IO is performed in a non-blocking fashion. Im not entirely sure that this is the best solution to my problem, however it appears to work!
Here's one way to do it:
if (System.Threading.Thread.CurrentThread.ManagedThreadId == ThreadB.ManagedThreadId)
...
I don't know enough about .NET's Thread class implementation to know if the comparison above is equivalent to Equals() or not, but in absence of this knowledge, comparing the IDs is a safe bet.
There may be a better (where better = easier, faster, etc.) way to accomplish what you're trying to do, depending on a few things like:
what kind of app (ASP.NET, WinForms, console, etc.) are you building?
why do you want to enforce I/O on only one thread?
what kind of I/O is this? (e.g. writes to one file? network I/O constrained to one socket? etc.)
what are your performance constraints relative to cost of locking, number of concurrent worker threads, etc?
whether the "else" clause in your code needs to be blocking, fire-and-forget, or something more sophisticated
how you want to deal with timeouts, deadlocks, etc.
Adding this info to your question would be helpful, although if yours is a WinForms app and you're talking about user-facing GUI I/O, you can skip the other questions since the scenario is obvious.
Keep in mind that // call performIO so that it is called (invoked?) on ThreadB implementation will vary depending on whether this is WinForms, ASP.NET, console, etc.
If WinForms, check out this CodeProject post for a cool way to handle it. Also see MSDN for how this is usually handled using InvokeRequired.
If Console or generalized server app (no GUI), you'll need to figure out how to let the main thread know that it has work waiting-- and you may want to consider an alternate implementation which has a I/O worker thread or thread pool which just sits around executing queued I/O requests that you queue to it. Or you might want to consider synchronizing your I/O requests (easier) instead of marshalling calls over to one thread (harder).
If ASP.NET, you're probably implementing this in the wrong way. It's usually more effective to use ASP.NET async pages and/or to (per above) synchronize snchronizing to your I/O using lock{} or another synchronization method.
What you are trying to do is the opposite of what the InvokeRequired property of a windows form control does, so if it's a window form application, you could just use the property of your main form:
if (InvokeRequired) {
// running in a separate thread
} else {
// running in the main thread, so needs to send the task to the worker thread
}
The else part of your snippet, Invoking PerformIO on ThreadB is only going to work when ThreadB is the Main thread running a Messageloop.
So maybe you should rethink what you are doing here, it is not a normal construction.
Does your secondary thread do anything else besides the performIO() function? If not, then an easy way to do this is to use a System.Threading.ManualResetEvent. Have the secondary thread sit in a while loop waiting for the event to be set. When the event is signaled, the secondary thread can perform the I/O processing. To signal the event, have the main thread call the Set() method of the event object.
using System.Threading;
static void Main(string[] args)
{
ManualResetEvent processEvent = new ManualResetEvent(false);
Thread thread = new Thread(delegate() {
while (processEvent.WaitOne()) {
performIO();
processEvent.Reset(); // reset for next pass...
}
});
thread.Name = "I/O Processing Thread"; // name the thread
thread.Start();
// Do GUI stuff...
// When time to perform the IO processing, signal the event.
processEvent.Set();
}
Also, as an aside, get into the habit of naming any System.Threading.Thread objects as they are created. When you create the secondary thread, set the thread name via the Name property. This will help you when looking at the Threads window in Debug sessions, and it also allows you to print the thread name to the console or the Output window if the thread identity is ever in doubt.
Alright...I've given the site a fair search and have read over many posts about this topic. I found this question: Code for a simple thread pool in C# especially helpful.
However, as it always seems, what I need varies slightly.
I have looked over the MSDN example and adapted it to my needs somewhat. The example I refer to is here: http://msdn.microsoft.com/en-us/library/3dasc8as(VS.80,printer).aspx
My issue is this. I have a fairly simple set of code that loads a web page via the HttpWebRequest and WebResponse classes and reads the results via a Stream. I fire off this method in a thread as it will need to executed many times. The method itself is pretty short, but the number of times it needs to be fired (with varied data for each time) varies. It can be anywhere from 1 to 200.
Everything I've read seems to indicate the ThreadPool class being the prime candidate. Here is what things get tricky. I might need to fire off this thing say 100 times, but I can only have 3 threads at most running (for this particular task).
I've tried setting the MaxThreads on the ThreadPool via:
ThreadPool.SetMaxThreads(3, 3);
I'm not entirely convinced this approach is working. Furthermore, I don't want to clobber other web sites or programs running on the system this will be running on. So, by limiting the # of threads on the ThreadPool, can I be certain that this pertains to my code and my threads only?
The MSDN example uses the event drive approach and calls WaitHandle.WaitAll(doneEvents); which is how I'm doing this.
So the heart of my question is, how does one ensure or specify a maximum number of threads that can be run for their code, but have the code keep running more threads as the previous ones finish up until some arbitrary point? Am I tackling this the right way?
Sincerely,
Jason
Okay, I've added a semaphore approach and completely removed the ThreadPool code. It seems simple enough. I got my info from: http://www.albahari.com/threading/part2.aspx
It's this example that showed me how:
[text below here is a copy/paste from the site]
A Semaphore with a capacity of one is similar to a Mutex or lock, except that the Semaphore has no "owner" – it's thread-agnostic. Any thread can call Release on a Semaphore, while with Mutex and lock, only the thread that obtained the resource can release it.
In this following example, ten threads execute a loop with a Sleep statement in the middle. A Semaphore ensures that not more than three threads can execute that Sleep statement at once:
class SemaphoreTest
{
static Semaphore s = new Semaphore(3, 3); // Available=3; Capacity=3
static void Main()
{
for (int i = 0; i < 10; i++)
new Thread(Go).Start();
}
static void Go()
{
while (true)
{
s.WaitOne();
Thread.Sleep(100); // Only 3 threads can get here at once
s.Release();
}
}
}
Note: if you are limiting this to "3" just so you don't overwhelm the machine running your app, I'd make sure this is a problem first. The threadpool is supposed to manage this for you. On the other hand, if you don't want to overwhelm some other resource, then read on!
You can't manage the size of the threadpool (or really much of anything about it).
In this case, I'd use a semaphore to manage access to your resource. In your case, your resource is running the web scrape, or calculating some report, etc.
To do this, in your static class, create a semaphore object:
System.Threading.Semaphore S = new System.Threading.Semaphore(3, 3);
Then, in each thread, you do this:
System.Threading.Semaphore S = new System.Threading.Semaphore(3, 3);
try
{
// wait your turn (decrement)
S.WaitOne();
// do your thing
}
finally {
// release so others can go (increment)
S.Release();
}
Each thread will block on the S.WaitOne() until it is given the signal to proceed. Once S has been decremented 3 times, all threads will block until one of them increments the counter.
This solution isn't perfect.
If you want something a little cleaner, and more efficient, I'd recommend going with a BlockingQueue approach wherein you enqueue the work you want performed into a global Blocking Queue object.
Meanwhile, you have three threads (which you created--not in the threadpool), popping work out of the queue to perform. This isn't that tricky to setup and is very fast and simple.
Examples:
Best threading queue example / best practice
Best method to get objects from a BlockingQueue in a concurrent program?
It's a static class like any other, which means that anything you do with it affects every other thread in the current process. It doesn't affect other processes.
I consider this one of the larger design flaws in .NET, however. Who came up with the brilliant idea of making the thread pool static? As your example shows, we often want a thread pool dedicated to our task, without having it interfere with unrelated tasks elsewhere in the system.