I have an app that needs to process some packets from a UDP broadcast, and write the result to a SQL database.
During intervals the rate of incoming packets gets quite high, and I find that the underlying buffer overflows, even at 8MB, because of the slow database writing.
To solve this I have come up with two options, cache some of the DB writing,
and/or marshal the db writes onto another sequentially operating thread. (order of db writes must be preserved).
My question is how to marshal the work most effectively, with built-in structures/features of c#?
Should I just have a synchronized queue, put the delegate work items onto it and have a single thread service it in an event loop?
The easiest way is to create a separate thread for the UDP reading.
The thread just receives the packages and puts them into a BlockingCollection.
(Don't know how you do the packet reading so this will be pseudocode)
BlockingCollection<Packet> _packets = new BlockingCollection<Packet>();
...
while (true) {
var packet = udp.GetNextPacket();
_packets.Add(packet);
}
This will ensure you don't miss any packages because the reading thread is busy.
Then you create a worker thread that does the same thing as your current code but it reads from the blockingcollection instead.
while (true) {
var nextPacket = _packets.Take();
...
}
If your packages are independent (which they should be since you use UDP) you can fire up several worker threads.
If you are using .Net 4, Tasks and TaskSchedulers can help you out here. On MSDN you'll find the LimitedConcurrencyLevelTaskScheduler. Create a single instance of that, with a concurrency level of 1, and each time you have work to do, create a Task and schedule it with that scheduler.
Related
I have a problem with scalability and processing and I want to get the opinion of the stack overflow community.
I basically have XML data coming down a socket and I want to process that data. For each XML line sent processing can include writing to a text file, opening a socket to another server and using various database queries; all of which take time.
At the minute my solution involves the following threads:
Thread 1
Accepts incoming sockets and thus generates child threads that handle each socket (there will only be a couple of incoming sockets from clients). When an XML line comes through (ReadLine() method on StreamReader) I basically put this line into a Queue, which is accessible via a static method on a class. This static method contains locking logic to ensure that the program is threadsafe (I could use Concurrent Queue for this of course instead of manual locking).
Threads 2-5
Constantly take XML lines from the queue and processes them one at a time (database queries, file writes etc).
This method seems to be working but I was curious if there is a better way of doing things because this seems very crude. If I take the processing that threads 2-5 do into thread 1 this results in extremely slow performance, which I expected, so I created my worker threads (2-5).
I appreciate I could replace threads 2-5 with a thread pool but the thread pool would still be reading from the same Queue of XML lines so I wandered if there is a more efficient way of processing these events instead of using the Queue?
A queue1 is the right approach. But I would certainly move from manual thread control to the thread pool (and thus I don't need to do thread management) and let it manage the number of threads.2
But in the end there is only so much processing a single computer (however expensive) can do. At some point one of memory size, CPU-memory bandwidth, storage IO, network IO, … is going to be saturated. At that point using an external queuing system (MSMQ, WebSphere*MQ, Rabbit-MQ, …) with each task being a separate message allows many workers on many computers to process the data ("competing consumers" pattern).
1 I would move immediately to ConcurrentQueue: getting locking right is hard, the more you don't need to do it yourself the better.
2 At some point you might find you need more control than the thread pool providers, that is the time to switch to a custom thread pool. But prototype and test: it is quite possible your implementation will actually be worse: see paragraph 2.
i have a program which process price data coming from the broker. the pseudo code are as follow:
Process[] process = new Process[50];
void tickEvent(object sender, EventArgs e)
{
int contractNumber = e.contractNumber;
doPriceProcess(process[contractNumber], e);
}
now i would like to use mutlithreading to speed up my program, if the data are of different contract number, i would like to fire off different threads to speed up the process. However if the data are from the same contract, i would like the program to wait until the current process finishes before i continue with the next data. How do i do it?
can you provide some code please?
thanks in advance~
You have many high level architectural decissions to make here:
How many ticks do you expect to come from that broker?
After all, you should have some kind dispatcher here.
Here is some simple description of what basically is to do:
Encapsulate the incoming ticks in packages, best
single commands that have all the data needed
Have a queue where you can easily (and thread safe) can store those commands
Have a Dispatcher, that takes an item of the queue and assigns some worker
to do the command (or let the command execute itself)
Having a worker, you can have multiple threads, processes or whatsoever
to work multiple commands seemlessly
Maybe you want to do some dispatching already for the input queue, depending
on how many requests you want to be able to complete per time unit.
Here is some more information that can be helpful:
Command pattern in C#
Reactor pattern (with sample code)
Rather than holding onto an array of Processes, I would hold onto an array of BlockingCollections. Each blocking collection can correspond to a particular contract. Then you can have producer threads that add work to do onto the end of a corresponding contract's queue, and you can have producer queues that consume the results from those collections. You can ensure than each thread (I would use threads for this, not processes) is handling 1-n different queues, but that each queue is handled by no more than one thread. That way you can ensure that no bits of work from the same contract are worked on in parallel.
The threading aspect of this can be handled effectiving using C#'s Task class. For your consumers you can create a new task for each BlockingCollection. That task's body will pretty much just be:
foreach(SomeType item in blockingCollections[contractNumber].GetConsumingEnumerable())
processItem(item);
However, by using Tasks you will let the computer schedule them as it sees fit. If it notices most of them sitting around waiting on empty queues it will just have a few (or just one) actual thread rotating between the tasks that it's using. If they are trying to do enough, and your computer can clearly support the load of additional threads, it will add more (possibly adding/removing dynamically as it goes). By letting much smarter people than you or I handle that scheduling it's much more likely to be efficient without under or over parallelizing.
I've run into a problem while writing an async multi-server network app in c#. I have many jobs being taken care of by the thread pool and these include the writes to the network sockets. This ended up allowing for the case where more than one thread could write to the socket at the same time and discombobulate my outgoing messages. My idea for getting around this was to implement a queue system where whenever data got added to the queue, the socket would write it.
My problem is, I can't quite wrap my head around the architecture of something of this nature. I imagine having a queue object that fires an event on whenever data gets added to the queue. The event then writes the data being held in the queue, but that won't work because if two threads come by and add to the queue simultaneously, even if the queue is made to be thread safe, events will still be fired for both and I'll run into the same problem. So then maybe someway to hold off an event if another is in progress, but then how do I continue that event once the first finishes without simply blocking the thread on some mutex or something. This wouldn't be so hard if I wasn't trying to stay strict with my "block nothing" architecture but this particular application requires that I allow the thread pool threads to keep doing their thing.
Any ideas?
While similar to Porges answer it differs a bit in implementation.
First, I usually don't queue the bytes to send, but objects and seralize them in the sending thread but I guess that's a matter of taste.
But the bigger difference is in the use of ConcurrentQueues (in addition to the BlockingCollection).
So I'd end up with code similar to
BlockingCollection<Packet> sendQueue = new BlockingCollection<Packet>(new ConcurrentQueue<Packet>());
while (true)
{
var packet = sendQueue.Take(); //this blocks if there are no items in the queue.
SendPacket(packet); //Send your packet here.
}
The key-take away here is that you have one thread which loops this code, and all other threads can add to the queue in a thread-safe way (both, BlockingCollection and ConcurrentQueue are thread-safe)
have a look at Processing a queue of items asynchronously in C# where I answered a similar question.
Sounds like you need one thread writing to the socket synchronously and a bunch of threads writing to a queue for that thread to process.
You can use a blocking collection (BlockingCollection<T>) to do the hard work:
// somewhere there is a queue:
BlockingCollection<byte[]> queue = new BlockingCollection<byte[]>();
// in socket-writing thread, read from the queue and send the messages:
foreach (byte[] message in queue.GetConsumingEnumerable())
{
// just an example... obviously you'd need error handling and stuff here
socket.Send(message);
}
// in the other threads, just enqueue messages to be sent:
queue.Add(someMessage);
The BlockingCollection will handle all synchronization. You can also enforce a maximum queue length and other fun things.
I don't know C#, but what I would do is have the event trigger the socket manager to start pulling from the queue and write things out one at a time. If it is already going the trigger won't do anything, and once there is nothing in the queue, it stops.
This solves the problem of two threads writing to the queue simultaneously because the second event would be a no-op.
You could have a thread-safe queue that all your worker thread write their results to. Then have another thread that polls the queue and sends results when it sees them waiting.
I have a C# HttpListener that runs on a single thread and parses data sent to it by another program. My main problem is not all the data sent to the server is received. I only assume this is due to the limitations of it being run on a single thread. I have searched high and low for a simple multi-threading solution so it may receive al the data sent to it, and came up empty handed. Any help in transforming this into a multi-threaded application would be much appreciated.
private void frmMain_Load(object sender, EventArgs e)
{
Thread t = new Thread(new ThreadStart(ThreadProc));
t.Start();
}
public static void ThreadProc()
{
while (true)
{
WebBot.SimpleListenerExample(new string[] { "http://localhost:13274/" });
//Thread t = new Thread(new ThreadStart(ThreadProc));
//t.Start();
Application.DoEvents();
}
}
First thing first: verify that your hypothesis is indeed correct. You need to check:
How much data is sent
How much data is received
How long does it take to send the data
How long does it take to operate on the data
HTTP works over TCP, which generally guarantees delivery, so even if it will take a long time, your server should be getting all the incoming information.
That said, if you still want to make the process multi-threaded, I would recommend the following design:
One thread like you have right now (LISTENER THREAD), that accepts incoming data.
Another set of threads that will process the incoming data (WORKER THREADS).
The listener thread will only receive the data and place it in a queue.
The worker threads will dequeue the queue and operate on the data.
Several notes and things to think about, though:
Take care of thread synchronization - specifically, you need to protect the queue.
Think if it matters which worker thread will get the data. If there are several chunks that need to be taken care of a specific worker thread, you'll need to address this problem.
In some cases, if there is a very high load on the listener thread, the queue may become a bottleneck, or more precisely - the locking on the queue may become a bottleneck. In this case I would recommend moving into a model of N queues for N worker threads, and have the listener just pick one in a round-robin fashion. This will minimize the locks and actually since you'll have one reader and one writer you can even get away without a lock (but this is out of scope for that answer).
Yet another option would be to use a thread pool. A thread pool is a pool of threads that are hibernating until they are needed. When the listener gets an incoming input it will allocate it to a free thread, or will enlarge the pool if needed; this way you don't have a queue, and your threads are optimally used.
Simplest Embedded Web Server Ever with HttpListener may help you get started.
I'm creating a server-type application at the moment which will do the usual listening for connections from external clients and, when they connect, handle requests, etc.
At the moment, my implementation creates a pair of threads every time a client connects. One thread simply reads requests from the socket and adds them to a queue, and the second reads the requests from the queue and processes them.
I'm basically looking for opinions on whether or not you think having all of these threads is overkill, and importantly whether this approach is going to cause me problems.
It is important to note that most of the time these threads will be idle - I use wait handles (ManualResetEvent) in both threads. The Reader thread waits until a message is available and if so, reads it and dumps it in a queue for the Process thread. The Process thread waits until the reader signals that a message is in the queue (again, using a wait handle). Unless a particular client is really hammering the server, these threads will be sat waiting. Is this costly?
I'm done a bit of testing - had 1,000 clients connected continually nagging - the server (so, 2,000+ threads) and it seemed to cope quite well.
I think your implementation is flawed. This kind of design doesn't scale because creating threads is expensive and there is a limit on how many threads can be created.
That is the reason that most implementations of this type use a thread pool. That makes it easy to put a cap on the maximum amount of threads while easily managing new connections and reusing the threads when the work is finished.
If all you are doing with your thread is putting items in a queue, then use the
ThreadPool.QueueUserWorkItem method to use the default .NET thread pool.
You haven't given enough information in your question to specify for definite but perhaps you now only need one other thread, constantly running clearing down the queue, you can use a wait handle to signal when something has been added.
Just make sure to synchronise access to your queue or things will go horribly wrong.
I advice to use following patter. First you need thread pool - build in or custom. Have a thread that checks is there something available to read, if yes it picks Reader thread. Then reading thread puts into queue and then thread from pool of processing threads will pick it. it will minimize number of threads and minimize time spend in waiting state