Share a BlockingCollection across multiple tasks

Share a BlockingCollection across multiple tasks - c#

Here my scenario. I am getting a large amount of data in chuck from an external data source and I have to write it locally at two places. One of the destination is very slow to write to but the other one is super fast (but I cannot rely on it to read and write to the slow destination). To accomplish that, I am using a Producer-Consumer pattern (using BlockingCollection).
The issue I have right now is that I have to queue the data in two BlockingCollection and that takes way too much memory. My code look very similar to the example below but I would really like to drive the two Task from a single queue. Anybody know what would be the proper way to do that? Any inefficiencies in the code below?
class Program
{
const int MaxNumberOfWorkItems = 15;
static BlockingCollection<int> slowBC = new BlockingCollection<int>(MaxNumberOfWorkItems);
static BlockingCollection<int> fastBC = new BlockingCollection<int>(MaxNumberOfWorkItems);
static void Main(string[] args)
{
Task slowTask = Task.Factory.StartNew(() =>
{
foreach (var item in slowBC.GetConsumingEnumerable())
{
Console.WriteLine("SLOW -> " + item);
Thread.Sleep(25);
}
});
Task fastTask = Task.Factory.StartNew(() =>
{
foreach (var item in fastBC.GetConsumingEnumerable())
{
Console.WriteLine("FAST -> " + item);
}
});
// Population two BlockingCollections with the same data. How can I have a single collection?
for (int i = 0; i < 100; i++)
{
while (slowBC.TryAdd(i) == false)
{
Console.WriteLine("Wait for slowBC...");
}
while (fastBC.TryAdd(i) == false)
{
Console.WriteLine("Wait for 2...");
}
}
slowBC.CompleteAdding();
fastBC.CompleteAdding();
Task.WaitAll(slowTask, fastTask);
Console.ReadLine();
}
}

Using a producer-consumer queue to transfer single ints is extremely inefficient. You are rx it in chunks, so why not type the queue as '*chunk' and send the whole chunk, immediately creating/depooling a new chunk at the same ref. varaible to rx. the next lot of data? This is how P-C queues are normally be used for non-trivial amounts of data - queueing refs/pointers, not actual data. Threads have shared memory spaces, (that some developers seem to think just causes problems), so use it - queue pointers/refs and safely transfer MB of data as one pointer. As long as you, IN THE NEXT LINE OF CODE, always create/depool a new one after queueing off the old one, the producer and consumer threads can never be operating on the same chunk.
Queueing *chunks is powers-of-10 times more efficient for large chunks.
Send the *chunks to the fast link then just 'forward' them to the slow link from there.
You may need flow-control overal if the slow link is not to block up your system and cause eventual OOM errors. What I usually do is fix an 'overall' quota for the total buffer size and create a pool of chunks at startup, (pool is another BlockingCollection, populated with *new(chunks) at startup). The producer thread dequeues chunks, fills them with data, queues them to the FAST thread. The FAST thread processes received chunks and then queues the *chunks to the SLOW thread. The SLOW thread processes the same data and then repools the 'used' chunk for re-use by the producer thread. This forms a flow-controlled system - if the SLOW thred is too slow, the producer eventually tries to depool a *chunk from an empty pool and so blocks there until the SLOW thread repools some used *chunks and so signals the producer thread to run again. You may need some policy in the slow thread to time-out its operations and dump its *chunk early, so dropping data - you must decide on a policy for that given your overall requirements - it is obviously impossible to continually queue data to a fast and slow consumer forever without memory overflow unless the slow consumer dumps some data.
Edit - Oh, and yes, using a pool eliminates GC on the used chunks, further increasing performance.
One overall flow policy would be to not dump any data in the slow thread. With continual high data flow, the *chunks will all end up being on the queue between the fast and slow threads and the producer thread will indeed block on the empty pool. The network connection wil then apply its own flow-contol to stop the network peer sending any more dat over TCP. This extends the flow -control all the way from your slow thread to the peer.

Related

Async\await again. Example with network requests

I completely don't understand the applied meaning of async\await.
I just started learning async\await and I know that there are already a huge number of topics. If I understand correctly, then async\await is not needed anywhere else except for operations with a long wait in a thread, if this is not related to a long calculation. For example, database response, network request, file handling. Many people write that async\await is also needed so as not to block the main thread. And here it is completely unclear to me why it should be blocked. Don't block without async\await, just create a task. So I'm trying to create a code that will wait a long time for a response from the network.
I created an example. I see with my own eyes through the windows task manager that the while (i < int.MaxValue) operation is processed first, taking up the entire processor resource, although I first launched the DownloadFile. And only then, when the processor is released, I see that the download files is in progress. On my machine, the example runs ~54 seconds.
Question: how could I first run the DownloadFile asynchronously so that the threads do not idle uselessly, but can do while (i < int.MaxValue)?
using System.Net;
string PathProject = Directory.GetParent(Directory.GetCurrentDirectory()).Parent.Parent.Parent.FullName;
//Create folder 1 in the project folder
DirectoryInfo Path = new DirectoryInfo($"{PathProject}\\1");
int Iterations = Environment.ProcessorCount * 3;
string file = "https://s182vla.storage.yandex.net/rdisk/82b08d86b9920a5e889c6947e4221eb1350374db8d799ee9161395f7195b0b0e/62f75403/geIEA69cusBRNOpxmtup5BdJ7AbRoezTJE9GH4TIzcUe-Cp7uoav-lLks4AknK2SfU_yxi16QmxiuZOGFm-hLQ==?uid=0&filename=004%20-%2002%20Lesnik.mp3&disposition=attachment&hash=e0E3gNC19eqNvFi1rXJjnP1y8SAS38sn5%2ByGEWhnzE5cwAGsEnlbazlMDWSjXpyvq/J6bpmRyOJonT3VoXnDag%3D%3D&limit=0&content_type=audio%2Fmpeg&owner_uid=160716081&fsize=3862987&hid=98984d857027117759bc5ce6092eaa6a&media_type=audio&tknv=v2&rtoken=k9xogU6296eg&force_default=no&ycrid=na-2bc914314062204f1cbf810798018afd-downloader16e&ts=5e61a6daac6c0&s=eef8b08190dc7b22befd6bad89e1393b394869a1668d9b8af3730cce4774e8ad&pb=U2FsdGVkX1__q3AvjJzgzWG4wVR80Oh8XMl-0Dlfyu9FhqAYQVVkoBV0dtBmajpmOkCXKUXPbREOS-MZCxMNu2rkAkKq_n-AXcZ85svtSFs";
List<Task> tasks = new List<Task>();
void MyMethod1(int i)
{
WebClient client = new WebClient();
client.DownloadFile(file, $"{Path}\\{i}.mp3");
}
void MyMethod2()
{
int i = 0;
while (i < int.MaxValue)
{
i++;
}
}
DateTime dateTimeStart = DateTime.Now;
for (int i = 0; i < Iterations; i++)
{
int j = i;
tasks.Add(Task.Run(() => MyMethod1(j)));
}
for (int i = 0; i < Iterations; i++)
{
tasks.Add(Task.Run(() => { MyMethod2(); MyMethod2(); }));
}
Task.WaitAll(tasks.ToArray());
Console.WriteLine(DateTime.Now - dateTimeStart);
while (true)
{
Thread.Sleep(100);
if (Path.GetFiles().Length == Iterations)
{
Thread.Sleep(1000);
foreach (FileInfo f in Path.GetFiles())
{
f.Delete();
}
return;
}
}

If there are 2 web servers that talk to a database and they run on 2 machines with the same spec the web server with async code will be able to handle more concurrent requests.
The following is from 2014's Async Programming : Introduction to Async/Await on ASP.NET
Why Not Increase the Thread Pool Size?
At this point, a question is always asked: Why not just increase the size of the thread pool? The answer is twofold: Asynchronous code scales both further and faster than blocking thread pool threads.
Asynchronous code can scale further than blocking threads because it uses much less memory; every thread pool thread on a modern OS has a 1MB stack, plus an unpageable kernel stack. That doesn’t sound like a lot until you start getting a whole lot of threads on your server. In contrast, the memory overhead for an asynchronous operation is much smaller. So, a request with an asynchronous operation has much less memory pressure than a request with a blocked thread. Asynchronous code allows you to use more of your memory for other things (caching, for example).
Asynchronous code can scale faster than blocking threads because the thread pool has a limited injection rate. As of this writing, the rate is one thread every two seconds. This injection rate limit is a good thing; it avoids constant thread construction and destruction. However, consider what happens when a sudden flood of requests comes in. Synchronous code can easily get bogged down as the requests use up all available threads and the remaining requests have to wait for the thread pool to inject new threads. On the other hand, asynchronous code doesn’t need a limit like this; it’s “always on,” so to speak. Asynchronous code is more responsive to sudden swings in request volume.
(These days threads are added added every 0.5 second)

WebRequest.Create("https://192.168.1.1").GetResponse()
At some point the above code will probably hit the OS method recv(). The OS will suspend your thread until data becomes available. The state of your function, in CPU registers and the thread stack, will be preserved by the OS while the thread is suspended. In the meantime, this thread can't be used for anything else.
If you start that method via Task.Run(), then your method will consume a thread from a thread pool that has been prepared for you by the runtime. Since these threads aren't used for anything else, your program can continue handling other requests on other threads. However, creating a large number of OS threads has significant overheads.
Every OS thread must have some memory reserved for its stack, and the OS must use some memory to store the full state of the CPU for any suspended thread. Switching threads can have a significant performance cost. For maximum performance, you want to keep a small number of threads busy. Rather than having a large number of suspended threads which the OS must keep swapping in and out of each CPU core.
When you use async & await, the C# compiler will transform your method into a coroutine. Ensuring that any state your program needs to remember is no longer stored in CPU registers or on the OS thread stack. Instead all of that state will be stored in heap memory while your task is suspended. When your task is suspended and resumed, only the data which you actually need will be loaded & stored, rather than the entire CPU state.
If you change your code to use .GetResponseAsync(), the runtime will call an OS method that supports overlapped I/O. While your task is suspended, no OS thread will be busy. When data is available, the runtime will continue to execute your task on a thread from the thread pool.
Is this going to impact the program you are writing today? Will you be able to tell the difference? Not until the CPU starts to become the bottleneck. When you are attempting to scale your program to thousands of concurrent requests.
If you are writing new code, look for the Async version of any I/O method. Sprinkle async & await around. It doesn't cost you anything.

If I understand correctly, then async\await is not needed anywhere else except for operations with a long wait in a thread, if this is not related to a long calculation.
It's kind of recursive, but async is best used whenever there's something asynchronous. In other words, anything where the CPU would be wasted if it had to just spin (or block) while waiting for the operation to complete. Operations that are naturally asynchronous are generally I/O-based (as you mention, DB and other network calls, as well as file I/O), but they can be more arbitrary events, too (e.g., timers). Anything where there isn't actual code to run to get the response.
Many people write that async\await is also needed so as not to block the main thread.
At a higher level, there are two primary benefits to async/await, depending on what kind of code you're talking about:
On the server side (e.g., web apps), async/await provides scalability by using fewer threads per request.
On the client side (e.g., UI apps), async/await provides responsiveness by keeping the UI thread free to respond to user input.
Developers tend to emphasize one or the other depending on the kind of work they normally do. So if you see an async article talking about "not blocking the main thread", they're talking about UI apps specifically.
And here it is completely unclear to me why it should be blocked. Don't block without async\await, just create a task.
That works just fine for many situations. But it doesn't work well in others.
E.g., it would be a bad idea to just Task.Run onto a background thread in a web app. The primary benefit of async in a web app is to provide scalability by using fewer threads per request, so using Task.Run does not provide any benefits at all (in fact, scalability is reduced). So, the idea of "use Task.Run instead of async/await" cannot be adopted as a universal principle.
The other problem is in resource-constrained environments, such as mobile devices. You can only have so many threads there before you start running into other problems.
But if you're talking Desktop apps (e.g., WPF and friends), then sure, you can use async/await to free up the UI thread, or you can use Task.Run to free up the UI thread. They both achieve the same goal.
Question: how could I first run the DownloadFile asynchronously so that the threads do not idle uselessly, but can do while (i < int.MaxValue)?
There's nothing in your code that is asynchronous at all. So really, you're dealing with multithreading/parallelism. In general, I recommend using higher-level constructs such as Parallel for parallelism rather than Task.Run.
But regardless of the API used, the underlying problem is that you're kicking off Environment.ProcessorCount * 6 threads. You'll want to ensure that your thread pool is ready for that many threads by calling ThreadPool.SetMinThreads with the workerThreads set to a high enough number.

It's not web requests but here's a toy example:
Test:
n: 1 await: 00:00:00.1373839 sleep: 00:00:00.1195186
n: 10 await: 00:00:00.1290465 sleep: 00:00:00.1086578
n: 100 await: 00:00:00.1101379 sleep: 00:00:00.6517959
n: 300 await: 00:00:00.1207069 sleep: 00:00:02.0564836
n: 500 await: 00:00:00.1211736 sleep: 00:00:02.2742309
n: 1000 await: 00:00:00.1571661 sleep: 00:00:05.3987737
Code:
using System.Diagnostics;
foreach( var n in new []{1, 10, 100, 300, 500, 1000})
{
var sw = Stopwatch.StartNew();
var tasks = Enumerable.Range(0,n)
.Select( i => Task.Run( async () =>
{
await Task.Delay(TimeSpan.FromMilliseconds(100));
}));
await Task.WhenAll(tasks);
var tAwait = sw.Elapsed;
sw = Stopwatch.StartNew();
var tasks2 = Enumerable.Range(0,n)
.Select( i => Task.Run( () =>
{
Thread.Sleep(TimeSpan.FromMilliseconds(100));
}));
await Task.WhenAll(tasks2);
var tSleep = sw.Elapsed;
Console.WriteLine($"n: {n,4} await: {tAwait} sleep: {tSleep}");
}

C# Producer/Consumer queue with delays

The problem is the classic N producers (with N possibly large) and X consumers with limited resources (X is currently 4). The producers messages come in (say via MQTT) and get queued to be processed by consumers on a FIFO basis. The important part of the processing is that each consumer may need to send back to the producers one or more "replies" and such replies should be at least some time apart (the exact delay is not important). The classic solution where one starts X tasks that wait on the message queue, process and loop is easy to implement using, for example, System.Threading.Channels:
while (!cancellationToken.IsCancellationRequested && await queue.Reader.WaitToReadAsync()) {
while (queue.Reader.TryRead(out IncomingMessage item)) {
// Do some processing.
SendResponse(1);
// Do some more processing.
if (needsToSend2Response) {
await Task.Delay(500);
SendResponse(2);
}
}
}
This works and works well except that if a task needs to delay it can't process any more messages and that's obviously bad.
Possible solutions I thought of:
Use an outbound queue that process the messages and makes sure there is at least a minimum delay between messages sent to the same producer.
Don't use a queue. Just start a new Task every time a new message comes in and arbitrate the limited resources using a semaphore: it works but I don't see how to guarantee the FIFO requirement (some times messages from the same producer are processed in the wrong order).
Any other ideas?

Async file I/O overhead in C#

I've got a problem where I have to process a large batch of large jsonl files (read, deserialize, do some transforms db lookups etc, then write the transformed results in a .net core console app.
I've gotten better throughput by putting the output in batches on a separate thread and was trying to improve the processing side by adding some parallelism but the overhead ended up being self defeating.
I had been doing:
using (var stream = new FileStream(_filePath, FileMode.Open))
using (var reader = new StreamReader(stream)
{
for (;;)
{
var l = reader.ReadLine();
if (l == null)
break;
// Deserialize
// Do some database lookups
// Do some transforms
// Pass result to output thread
}
}
And some diagnostic timings showed me that the ReadLine() call was taking more than the deserialization, etc. To put some numbers on that, a large file would have about:
11 seconds spent on ReadLine
7.8 seconds spend on serialization
10 seconds spent on db lookups
I wanted to overlap that 11 seconds of file i/o with the other work so I tried
using (var stream = new FileStream(_filePath, FileMode.Open))
using (var reader = new StreamReader(stream)
{
var nextLine = reader.ReadLineAsync();
for (;;)
{
var l = nextLine.Result;
if (l == null)
break;
nextLine = reader.ReadLineAsync();
// Deserialize
// Do some database lookups
// Do some transforms
// Pass result to output thread
}
}
To get the next I/O going while I did the transform stuff. Only that ended up taking a lot longer than the regular sync stuff (like twice as long).
I've got requirements that they want predictability on the overall result (i.e. the same set of files have to be processed in name order and the output rows have to be predictably in the same order) so I can't just throw a file per thread and let them fight it out.
I was just trying to introduce enough parallelism to smooth the throughput over a large set of inputs, and I was surprised how counterproductive the above turned out to be.
Am I missing something here?

The built-in asynchronous filesystem APIs are currently broken, and you are advised to avoid them. Not only they are much slower than their synchronous counterparts, but they are not even truly asynchronous. The .NET 6 will come with an improved FileStream implementation, so in a few months this may no longer be an issue.
What you are trying to achieve is called task-parallelism, where two or more heterogeneous operations are running concurrently and independently from each other. It's an advanced technique and it requires specialized tools. The most common type of parallelism is the so called data-parallelism, where the same type of operation is running in parallel on a list of homogeneous data, and it's commonly implemented using the Parallel class or the PLINQ library.
To achieve task-parallelism the most readily available tool is the TPL Dataflow library, which is built-in the .NET Core / .NET 5 platforms, and you only need to install a package if you are targeting the .NET Framework. This library allows you to create a pipeline consisting of linked components that are called "blocks" (TransformBlock, ActionBlock, BatchBlock etc), where each block acts as an independent processor with its own input and output queues. You feed the pipeline with data, and the data flows from block to block through the pipeline, while being processed along the way. You Complete the first block in the pipeline to signal that no more input data will ever be available, and then await the Completion of the last block to make your code wait until all the work has been done. Here is an example:
private async void Button1_Click(object sender, EventArgs e)
{
Button1.Enabled = false;
var fileBlock = new TransformManyBlock<string, IList<string>>(filePath =>
{
return File.ReadLines(filePath).Buffer(10);
});
var deserializeBlock = new TransformBlock<IList<string>, MyObject[]>(lines =>
{
return lines.Select(line => Deserialize(line)).ToArray();
}, new ExecutionDataflowBlockOptions()
{
MaxDegreeOfParallelism = 2 // Let's assume that Deserialize is parallelizable
});
var persistBlock = new TransformBlock<MyObject[], MyObject[]>(async objects =>
{
foreach (MyObject obj in objects) await PersistToDbAsync(obj);
return objects;
});
var displayBlock = new ActionBlock<MyObject[]>(objects =>
{
foreach (MyObject obj in objects) TextBox1.AppendText($"{obj}\r\n");
}, new ExecutionDataflowBlockOptions()
{
TaskScheduler = TaskScheduler.FromCurrentSynchronizationContext()
// Make sure that the delegate will be invoked on the UI thread
});
fileBlock.LinkTo(deserializeBlock,
new DataflowLinkOptions { PropagateCompletion = true });
deserializeBlock.LinkTo(persistBlock,
new DataflowLinkOptions { PropagateCompletion = true });
persistBlock.LinkTo(displayBlock,
new DataflowLinkOptions { PropagateCompletion = true });
foreach (var filePath in Directory.GetFiles(#"C:\Data"))
await fileBlock.SendAsync(filePath);
fileBlock.Complete();
await displayBlock.Completion;
MessageBox.Show("Done");
Button1.Enabled = true;
}
The data passed through the pipeline should be chunky. If each unit of work is too lightweight, you should batch them in arrays or lists, otherwise the overhead of moving lots of tiny data around is going to outweigh the benefits of parallelism. That's the reason for using the Buffer LINQ operator (from the System.Interactive package) in the above example. The .NET 6 will come with a new Chunk LINQ operator, offering the same functionality.

Theodor's suggestion looks like a really powerful and useful library that's worth checking out, but if you're looking for a smaller DIY solution this is how I would approach it:
using System;
using System.IO;
using System.Threading.Tasks;
using System.Collections.Generic;
namespace Parallelism
{
class Program
{
private static Queue<string> _queue = new Queue<string>();
private static Task _lastProcessTask;
static async Task Main(string[] args)
{
string path = "???";
await ReadAndProcessAsync(path);
}
private static async Task ReadAndProcessAsync(string path)
{
using (var str = File.OpenRead(path))
using (var sr = new StreamReader(str))
{
string line = null;
while (true)
{
line = await sr.ReadLineAsync();
if (line == null)
break;
lock (_queue)
{
_queue.Enqueue(line);
if (_queue.Count == 1)
// There was nothing in the queue before
// so initiate a new processing loop. Save
// but DON'T await the Task yet.
_lastProcessTask = ProcessQueueAsync();
}
}
}
// Now that file reading is completed, await
// _lastProcessTask to ensure we don't return
// before it's finished.
await _lastProcessTask;
}
// This will continue processing as long as lines are in the queue,
// including new lines entering the queue while processing earlier ones.
private static Task ProcessQueueAsync()
{
return Task.Run(async () =>
{
while (true)
{
string line;
lock (_queue)
{
// Only peak at first so the read loop doesn't think
// the queue is empty and initiate a second processing
// loop while we're processing this line.
if (!_queue.TryPeek(out line))
return;
}
await ProcessLineAsync(line);
lock (_queue)
{
// Dequeues the item we just processed. If it's the last
// one, this loop is done.
_queue.Dequeue();
if (_queue.Count == 0)
return;
}
}
});
}
private static async Task ProcessLineAsync(string line)
{
// do something
}
}
}
Note this approach has a processing loop that terminates when nothing is left in the queue, and is re-initiated if needed when new items are ready. Another approach would be to have a continuous processing loop that repeatedly re-checks and does a Task.Delay() for a small amount of time while the queue is empty. I like my approach better because it doesn't bog down the worker thread with periodic and unnecessary checks but performance would likely be unnoticeably different.
Also just to comment on Blindy's answer, I have to disagree with discouraging the use of parallelism here. First off, most CPUs these days are multi-core, so smart use of the .NET threadpool will in fact maximize your application's efficiency when run on multi-core CPUs and have pretty minimal downside in single-core scenarios.
More importantly, though, async does not equal multithreading. Asynchronous programming existed long before multithreading, I/O being the most notable example. I/O operations are in large part handled by hardware other than the CPU - the NIC, SATA controllers, etc. They use an ancient concept called the Hardware Interrupt that most coders today have probably never heard of and predates multithreading by decades. It's basically just a way to give the CPU a callback to execute when an off-CPU operation is finished. So when you use a well-behaved asynchronous API (notwithstanding that .NET FileStream has issues as Theodore mentioned), your CPU really shouldn't be doing that much work at all. And when you await such an API, the CPU is basically sitting idle until the other hardware in the machine has written the requested data to RAM.
I agree with Blindy that it would be better if computer science programs did a better job of teaching people how computer hardware actually works. Looking to take advantage of the fact that the CPU can be doing other things while waiting for data to be read off the disk, off a network, etc., is, in the words of Captain Kirk, "officer thinking".

11 seconds spent on ReadLine
More like, specifically, 11 seconds spent on file I/O, but you didn't measure that.
Replace your stream creation with this instead:
using var reader = new StreamReader(_filePath, Encoding.UTF8, false, 50 * 1024 * 1024);
That will cause it to read it to a buffer of 50MB (play with the size as needed) to avoid repeated I/O on what seems like an ancient hard drive.
I was just trying to introduce enough parallelism to smooth the throughput
Not only did you not introduce any parallelism at all, but you used ReadLineAsync wrong -- it returns a Task<string>, not a string.
It's completely overkill, the buffer size increase will most likely fix your issue, but if you want to actually do this you need two threads that communicate over a shared data structure, as Peter said.
Only that ended up taking a lot longer than the regular sync stuff
It baffles me that people think multi-threaded code should take less processing power than single-threaded code. There has to be some really basic understanding missing from present day education to lead to this. Multi-threading includes multiple extra context switches, mutex contention, your OS scheduler kicking in to replace one of your threads (leading to starvation or oversaturation), gathering, serializing and aggregating results after work is done etc. None of that is free or easy to implement.

Reduce CPU usage while processing large amount of data

I am writing a real time application which receives around 2000 messages per second which was pushed in a queue. I have written a background thread which process the messages in the queue.
private void ProcessSocketMessage()
{
while (!this.shouldStopProcessing)
{
while (this.messageQueue.Count > 0)
{
string message;
bool result = this.messageQueue.TryDequeue(out message);
if (result)
{
// Process the string and do some other stuff
// Like updating the received message in a datagrid
}
}
}
}
The problem with the above code is that it uses insane amount of processing power around 12% of CPU(2.40 GHz dual core processor).
I have 4 blocks similar to the one above which literally takes up 50 % of CPU computing power.
Is there anything which can be optimized in the above code?
Adding a Thread Sleep of 100 ms before second while loop end does seems to be increase the performance by 50%. But am I doing something wrong?

This functionality is already provided in the Dataflow library's ActionBlock class. An ActionBlock has an input buffer that receives messages and processes them by calling an action for each one. By default, only one message is processed at a time. It doesn't use busy waiting.
void MyActualProcessingMethod(string it)
{
// Process the string and do some other stuff
}
var myBlock = new ActionBlock<string>( someString =>MyActualProcessingMethod(someString));
//Simulate a lot of messages
for(int i=0;i<100000;i++)
{
myBlock.Post(someMessage);
}
When the messages finish and/or we don't want any more messages, we command it to complete, by refusing any new messages and processing anything left in the input buffer:
myBlock.Complete();
Before we finish, we need to actually await for the block to finish processing the leftovers:
await myBlock.Completion;
All Dataflow blocks can accept messages from multiple clients.
Blocks can be combined as well. The output of one block can feed another. The TransformBlock accepts a function that transforms an input into an output.
Typically each block uses tasks from the thread pool. By default one block processes only one message at a time. Different blocks run on different tasks or even different TaskSchedulers. This way, you can have one block do some heavy processing and push a result to another block that updates the UI.
string MyActualProcessingMethod(string it)
{
// Process the string and do some other stuff
// and send a progress message downstream
return SomeProgressMessage;
}
void UpdateTheUI(string msg)
{
statusBar1.Text = msg;
}
var myProcessingBlock = new TransformBlock<string,string>(msg =>MyActualProcessingMethod(msg));
The UI will be updated by another block that runs on the UI thread. This is expressed through the ExecutionDataflowBlockOptions :
var runOnUI=new ExecutionDataflowBlockOptions {
TaskScheduler = TaskScheduler.FromCurrentSynchronizationContext()
};
var myUpdater = new ActionBlock<string>(msg => UpdateTheUI(msg),runOnUI);
//Pass progress messages from the processor to the updater
myProcessingBlock.LinkTo(myUpdater,new DataflowLinkOptions { PropagateCompletion = true });
The code that posts messages to the pipeline's first block doesn't change :
//Simulate a lot of messages
for(int i=0;i<100000;i++)
{
myProcessingBlock.Post(someMessage);
}
//We are finished, tell the block to process any leftover messages
myProcessingBlock.Complete();
In this case, as soon as the procesor completes it will notify the next block in the pipeline to complete. We need to wait for that final block to complete as well
//Wait for the block to finish
await myUpdater.Completion;
How about making the first block work in parallel? We can specify that up to eg 10 tasks will be used to process input messages through its execution options :
var dopOptions = new ExecutionDataflowBlockOptions {MaxDegreeOfParallelism = 10};
var myProcessingBlock = new TransformBlock<string,string>(msg =>MyActualProcessingMethod(msg),dopOptions);
The processor will process up to 10 messages in parallel but the updater will still process them one by one, in the UI thread.

You're best bet is to use a profile to monitor the running application and determine for sure where the CPU is spending it's time.
However, it looks like you have the possibility for a busy-wait loop if this.messageQueue.Count is 0. At minimum, I would suggest adding a small pause if the queue is empty to allow a message to go onto the queue. Otherwise your CPU is just spending time checking the queue over and over and over.
If the time is spent dequeueing messages, you may want to consider handling multiple messages at once (if there are multiple messages available), assuming you're queue allows you to pop multiple messages off the queue in a single call.

Thread Pool of workers in a Window service?

I'm creating a Windows service with 2 separate components:
1 component creates jobs and inserts them to the database (1 thread)
The 2nd component processes these jobs (multiple FIXED # of threads in a thread pool)
These 2 components will always run as long as the service is running.
What I'm stuck on is determining how to implement this thread pool. I've done some research, and there seems to be many ways of doing this such as creating a class that overriddes the method "ThreadPoolCallback", and using ThreadPool.QueueUserWorkItem to queue a work item. http://msdn.microsoft.com/en-us/library/3dasc8as.aspx
However in the example given, it doesn't seem to fit my scenario. I want to create a FIXED number of threads in a thread pool initially. Then feed it jobs to process. How do I do this?
// Wrapper method for use with thread pool.
public void ThreadPoolCallback(Object threadContext)
{
int threadIndex = (int)threadContext;
Console.WriteLine("thread {0} started...", threadIndex);
_fibOfN = Calculate(_n);
Console.WriteLine("thread {0} result calculated...", threadIndex);
_doneEvent.Set();
}
Fibonacci[] fibArray = new Fibonacci[FibonacciCalculations];
const int FibonacciCalculations = 10;
for (int i = 0; i < FibonacciCalculations; i++)
{
ThreadPool.QueueUserWorkItem(f.ThreadPoolCallback, i);
}

Create a BlockingCollection of work items. The thread that creates jobs adds them to this collection.
Create a fixed number of persistent threads that read items from that BlockingCollection and process them. Something like:
BlockingCollection<WorkItem> WorkItems = new BlockingCollection<WorkItem>();
void WorkerThreadProc()
{
foreach (var item in WorkItems.GetConsumingEnumerable())
{
// process item
}
}
Multiple worker threads can be doing that concurrently. BlockingCollection supports multiple readers and writers, so there's no concurrency problems that you have to deal with.
See my blog post Simple Multithreading, part 2 for an example that uses one consumer and one producer. Adding multiple consumers is a very simple matter of spinning up a new task for each consumer.
Another way to do it is to use a semaphore that controls how many jobs are currently being processed. I show how to do that in this answer. However, I think the shared BlockingCollection is in general a better solution.

The .NET thread pool isn't really designed for a fixed number of threads. It's designed to use the resources of the machine in the best way possible to perform multiple relatively small jobs.
Maybe a better solution for you would be to instantiate a fixed number of BackgroundWorkers instead? There are some reasonable BW examples.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.