How to Perform Interlocked.Increment/Decrement Properly in IO Bound? - c#

I have simple IO bound 4.0 console application, which send 1 to n requests to a web-service and wait for their completion and then exit. Here is a sample,
static int counter = 0;
static void Main(string[] args)
{
foreach (my Loop)
{
......................
WebClientHelper.PostDataAsync(... =>
{
................................
................................
Interlocked.Decrement(ref counter);
});
Interlocked.Increment(ref counter);
}
while(counter != 0)
{
Thread.Sleep(500);
}
}
Is this is correct implementation?

You can use Tasks. Let TPL manage those things.
Task<T>[] tasks = ...;
//Started the tasks
Task.WaitAll(tasks);
Another way is to use TaskCompletionSource as mentioned here.

As suggested by Hans, here's your code implemented with CountdownEvent:
static void Main(string[] args)
{
var counter = new CountdownEvent();
foreach (my Loop)
{
......................
WebClientHelper.PostDataAsync(... =>
{
................................
................................
counter.Signal();
});
counter.AddCount();
}
counter.Wait();
}

Related

Multi Thread & Semaphore & Events

I am trying to execute some commands that come from RabbitMQ. Its about 5 msgs/sec. So as are too many msg, I have to send to a thread to execute, but I dont have so many threads, so I put a limit of 10.
so the ideia was that the msgs would come to the worker, put in a queue and any of the 10 threads would peak and execute. All these using semaphore.
After some experiments, I donĀ“t know why, but my thread only executes 3 or 4 items, after that it just stops with no error...
The problem I think is the logic when the event calls the method to execute, could not think in a better way...
Why just the first 4 msgs are processed??
What pattern or better way to do this?
Here are some parts of my code:
const int MaxThreads = 10;
private static Semaphore sem = new Semaphore(MaxThreads, MaxThreads);
private static Queue<BasicDeliverEventArgs> queue = new Queue<BasicDeliverEventArgs>();
static void Main(string[] args)
{
consumer.Received += (sender, ea) =>
{
var m = JsonConvert.DeserializeObject<Mail>(ea.Body.GetString());
Console.WriteLine($"Sub-> {m.Subject}");
queue.Enqueue(ea);
RUN();
};
channel.BasicConsume(queueName, false, consumer);
Console.Read();
}
private static void RUN()
{
while (queue.Count > 0)
{
sem.WaitOne();
var item = queue.Dequeue();
ThreadPool.QueueUserWorkItem(sendmail, item);
}
}
private static void sendmail(Object item)
{
//.....soem processing stuff....
//tell rabbitMq that everything was OK
channel.BasicAck(deliveryTag: x.DeliveryTag, multiple: true);
//release thread
sem.Release();
}
I think that you could use a blocking collection here. It will simplify the code.
So your email sender would look something like that:
public class ParallelEmailSender : IDisposable
{
private readonly BlockingCollection<string> blockingCollection;
public ParallelEmailSender(int threadsCount)
{
blockingCollection = new BlockingCollection<string>(new ConcurrentQueue<string>());
for (int i = 0; i < threadsCount; i++)
{
Task.Factory.StartNew(SendInternal);
}
}
public void Send(string message)
{
blockingCollection.Add(message);
}
private void SendInternal()
{
foreach (string message in blockingCollection.GetConsumingEnumerable())
{
// send method
}
}
public void Dispose()
{
blockingCollection.CompleteAdding();
}
}
Of course you will need to add error catching logic and you could also improve the app shutting down process by using cancellation tokens.
I strongly suggest to read the great e-book about multithreading programming written by Joseph Albahari.

C# simple serial task queue memory leak

I needed a very basic serial execution queue.
I wrote the following based on this idea, but I needed a queue to ensure FIFO, so I added a intermediate ConcurrentQueue<>.
Here is the code:
public class SimpleSerialTaskQueue
{
private SemaphoreSlim _semaphore = new SemaphoreSlim(0);
private ConcurrentQueue<Func<Task>> _taskQueue = new ConcurrentQueue<Func<Task>>();
public SimpleSerialTaskQueue()
{
Task.Run(async () =>
{
Func<Task> dequeuedTask;
while (true)
{
if (await _semaphore.WaitAsync(1000))
{
if (_taskQueue.TryDequeue(out dequeuedTask) == true)
{
await dequeuedTask();
}
}
else
{
Console.WriteLine("Nothing more to process");
//If I don't do that , memory pressure is never released
//GC.Collect();
}
}
});
}
public void Add(Func<Task> o_task)
{
_taskQueue.Enqueue(o_task);
_semaphore.Release();
}
}
When I run that in a loop, simulating heavy load, I get some kind of memory leak. Here is the code:
static void Main(string[] args)
{
SimpleSerialTaskQueue queue = new SimpleSerialTaskQueue();
for (int i = 0; i < 100000000; i++)
{
queue.Add(async () =>
{
await Task.Delay(0);
});
}
Console.ReadLine();
}
EDIT:
I don't understand why once the tasks have been executed, I still get like 750MB of memory used (based on VS2015 diagnostic tools). I thought once executed it would be very low. The GC doesnt seem to collect anything.
Can anyone tell me what is happening? Is this related to the state machine

How to handle threads that hang when using SemaphoreSlim

I have some code that runs thousands of URLs through a third party library. Occasionally the method in the library hangs which takes up a thread. After a while all threads are taken up by processes doing nothing and it grinds to a halt.
I am using a SemaphoreSlim to control adding new threads so I can have an optimal number of tasks running. I need a way to identify tasks that have been running too long and then to kill them but also release a thread from the SemaphoreSlim so a new task can be created.
I am struggling with the approach here so I made some test code that immitates what I am doing. It create tasks that have a 10% chance of hanging so very quickly all threads have hung.
How should I be checking for these and killing them off?
Here is the code:
class Program
{
public static SemaphoreSlim semaphore;
public static List<Task> taskList;
static void Main(string[] args)
{
List<string> urlList = new List<string>();
Console.WriteLine("Generating list");
for (int i = 0; i < 1000; i++)
{
//adding random strings to simulate a large list of URLs to process
urlList.Add(Path.GetRandomFileName());
}
Console.WriteLine("Queueing tasks");
semaphore = new SemaphoreSlim(10, 10);
Task.Run(() => QueueTasks(urlList));
Console.ReadLine();
}
static void QueueTasks(List<string> urlList)
{
taskList = new List<Task>();
foreach (var url in urlList)
{
Console.WriteLine("{0} tasks can enter the semaphore.",
semaphore.CurrentCount);
semaphore.Wait();
taskList.Add(DoTheThing(url));
}
}
static async Task DoTheThing(string url)
{
Random rand = new Random();
// simulate the IO process
await Task.Delay(rand.Next(2000, 10000));
// add a 10% chance that the thread will hang simulating what happens occasionally with http request
int chance = rand.Next(1, 100);
if (chance <= 10)
{
while (true)
{
await Task.Delay(1000000);
}
}
semaphore.Release();
Console.WriteLine(url);
}
}
As people have already pointed out, Aborting threads in general is bad and there is no guaranteed way of doing it in C#. Using a separate process to do the work and then kill it is a slightly better idea than attempting Thread.Abort; but still not the best way to go. Ideally, you want co-operative threads/processes, which use IPC to decide when to bail out themselves. This way the cleanup is done properly.
With all that said, you can use code like below to do what you intend to do. I have written it assuming your task will be done in a thread. With slight changes, you can use the same logic to do your task in a process
The code is by no means bullet-proof and is meant to be illustrative. The concurrent code is not really tested well. Locks are held for longer than needed and some places I am not locking (like the Log function)
class TaskInfo {
public Thread Task;
public DateTime StartTime;
public TaskInfo(ParameterizedThreadStart startInfo, object startArg) {
Task = new Thread(startInfo);
Task.Start(startArg);
StartTime = DateTime.Now;
}
}
class Program {
const int MAX_THREADS = 1;
const int TASK_TIMEOUT = 6; // in seconds
const int CLEANUP_INTERVAL = TASK_TIMEOUT; // in seconds
public static SemaphoreSlim semaphore;
public static List<TaskInfo> TaskList;
public static object TaskListLock = new object();
public static Timer CleanupTimer;
static void Main(string[] args) {
List<string> urlList = new List<string>();
Log("Generating list");
for (int i = 0; i < 2; i++) {
//adding random strings to simulate a large list of URLs to process
urlList.Add(Path.GetRandomFileName());
}
Log("Queueing tasks");
semaphore = new SemaphoreSlim(MAX_THREADS, MAX_THREADS);
Task.Run(() => QueueTasks(urlList));
CleanupTimer = new Timer(CleanupTasks, null, CLEANUP_INTERVAL * 1000, CLEANUP_INTERVAL * 1000);
Console.ReadLine();
}
// TODO: Guard against re-entrancy
static void CleanupTasks(object state) {
Log("CleanupTasks started");
lock (TaskListLock) {
var now = DateTime.Now;
int n = TaskList.Count;
for (int i = n - 1; i >= 0; --i) {
var task = TaskList[i];
Log($"Checking task with ID {task.Task.ManagedThreadId}");
// kill processes running for longer than anticipated
if (task.Task.IsAlive && now.Subtract(task.StartTime).TotalSeconds >= TASK_TIMEOUT) {
Log("Cleaning up hung task");
task.Task.Abort();
}
// remove task if it is not alive
if (!task.Task.IsAlive) {
Log("Removing dead task from list");
TaskList.RemoveAt(i);
continue;
}
}
if (TaskList.Count == 0) {
Log("Disposing cleanup thread");
CleanupTimer.Dispose();
}
}
Log("CleanupTasks done");
}
static void QueueTasks(List<string> urlList) {
TaskList = new List<TaskInfo>();
foreach (var url in urlList) {
Log($"Trying to schedule url = {url}");
semaphore.Wait();
Log("Semaphore acquired");
ParameterizedThreadStart taskRoutine = obj => {
try {
DoTheThing((string)obj);
} finally {
Log("Releasing semaphore");
semaphore.Release();
}
};
var task = new TaskInfo(taskRoutine, url);
lock (TaskListLock)
TaskList.Add(task);
}
Log("All tasks queued");
}
// simulate all processes get hung
static void DoTheThing(string url) {
while (true)
Thread.Sleep(5000);
}
static void Log(string msg) {
Console.WriteLine("{0:HH:mm:ss.fff} Thread {1,2} {2}", DateTime.Now, Thread.CurrentThread.ManagedThreadId.ToString(), msg);
}
}

Producer Consumer in C# with multiple (parallel) consumers and no TPL Dataflow

I am trying to implement producer/consumer pattern with multiple or parallel consumers.
I did an implementation but I would like to know how good it is. Can somebody do better? Can any of you spot any errors?
Unfortunately I can not use TPL dataflow, because we are at the end of our project and to put in an extra library in our package would take to much paperwork and we do not have that time.
What I am trying to do is to speed up the following portion:
anIntermediaryList = StepOne(anInputList); // I will put StepOne as Producer :-) Step one is remote call.
aResultList = StepTwo(anIntermediaryList); // I will put StepTwo as Consumer, however he also produces result. Step two is also a remote call.
// StepOne is way faster than StepTwo.
For this I came up with the idea that I will chunk the input list (anInputList)
StepOne will be inside of a Producer and will put the intermediary chunks into a queue.
There will be multiple Producers and they will take the intermediary results and process it with StepTwo.
Here is a simplified version of of the implementation later:
Task.Run(() => {
aChunkinputList = Split(anInputList)
foreach(aChunk in aChunkinputList)
{
anIntermediaryResult = StepOne(aChunk)
intermediaryQueue.Add(anIntermediaryResult)
}
})
while(intermediaryQueue.HasItems)
{
anItermediaryResult = intermediaryQueue.Dequeue()
Task.Run(() => {
aResultList = StepTwo(anItermediaryResult);
resultQueue.Add(aResultList)
}
}
I also thought that the best number for the parallel running Consumers would be: "Environment.ProcessorCount / 2". I would like to know if this also is a good idea.
Now here is my mock implementation and the question is can somebody do better or spot any error?
class Example
{
protected static readonly int ParameterCount_ = 1000;
protected static readonly int ChunkSize_ = 100;
// This might be a good number for the parallel consumers.
protected static readonly int ConsumerCount_ = Environment.ProcessorCount / 2;
protected Semaphore mySemaphore_ = new Semaphore(Example.ConsumerCount_, Example.ConsumerCount_);
protected ConcurrentQueue<List<int>> myIntermediaryQueue_ = new ConcurrentQueue<List<int>>();
protected ConcurrentQueue<List<int>> myResultQueue_ = new ConcurrentQueue<List<int>>();
public void Main()
{
List<int> aListToProcess = new List<int>(Example.ParameterCount_ + 1);
aListToProcess.AddRange(Enumerable.Range(0, Example.ParameterCount_));
Task aProducerTask = Task.Run(() => Producer(aListToProcess));
List<Task> aTaskList = new List<Task>();
while(!aProducerTask.IsCompleted || myIntermediaryQueue_.Count > 0)
{
List<int> aChunkToProcess;
if (myIntermediaryQueue_.TryDequeue(out aChunkToProcess))
{
mySemaphore_.WaitOne();
aTaskList.Add(Task.Run(() => Consumer(aChunkToProcess)));
}
}
Task.WaitAll(aTaskList.ToArray());
List<int> aResultList = new List<int>();
foreach(List<int> aChunk in myResultQueue_)
{
aResultList.AddRange(aChunk);
}
aResultList.Sort();
if (aListToProcess.SequenceEqual(aResultList))
{
Console.WriteLine("All good!");
}
else
{
Console.WriteLine("Bad, very bad!");
}
}
protected void Producer(List<int> elements_in)
{
List<List<int>> aChunkList = Example.SplitList(elements_in, Example.ChunkSize_);
foreach(List<int> aChunk in aChunkList)
{
Console.WriteLine("Thread Id: {0} Producing from: ({1}-{2})",
Thread.CurrentThread.ManagedThreadId,
aChunk.First(),
aChunk.Last());
myIntermediaryQueue_.Enqueue(ProduceItemsRemoteCall(aChunk));
}
}
protected void Consumer(List<int> elements_in)
{
Console.WriteLine("Thread Id: {0} Consuming from: ({1}-{2})",
Thread.CurrentThread.ManagedThreadId,
Convert.ToInt32(Math.Sqrt(elements_in.First())),
Convert.ToInt32(Math.Sqrt(elements_in.Last())));
myResultQueue_.Enqueue(ConsumeItemsRemoteCall(elements_in));
mySemaphore_.Release();
}
// Dummy Remote Call
protected List<int> ProduceItemsRemoteCall(List<int> elements_in)
{
return elements_in.Select(x => x * x).ToList();
}
// Dummy Remote Call
protected List<int> ConsumeItemsRemoteCall(List<int> elements_in)
{
return elements_in.Select(x => Convert.ToInt32(Math.Sqrt(x))).ToList();
}
public static List<List<int>> SplitList(List<int> masterList_in, int chunkSize_in)
{
List<List<int>> aReturnList = new List<List<int>>();
for (int i = 0; i < masterList_in.Count; i += chunkSize_in)
{
aReturnList.Add(masterList_in.GetRange(i, Math.Min(chunkSize_in, masterList_in.Count - i)));
}
return aReturnList;
}
}
Main function:
class Program
{
static void Main(string[] args)
{
Example anExample = new Example();
anExample.Main();
}
}
Bye
Laszlo
Based on the comments I've posted a second and third version:
https://codereview.stackexchange.com/questions/71182/producer-consumer-in-c-with-multiple-parallel-consumers-and-no-tpl-dataflow/71233#71233

How to use multi threading in a For loop

I want to achieve the below requirement; please suggest some solution.
string[] filenames = Directory.GetFiles("C:\Temp"); //10 files
for (int i = 0; i < filenames.count; i++)
{
ProcessFile(filenames[i]); //it takes time to execute
}
I wanted to implement multi-threading. e.g There are 10 files. I wanted to process 3 files at a time (configurable, say maxthreadcount). So 3 files will be processed in 3 threads from the for loop and if any thread completes the execution, it should pick the next item from the for loop. Also wanted to ensure all the files are processed before it exits the for loop.
Please suggest best approach.
Try
Parallel.For(0, filenames.Length, i => {
ProcessFile(filenames[i]);
});
MSDN
It's only available since .Net 4. Hope that acceptable.
This will do the job in .net 2.0:
class Program
{
static int workingCounter = 0;
static int workingLimit = 10;
static int processedCounter = 0;
static void Main(string[] args)
{
string[] files = Directory.GetFiles("C:\\Temp");
int checkCount = files.Length;
foreach (string file in files)
{
//wait for free limit...
while (workingCounter >= workingLimit)
{
Thread.Sleep(100);
}
workingCounter += 1;
ParameterizedThreadStart pts = new ParameterizedThreadStart(ProcessFile);
Thread th = new Thread(pts);
th.Start(file);
}
//wait for all threads to complete...
while (processedCounter< checkCount)
{
Thread.Sleep(100);
}
Console.WriteLine("Work completed!");
}
static void ProcessFile(object file)
{
try
{
Console.WriteLine(DateTime.Now.ToString() + " recieved: " + file + " thread count is: " + workingCounter.ToString());
//make some sleep for demo...
Thread.Sleep(2000);
}
catch (Exception ex)
{
//handle your exception...
string exMsg = ex.Message;
}
finally
{
Interlocked.Decrement(ref workingCounter);
Interlocked.Increment(ref processedCounter);
}
}
}
Take a look at the Producer/Consumer Queue example by Joe Albahari. It should provide a good starting point for what you're trying to accomplish.
You could use the ThreadPool.
Example:
ThreadPool.SetMaxThreads(3, 3);
for (int i = 0; i < filenames.count; i++)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(ProcessFile), filenames[i]);
}
static void ProcessFile(object fileNameObj)
{
var fileName = (string)fileNameObj;
// do your processing here.
}
If you are using the ThreadPool elsewhere in your application then this would not be a good solution since it is shared across your app.
You could also grab a different thread pool implementation, for example SmartThreadPool
Rather than starting a thread for each file name, put the file names into a queue and then start up three threads to process them. Or, since the main thread is now free, start up two threads and let the main thread work on it, too:
Queue<string> MyQueue;
void MyProc()
{
string[] filenames = Directory.GetFiles(...);
MyQueue = new Queue(filenames);
// start two threads
Thread t1 = new Thread((ThreadStart)ProcessQueue);
Thread t2 = new Thread((ThreadStart)ProcessQueue);
t1.Start();
t2.Start();
// main thread processes the queue, too!
ProcessQueue();
// wait for threads to complete
t1.Join();
t2.Join();
}
private object queueLock = new object();
void ProcessQueue()
{
while (true)
{
string s;
lock (queueLock)
{
if (MyQueue.Count == 0)
{
// queue is empty
return;
}
s = MyQueue.Dequeue();
}
ProcessFile(s);
}
}
Another option is to use a semaphore to control how many threads are working:
Semaphore MySem = new Semaphore(3, 3);
void MyProc()
{
string[] filenames = Directory.GetFiles(...);
foreach (string s in filenames)
{
mySem.WaitOne();
ThreadPool.QueueUserWorkItem(ProcessFile, s);
}
// wait for all threads to finish
int count = 0;
while (count < 3)
{
mySem.WaitOne();
++count;
}
}
void ProcessFile(object state)
{
string fname = (string)state;
// do whatever
mySem.Release(); // release so another thread can start
}
The first will perform somewhat better because you don't have the overhead of starting and stopping a thread for each file name processed. The second is much shorter and cleaner, though, and takes full advantage of the thread pool. Likely you won't notice the performance difference.
Can set max threads unsing ParallelOptions
Parallel.For Method (Int32, Int32, ParallelOptions, Action)
ParallelOptions.MaxDegreeOfParallelism
var results = filenames.ToArray().AsParallel().Select(filename=>ProcessFile(filename)).ToArray();
bool ProcessFile(object fileNameObj)
{
var fileName = (string)fileNameObj;
// do your processing here.
return true;
}

Categories

Resources