Thread won't resume after multithreaded session - c#

I have a thread, call it the "Parsing thread".
Thread parsingThread = new Thread(myMethod);
I perform some computations on this thread, of which the last involves more parallel computations.
public void ReadCityFiles(BlockingCollection<GeonamesFileInfo> files)
{
Parallel.ForEach<GeonamesFileInfo>(
files.GetConsumingPartitioner<GeonamesFileInfo>(),
new ParallelOptions { MaxDegreeOfParallelism = _maxParallelism },
(inputFile, args) =>
{
RaiseFileParsing(inputFile);
using (var input = new System.IO.StreamReader(inputFile.FullName))
{
while (!input.EndOfStream)
{
RaiseEntryParsed(ParseCity(input.ReadLine()));
Interlocked.Increment(ref _parsedEntries);
}
}
RaiseFileParsed(inputFile);
});
RaiseDirectoryParsed(Directory);
}
The problem is that when these very long and computationally expensive async foreach operations finish (~30 mins), the "Parsing Thread" doesn't resume. My GUI is still responsive, but the RaiseDirectoryParsed function that is supposed to continue to run on the "Parsing Thread" is never called. I debugged the program up to this point, and am pretty baffled as to what to do in this situation.

The point of BlockingCollection is that when an operation cannot be performed now, but might be in the future (e.g. Take() or Add() on a collection with bounded capacity), it will block. The same applies to GetConsumingEnumerable() and thus also to GetConsumingPartitioner(): if the collection is currently empty, the enumerable will block until you add more items to the collection.
But there is also a way to tell the collection that you're not going to add new items anymore and that it shouldn't block when empty from now on: the CompleteAdding() method. If you call this when you know you won't be adding any more new items to the collection, your Parallel.ForEach() won't block anymore and your thread will continue executing.

Related

How to invoke a consumer method as soon as BlockingCollection got populated?

Background:
By reading so many sources I understood BlockingCollection<T> is designed to get rid of the requirement of checking if new data is available in the shared collection between threads. if there is new data inserted into the shared collection then your consumer thread will awake immediately. So you do not have to check if new data is available for consumer thread in certain time intervals typically in a while loop.
I also have similar requirement:
I have a blocking collection of size 1.
This collection will be populated from 3 places (3 producers).
Currently using while loop to check whether collection has something or not.
Want to execute ProcessInbox() method as soon as blocking collection got a value and empty that collection, without checking if new data is available for consumer thread in certain time intervals typically in a while loop. How we can achieve it?
using System;
using System.Collections.Concurrent;
using System.Linq;
using System.Threading;
namespace ConsoleApp1
{
class Program
{
private static BlockingCollection<int> _processingNotificationQueue = new(1);
private static void GetDataFromQueue(CancellationToken cancellationToken)
{
Console.WriteLine("GDFQ called");
int data;
//while (!cancellationToken.IsCancellationRequested)
while(!_processingNotificationQueue.IsCompleted)
{
try
{
if(_processingNotificationQueue.TryTake(out data))
{
Console.WriteLine("Take");
ProcessInbox();
}
}
catch (Exception ex)
{
}
}
}
private static void ProcessInbox()
{
Console.WriteLine("PI called");
}
private static void PostDataToQueue(object state)
{
Console.WriteLine("PDTQ called");
_processingNotificationQueue.TryAdd(1);
}
private void MessageInsertedToTabale()
{
PostDataToQueue(new CancellationToken());
}
private void FewMessagesareNotProcessed()
{
PostDataToQueue(new CancellationToken());
}
static void Main(string[] args)
{
Console.WriteLine("Start");
new Timer(PostDataToQueue, new CancellationToken(), TimeSpan.Zero,
TimeSpan.FromMilliseconds(100));
// new Thread(()=> PostDataToQueue()).Start();
new Thread(() => GetDataFromQueue(new CancellationToken())).Start();
Console.WriteLine("End");
Console.ReadKey();
}
}
}
Just foreach over it. It's blocking. As long as it is not marked as completed, your foreach will HANG if the collection is empty, and will wake up as soon as new items were added.
See first ConsumingEnumerableDemo in of https://learn.microsoft.com/en-us/dotnet/api/system.collections.concurrent.blockingcollection-1?view=net-6.0 and imagine the consumer foreach (var item in bc.GetConsumingEnumerable()) is in another thread. The producer there has a delay between new items, so you should be able to easily tinker with it and see how consumer wakes up "relatively immediately". There's just one producer, but I don't see a problem with multiple producers, Add is thread-safe.
I can't guarantee that there's no significant delay between adding new item and waking up a sleeping consumer, because 'significant' is totally case-dependent word. There probably is some delay, at least for switching threads, but I doubt the collection does any additional throttling. I suppose it signals to wake up sleeping consumers before Add returns in the producer. And I suppose for the purposes of Inbox processing, that's probably UI thing for humans, and probably an order of 100ms delay won't be noticeable and I wouldn't expect the Add/Wakeup latency to be much below 100ms. No guarantees though.
If the 'blocking foreach' part sounds evil to you for some reason, you'll probably have to switch to a different synchronization mechanism (**). This is a BlockingCollection, right? It's not evented collection or something. The TryXXX methods are there for cases where you want to limit exceptions for some reason, and can deal with scheduling updates yourself like you do here (*).
(*) well, almost. This code you posted is missing 2 important things. Your while loop busy-spins at max speed when the collection is empty, that's usually a deadly no-no, especially for anything that runs on batteries. Consider addind some dead time to have if(hasItems) doWork; else sleep(sometime);. The other thing is try-catch. The docs say, when the collection throws, it means the collection is "done". NO MORE ITEMS EVER. No point in looping over a dead collection. The try-catch should be not inside the loop, but should encompass the loop so looping is stopped when collection is finished.
(**) I personally like RX extensions. Here it'd be a simple subject and one observer. Also, async/wait/IAsyncEnumerable are tempting, can help prevent synchronization by sleeping, but it still can end up with a busy-spinning loop if not done carefuly. And there are more choices, and the question was on BlockingCollection, so just FYI.

Task.Run does not work like Thread.start

I've been developing an application which I need to run some methods as parallel and not blocking. first I used Task.Run, but IN DEBUG MODE, I see that the operation blocks and just waits for the result. I do not want this, I want all method , which call in a foreach loop, run asynchronously.
public async void f()
{
foreach (var item in childrenANDparents)
{
await Task.Run(() => SendUpdatedSiteInfo(item.Host,site_fr));
// foreach loop does not work until the task return and continues
}
}
So I changed the task.run to thread.start and it works great!
public async void f()
{
foreach (var item in childrenANDparents)
{
Thread t = new Thread(() => SendUpdatedSiteInfo(item.Host, site_fr));
t.Start();
// foreach loop works regardless of the method, in debug mode it shows me
// they are working in parallel
}
}
Would you explain what is the difference and why ? I expect the same behavior from both code and it seems they are different.
thanks
I want all method , which call in a foreach loop, run asynchronously.
It seems that you're confusing async/sync calls with parallelization.
A quote from MSDN:
Data parallelism: A form of parallel processing where the same
computation executes in parallel on different data. Data parallelism
is supported in the Microsoft .NET Framework by the Parallel.For and
Parallel.ForEach methods and by PLINQ. Compare to task parallelism.
Asynchronous operation: An operation that that does not block the current thread
of control when the operation starts.
Let's have a closer look at your code again:
foreach (var item in childrenANDparents)
{
await Task.Run(() => SendUpdatedSiteInfo(item.Host,site_fr));
}
The await keyword will cause compiler to create a StateMachine that will handle the method execution.
It's like if you say to compiler:"Start this async operation without blocking any threads and when it's completed - execute the rest of the stuff".
After Task finishes execution this thread will be released and returned to a ThreadPool and it will execute the rest of the code on a first available thread from a ThreadPool and will make attempt to execute it in a thread in which it had started the method execution (unless .ConfigureAwait(false) is used in which case it's more like 'fire and forget' mode when we don't really care which thread will do the continuation).
When you create a separate Thread you do parallelism by delegating some code to run in a separate Thread. So depending on the code itself it may or may not be executed asynchronously.
It's like if you say to compiler:"Take this piece of work start a new thread and do it there"
If you still want to use Tasks with parallelism you could create an array of tasks in a loop and then wait for all of them to finish execution:
var tasks = new[]
{
childrenANDparents.Select(item=> Task.Run(() => SendUpdatedSiteInfo(item.Host,site_fr)));
}
await Task.WhenAll(tasks);
P.S.
And yes you may as well use TPL (Task Parallel Library) and specifically Parallel loops.
You could use a simple Parallel.ForEach or PLinq
Parallel.ForEach(childrenANDparents, (item) =>
{
SendUpdatedSiteInfo(item.Host,site_fr)
});
To better understand async and await its best to start reading some docos, its a large topic, but its worth your while
https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/async/

How to detect AddingCompleted of a BlockingCollection without race condition and exception?

I'm using a BlockingCollection{T} that's filled from only one thread and consumed by only one thread. Producing and consuming items works fine. The problem is at the end of this operation. The task blocks (as expected) at GetConsumingEnumerable. After calling CompleteAdding the task will dispose the BlockingCollection and will finish without any exceptions. So far so good.
Now I've a thread that adds items to the BlockingCollection. This thread has to test for IsAddingCompleted and then it has to add the item. But there's a race condition between aksing for IsAddingCompleted and adding the item. There's a TryAdd-method but is also raises an exception if adding is already completed.
How can I add an item or test for adding completed without an additional lock? Why does TryAdd throw any exceptions? Returning false will be fine if adding is already completed.
The very simplified code looks like that:
private BlockingCollection<string> _items = new BlockingCollection<string>();
public void Start()
{
Task.Factory.StartNew(
() =>
{
foreach (var item in this._items.GetConsumingEnumerable())
{
}
this._items.Dispose();
});
Thread.Sleep(50); // Wait for Task
this._items.CompleteAdding(); // Complete adding
}
public void ConsumeItem(string item)
{
if (!this._items.IsAddingCompleted)
{
this._items.Add(item);
}
}
Yes I know that this code doesn't make sense because there's nearly no chance to add any item and the foreach-loop does noting. The consuming task doesn't matter for my problem.
The problem is shown in ConsumeItem-method. I'm able to add an additional lock (Semaphore) arround ConsumeItem and CompleteAdding+Dispose but I try to avoid this performance impact.
How can I add items without any exceptions? Losing items will be fine if adding has been completed.

Stop Threads created by foreach

well, i make a loop that makes a lot of threads, see:
foreach (DataGridViewRow dgvRow in dataGridView1.Rows)
{
Class h = new Class(dgvRow.Cells["name"].Value.ToString());
Thread trdmyClass = new Thread(h.SeeInfoAboutName);
trdmyClass.IsBackground = true;
trdmyClass.Start();
}
This is working fine, creating the threads that i need, but i want to stop all this threads (using Thread.Abort()), in one time when i click on a button for e.g.
How can i do this?
I wouldn't use Thread.Abort. It can have some very nasty consequences. What you should be doing is keeping track of the threads you create by putting them into a list. You can then use a ManualResetEvent class. The threads should check if the reset was raised or not periodically and if it has been set, they should cleanup and exit. I use the WaitOne method will a millisecond timeout and then check the return value to allow threads to run in a loop. If true is returned the signal is set and you can exit the loop or otherwise return from your thread. If you're using .Net 4, you can also use a CancelationToken as well.
http://msdn.microsoft.com/en-us/library/system.threading.manualresetevent.aspx
http://msdn.microsoft.com/en-us/library/system.threading.cancellationtoken.aspx
Read more about the issues with Thread.Abort here: http://msdn.microsoft.com/en-us/library/ty8d3wta.aspx
EDIT: I use a ManualResetEvent as its thread safe and you could use it to syncronize the processing in the threads, for example if you're doing a producer / consumer pattern. A volatile boolean could be used as well. I recommend keeping the threads in a list in case you need to wait for them to complete, so you can Join on each one. This may or may not be applicable to your problem though. Its usually a good idea, especially if you're exiting, to Join all your threads to allow them to finish any cleanup they may be doing.
You really shouldn't use Thread.Abort(), it can be very dangerous. Instead, you should provide some way to signal to the threads that they are canceled. Each thread would then periodically check whether it's canceled and end if it was.
One way to do this would be to use CancellationToken, which does exactly that. The framework methods that support cancellation work with this type too.
Your code could then look something like this:
// field to keep CancellationTokenSource:
CancellationTokenSource m_cts;
// in your method:
m_cts = new CancellationTokenSource();
foreach (DataGridViewRow dgvRow in dataGridView1.Rows)
{
Class h = new Class(dgvRow.Cells["name"].Value.ToString());
Thread trdmyClass = new Thread(() => h.SeeInfoAboutName(m_cts.Token));
trdmyClass.IsBackground = true;
trdmyClass.Start();
}
//somewhere else, where you want to cancel the threads:
m_cts.Cancel();
// the SeeInfoAboutName() method
public void SeeInfoAboutName(CancellationToken cancellationToken)
{
while (!cancellationToken.IsCancellationRequested)
{
// do some work
}
}
Keep all the threads in a List, and then loop through the list and stop them.

how do set a timeout for a method

how do set a timeout for a busy method +C#.
Ok, here's the real answer.
...
void LongRunningMethod(object monitorSync)
{
//do stuff
lock (monitorSync) {
Monitor.Pulse(monitorSync);
}
}
void ImpatientMethod() {
Action<object> longMethod = LongRunningMethod;
object monitorSync = new object();
bool timedOut;
lock (monitorSync) {
longMethod.BeginInvoke(monitorSync, null, null);
timedOut = !Monitor.Wait(monitorSync, TimeSpan.FromSeconds(30)); // waiting 30 secs
}
if (timedOut) {
// it timed out.
}
}
...
This combines two of the most fun parts of using C#. First off, to call the method asynchronously, use a delegate which has the fancy-pants BeginInvoke magic.
Then, use a monitor to send a message from the LongRunningMethod back to the ImpatientMethod to let it know when it's done, or if it hasn't heard from it in a certain amount of time, just give up on it.
(p.s.- Just kidding about this being the real answer. I know there are 2^9303 ways to skin a cat. Especially in .Net)
You can not do that, unless you change the method.
There are two ways:
The method is built in such a way that it itself measures how long it has been running, and then returns prematurely if it exceeds some threshold.
The method is built in such a way that it monitors a variable/event that says "when this variable is set, please exit", and then you have another thread measure the time spent in the first method, and then set that variable when the time elapsed has exceeded some threshold.
The most obvious, but unfortunately wrong, answer you can get here is "Just run the method in a thread and use Thread.Abort when it has ran for too long".
The only correct way is for the method to cooperate in such a way that it will do a clean exit when it has been running too long.
There's also a third way, where you execute the method on a separate thread, but after waiting for it to finish, and it takes too long to do that, you simply say "I am not going to wait for it to finish, but just discard it". In this case, the method will still run, and eventually finish, but that other thread that was waiting for it will simply give up.
Think of the third way as calling someone and asking them to search their house for that book you lent them, and after you waiting on your end of the phone for 5 minutes you simply say "aw, chuck it", and hang up. Eventually that other person will find the book and get back to the phone, only to notice that you no longer care for the result.
This is an old question but it has a simpler solution now that was not available then: Tasks!
Here is a sample code:
var task = Task.Run(() => LongRunningMethod());//you can pass parameters to the method as well
if (task.Wait(TimeSpan.FromSeconds(30)))
return task.Result; //the method returns elegantly
else
throw new TimeoutException();//the method timed-out
While MojoFilter's answer is nice it can lead to leaks if the "LongMethod" freezes. You should ABORT the operation if you're not interested in the result anymore.
public void LongMethod()
{
//do stuff
}
public void ImpatientMethod()
{
Action longMethod = LongMethod; //use Func if you need a return value
ManualResetEvent mre = new ManualResetEvent(false);
Thread actionThread = new Thread(new ThreadStart(() =>
{
var iar = longMethod.BeginInvoke(null, null);
longMethod.EndInvoke(iar); //always call endinvoke
mre.Set();
}));
actionThread.Start();
mre.WaitOne(30000); // waiting 30 secs (or less)
if (actionThread.IsAlive) actionThread.Abort();
}
You can run the method in a separate thread, and monitor it and force it to exit if it works too long. A good way, if you can call it as such, would be to develop an attribute for the method in Post Sharp so the watching code isn't littering your application.
I've written the following as sample code(note the sample code part, it works, but could suffer issues from multithreading, or if the method in question captures the ThreadAbortException would break it):
static void ActualMethodWrapper(Action method, Action callBackMethod)
{
try
{
method.Invoke();
} catch (ThreadAbortException)
{
Console.WriteLine("Method aborted early");
} finally
{
callBackMethod.Invoke();
}
}
static void CallTimedOutMethod(Action method, Action callBackMethod, int milliseconds)
{
new Thread(new ThreadStart(() =>
{
Thread actionThread = new Thread(new ThreadStart(() =>
{
ActualMethodWrapper(method, callBackMethod);
}));
actionThread.Start();
Thread.Sleep(milliseconds);
if (actionThread.IsAlive) actionThread.Abort();
})).Start();
}
With the following invocation:
CallTimedOutMethod(() =>
{
Console.WriteLine("In method");
Thread.Sleep(2000);
Console.WriteLine("Method done");
}, () =>
{
Console.WriteLine("In CallBackMethod");
}, 1000);
I need to work on my code readability.
Methods don't have timeouts in C#, unless your in the debugger or the OS believes your app has 'hung'. Even then processing still continues and as long as you don't kill the application a response is returned and the app continues to work.
Calls to databases can have timeouts.
Could you create an Asynchronous Method so that you can continue doing other stuff whilst the "busy" method completes?
I regularly write apps where I have to synchronize time critical tasks across platforms. If you can avoid thread.abort you should. See http://blogs.msdn.com/b/ericlippert/archive/2010/02/22/should-i-specify-a-timeout.aspx and http://www.interact-sw.co.uk/iangblog/2004/11/12/cancellation for guidelines on when thread.abort is appropriate. Here are the concept I implement:
Selective execution: Only run if a reasonable chance of success exists (based on ability to meet timeout or likelihood of success result relative to other queued items). If you break code into segments and know roughly the expected time between task chunks, you can predict if you should skip any further processing. Total time can be measured by wrapping an object bin tasks with a recursive function for time calculation or by having a controller class that watches workers to know expected wait times.
Selective orphaning: Only wait for return if reasonable chance of success exists. Indexed tasks are run in a managed queue. Tasks that exceed their timeout or risk causing other timeouts are orphaned and a null record is returned in their stead. Longer running tasks can be wrapped in async calls. See example async call wrapper: http://www.vbusers.com/codecsharp/codeget.asp?ThreadID=67&PostID=1
Conditional selection: Similar to selective execution but based on group instead of individual task. If many of your tasks are interconnected such that one success or fail renders additional processing irrelevant, create a flag that is checked before execution begins and again before long running sub-tasks begin. This is especially useful when you are using parallel.for or other such queued concurrency tasks.

Categories

Resources