I have a windows service (written in C#) that use the task parallel library dll to perform some parallel tasks (5 tasks a time)
After the tasks are executed once I would like to repeat the same tasks on an on going basis (hourly). Call the QueuePeek method
Do I use a timer or a counter like I have setup in the code snippet below?
I am using a counter to set up the tasks, once I reach five I exit the loop, but I also use a .ContinueWith to decrement the counter, so my thought is that the counter value would be below 5 hence the loop would continue. But my ContinueWith seems to be executing on the main thread and the loop then exits.
The call to DecrementCounter using the ContinueWith does not seem to work
FYI : The Importer class is to load some libraries using MEF and do the work
This is my code sample:
private void QueuePeek()
{
var list = SetUpJobs();
while (taskCounter < 5)
{
int j = taskCounter;
Task task = null;
task = new Task(() =>
{
DoLoad(j);
});
taskCounter += 1;
tasks[j] = task;
task.ContinueWith((t) => DecrementTaskCounter());
task.Start();
ds.SetJobStatus(1);
}
if (taskCounter == 0)
Console.WriteLine("Completed all tasks.");
}
private void DoLoad(int i)
{
ILoader loader;
DataService.DataService ds = new DataService.DataService();
Dictionary<int, dynamic> results = ds.AssignRequest(i);
var data = results.Where(x => x.Key == 2).First();
int loaderId = (int)data.Value;
Importer imp = new Importer();
loader = imp.Run(GetLoaderType(loaderId));
LoaderProcessor lp = new LoaderProcessor(loader);
lp.ExecuteLoader();
}
private void DecrementTaskCounter()
{
Console.WriteLine(string.Format("Decrementing task counter with threadId: {0}",Thread.CurrentThread.ManagedThreadId) );
taskCounter--;
}
I see a few issues with your code that can potentially lead to some hard to track-down bugs. First, if using a counter that all of the tasks can potentially be reading and writing to at the same time, try using Interlocked. For example:
Interlocked.Increment(ref _taskCounter); // or Interlocked.Decrement(ref _taskCounter);
If I understand what you're trying to accomplish, I think what you want to do is to use a timer that you re-schedule after each group of tasks is finished.
public class Worker
{
private System.Threading.Timer _timer;
private int _timeUntilNextCall = 3600000;
public void Start()
{
_timer = new Timer(new TimerCallback(QueuePeek), null, 0, Timeout.Infinite);
}
private void QueuePeek(object state)
{
int numberOfTasks = 5;
Task[] tasks = new Task[numberOfTasks];
for(int i = 0; i < numberOfTasks; i++)
{
tasks[i] = new Task(() =>
{
DoLoad();
});
tasks[i].Start();
}
// When all tasks are complete, set to run this method again in x milliseconds
Task.Factory.ContinueWhenAll(tasks, (t) => { _timer.Change(_timeUntilNextCall, Timeout.Infinite); });
}
private void DoLoad() { }
}
Related
we are running an ASP.NET 6 webapplication and are having strange issues with deadlocks.
The app suddenly freezes after some weeks of operations and it seems that it might be caused by our locking mechanism with the SemaphoreSlim class.
I tried to reproduce the issue with a simple test-project and found something strange.
The following code is simply starting 1000 tasks where each is doing some work (requesting semaphore-handle, waiting for 10 ms and releasing the semaphore).
I expected this code to simply execute one task after another. But it freezes because of a deadlock in the first call of the DoWork method (at await Task.Delay(10)).
Does anyone know why this causes a deadlock? I tried exactly the same code with ThreadPool.QueueUserWorkItem instead of Task.Run and Thread.Sleep instead of Task.Delay and this worked as expected. But as soon as I use the tasks it stops working.
Here is the complete code-snippet:
internal class Program
{
static int timeoutSec = 60;
static SemaphoreSlim semaphore = new SemaphoreSlim(1);
static int numPerIteration = 1000;
static int iteration = 0;
static int doneCounter = numPerIteration;
static int successCount = 0;
static int failedCount = 0;
static Stopwatch sw = new Stopwatch();
static Random rnd = new Random();
static void Main(string[] args)
{
Task.WaitAll(TestUsingTasks());
}
static async Task TestUsingTasks()
{
while (true)
{
var tasks = new List<Task>();
if (doneCounter >= numPerIteration)
{
doneCounter = 0;
if (iteration >= 1)
{
Log($"+++++ FINISHED TASK ITERATION {iteration} - SUCCESS: {successCount} - FAILURES: {failedCount} - Seconds: {sw.Elapsed.TotalSeconds:F1}", ConsoleColor.Magenta);
}
iteration++;
sw.Restart();
for (int i = 0; i < numPerIteration; i++)
{
// Start indepdent tasks to do some work
Task.Run(async () =>
{
if (await DoWork())
{
successCount++;
}
else
{
failedCount++;
}
doneCounter++;
});
}
}
await Task.Delay(10);
}
}
static async Task<bool> DoWork()
{
if (semaphore.Wait(timeoutSec * 1000)) // Request the semaphore to ensure that one 1 task at a time can enter
{
Log($"Got handle for {iteration} within {sw.Elapsed.TotalSeconds:F1}", ConsoleColor.Green);
var totalSec = sw.Elapsed.TotalSeconds;
await Task.Delay(10); // Wait for 10ms to simulate some work => Deadlock seems to happen here
Log($"RELEASING LOCK handle for {iteration} within {sw.Elapsed.TotalSeconds:F1}. WAIT took " + (sw.Elapsed.TotalSeconds - totalSec) + " seconds", ConsoleColor.Gray);
semaphore.Release();
return true;
}
else
{
Log($"ERROR: TASK handle failed for {iteration} within {sw.Elapsed.TotalSeconds:F1} sec", ConsoleColor.Red);
return false;
}
}
static void Log(string message, ConsoleColor color)
{
Console.ForegroundColor = color;
Console.WriteLine(message);
Console.ForegroundColor = ConsoleColor.White;
}
}
Thanks in advance!
But it freezes because of a deadlock in the first call of the DoWork method (at await Task.Delay(10)).
I would argue that it is not deadlock but a thread starvation issue. If you wait long enough you will see that threads will be able to finish the simulation wait from time to time.
The quick fix here is using non-blocking WaitAsync call with await:
static async Task<bool> DoWork()
{
if (await semaphore.WaitAsync(timeoutSec * 1000))
{
...
}
}
Also note:
It is recommended to wrap the code after Wait.. into try-finally block and release the semaphore in the finally.
Incrementing counters in parallel environments better should be done in atomic fashion, for example with Interlocked.Increment.
I have set StaTaskScheduler threads to 1 and I expected that I would get one Debug output every 5 seconds, but I end up with 10 with the same date
private void Test() {
for (int i = 0; i < 10; i++)
Task.Factory.StartNew(() =>
{
Task.Delay(5000); //temp for long operation
Debug.WriteLine(DateTime.Now);
}, CancellationToken.None, TaskCreationOptions.None, MainWindow.MyStaThread);
}
public static StaTaskScheduler MyStaThread =
new StaTaskScheduler(numberOfThreads: 1);
What am I missing? The reason for STA is that later it will be used for Icons extraction needing STA, but this test is to check it is done in sequence.
you have to start tasks using the MyStaThred.QueueTask rather then Task.Factory.Startnew:
private void Test() {
for (int i = 0; i < 10; i++)
MyStaThread.QueueTask(new Task(() =>
{
Task.Delay(5000); //temp for long operation
Debug.WriteLine(DateTime.Now);
}));
}
public static StaTaskScheduler MyStaThread =
new StaTaskScheduler(numberOfThreads: 1);
Task.Factory.Startnew uses .Net Framework internal thread pool and does not take the StaTaskScheduler into account.
I have this function which checks for proxy servers and currently it checks only a number of threads and waits for all to finish until the next set is starting. Is it possible to start a new thread as soon as one is finished from the maximum allowed?
for (int i = 0; i < listProxies.Count(); i+=nThreadsNum)
{
for (nCurrentThread = 0; nCurrentThread < nThreadsNum; nCurrentThread++)
{
if (nCurrentThread < nThreadsNum)
{
string strProxyIP = listProxies[i + nCurrentThread].sIPAddress;
int nPort = listProxies[i + nCurrentThread].nPort;
tasks.Add(Task.Factory.StartNew<ProxyAddress>(() => CheckProxyServer(strProxyIP, nPort, nCurrentThread)));
}
}
Task.WaitAll(tasks.ToArray());
foreach (var tsk in tasks)
{
ProxyAddress result = tsk.Result;
UpdateProxyDBRecord(result.sIPAddress, result.bOnlineStatus);
}
tasks.Clear();
}
This seems much more simple:
int numberProcessed = 0;
Parallel.ForEach(listProxies,
new ParallelOptions { MaxDegreeOfParallelism = nThreadsNum },
(p)=> {
var result = CheckProxyServer(p.sIPAddress, s.nPort, Thread.CurrentThread.ManagedThreadId);
UpdateProxyDBRecord(result.sIPAddress, result.bOnlineStatus);
Interlocked.Increment(numberProcessed);
});
With slots:
var obj = new Object();
var slots = new List<int>();
Parallel.ForEach(listProxies,
new ParallelOptions { MaxDegreeOfParallelism = nThreadsNum },
(p)=> {
int threadId = Thread.CurrentThread.ManagedThreadId;
int slot = slots.IndexOf(threadId);
if (slot == -1)
{
lock(obj)
{
slots.Add(threadId);
}
slot = slots.IndexOf(threadId);
}
var result = CheckProxyServer(p.sIPAddress, s.nPort, slot);
UpdateProxyDBRecord(result.sIPAddress, result.bOnlineStatus);
});
I took a few shortcuts there to guarantee thread safety. You don't have to do the normal check-lock-check dance because there will never be two threads attempting to add the same threadid to the list, so the second check will always fail and isn't needed. Secondly, for the same reason, I don't believe you need to ever lock around the outer IndexOf either. That makes this a very highly efficient concurrent routine that rarely locks (it should only lock nThreadsNum times) no matter how many items are in the enumerable.
Another solution is to use a SemaphoreSlim or the Producer-Consumer Pattern using a BlockinCollection<T>. Both solution support cancellation.
SemaphoreSlim
private async Task CheckProxyServerAsync(IEnumerable<object> proxies)
{
var tasks = new List<Task>();
int currentThreadNumber = 0;
int maxNumberOfThreads = 8;
using (semaphore = new SemaphoreSlim(maxNumberOfThreads, maxNumberOfThreads))
{
foreach (var proxy in proxies)
{
// Asynchronously wait until thread is available if thread limit reached
await semaphore.WaitAsync();
string proxyIP = proxy.IPAddress;
int port = proxy.Port;
tasks.Add(Task.Run(() => CheckProxyServer(proxyIP, port, Interlocked.Increment(ref currentThreadNumber)))
.ContinueWith(
(task) =>
{
ProxyAddress result = task.Result;
// Method call must be thread-safe!
UpdateProxyDbRecord(result.IPAddress, result.OnlineStatus);
Interlocked.Decrement(ref currentThreadNumber);
// Allow to start next thread if thread limit was reached
semaphore.Release();
},
TaskContinuationOptions.OnlyOnRanToCompletion));
}
// Asynchronously wait until all tasks are completed
// to prevent premature disposal of semaphore
await Task.WhenAll(tasks);
}
}
Producer-Consumer Pattern
// Uses a fixed number of same threads
private async Task CheckProxyServerAsync(IEnumerable<ProxyInfo> proxies)
{
var pipe = new BlockingCollection<ProxyInfo>();
int maxNumberOfThreads = 8;
var tasks = new List<Task>();
// Create all threads (count == maxNumberOfThreads)
for (int currentThreadNumber = 0; currentThreadNumber < maxNumberOfThreads; currentThreadNumber++)
{
tasks.Add(
Task.Run(() => ConsumeProxyInfo(pipe, currentThreadNumber)));
}
proxies.ToList().ForEach(pipe.Add);
pipe.CompleteAdding();
await Task.WhenAll(tasks);
}
private void ConsumeProxyInfo(BlockingCollection<ProxyInfo> proxiesPipe, int currentThreadNumber)
{
while (!proxiesPipe.IsCompleted)
{
if (proxiesPipe.TryTake(out ProxyInfo proxy))
{
int port = proxy.Port;
string proxyIP = proxy.IPAddress;
ProxyAddress result = CheckProxyServer(proxyIP, port, currentThreadNumber);
// Method call must be thread-safe!
UpdateProxyDbRecord(result.IPAddress, result.OnlineStatus);
}
}
}
If I'm understanding your question properly, this is actually fairly simple to do with await Task.WhenAny. Basically, you keep a collection of all of the running tasks. Once you reach a certain number of tasks running, you wait for one or more of your tasks to finish, and then you remove the tasks that were completed from your collection and continue to add more tasks.
Here's an example of what I mean below:
var tasks = new List<Task>();
for (int i = 0; i < 20; i++)
{
// I want my list of tasks to contain at most 5 tasks at once
if (tasks.Count == 5)
{
// Wait for at least one of the tasks to complete
await Task.WhenAny(tasks.ToArray());
// Remove all of the completed tasks from the list
tasks = tasks.Where(t => !t.IsCompleted).ToList();
}
// Add some task to the list
tasks.Add(Task.Factory.StartNew(async delegate ()
{
await Task.Delay(1000);
}));
}
I suggest changing your approach slightly. Instead of starting and stopping threads, put your proxy server data in a concurrent queue, one item for each proxy server. Then create a fixed number of threads (or async tasks) to work on the queue. This is more likely to provide smooth performance (you aren't starting and stopping threads over and over, which has overhead) and is a lot easier to code, in my opinion.
A simple example:
class ProxyChecker
{
private ConcurrentQueue<ProxyInfo> _masterQueue = new ConcurrentQueue<ProxyInfo>();
public ProxyChecker(IEnumerable<ProxyInfo> listProxies)
{
foreach (var proxy in listProxies)
{
_masterQueue.Enqueue(proxy);
}
}
public async Task RunChecks(int maximumConcurrency)
{
var count = Math.Max(maximumConcurrency, _masterQueue.Count);
var tasks = Enumerable.Range(0, count).Select( i => WorkerTask() ).ToList();
await Task.WhenAll(tasks);
}
private async Task WorkerTask()
{
ProxyInfo proxyInfo;
while ( _masterList.TryDequeue(out proxyInfo))
{
DoTheTest(proxyInfo.IP, proxyInfo.Port)
}
}
}
I have some code that runs thousands of URLs through a third party library. Occasionally the method in the library hangs which takes up a thread. After a while all threads are taken up by processes doing nothing and it grinds to a halt.
I am using a SemaphoreSlim to control adding new threads so I can have an optimal number of tasks running. I need a way to identify tasks that have been running too long and then to kill them but also release a thread from the SemaphoreSlim so a new task can be created.
I am struggling with the approach here so I made some test code that immitates what I am doing. It create tasks that have a 10% chance of hanging so very quickly all threads have hung.
How should I be checking for these and killing them off?
Here is the code:
class Program
{
public static SemaphoreSlim semaphore;
public static List<Task> taskList;
static void Main(string[] args)
{
List<string> urlList = new List<string>();
Console.WriteLine("Generating list");
for (int i = 0; i < 1000; i++)
{
//adding random strings to simulate a large list of URLs to process
urlList.Add(Path.GetRandomFileName());
}
Console.WriteLine("Queueing tasks");
semaphore = new SemaphoreSlim(10, 10);
Task.Run(() => QueueTasks(urlList));
Console.ReadLine();
}
static void QueueTasks(List<string> urlList)
{
taskList = new List<Task>();
foreach (var url in urlList)
{
Console.WriteLine("{0} tasks can enter the semaphore.",
semaphore.CurrentCount);
semaphore.Wait();
taskList.Add(DoTheThing(url));
}
}
static async Task DoTheThing(string url)
{
Random rand = new Random();
// simulate the IO process
await Task.Delay(rand.Next(2000, 10000));
// add a 10% chance that the thread will hang simulating what happens occasionally with http request
int chance = rand.Next(1, 100);
if (chance <= 10)
{
while (true)
{
await Task.Delay(1000000);
}
}
semaphore.Release();
Console.WriteLine(url);
}
}
As people have already pointed out, Aborting threads in general is bad and there is no guaranteed way of doing it in C#. Using a separate process to do the work and then kill it is a slightly better idea than attempting Thread.Abort; but still not the best way to go. Ideally, you want co-operative threads/processes, which use IPC to decide when to bail out themselves. This way the cleanup is done properly.
With all that said, you can use code like below to do what you intend to do. I have written it assuming your task will be done in a thread. With slight changes, you can use the same logic to do your task in a process
The code is by no means bullet-proof and is meant to be illustrative. The concurrent code is not really tested well. Locks are held for longer than needed and some places I am not locking (like the Log function)
class TaskInfo {
public Thread Task;
public DateTime StartTime;
public TaskInfo(ParameterizedThreadStart startInfo, object startArg) {
Task = new Thread(startInfo);
Task.Start(startArg);
StartTime = DateTime.Now;
}
}
class Program {
const int MAX_THREADS = 1;
const int TASK_TIMEOUT = 6; // in seconds
const int CLEANUP_INTERVAL = TASK_TIMEOUT; // in seconds
public static SemaphoreSlim semaphore;
public static List<TaskInfo> TaskList;
public static object TaskListLock = new object();
public static Timer CleanupTimer;
static void Main(string[] args) {
List<string> urlList = new List<string>();
Log("Generating list");
for (int i = 0; i < 2; i++) {
//adding random strings to simulate a large list of URLs to process
urlList.Add(Path.GetRandomFileName());
}
Log("Queueing tasks");
semaphore = new SemaphoreSlim(MAX_THREADS, MAX_THREADS);
Task.Run(() => QueueTasks(urlList));
CleanupTimer = new Timer(CleanupTasks, null, CLEANUP_INTERVAL * 1000, CLEANUP_INTERVAL * 1000);
Console.ReadLine();
}
// TODO: Guard against re-entrancy
static void CleanupTasks(object state) {
Log("CleanupTasks started");
lock (TaskListLock) {
var now = DateTime.Now;
int n = TaskList.Count;
for (int i = n - 1; i >= 0; --i) {
var task = TaskList[i];
Log($"Checking task with ID {task.Task.ManagedThreadId}");
// kill processes running for longer than anticipated
if (task.Task.IsAlive && now.Subtract(task.StartTime).TotalSeconds >= TASK_TIMEOUT) {
Log("Cleaning up hung task");
task.Task.Abort();
}
// remove task if it is not alive
if (!task.Task.IsAlive) {
Log("Removing dead task from list");
TaskList.RemoveAt(i);
continue;
}
}
if (TaskList.Count == 0) {
Log("Disposing cleanup thread");
CleanupTimer.Dispose();
}
}
Log("CleanupTasks done");
}
static void QueueTasks(List<string> urlList) {
TaskList = new List<TaskInfo>();
foreach (var url in urlList) {
Log($"Trying to schedule url = {url}");
semaphore.Wait();
Log("Semaphore acquired");
ParameterizedThreadStart taskRoutine = obj => {
try {
DoTheThing((string)obj);
} finally {
Log("Releasing semaphore");
semaphore.Release();
}
};
var task = new TaskInfo(taskRoutine, url);
lock (TaskListLock)
TaskList.Add(task);
}
Log("All tasks queued");
}
// simulate all processes get hung
static void DoTheThing(string url) {
while (true)
Thread.Sleep(5000);
}
static void Log(string msg) {
Console.WriteLine("{0:HH:mm:ss.fff} Thread {1,2} {2}", DateTime.Now, Thread.CurrentThread.ManagedThreadId.ToString(), msg);
}
}
I need to start tasks in parallel, but I choose to use Task.Run instead of Parallel.Foreach, so I can get some feedback when all tasks finished and enable UI controls.
private async void buttonStart_Click(object sender, EventArgs e)
{
var cells = objectListView.CheckedObjects;
if(cells != null)
{
List<Task> tasks = new List<Task>();
foreach (Cell c in cells)
{
Cell cell = c;
var progressHandler = new Progress<string>(value =>
{
cell.Status = value;
});
var progress = progressHandler as IProgress<string>;
Task t = Task.Run(() =>
{
progress.Report("Starting...");
int a = 123;
for (int i = 0; i < 200000; i++)
{
a = a + i;
Task.Delay(500).Wait();
}
progress.Report("Done");
});
tasks.Add(t);
}
await Task.WhenAll(tasks);
Console.WriteLine("Done, enabld UI controls");
}
}
So what I expect is that I see in UI "Starting..." almost instantly for all items. What I actually see is first 4 items are "Starting..." (I guess because all 4 CPU cores are used per thread), then each second or less new item is "Starting". I have total 37 items and it takes around 30 seconds for all items to start all tasks.
How can I make it as parallel as possible?
How can I make it as parallel as possible?
The part of inner for loop is simulating long running CPU-bound job, which I would like to start at the same time as much as possible.
It's already as parallel as possible. Starting 37 threads that all have CPU-bound work to do will not make it go any faster, since you're apparently running it on a 4-core machine. There are 4 cores, so only 4 threads can actually run at a time. The other 33 threads are going to be waiting while 4 are running. They would only appear to run simultaneously.
That said, if you really want to start up all those thread pool threads, you can do this by calling ThreadPool.SetMinThreads.
I need to start tasks in parallel, but I choose to use Task.Run instead of Parallel.Foreach, so I can get some feedback when all tasks finished and enable UI controls.
Since you have parallel work to do, you should use Parallel. If you want the nice resume-on-the-UI-thread behavior of await, then you can use a single await Task.Run, something like this:
private async void buttonStart_Click(object sender, EventArgs e)
{
var cells = objectListView.CheckedObjects;
if (cells == null)
return;
var workItems = cells.Select(c => new
{
Cell = c,
Progress = new Progress<string>(value => { c.Status = value; }),
}).ToList();
await Task.Run(() => Parallel.ForEach(workItems, item =>
{
var progress = item.Progress as IProgress<string>();
progress.Report("Starting...");
int a = 123;
for (int i = 0; i < 200000; i++)
{
a = a + i;
Thread.Sleep(500);
}
progress.Report("Done");
}));
Console.WriteLine("Done, enabld UI controls");
}
I'd say, it is as parallel as possible. If you have 4 cores, you can run 4 threads in parallel.
If you can do stuff while waiting for the "delay", have a look into asynchronous programming (where one thread can run multiple tasks "at once", because most of them are waiting for something).
EDIT: you can also run Parallel.ForEach in its own task and await that:
private async void buttonStart_Click(object sender, EventArgs e)
{
var cells = objectListView.CheckedObjects;
if(cells != null)
{
await Task.Run( () => Parallel.ForEach( cells, c => ... ) );
}
}
I think it relies on your taskcreation-options.
TaskCreationOptions.LongRunning
Here you can find further informations:
https://msdn.microsoft.com/en-us/library/system.threading.tasks.taskcreationoptions(v=vs.110).aspx
But you have to know, that task uses a threadpool with a finite maximum amount of threads. You can use LongRunning to signal, that this task needs a long time and should not clog your pool. I thinks it's more complex to create a long-running task, because the scheduler may create a new thread.
using System;
using System.Collections.Generic;
using System.Threading.Tasks;
namespace TaskTest
{
internal class Program
{
private static void Main(string[] args)
{
var demo = new Program();
demo.SimulateClick();
Console.ReadLine();
}
public void SimulateClick()
{
buttonStart_Click(null, null);
}
private async void buttonStart_Click(object sender, EventArgs e)
{
var tasks = new List<Task>();
for (var i = 0; i < 36; i++)
{
var taskId = i;
var t = Task.Factory.StartNew((() =>
{
Console.WriteLine($"Starting Task ({taskId})");
for (var ii = 0; ii < 200000; ii++)
{
Task.Delay(TimeSpan.FromMilliseconds(500)).Wait();
var s1 = new string(' ', taskId);
var s2 = new string(' ', 36-taskId);
Console.WriteLine($"Updating Task {s1}X{s2} ({taskId})");
}
Console.Write($"Done ({taskId})");
}),TaskCreationOptions.LongRunning);
tasks.Add(t);
}
await Task.WhenAll(tasks);
Console.WriteLine("Done, enabld UI controls");
}
}
}