Running multiple tasks at the same time using a single source data - c#

So how can I run multiple tasks at once using a txt file as input.
Load Source Data
var lines = File.ReadAllLines("file.txt")
Run Tasks
forearch(var line in lines)
{
//I want to execute 3 tasks and each task needs to receive a line. At the end of each task he should pick up another line that has not been used and continue to the end of the file.
}

Have you looked at Parallel.ForEach?
use like :
Parallel.ForEach(File.ReadLines("file.txt"), new ParallelOptions { MaxDegreeOfParallelism = 3 }, line => { \\ do stuff })

Maybe something like this:
async void Main()
{
var lines = File.ReadAllLines("file.txt");
int i = 0;
var concurrency = 3;
while (i < lines.Length)
{
var tasks = new List<Task>(concurrency);
for (int j = 0; j < concurrency && i < lines.Length; j++)
{
tasks.Add(MyMethod(lines[i++]));
}
await Task.WhenAll(tasks);
}
}
public Task MyMethod(string s)
{
return Task.CompletedTask;
}

you can try this:
private static async Task Main(string[] args) {
const ushort concurrentWorkers = 5;
var lines = File.ReadAllLines("file.txt");
var concurrentSourceQueue = new ConcurrentQueue<string>(lines);
var worker = Enumerable.Range(0, concurrentWorkers)
.Select(_ => DoWorkAsync(concurrentSourceQueue));
await Task.WhenAll(worker);
}
private static async Task DoWorkAsync(ConcurrentQueue<string> queue) {
while (queue.TryDequeue(out var item)) {
//process line here
}
}

Related

For loop async issue C# [duplicate]

This question already has answers here:
Captured variable in a loop in C#
(10 answers)
Closed 2 years ago.
Hi I'm trying to make a simple code to run my function in async way. But the result turn out to be quite unexpected. the result i want is like the counter function can run in parallel way and output the result some way similar like:
Start1
End1
Start2
End2
Start3
Start4
End 3
......
Hi
but it turns out it only get the for loop value i=60 into counter function. I'm quite new to async method and google also cant find the appropriate explanation.
namespace Asycn
{
class Program
{
static async Task Main(string[] args)
{
var tasks = new List<Task>();
for (int i = 0; i < 60; i++)
{
tasks.Add(Task.Run(()=>counters(i)));
}
await Task.WhenAll(tasks);
Console.WriteLine("Hi");
}
private static void counters(int num)
{
Console.WriteLine("Start"+num.ToString());
Thread.Sleep(num*1000);
Console.WriteLine("End"+num.ToString());
}
}
}
And below is the running result
Running Result
I assume that you are just getting familiar with async here. Generally when you want to process this number of tasks, it's better to limit parallelism with something like plinq, or Parallel.Foreach
The issue is that i is incremented before the Tasks run.
All you need to do is capture the value within the loop:
namespace Asycn
{
class Program
{
static async Task Main(string[] args)
{
var tasks = new List<Task>();
for (int i = 0; i < 60; i++)
{
var copy = i; // capture state of i
tasks.Add(Task.Run(()=>counters(copy)));
}
await Task.WhenAll(tasks);
Console.WriteLine("Hi");
}
private static void counters(int num)
{
Console.WriteLine("Start"+num.ToString());
Thread.Sleep(num*1000);
Console.WriteLine("End"+num.ToString());
}
}
}
Your code isn't actually using async/await to its fullest potential. You're not capturing the value of i, but you won't have to if you write your code like this:
static async Task Main(string[] args)
{
var tasks = new List<Task>();
for (int i = 0; i < 60; i++)
{
tasks.Add(counters(i));
}
await Task.WhenAll(tasks);
Console.WriteLine("Hi");
}
private static async Task counters(int num)
{
Console.WriteLine("Start"+num.ToString());
await Task.Delay(num*1000);
Console.WriteLine("End"+num.ToString());
}
The output looks like this:
Start0
End0
Start1
Start2
Start3
...
End1
End2
End3
...
Hi

Producer/consumer doesn't generate expected results

I've written such producer/consumer code, which should generate big file filled with random data
class Program
{
static void Main(string[] args)
{
Random random = new Random();
String filename = #"d:\test_out";
long numlines = 1000000;
var buffer = new BlockingCollection<string[]>(10); //limit to not get OOM.
int arrSize = 100; //size of each string chunk in buffer;
String[] block = new string[arrSize];
Task producer = Task.Factory.StartNew(() =>
{
long blockNum = 0;
long lineStopped = 0;
for (long i = 0; i < numlines; i++)
{
if (blockNum == arrSize)
{
buffer.Add(block);
blockNum = 0;
lineStopped = i;
}
block[blockNum] = random.Next().ToString();
//null is sign to stop if last block is not fully filled
if (blockNum < arrSize - 1)
{
block[blockNum + 1] = null;
}
blockNum++;
};
if (lineStopped < numlines)
{
buffer.Add(block);
}
buffer.CompleteAdding();
}, TaskCreationOptions.LongRunning);
Task consumer = Task.Factory.StartNew(() =>
{
using (var outputFile = new StreamWriter(filename))
{
foreach (string[] chunk in buffer.GetConsumingEnumerable())
{
foreach (string value in chunk)
{
if (value == null) break;
outputFile.WriteLine(value);
}
}
}
}, TaskCreationOptions.LongRunning);
Task.WaitAll(producer, consumer);
}
}
And it does what is intended to do. But for some unknown reason it produces only ~550000 strings, not 1000000 and I can not understand why this is happening.
Can someone point on my mistake? I really don't get what's wrong with this code.
The buffer
String[] block = new string[arrSize];
is declared outside the Lambda. That means it is captured and re-used.
That would normally go unnoticed (you would just write out the wrong random data) but because your if (blockNum < arrSize - 1) is placed inside the for loop you regularly write a null into the shared buffer.
Exercise, instead of:
block[blockNum] = random.Next().ToString();
use
block[blockNum] = i.ToString();
and predict and verify the results.

Console.Writeline is only writing to the console some of the times when using async await

So, I have a super simple application, but as I am testing this out it is only writing to the console from the method DoWork(). I am unsure why that is, but I am fairly sure it has to do with the fact that it is async code. Any ideas, why it only writes from method DoWork()?
class Program
{
static void Main(string[] args)
{
MainAsync().Wait();
System.Threading.Thread.Sleep(50000);
}
static async Task MainAsync()
{
Console.WriteLine("Hello World!");
for (int i = 0; i < 300; i++)
{
List<Task> myWork = new List<Task>();
myWork.Add(DoWork(i));
if (myWork.Count == 50)
{
await Task.WhenAll(myWork);
Console.WriteLine("before delay");
await Task.Delay(1000);
Console.WriteLine("after delay");
myWork.Clear();
Console.WriteLine("List cleared.");
}
}
}
public static async Task DoWork(int i)
{
await Task.Delay(0);
Console.WriteLine("Run: " + i);
}
}
You're creating a new List for each iteration of the loop...it will only ever have one thing in it.
Declare the List outside of the loop.
Change:
for (int i = 0; i < 300; i++)
{
List<Task> myWork = new List<Task>();
To:
List<Task> myWork = new List<Task>();
for (int i = 0; i < 300; i++)
{

Triggering Parallel.For in c# with sleep

I have a Parallel.For loop which I use to peform a lot of HTTP request at a certain point when a scheduled task occurs like this:
Parallel.For(0, doc.GetElementsByTagName("ItemID").Count, i => {
var xmlResponse = PerformHttpRequestMethod();
});
Is there any way for me to set the loop to pause after the counter value hits 2,4,6,8,10 and so on...
so every 2 method calls it performs, sleep for 2 minutes lets say..
Is there any way I could achieve this ?
I recommend you to use Task.Delay.
Now your method is asyncronous using async/await.
public async Task DoSomething()
{
int i = 0;
while (i < doc.GetElementsByTagName("ItemID").Count)
{
Task.Run(() => PerformHttpRequestMethod());
if(i%2==0){
await Task.Delay(TimeSpan.FromMinutes(2));
//or simply:
await Task.Delay(120000);//120000 represents 2 minutes.
}
i++;
}
}
OR simply if you want to use for loop.
public async Task DoSomething()
{
for (int i = 0; i < doc.GetElementsByTagName("ItemID").Count; i++)
{
Task.Run(() => PerformHttpRequestMethod());
if(i%2==0){
await Task.Delay(TimeSpan.FromMinutes(2));
}
}
}
How would this 2nd example look if I'd want to do iterations from 0 to 4 then sleep 5 to 9 and so on... ?
public async Task DoSomething()
{
for (int i = 0; i < doc.GetElementsByTagName("ItemID").Count; i=i+5)
{
if( i%10 == 0 ){
for( int j=i;j<=i+4;j++){
Task.Run(() => PerformHttpRequestMethod());
}
}
else{
for(int j=i;j<=i+4;j++){
await Task.Delay(TimeSpan.FromMinutes(2));
}
}
}
}
Let's test the correctitude of algorithm.
i=0 -> (0%10==0 ->true) ,then will execute Task.Run(() => PerformHttpRequestMethod()) for i=(0,4)
i=5 -> (5%10==0 ->false), then will execute await Task.Delay(TimeSpan.FromMinutes(2)); for i=(5,9).
And so on...
I don't really see the point of using a Parallel.For if you want to sleep for x number of minutes or seconds every other iteration...how about using a plain old for loop?:
for(int i = 0; i < doc.GetElementsByTagName("ItemID").Count; ++i)
{
var xmlResponse = PerformHttpRequestMethod();
if (i % 2 == 0)
{
System.Threading.Thread.Sleep(TimeSpan.FromMinutes(2));
}
}
Or maybe you want to keep track of the how many iterations that are currently in flight?:
int inFlight = 0;
Parallel.For(0, doc.GetElementsByTagName("ItemID").Count, i => {
System.Threading.Interlocked.Increment(ref inFlight);
if (inFlight % 2 == 0)
System.Threading.Thread.Sleep(TimeSpan.FromMinutes(2));
var xmlResponse = PerformHttpRequestMethod();
System.Threading.Interlocked.Decrement(ref inFlight);
});
You can do that by combining a Parallel.For and a normal for loop:
for(var i = 0;i<doc.GetElementsByTagName("ItemID").Count;i = i+2)
{
Parallel.For(0, 2, i => {
var xmlResponse = PerformHttpRequestMethod();
});
Thread.Sleep(2000);
}

Tasks in array -- only last one runs

I was experimenting with tasks. Why does this output 10 and not each value of the loop?
public static void StartTasks()
{
Task[] tasks = new Task[10];
for (int i = 0; i < 10; i++)
tasks[i] = new Task(() => Console.WriteLine(i));
foreach (Task task in tasks)
{
task.Start();
}
}
C# lambdas capture a reference to the variable, not the value of the variable.
If you want to capture the value, you need to make a copy of it first inside the loop which causes the capture to get the reference to the locally scoped unchanging variable.
public static void StartTasks()
{
Task[] tasks = new Task[10];
for (int i = 0; i < 10; i++) {
int j = i;
tasks[i] = new Task(() => Console.WriteLine(j));
}
foreach (Task task in tasks)
{
task.Start();
}
}
In addition to the accepted answer, you can also pass a parameter to the task. For example,
using System;
using System.Threading.Tasks;
static void StartTasks(int instances)
{
var tasks = new Task[instances];
for (int i = 0; i < instances; i++)
{
tasks[i] = new Task((object param) =>
{
var t = (int)param;
Console.Write("({0})", t);
}, i);
}
Parallel.ForEach<Task>(tasks, (t) => { t.Start(); });
Task.WaitAll(tasks);
}

Categories

Resources