Splitting Loop Between Multiple Threads

Splitting Loop Between Multiple Threads - c#

My code gets open ports from the given host.
This process is very time consuming
Code :
for(int port=0;port<11;port++)
{
string statusTCP = "Open";
using (TcpClient tcp = new TcpClient())
{
try{ tcp.Connect("127.0.0.1",port);
}catch { statusTCP="Close";}
}
Console.WriteLine("Port " + port + " : " + statusTCP);
}
This process takes 11s for me !
This is very long if i check 100 or 1000 ports...
Any Good & Fast way to do this ?

You don't need multiple threads to connect to multiple ports. You can use ConnectAsync to connect asynchronously. Windows uses asynchronous IO through completion ports from the NT days.
This means that no threads are blocked, the OS notifies the original thread when an IO request completes. In fact, blocking is simulated to make synchronous programming easier.
You can create a single method that connects to a single port and reports a message when finished. Writing to the console blocks though, so I'll use the IProgress< T> interface to report progress:
public async Task ConnectToPortAsync(string host,int port,IProgress<string> progress)
{
using(var client=new TcpClient())
{
try
{
await client.ConnectAsync(host,port).ConfigureAwait(false);
progress.Report($"Port: {port} Open");
//Do some more work
}
catch
{
progress.Report($"Port {port} Closed");
}
}
}
Assuming you have a list of ports:
var ports=new[]{80,8080,...};
or
var ports=Enumerable.Range(0,11);
You can call multiple ports simultaneously like this:
var host="127.0.0.1";
var progress=new Progress<string>(msg=>Console.WriteLine(msg););
var allTasks= ports.Select(port=>ConnectToPortAsync(host,port,progress));
await Task.WhenAll(allTasks);
This code will use the main thread up until the first await. This means that all TcpClient instances will be created using the main thread. After that though, the code will execute asynchronously, using a thread from the IO Completion thread pool whenever necessary.
Reporting is performed by the main thread by the Progress< T> object, so the background threads don't block waiting to write to the console.
Finally, once await returns, execution continues on the original synchronization context. For a desktop application (WPF, Winforms) that would be the main thread. That's OK if you want to update the UI after an asynchronous operation, but can cause blocking if you want to perform more work in the background. By using ConfigureAwait(false) we instruct the runtime to keep working on the background thread
Further improvements :
You can add a timeout in case a connection takes too long, by using Task.Delay() and Task.WhenAny :
var connectionTask=client.ConnectAsync(host,port);
var timeout=Task.Delay(1000);
var completed=await Task.WhenAny(connectionTask,timeout);
if(completed==connectionTask)
{
progress.Report($"Port {port} Open");
}
else
{
progress.Report($"Port {port} Timeout");
}

If what you're asking is how to run multiple lookups in parallel, take a look at the Parallel class.
Parallel.For(0, 11, new ParallelOptions { MaxDegreeOfParallelism = 5 }, (port) =>
{
string statusTCP = "Open";
using (TcpClient tcp = new TcpClient())
{
try
{
tcp.Connect("127.0.0.1", port);
}
catch { statusTCP = "Close"; }
}
Console.WriteLine("Port " + port + " : " + statusTCP);
});
Note in the first few method parameters where I specify to go from 0-10 (because 11 is exclusive) and execute 5 lookups in parallel with the ParallelOptions class.

Related

parallel check open port ip from file

I want to check the open port in a group of ip . I do not want to check an ip behind a title sequentially. But I want it in parallel so that the user determines the number of threads. The number of ip is divided into threads :
private async void button3_Click(object sender, EventArgs e)
{
await Task.Run(() =>
{
var lines = File.ReadAllLines("ip.txt");
int cc = (lines.Length) / (int.Parse(thread.Text));
if (lines.Length % int.Parse(thread.Text) == 0)
{
for (int s = 0; s < lines.Length; s = s + cc)
{
Parallel.For(0, 1, a =>
{
checkopen(s, (s + cc));
});
}
}
else
{
MessageBox.Show("enter true thread num");
}
});
}
this to check open port :
void checkopen(int first,int last)
{
int port = Convert.ToInt32(portnum.Text);
var lines = File.ReadAllLines("ip.txt");
for (int i = first; i < last; ++i)
{
var line = lines[i];
using (TcpClient tcpClient = new TcpClient())
{
try
{
tcpClient.Connect(line, port);
this.Invoke(new Action(() =>
{
listBox1.Items.Add(line); // open port
}));
}
catch (Exception)
{
this.Invoke(new Action(() =>
{
listBox2.Items.Add(line); //close port
}));
}
}
}
}

I see this problem every day.
There is no point putting IO bound operations in Parallel.For, it's not designed for IO work, or the async pattern (and trust me, that's what you want).
Why do we want the async await pattern?
Because it designed to give the threads back when it's awaiting an IO completion port or awaitable workload.
When you run IO work in Parallel.For/Parallel.ForEach like this, the Task Scheduler just won't give you threads to block, it uses all sort of heuristics to work out how many threads you should have, and it takes a dim view of it.
So what should we use?
The async and await pattern.
Why?
Because we can let IO be IO, the system creates and IO completion port, .Net gives the thread back to the threadpool until the Completion port calls back and the method continues.
So, there are many options for this. But first and foremost await the awaitable async methods of the libraries you are using.
From here you can either create a lists of tasks and use something like an awaitable SemaphoreSlim to limit concurrency and a WhenAll.
Or you could use something like an ActionBlock out of TPL Dataflow, which is designed to work with both CPU and IO bound workloads.
The real world benefits can't be understated. Your Parallel.For approach will just run a handful of thread and block them. An async version you'll be able to run 100s simultaneously.
Dataflow example
You can get the Nuget here
public async Task DoWorkLoads(List<WorkLoad> workloads)
{
var options = new ExecutionDataflowBlockOptions
{
// add pepper and salt to taste
MaxDegreeOfParallelism = 100,
EnsureOrdered = false
};
// create an action block
var block = new ActionBlock<WorkLoad>(MyMethodAsync, options);
// Queue them up
foreach (var workLoad in workloads)
block.Post(workLoad );
// wait for them to finish
block.Complete();
await block.Completion;
}
...
// Notice we are using the async / await pattern
public async Task MyMethodAsync(WorkLoad workLoad)
{
try
{
Console.WriteLine("Doing some IO work async);
await DoIoWorkAsync;
}
catch (Exception)
{
// probably best to add some error checking some how
}
}

Parallel.ForEach and blocking thread

I created Windows Service application with Quartz.NET library to schedule jobs for reporting purposes. Main part of application is fetching some data from databases on different locations (~260), so I decided to use Parallel.ForEach for parallel fetching and storing data on central location.
In Quartz.NET Job I run static method from my utility class that do parallel processing.
Utility class:
public class Helper
{
public static ConcurrentQueue<Exception> KolekcijaGresaka = new ConcurrentQueue<Exception>(); // Thread-safe
public static void Start()
{
List<KeyValuePair<string, string>> podaci = Aktivne(); // List of data for processing (260 items)
ParallelOptions opcije = new ParallelOptions { MaxDegreeOfParallelism = 50 };
Parallel.ForEach(podaci, opcije, p =>
{
UzmiPodatke(p.Key, p.Value, 2000);
});
}
public static void UzmiPodatke(string oznaka, string ipAdresa, int pingTimeout)
{
string datumTrenutneString = DateTime.Now.ToString("d.M.yyyy");
string datumPrethodneString = DatumPrethodneGodineString();
string sati = DateTime.Now.ToString("HH");
// Ping:
Ping ping = new Ping();
PingReply reply = ping.Send(ipAdresa, pingTimeout);
// If is online call method for copy data:
if (reply.Status == IPStatus.Success)
{
KopirajPodatke(oznaka, ipAdresa, datumTrenutneString, datumPrethodneString, sati, "TBL_DATA");
}
}
public static void KopirajPodatke(string oznaka, string ipAdresa, string datumTrenutneString, string datumPrethodneString, string sati, string tabelaDestinacija)
{
string lanString = "Database=" + ipAdresa + "://DBS//custdb.gdb; User=*******; Password=*******; Dialect=3;";
IDbConnection lanKonekcija = new FbConnection(lanString);
IDbCommand lanCmd = lanKonekcija.CreateCommand();
try
{
lanKonekcija.Open();
lanCmd.CommandText = "query ...";
DataTable podaciTabela = new DataTable();
// Get data from remote location:
try
{
podaciTabela.Load(lanCmd.ExecuteReader());
}
catch (Exception ex)
{
throw ex;
}
// Save data:
if (podaciTabela.Rows.Count > 0)
{
using (SqlConnection sqlKonekcija = new SqlConnection(Konekcije.DB("Podaci")))
{
sqlKonekcija.Open();
using (SqlBulkCopy bulkcopy = new SqlBulkCopy(sqlKonekcija))
{
bulkcopy.DestinationTableName = tabelaDestinacija;
bulkcopy.BulkCopyTimeout = 5; // seconds
bulkcopy.ColumnMappings.Add("A", "A");
bulkcopy.ColumnMappings.Add("B", "B");
bulkcopy.ColumnMappings.Add("C", "C");
bulkcopy.ColumnMappings.Add("D", "D");
try
{
bulkcopy.WriteToServer(podaciTabela);
}
catch (Exception ex)
{
throw ex;
}
}
}
}
}
catch (Exception ex)
{
KolekcijaGresaka.Enqueue(ex);
}
finally
{
lanCmd.Dispose();
lanKonekcija.Close();
lanKonekcija.Dispose();
}
}
Application works most of times (job is executing 4 times per day), but sometimes get stuck and hanging (usually when processed ~200 items parallel) thus blocking main thread and never ends. Seems like one of thread from parallel processing get blocked and prevents execution of main thread. Can this be caused by deadlocks?
How can I ensure that no one thread blocks application execution (even with no success of fetching data)? What can get wrong with code above?

How can I ensure that no one thread blocks application execution (even with no success of fetching data)? What can get wrong with code above?
Parallel.Foreach is not asynchronous, it only executes each iteration in parallel, so it will wait for every operation to finish before proceeding. If you truly do not care to wait for all operations to finish before proceeding back to the caller, then try using the Task factory to schedule these and use the thread pool by default.
i.e.
foreach(var p in podaci)
{
Task.Factory.StartNew(() => UzmiPodatke(p.Key, p.Value, 2000));
}
Or use ThreadPool.QueueUserWorkItem or BackgroundWorker, whatever you're familiar with and is applicable to the behavior you want.
This probably won't solve all your problems, just the unresponsive program. Most likely, if there is actually a problem with your code, one of your Tasks will eventually throw an exception which will crash your program if unhandled. Or worse yet, you will have "stuck" tasks just sitting there hogging resources if the Task(s) never finish. However, it may just be the case that occasionally one of these takes extremely long. In this case, you can handle this however you want (cancellation of long task, make sure all previously scheduled tasks complete before scheduling more, etc.), and the Task Parallel Library can support all these cases with some minor modifications.

Proper use of async / await and Task

I'm trying to learn those "new" keywords and tried to implement a simple async udp server.
public class UdpServerSync
{
private CancellationTokenSource _cts;
private CancellationToken _token;
private UdpClient _client;
public void Start()
{
Console.WriteLine("Start server");
_cts = new CancellationTokenSource();
_token = _cts.Token;
var ipAddress = IPAddress.Parse("192.168.0.25");
var ip = new IPEndPoint(ipAddress, 7070);
try
{
Task.Run(async () =>
{
using (_client = new UdpClient(ip))
{
while (!_token.IsCancellationRequested)
{
var receivedData = await _client.ReceiveAsync();
var msg = Encoding.ASCII.GetString(receivedData.Buffer);
// Process request e.g ProcessRequest(msg);
Console.WriteLine(msg);
}
}
}, _token).ConfigureAwait(false);
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
}
public void Stop()
{
Console.WriteLine("Stop server");
if (_cts != null) _cts.Cancel();
}
And then use it like this (for testing purpose):
var server = new UdpServerSync();
server.Start();
await Task.Delay(5000);
server.Stop();
The above code is just a proof of concept, its not about code review. By a simple udp server I mean a while loop with a udpclient listening for udp messages and writing them to the console - no processing or error handling.
The reason for the Task.Delay is just because its a proof of concept of calling the server's start() and stop() methods.
To narrow down my questions:
1) if i was going to call the Start() and Stop() methods from e.g a WPF application´s start button, should I use server.Start() or Task.Run ? I don't want to await the call since there's no way to know how long the user is going to want the server started.
2) in the server code "ProcessRequest(msg), if that was a void method in another library, should I use Task.Run() to execute it to avoid the server thread being blocked or is there a better way ?
3) When we do async/await, does the code in the await statement execute in a new thread from the thread pool ?
4) Can I specify that the UdpServer is a long running process or it doesn't matter to the thread pool ?
Hope my question is more clear now, thanks guys :)

1) if i was going to call the Start() and Stop() methods from e.g a WPF application´s start button, should I use server.Start() or Task.Run ?
With the code you have now, you can just call Start (which calls Task.Run). IMO, the call to Task.Run is unnecessary. And the call to ConfigureAwait is definitely unnecessary since there's no await to configure.
I don't want to await the call since there's no way to know how long the user is going to want the server started.
But you probably do want to know about any exceptions. So, think about how to handle those. One solution is to save the returned Task as a property. Or, you could just await it (remember, it doesn't matter "how long" it runs, because it's an asynchronous wait).
2) in the server code "ProcessRequest(msg), if that was a void method in another library, should I use Task.Run() to execute it to avoid the server thread being blocked or is there a better way ?
There's no "server thread". But if ProcessRequest takes a long time, then you might want to consider using Task.Run so that your code will accept the next request while processing that one.
3) When we do async/await, does the code in the await statement execute in a new thread from the thread pool ?
No. I have an async intro on my blog that goes into more detail.
4) Can I specify that the UdpServer is a long running process or it doesn't matter to the thread pool ?
It doesn't matter.

Always Running Threads on Windows Service

I'm writing a Windows Service that will kick off multiple worker threads that will listen to Amazon SQS queues and process messages. There will be about 20 threads listening to 10 queues.
The threads will have to be always running and that's why I'm leaning towards to actually using actual threads for the worker loops rather than threadpool threads.
Here is a top level implementation. Windows service will kick off multiple worker threads and each will listen to it's queue and process messages.
protected override void OnStart(string[] args)
{
for (int i = 0; i < _workers; i++)
{
new Thread(RunWorker).Start();
}
}
Here is the implementation of the work
public async void RunWorker()
{
while(true)
{
// .. get message from amazon sqs sync.. about 20ms
var message = sqsClient.ReceiveMessage();
try
{
await PerformWebRequestAsync(message);
await InsertIntoDbAsync(message);
}
catch(SomeExeception)
{
// ... log
//continue to retry
continue;
}
sqsClient.DeleteMessage();
}
}
I know I can perform the same operation with Task.Run and execute it on the threadpool thread rather than starting individual thread, but I don't see a reason for that since each thread will always be running.
Do you see any problems with this implementation? How reliable would it be to leave threads always running in this fashion and what can I do to make sure that each thread is always running?

One problem with your existing solution is that you call your RunWorker in a fire-and-forget manner, albeit on a new thread (i.e., new Thread(RunWorker).Start()).
RunWorker is an async method, it will return to the caller when the execution point hits the first await (i.e. await PerformWebRequestAsync(message)). If PerformWebRequestAsync returns a pending task, RunWorker returns and the new thread you just started terminates.
I don't think you need a new thread here at all, just use AmazonSQSClient.ReceiveMessageAsync and await its result. Another thing is that you shouldn't be using async void methods unless you really don't care about tracking the state of the asynchronous task. Use async Task instead.
Your code might look like this:
List<Task> _workers = new List<Task>();
CancellationTokenSource _cts = new CancellationTokenSource();
protected override void OnStart(string[] args)
{
for (int i = 0; i < _MAX_WORKERS; i++)
{
_workers.Add(RunWorkerAsync(_cts.Token));
}
}
public async Task RunWorkerAsync(CancellationToken token)
{
while(true)
{
token.ThrowIfCancellationRequested();
// .. get message from amazon sqs sync.. about 20ms
var message = await sqsClient.ReceiveMessageAsync().ConfigureAwait(false);
try
{
await PerformWebRequestAsync(message);
await InsertIntoDbAsync(message);
}
catch(SomeExeception)
{
// ... log
//continue to retry
continue;
}
sqsClient.DeleteMessage();
}
}
Now, to stop all pending workers, you could simple do this (from the main "request dispatcher" thread):
_cts.Cancel();
try
{
Task.WaitAll(_workers.ToArray());
}
catch (AggregateException ex)
{
ex.Handle(inner => inner is OperationCanceledException);
}
Note, ConfigureAwait(false) is optional for Windows Service, because there's no synchronization context on the initial thread, by default. However, I'd keep it that way to make the code independent of the execution environment (for cases where there is synchronization context).
Finally, if for some reason you cannot use ReceiveMessageAsync, or you need to call another blocking API, or simply do a piece of CPU intensive work at the beginning of RunWorkerAsync, just wrap it with Task.Run (as opposed to wrapping the whole RunWorkerAsync):
var message = await Task.Run(
() => sqsClient.ReceiveMessage()).ConfigureAwait(false);

Well, for one I'd use a CancellationTokenSource instantiated in the service and passed down to the workers. Your while statement would become:
while(!cancellationTokenSource.IsCancellationRequested)
{
//rest of the code
}
This way you can cancel all your workers from the OnStop service method.
Additionally, you should watch for:
If you're playing with thread states from outside of the thread, then a ThreadStateException, or ThreadInterruptedException or one of the others might be thrown. So, you want to handle a proper thread restart.
Do the workers need to run without pause in-between iterations? I would throw in a sleep in there (even a few ms's) just so they don't keep the CPU up for nothing.
You need to handle ThreadStartException and restart the worker, if it occurs.
Other than that there's no reason why those 10 treads can't run for as long as the service runs (days, weeks, months at a time).

Network Command Processing with TPL Dataflow

I'm working on a system that involves accepting commands over a TCP network connection, then sending responses upon execution of those commands. Fairly basic stuff, but I'm looking to support a few requirements:
Multiple clients can connect at the same time and establish separate sessions. Sessions can last as long or as short as desired, with the same client IP able to establish multiple parallel sessions, if desired.
Each session can process multiple commands at the same time, as some of the requested operations can be performed in parallel.
I'd like to implement this cleanly using async/await and, based on what I've read, TPL Dataflow sounds like a good way to cleanly break up the processing into nice chunks that can run on the thread pool instead of tying up threads for different sessions/commands, blocking on wait handles.
This is what I'm starting with (some parts stripped out to simplify, such as details of exception handling; I've also omitted a wrapper that provides an efficient awaitable for the network I/O):
private readonly Task _serviceTask;
private readonly Task _commandsTask;
private readonly CancellationTokenSource _cancellation;
private readonly BufferBlock<Command> _pendingCommands;
public NetworkService(ICommandProcessor commandProcessor)
{
_commandProcessor = commandProcessor;
IsRunning = true;
_cancellation = new CancellationTokenSource();
_pendingCommands = new BufferBlock<Command>();
_serviceTask = Task.Run((Func<Task>)RunService);
_commandsTask = Task.Run((Func<Task>)RunCommands);
}
public bool IsRunning { get; private set; }
private async Task RunService()
{
_listener = new TcpListener(IPAddress.Any, ServicePort);
_listener.Start();
while (IsRunning)
{
Socket client = null;
try
{
client = await _listener.AcceptSocketAsync();
client.Blocking = false;
var session = RunSession(client);
lock (_sessions)
{
_sessions.Add(session);
}
}
catch (Exception ex)
{ //Handling here...
}
}
}
private async Task RunCommands()
{
while (IsRunning)
{
var command = await _pendingCommands.ReceiveAsync(_cancellation.Token);
var task = Task.Run(() => RunCommand(command));
}
}
private async Task RunCommand(Command command)
{
try
{
var response = await _commandProcessor.RunCommand(command.Content);
Send(command.Client, response);
}
catch (Exception ex)
{
//Deal with general command exceptions here...
}
}
private async Task RunSession(Socket client)
{
while (client.Connected)
{
var reader = new DelimitedCommandReader(client);
try
{
var content = await reader.ReceiveCommand();
_pendingCommands.Post(new Command(client, content));
}
catch (Exception ex)
{
//Exception handling here...
}
}
}
The basics seem straightforward, but one part is tripping me up: how do I make sure that when I'm shutting down the application, I wait for all pending command tasks to complete? I get the Task object when I use Task.Run to execute the command, but how do I keep track of pending commands so that I can make sure that all of them are complete before allowing the service to shut down?
I've considered using a simple List, with removal of commands from the List as they finish, but I'm wondering if I'm missing some basic tools in TPL Dataflow that would allow me to accomplish this more cleanly.
EDIT:
Reading more about TPL Dataflow, I'm wondering if what I should be using is a TransformBlock with an increased MaxDegreeOfParallelism to allow processing parallel commands? This sets an upper limit on the number of commands that can run in parallel, but that's a sensible limitation for my system, I think. I'm curious to hear from those who have experience with TPL Dataflow to know if I'm on the right track.

Yeah, so... you're kinda half using the power of TPL here. The fact that you're still manually receiving items from the BufferBlock in your own while loop in a background Task is not the "way" you want to do it if you're subscribing to the TPL DataFlow style.
What you would do is link an ActionBlock to the BufferBlock and do your command processing/sending from within that. This is also the block where you would set the MaxDegreeOfParallelism to control just how many concurrent commands you want to process. So that setup might look something like this:
// Initialization logic to build up the TPL flow
_pendingCommands = new BufferBlock<Command>();
_commandProcessor = new ActionBlock<Command>(this.ProcessCommand);
_pendingCommands.LinkTo(_commandProcessor);
private Task ProcessCommand(Command command)
{
var response = await _commandProcessor.RunCommand(command.Content);
this.Send(command.Client, response);
}
Then, in your shutdown code, you would need to signal that you're done adding items into the pipeline by calling Complete on the _pipelineCommands BufferBlock and then wait on the _commandProcessor ActionBlock to complete to ensure that all items have made their way through the pipeline. You do this by grabbing the Task returned by the block's Completion property and calling Wait on it:
_pendingCommands.Complete();
_commandProcessor.Completion.Wait();
If you want to go for bonus points, you can even separate the command processing from the command sending. This would allow you to configure those steps separately from one another. For example, maybe you need to limit the number of threads processing commands, but want to have more sending out the responses. You would do this by simply introducing a TransformBlock into the middle of the flow:
_pendingCommands = new BufferBlock<Command>();
_commandProcessor = new TransformBlock<Command, Tuple<Client, Response>>(this.ProcessCommand);
_commandSender = new ActionBlock<Tuple<Client, Response>(this.SendResponseToClient));
_pendingCommands.LinkTo(_commandProcessor);
_commandProcessor.LinkTo(_commandSender);
private Task ProcessCommand(Command command)
{
var response = await _commandProcessor.RunCommand(command.Content);
return Tuple.Create(command, response);
}
private Task SendResponseToClient(Tuple<Client, Response> clientAndResponse)
{
this.Send(clientAndResponse.Item1, clientAndResponse.Item2);
}
You probably want to use your own data structure instead of Tuple, it was just for illustrative purposes, but the point is this is exactly the kind of structure you want to use to break up the pipeline so that you can control the various aspects of it exactly how you might need to.

Tasks are by default background, which means that when application terminates they are also immediately terminated. You should use a Thread not a Task. Then you can set:
Thread.IsBackground = false;
This will prevent your application from terminating while the worker thread is running.
Although of course this will require some changes in your above code.
What's more you, when executing the shutdown method, you could also just wait for any outstanding tasks from the main thread.
I do not see a better solution to this.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Splitting Loop Between Multiple Threads - c#

Related

parallel check open port ip from file

Parallel.ForEach and blocking thread

Proper use of async / await and Task

Always Running Threads on Windows Service

Network Command Processing with TPL Dataflow

Categories

Resources