Proper understanding of Tasks - c#

At the risk of asking a stupid question (and I will voluntarily delete the question myself if my peers think it is a stupid question)..
I have a C# desktop app.
I upload data to my server using a WCF Service.
I am experimenting with using Tasks.
This code calls my web service...
Task t = Task.Run(() => { wcf.UploadMotionDynamicRaw(bytes); });
I am stress testing this line of code.
I call it as many times in 1 second for a period of X time.
Will this 'flood' my router if the internet is slow for whatever reason?
I can sort of guess that this will be the case...
So, how can I test whether the task has completed before calling it again? In doing this will the extra plumbing slow down my speed gains by using Task?
Finally, is using Tasks making usage of multiple cores?

Will this 'flood' my router if the internet is slow for whatever
reason?
This depends on the size of the file you are uploading and you connection speed. To figure that out, just run .
So, how can I test whether the task has completed before calling it
again?
You can use Task.ContinueWith function (any of available overloads) to "catch" task completion and run some other method, may be in recursion too.
In doing this will the extra plumbing slow down my speed gains by
using Task?
It depends on workload, your processor and timing you expect. So, in other words, run it to measure, there is no possible generic answer to this.
is using Tasks making usage of multiple cores?
Yes, whenever it figures out it is possible. Running single task one after another will not spread the single function work on multiple cores. For this you need to use Parallel.For and similar artifacts. And again, .NET does not provide you with a mechanism for SIMD orchestration, so you are not guaranteed that it will run on multicores, but most probably will.

Related

Options Besides Using Parallel.ForEach

I have built a Windows Forms app that is used to generate approximately 70k SSRS reports and save them to a folder for distribution. This process takes about 8 hours to run so I tried using Parallel.ForEach() to speed things up.
I can run the app with MaxDegreeOfParallelism set to 3 as long as no other processes are accessing the report server, anything higher than that or some other process accessing the server at the same time and the report server throws an HTTP 503 error because it's overloaded. I have no control over what or when other processes can access the server so I’m concerned that setting MaxDegreeOfParallelism down to 2 may not prevent overloading the server.
I have almost no experience using the Parallel Library so I would appreciate any direction or suggestions on what I can do besides using Parallel.ForEach() to speed up this app.
First thing in your analysis is, is your task processor intensive or I/O intensive, this will help you decide whether to use Parallel.ForEach() for processor intensive processing or something like Task.WhenAll for I/O intensive processing.
According to your question, I believe this is more a I/O intensive process but is hard to say without being able to see your code.
Is time in each process being spent on database queries, on file reads/writes or in actual processor operations? These are the key questions you need to answer to find the best solution.
Also you can consider new language tools like async streams, or parallel foreach async
You can find some great examples here :
https://scatteredcode.net/parallel-foreach-async-in-c/

Using ThreadPool threads in library

I'm working on a .net core library that will get used mostly in web apps. This library is being built with performance in mind as this is the main design decision. There is some code that is fairly heavy and due to this, will get cached so that subsequent calls are quick. As you can imagine, the first call is slower and I don't want that. I want to execute this code at the earliest possible time to warm up the cache without affecting the other operations. I was thinking of using Task.Start() without awaiting to to achieve this.
My question is, is it frowned upon to use threadpool threads in a library, i.e what is the etiquette on this? As this will be mostly used on web apps, I feel I don't want to interfere with the client's threadpool. That being said, the library will only use one background thread and this will be less than a second. Or should I just let the client take the performance hit for first calls?
If I understand you correctly; it's perfectly legitimate to use multi-threading in a library; as a matter of fact: it happens all the time.
Basically, a lot of async Task methods do this in one way or another. (Sometimes there is no thread)
If it's so heavy you need multiple parallel threads for a long period in time, than it's best to create an explicit initialize routine, and warn the caller in the docs.
Task.Run is typically used for such processing.

Task in synchronous application - IO bound

We are working on a old comparator.
When an user make a research, we are calling 10-30 different webservice (REST, SOAP) at the same time. Pretty classic so. Each webservice is reprensented by a Client in our application.
So the code is like:
//Get the request list of client to call
clientRqListToCall = BuildRequest(userContext);
List<Task> taskList = new List<Task>();
//Call the different client
Foreach (ClientRequest clientRq in clientRqListToCall) {
Task task = Task.Run(() => CallClient(clientRq));
taskList.Add(task);
}
//wait client until timeOut
Task mainWaiterTask = Task.WhenAll(taskList);
mainTask.ConfigureAwait(false);
mainTask.Wait(timeout);
Simple. (Not sure the configureAwait is needed). The response of each client is store in a field of ClientRequest, so we don't use mainTask.Result (if a client Timeout, we need to be able to continue with another's ones, and they timeout a lot! Client call behaviours are pretty similar to a fireandforget).
The application is a little old, our search engine is synchronous. The call of the different webservice are in the different CallClient callTree, according the to research context, 5 to 15 different function are call before the webservice call. Each webservice call is pretty long (1 to 15s each) ! This point seems to be important ! These are not pings simple pings requests.
Actions / Changes ?
So this is an I/O bound problem, we know Task.Run work pretty well for CPU-bound problem and not for I/O, the question is how to make this code better?
We read a lot of different article on the subject, thanks to Stephen Cleary (http://blog.stephencleary.com/2012/07/dont-block-on-async-code.html)
But we arenot sure of our choice / road map, that s why i post this ticket.
We could make the code asynchronous, but we would have to rework the whole CallClient call tree (hundreds of functions). It is the only solution ? Of course we could migrate webservice one by one using bool argument hack (https://msdn.microsoft.com/en-us/magazine/mt238404.aspx).
=> Must we start with the most costly (in term of IO) webservice, or only the number of webservice call is important, and so we should start the easiest?
In others words, if i got 1 bigs client, with a 10s response average and a lot of data, must we start to async then first? Or should be start with littles ones (1-2s) with the same amount of data. I could be wrong but a thread is lock in synchronous way until task.run() finish so obvisouly the 10s Task lock a thread for the whole time, but in term of I/O free a thread asap could be better. Does the amount of data download is important? or should we only thinck in term of webservice timer?
Task.Run use the application threadPool, we have to choice between .Run(...) or Task.Factory.StartNew(..., TaskCreationOptions.LongRunning) in order to (lots of the time),
create new thread and so maybe got a better.
=> i made some test on subjet, using a console application, .Run() seems to be 25% to 33% faster than Task.Factory.StartNew in all scenario.
Of course this is an expected result, but on a webapp with like 200 users,
i am not sure the result would be the same, i fear the pool to be full and the Task jump to each others without be ended.
Note: If startNew is used, WaitAll(timeout) remplace WhenAll.
Today we got in average 20 to 50 customers can make a research at the same time. The application work without big issues, we dont have deadlock, but sometimes we can see some delay in the task execution in our side. Our Cpu usage is pretty low (<10%), Ram is green too (<25%)
I know there is plenty of tickets about Tasks, but it s hard to merge them together to match our problem. And we also read contradictory advices.
I have used Parallel.ForEach to handle multiple I/O operations before, I did not see it mentioned above. I am not sure it will handle quite what you need seeing the function that is passed into the loop is that same for each. Maybe coupled with a strategy pattern / delegates you can achieve what you need.

Task parallel library - Parallelism on single core

I am working on a WPF application.
In a screen/View i have to make 6 calls to a WCF service. None of those calls are related in the sense they dont share data neither are they dependent on each other. I am planning to use TPL and make these 6 WCF service calls as 6 tasks. Now the application might be either deployed on a single core machine or multiple core machine.
I am being told that usage of TPL on single core machine would actually increase the time take for the tasks to complete because of the overhead that would be placed on the cpu scheduler to time splice different tasks. Is this true. If yes should i still continue with my design or should i look at alternatives.
if i have to look at alternatives, what are those alternatives :) ?
When doing something CPU intensive, you would be adding overhead by running parallel threads on a single core machine.
In your case the tasks are not CPU intensive, they are waiting for a service call to respond, so you can very well run parallel threads on a single core machine.
Depending on how the server handles the calls, there might not be any time increase anyway. If the calls are queued on the server, it will take about the same time to run all calls anyway. In that case it would be better to run the calls in sequence, just because it's simpler.
Your best bet is to profile using multi-core and single core. Most bios's can set the number of active core's so it shouldn't be a big problem. You can do some mock testing to find out if it will work for you.
Obviously using task switching has overhead issues but as long as each task's time is much longer than the setup time you won't notice it.
There are many ways to implement multi-tasking behavior and if you do not know which is best then chances are you need to actually write some test cases and do some profiling. This is not difficult to do. If you are simply trying to use multi-core systems then it generally is quite easy with the latest version of .NET and you can even set it up for multi-core but revert back to single core by using appropriate constructs.
the async/await pattern, for example, can easily be ran synchronously by either using #ifdef or removing all await keywords(with a search and replace tool). Parallel.For loops are easily convertible to normal for loops either directly or by changing MaxDegreeOfParallelism. Tasks can easily be ran synchronously.
If you would like to make it more transparent you could use some pre-processing scripting like T4.
In general, When running multi threads on single core it will be slower since it has Context Switch between the threads.
I think the following diagram will explain you the difference:
As you can see the diagram refer to 4 threads running on single core, first time in multi-tasking and the second time Sequential.
you can see that in multi-tasking all threads will finish at a later time than Sequential tasking.
In your specific case in probably won't be the same and I think #Guffa is right in his answer since its involving WCF calling

C# multi threading query

I am trying to write a program in C# that will connect to around 400 computers and retrieve some information, lets say it retrieves the list of web services running on each computer.
I am assuming I need a well threaded application to be able to retrieve info from such a huge number of servers really quick. I am pretty blank on how to start working on this, can you guys give me a head start as to how to begin!
Thanks!
I see no reason why you should use threading in your main logic. Use asynchronous APIs and schedule their callback to the main thread. That way you get the benefits of asynchrony, but without most of the difficulty related to threading.
You'll only need multithreading in your logic code if the work you need to do on the data is that expensive. And even then you usually can get aways with parallelizing using side effect free functions.
Take a look at the Task Parallel Library.
Speficically Data Parallelism.
You could also use PLINQ if you wanted.
You should also execute the threads parallely on a multi-core CPU to enhance performance.
My favourite references on the topic are given below -
http://www.albahari.com/threading/
http://www.codeproject.com/KB/Parallel_Programming/NET4ParallelIntro.aspx
Where and how do you get the list of those 400 servers to query?
how often do you need to do this?
you could use a windows service or schedule a task which invoke your software and in it you could do a foreach element in the server list and start a call to such server in a different thread using thread queue/pool, but there is a maximum so you won't start 400 threads all together anyway.
describe a bit better your solution and we see what you can do :)
Take a look at this library: Task Parallel Library. You can make efficient use of your system resources and manage your work easier than managing your threads directly.
There might be considerable impact on the server side when you start query all 400 computers. But you can take a look at Parallel LINQ (PLINQ), where you can limit the degree of parallelism.
You can also use thread pooling for this matter, e.g. a Task class.
Createing manual threads may not be a good idea, as they are not highly reusable and take quite a lot of memory/CPU to be created

Categories

Resources