I am looking at ways to implement co-routines (user scheduled threads) in c#. When using c++ I was using fibers. I see on the internet fibers do not exist in C#. I would like to get similar functionality.
Is there any "right" way to implement coroutines in c#?
I have thought of implementing this using threads that acquire a single execution mutex + 1 on scheduler thread which releases this mutex for each coroutine. But this seems very costly (it forces a context switch between each coroutine)
I have also seen the yield iterator functionality, but as I understand you can't yield within an internal function (only in the original ienumerator function). So this does me little good.
I believe with the new .NET 4.5\C# 5 the async\await pattern should meet your needs.
async Task<string> DownloadDocument(Uri uri)
{
var webClient = new WebClient();
var doc = await webClient.DownloadStringTaskAsync(url);
// do some more async work
return doc;
}
I suggest looking at http://channel9.msdn.com/Events/TechEd/Australia/Tech-Ed-Australia-2011/DEV411 for more info. It is a great presentation.
Also http://msdn.microsoft.com/en-us/vstudio/gg316360 has some great information.
If you are using an older version of .NET there is a Async CTP available for older .NET with a go live license so you can use it in production environments. Here is a link to the CTP http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=9983
If you don't like either of the above options I believe you could follow the async iterator pattern as outlined here. http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=9983
Edit: You can now use these: Is there a fiber api in .net?
I believe that you should look at the the Reactive Extensions for .NET. For example coroutines can be simulated using iterators and the yield statement.
However you may want to read this SO question too.
Here is an example of using threads to implement coroutines:
So I cheat. I use threads, but I only
let one of them run at a time. When I
create a coroutine, I create a thread,
and then do some handshaking that ends
with a call to Monitor.Wait(), which
blocks the coroutine thread — it won’t
run anymore until it’s unblocked. When
it’s time to call into the coroutine,
I do a handoff that ends with the
calling thread blocked, and the
coroutine thread runnable. Same kind
of handoff on the way back.
Those handoffs are kind of expensive,
compared with other implementations.
If you need speed, you’ll want to
write your own state machine, and
avoid all this context switching. (Or
you’ll want to use a fiber-aware
runtime — switching fibers is pretty
cheap.) But if you want expressive
code, I think coroutines hold some
promise.
It's 2020, lots of things have evolved in C#. I've published an article on this topic, Asynchronous coroutines with C# 8.0 and IAsyncEnumerable:
In the C# world, they (coroutines) have been popularized by Unity
game development
platform, and Unity
uses
IEnumerator-style
methods and yield return for that.
Prior to C# 8, it wasn't possible to combine await and yield return within the same method, making it difficult to use asynchrony
inside coroutines. Now, with the compiler's support for
IAsyncEnumerable, it can be done naturally.
Channels the missing piece
Pipelines are the missing piece relative to channels in golang. Channels are actually what make golang tick. Channels are the core concurrency tool. If you're using something like a coroutine in C# but using thread synchronisation primatives (semaphore, monitor, interlocked, etc..) then it's not the same.
Almost the same - Pipelines, but baked in
8 years later, and .Net Standard (.Net Framework / .Net Core) has support for Pipelines [https://blogs.msdn.microsoft.com/dotnet/2018/07/09/system-io-pipelines-high-performance-io-in-net/]. Pipelines are preferred for network processing. Aspcore now rates among the top 11 plaintext throughput request rates [https://www.techempower.com/benchmarks/#section=data-r16&hw=ph&test=plaintext].
Microsoft advises best practice for interfacing with network traffic: the awaited network bytes (Completion Port IO) should put data into a pipeline, and another thread should read data from the pipeline asynchronously. Many pipelines can be used in series for various processes on the byte stream. The Pipeline has a reader and a writer cursor, and the virtual buffer size will cause backpressure on the writer to reduce unnecessary use of memory for buffering, typically slowing down network traffic.
There are some critical differences between Pipelines and Go Channels. Pipelines aren't the same as a golang Channel. Pipelines are about passing mutable bytes rather than golang channels which are for signalling with memory references (including pointers). Finally, there is no equivalent select with Pipelines.
(Pipelines use Spans [https://adamsitnik.com/Span/], which have been around for a little while, but now are optimised deeply in .Net Core. Spans improve performance significantly. The .Net core support improves performance further but only incrementally, so .Net Framework use is perfectly fine. )
So pipelines are a built-in standard should help replace golang channels in .Net, but they are not the same, and there will be plenty of cases where pipelines are not the answer.
Direct Implementations of Golang Channel
https://codereview.stackexchange.com/questions/32500/golang-channel-in-c - this is some custom code, and not complete.
You would need to be careful (as with golang) that passed messages via a .Net Channel indicate a change of ownership over an object. This is something only a programmer can track and check, and if you get it wrong, you'll two or more threads accessing data without synchronisation.
You may be interested in this is a library that hides the usage of coroutines. For example to read a file:
//Prepare the file stream
FileStream sourceStream = File.Open("myFile.bin", FileMode.OpenOrCreate);
sourceStream.Seek(0, SeekOrigin.End);
//Invoke the task
yield return InvokeTaskAndWait(sourceStream.WriteAsync(result, 0, result.Length));
//Close the stream
sourceStream.Close();
This library uses one thread to run all coroutines and allow calling the task for the truly asynchronous operations. For example to call another method as a coroutine (aka yielding for its return
//Given the signature
//IEnumerable<string> ReadText(string path);
var result = new Container();
yield return InvokeLocalAndWait(() => _globalPathProvider.ReadText(path), container);
var data = container.RawData as string;
Related
Is it possible to define and call method asynchronously in the same thread as the caller?
Suppose I have just one core and I don't want threading management overhead with like 100 threads.
Edit
The reason I ask is nodejs model of doing things - everything on one thread never blocking anything, which proved to be very efficient, which made me wonder if same stuff possible in C# (and I couldn't achieve it myself).
Edit2 Well, as noted in comments node isn't single-threaded after all (however simple load test shows, that it uses just one core...), but I think what makes it so efficient is implicit requirement to write only non-blocking code. Which is possible in C#, except not required:) Anyway, thanks everyone...
More info in
this SO post and even more in this one
It's not really clear exactly what context you're talking about, but the async/await feature of C# 5 already helps to support this sort of thing. The difference is that whereas in node.js everything is forced to be single threaded by default (as I understand it; quite possibly incorrectly given the links in the comments), server-side applications in .NET using asynchrony will use very few threads without limiting themselves to that. If everything really can be served by a single thread, it may well be - if you never have more than one thing to do in terms of physical processing, then that's fine.
But what if one request comes in while another is doing a bit of work? Maybe it's doing just a small amount of encryption, or something like that. Do you really want to make the second request wait for the first one to complete? If you do, you could model that in .NET fairly easily with a TaskScheduler associated with a single-thread thread-pool... but more commonly, you'd use the thread-pool built into .NET, which will work efficiently while still allowing concurrency.
First off, you should make sure you're using .NET 4.5 - it has far more asynchronous APIs (e.g. for database and file access) than earlier versions of .NET. You want to use APIs which conform to the Task-based Asynchronous Pattern (TAP). Then, using async/await you can write your server-side code so that it reads a bit like synchronous code, but actually executes asynchronous. So for example, you might have:
Guid userId = await FetchUserIdAsync();
IEnumerable<Message> messages = await FetchMessagesAsync(userId);
Even though you need to "wait" while each of these operations talks place, you do so without blocking a thread. The C# compiler takes care of building a state machine for you. There's no need to explicitly write callbacks which frequently turn into spaghetti code - the compiler does it all.
You need to make sure that whatever web/web-service framework you use supports asynchrony, and it should just manage the rest, using as few threads as it can get away with.
Follow the two links above to find out more about the asynchrony support in C# 5 - it's a huge topic, and I'm sure you'll have more questions afterwards, but when you've read a bit more you should be able to ask very specific questions instead of the current broad one.
As part of trying to learn C#, I'm writing a small app that goes through a list of proxies. For each proxy it will create an httpwebrequest to a proxytest.php which prints generic data about a given proxy (or doesn't, in which case the proxy is discarded)
Clearly the webrequest code needs to run in a separate thread - especially since I'm planning on going through rather large lists. But even on a separate thread, going through 5,000 proxies will take forever, so I think this means I am to create multiple threads (correct me if I'm wrong)
I looked through MSDN and random threading tutorials and there's several different classes available. What's the difference between dispatcher, backgroundworker and parallel? I was given this snippet:
Parallel.ForEach(URLsList, new ParallelOptions() { MaxDegreeOfParallelism = S0 }, (m, i, j) =>
{
string[] UP = m.Split('|');
string User = UP[0];
string Pass = UP[1];
// make call here
}
I'm not really sure how it's different than something like starting 5 separate background workers would do.
So what are the differences between those three and what would be a good (easy) approach to this problem?
Thanks
The Dispatcher is an object that models the message loop of WPF applications. If that doesn't mean anything to you then forget you ever heard of it.
BackgroundWorker is a convenience class over a thread that is part of the managed thread pool. It exists to provide some commonly requested functionality over manually assigning work to the thread pool with ThreadPool.QueueUserWorkItem.
The Thread class is very much like using the managed thread pool, with the difference being that you are in absolute control of the thread's lifetime (on the flip side, it's worse than using the thread pool if you intend to launch lots of short tasks).
The Task Parallel Library (TPL) (i.e. using Parallel.ForEach) would indeed be the best approach, since it not only takes care of assigning work units to a number of threads (from the managed thread pool) but it will also automatically divide the work units among those threads.
I would say use the task parallel library. It is a new library around all the manual threading code you will have to write otherwise.
The Task Parallel Library (TPL) is a collection of new classes specifically designed to make it easier and more efficient to execute very fine-grained parallel workloads on modern hardware. TPL has been available separately as a CTP for some time now, and was included in the Visual Studio 2010 CTP, but in those releases it was built on its own dedicated work scheduler. For Beta 1 of CLR 4.0, the default scheduler for TPL will be the CLR thread pool, which allows TPL-style workloads to “play nice” with existing, QUWI-based code, and allows us to reuse much of the underlying technology in the thread pool - in particular, the thread-injection algorithm, which we will discuss in a future post.
from
http://blogs.msdn.com/b/ericeil/archive/2009/04/23/clr-4-0-threadpool-improvements-part-1.aspx
I found working with this new 4 library really easy. This blog is showing the old BackgroundWorker way of doing things and the new Task way of doing things.
http://nitoprograms.blogspot.com/2010/06/reporting-progress-from-tasks.html
I've a configuration xml which is being used by a batch module in my .Net 3.5 windows application.
Each node in the xml is mapped to a .Net class. Each class does processing like mathematical calculations, making db calls etc.
The batch module loads the xml, identifies the class associated with each node and then processes it.
Now, we have the following requirements:
1.Lets say there are 3 classes[3 nodes in the xml]...A,B, and C.
Class A can be dependant on class B...ie. we need to execute class B before processing class A. Class C processing should be done on a separare thread.
2.If a thread is running, then we should be able to cancel that thread in the middle of its processing.
We need to implement this whole module using .net multi-threading.
My questions are:
1.Is it possible to implement requirement # 1 above?If yes, how?
2.Given these requirements, is .Net 3.5 a good idea or .Net 4.0 would be a better choice?Would like to know advantages and disadvantages please.
Thanks for reading.
You'd be better off using the Task Parallel Library (TPL) in .NET 4.0. It'll give you lots of nice features for abstracting the actual business of creating threads in the thread pool. You could use the parallel tasks pattern to create a Task for each of the jobs defined in the XML and the TPL will handle the scheduling of those tasks regardless of the hardware. In other words if you move to a machine with more cores the TPL will schedule more threads.
1) The TPL supports the notion of continuation tasks. You can use these to enforce task ordering and pass the result of one Task or future from the antecedent to the continuation. This is the futures pattern.
// The antecedent task. Can also be created with Task.Factory.StartNew.
Task<DayOfWeek> taskA = new Task<DayOfWeek>(() => DateTime.Today.DayOfWeek);
// The continuation. Its delegate takes the antecedent task
// as an argument and can return a different type.
Task<string> continuation = taskA.ContinueWith((antecedent) =>
{
return String.Format("Today is {0}.",
antecedent.Result);
});
// Start the antecedent.
taskA.Start();
// Use the contuation's result.
Console.WriteLine(continuation.Result);
2) Thread cancellation is supported by the TPL but it is cooperative cancellation. In other words the code running in the Task must periodically check to see if it has been cancelled and shut down cleanly. TPL has good support for cancellation. Note that if you were to use threads directly you run into the same limitations. Thread.Abort is not a viable solution in almost all cases.
While you're at it you might want to look at a dependency injection container like Unity for generating configured objects from your XML configuration.
Answer to comment (below)
Jimmy: I'm not sure I understand holtavolt's comment. What is true is that using parallelism only pays off if the amount of work being done is significant, otherwise your program may spend more time managing parallelism that doing useful work. The actual datasets don't have to be large but the work needs to be significant.
For example if your inputs were large numbers and you we checking to see if they were prime then the dataset would be very small but parallelism would still pay off because the computation is costly for each number or block of numbers. Conversely you might have a very large dataset of numbers that you were searching for evenness. This would require a very large set of data but the calculation is still very cheap and a parallel implementation might still not be more efficient.
The canonical example is using Parallel.For instead of for to iterate over a dataset (large or small) but only perform a simple numerical operation like addition. In this case the expected performance improvement of utilizing multiple cores is outweighed by the overhead of creating parallel tasks and scheduling and managing them.
Of course it can be done.
Assuming you're new, I would likely look into multithreading, and you want 1 thread per class then I would look into the backgroundworker class, and basically use it in the different classes to do the processing.
What version you want to use of .NET also depends on if this is going to run on client machines also. But I would go for .NET 4 simply because it's newest, and if you want to split up a single task into multiple threads it has built-in classes for this.
Given your use case, the Thread and BackgroundWorkerThread should be sufficient. As you'll discover in reading the MSDN information regarding these classes, you will want to support cancellation as your means of shutting down a running thread before it's complete. (Thread "killing" is something to be avoided if at all possible)
.NET 4.0 has added some advanced items in the Task Parallel Library (TPL) - where Tasks are defined and managed with some smarter affinity for their most recently used core (to provide better cache behavior, etc.), however this seems like overkill for your use case, unless you expect to be running very large datasets. See these sites for more information:
http://msdn.microsoft.com/en-us/library/dd460717.aspx
http://archive.msdn.microsoft.com/ParExtSamples
I'm in the process of writing a library that deals with long-running tasks like file downloading and processing large amounts of text. I want to multi-thread this library so that these tasks won't freeze up the applications that use them.
Do you have any advice for doing so in a structured manner, or specific classes I should use/avoid? I was thinking of using the IAsyncResult interface: http://msdn.microsoft.com/en-us/library/system.iasyncresult.aspx, or perhaps some BackgroundWorkers.
so that these tasks won't freeze up the applications that use them.
If this is your goal, you should look into the standard asynchronous programming patterns in the framework.
If your library is targeting .NET 4, have it return Task and Task<T>, as this will ease transition into the async support coming in the next release of C# and VB.NET. This also has the very nice addition of allowing synchronous usage with no extra work on your part, since the user can always just do:
var result = foo.BarAsync().Result; // Getting Task<T>.Result blocks, effectively making this synchronous
If you're targeting .NET 3.5 or earlier, you should consider using the Event-based asynchronous pattern, as it is used in more of the current APIs than the APM.
I have created a renderer in Silverlight/C#. Currently I'm using System.Threading.ThreadPool to schedule rendering of tiles in parallel. This works well right now, but I would like to limit the number of threads used.
Since this runs on Silverlight there are a couple of restrictions:
If I call ThreadPool.SetMaxThreads the application crashes as documented.
There is no Task Parallel Library
I see a few options:
Find an OSS/third party Thread Pool
Implement my own Thread Pool (I'd rather not)
Use Rx (which I do in other places)
Are there any tested alternative Thread Pools that work with Silverlight out there?
Or can anyone come up with a Rx expression that spawns a limited number of threads and queue work on these?
If you're using Rx, check out:
https://github.com/xpaulbettsx/ReactiveUI/blob/master/ReactiveUI/ObservableAsyncMRUCache.cs
(Copying this one file into your app should be pretty easy, just nuke the this.Log() lines and the IEnableLogger interface)
Using it is pretty easy, just change your SelectMany to CachedSelectMany:
someArray.ToObservable()
.CachedSelectMany(webService)
.Subscribe(x => /* do stuff */);
If you use Rx then it seems like you could quite easily write your own implementation of IScheduler. This could just apply a simple semaphore and then pass the work on to the ThreadPool. With this approach you get to leaverage the ThreadPool, allow for testing as you are coding against an interface and you will also have good seams for testing.
Further more, as you have written this yourself, you could actually use a small-ish (<10)set of Threads that you manage yourself (instead of the threadpool)so you can avoid ThreadPool starvation.
Check out Ami Bar's SmartThreadPool. It's got a ton of features missing from the default .NET threadpool, allows you to set a MaxThreads property per threadpool instance, and supports Silverlight.