Multi threaded programme - design considerations

Multi threaded programme - design considerations - c#

This might be quite complex so sorry for the wordy question.
1) Im going to redesign my application now to work with multiple threads (Backgroundworkers to be precise.) I will probably be having 5 or 6 bgw's for a particular gui. My first issue is, i have one method call that a gui needs to get its "core" data. Various parts of this data is then used for various other calls to places which also forms data that is displayed on the same page as the core data. How can i process this with various background workers such that backgroundworker1 does the core data getting, backgroundworker2 uses a particular item of the core to get more data, backgroundworker3 uses some core data and so on? Thus leaving my gui and main thread in an unlocked state
2) As i said previously, the gui has to get a set of core data first and then a fair few other database calls to get the rest of the important data. As i have seen i need to get this data outside of the gui constructor so there arent such big demands when the gui is created. In a design sense, how should i construct my gui such that it has access to data that then just needs to be displayed on creation opposed to, accessed and then displayed?
I hope these arent too wordy questions? I can see already that a lot of this comes down to programme design which as a novice is quite difficuly ( in my opinion of course). Hopefully someone can advise me as to what they would do in this situation.
Thanks

This sounds like a good task for a work queue. The main idea behind this is to add a work item to the queue, and this work item will have an associated function to do work on the data. The work is typically distributed to any number of threads you specify.
Several of these exist, just google for it.

Have you had a look at the .net 4 Task Parallel Library? Task Parallel Library
Check out the area titled Creating Task Continuations almost halfway down the page.
This is an example form the linked site
Task<byte[]> getData = new Task<byte[]>(() => GetFileData());
Task<double[]> analyzeData = getData.ContinueWith(x => Analyze(x.Result));
Task<string> reportData = analyzeData.ContinueWith(y => Summarize(y.Result));
getData.Start();
System.IO.File.WriteAllText(#"C:\reportFolder\report.txt", reportData.Result);
//or...
Task<string> reportData2 = Task.Factory.StartNew(() => GetFileData())
.ContinueWith((x) => Analyze(x.Result))
.ContinueWith((y) => Summarize(y.Result));
System.IO.File.WriteAllText(#"C:\reportFolder\report.txt", reportData.Result);

Related

Return from Controller without waiting for async method to finish

I've read myself blue and am hoping there's a simple answer.
I have a web API that handles telemetry from various apps "in the wild" In one of my controllers, I want to receive a request to log an error to my central monitoring database and return a response near immediately as possible (I have no real way of knowing how critical performance might be on the caller's end and there's already a significant hit for making the initial web service request).
Essentially, what I'm looking for is something like this:
public IHttpActionResult Submit() {
try {
var model = MyModel.Parse(Request.Content.ReadAsStringAsync().Result);
// ok, I've got content, now log it but don't wait
// around to see the results of the logging, just return
// an Accepted result and begone
repository.SaveSubmission(model); // <-- fire and forget, don't wait
return Accepted();
} catch (Exception)
return InternalServerError();
}
}
It seems like it ought to be straightforward, but apparently not. I've read any number of various posts indicating everything from yup, just use Task.Run() to this is a terrible mistake and you can never achieve what you want!
The problem in my scenario appears to be the fact that this process could be terminated mid-process due to it running on the ASP.NET worker process, regardless of the mire of different ways to invoke async methods (I've spend the last two hours or so reading various SO questions and Stephen Cleary blogs... whew).
If the underlying issue in this case is that the method I'd 'fire and forget' is bound to the http context and subject to early termination by the ASP.NET worker process, then my question becomes...
Is there some way to remove this method/task/process from that ASP.NET context? Once that request is parsed into the model, I myself have no more specific need to be operating within the http context. If there's an easy way I can move it out of there (and thus letting the thing run barring a website/apppool restart), that'd be great.
For the sake of due diligence, let's say I get rid of the repository context in the controller and delegate it to some other context:
public IHttpActionResult Submit() {
try {
var model = MyModel.Parse(Request.Content.ReadAsStringAsync().Result);
SomeStaticClass.SaveSubmission(model); // <-- fire and forget, don't wait
return Accepted();
} catch (Exception)
return InternalServerError();
}
}
... then the only thing that has to "cross lines" is the model itself - no other code logic dependencies.
Granted, I'm probably making a mountain of a molehill - the insertion to the database won't take but a fraction of time anyway... it seems like it should be easy though, and I'm apparently too stubborn to settle for "good enough" tonight.

Ok, found a few more that were actually helpful to my scenario. The basic gist of it seems to be don't do it.
In order to do this correctly, one needs to submit this to a separate component in a distributed architecture (e.g., message or service queue of some sort where it can be picked up separately for processing). This appears to be the only way to break out of the ASP.NET worker process entirely.
One S/O comment (to another S/O post) lead me to two articles I hadn't yet seen before posting: one by Stephen Cleary and another by Phil Haack.
SO post of interest: How to queue background tasks in ASP.NET Web API
Stephen's Fire and Forget on ASP.NET blog post (excellent, wish I had found this first): http://blog.stephencleary.com/2014/06/fire-and-forget-on-asp-net.html
And Phil's article: http://haacked.com/archive/2011/10/16/the-dangers-of-implementing-recurring-background-tasks-in-asp-net.aspx/
The following project by Stephen may be of interest as well: https://github.com/StephenCleary/AspNetBackgroundTasks
I thought I'd delete my question but then figured it took me so long digging around to find my answer that maybe another question floating around SO wouldn't hurt...
(in this particular case, submitting to another service is going to take near as long as writing to the database anyway, so I'll probably forego the async processing for this api method, but at least now I know for when I actually do need to do it)

A database insert shouldn't take so long that you have to offload that processing to a background task. For starters just writing the task to a queue (or, as you suggested, handing off to a service) is going to take just as long but either approach should be sub-second.
However, if time is critical for you one way to speed up your response time is to make the database write as fast as possible using some form of in-memory cache so that the slower write to physical database storage is a queued background task. High-volume sites frequently use in-memory databases that implement this kind of behaviour (I've never needed one so can't help you choose a product) but you could also code this yourself just using a per-application instance list of objects and a background loop of some form.
This is where those articles you've linked apply and it gets complicated so a pre-built implementation is almost always the best approach - check out HangFire if you want a pre-built fire-and-forget implementation.

Best practices to enable fast startup for a Windows Forms Application (.NET 4.0 C#)

I have made a rather complex .NET 4.0 (C#) Windows Forms application using Visual Studio 2013. The question is quite general though, and should be applicable for other versions of .NET and VS as well.
On startup the system reads config file, parses file folders and reads file content, reads data from database, performs a web request and adds data to a lot of controls on the main startup form.
I want to avoid having a splash screen with "waiting-hourglass", the goal is to make the application startup fast and show the main form immediately.
My solution has been to use backgroundworker for some of the startup tasks, making the application visible and responsive while data are fetched. The user can then choose to navigate away from the startup form and start doing other tasks without having to wait for all the startup procedures to be completed.
Is use of backgroundworker suitable for this?
What other methods should be considered instead of, or in addition to, backgroundworker to enable fast startup for an application with a lot of startup procedures?

In my applications I use a splash screen. However, I do not show a waiting-hourglass. Instead it shows a status line where current action is written, e.g. "Read config file", "Connect to database", "Perform web request", etc.
Of course, the application does not start faster but the user does not have the feeling of a hanging program and it appears faster.

In any case it depends if early access availablity makes sense for the user. A good way would also be to just preload the first page / form / tab before the user can see the interface (Splashscreen or loading bar before that).
When the first bits are loaded you could asynchroniously cache more data and only allow the user switching pages / tabs when the caching of these components is completed (you will have to display a "still loading" message or grey out other tabs while doing this to not confuse the user).
You can also just load addditional data if the user chooses to use the page / tab / feature to reduce loading unneccesary information but this will lead to waiting while using the application - it`s up to you.
Technically, as BackgroundWorker is explicitly labeled as obsolete in .NET 4.5 you should see if the introduced await/async would be a more elegant solution for you (See MSDN Asynchronous Programming with Async and Await Introduction)
MSDN says:
The async-based approach to asynchronous programming is preferable to
existing approaches in almost every case. In particular, this approach
is better than BackgroundWorker for IO-bound operations because the
code is simpler and you don't have to guard against race conditions.
See a comparison thread Background Worker vs Await/Async
See a well commented example of backgroundworker code to load GUI data if you choose to use that technique

Rather an advice than an answer:
Is use of backgroundworker suitable for this? - yes.
What other methods should be considered instead of, or in addition to, backgroundworker to enable fast startup for an application with a lot of startup procedures? - consider on-demand a.k.a. lazy loading of data. That is, load the data only when they are actually needed rather than query everything at once possibly many of them without ever being used or looked at. If this is not possible as of your UI setup, consider refining your UI and rethink whether everything should be displayed as is. For example, use separate windows or expanders to display details and query the data when they are made visible. This does not only save you time on app startup but also makes sure that you display any data in an up-to-date manner.

Responsive GUI using Multi-threading in C# Winforms

I have a child form in my application. This form has got about more than 50 comboboxes and everyone is getting data from database. All combobexs are loaded in the load event of the form. The data is large. Data retrieval takes about 2 minutes. when I open this form, my whole application becomes unresponsive. The application hangs and it gets life after about 2 minutes :/
As I have studied, we can use different threads in order to avoid such situations. Can someone guide, is it possible, safe and secure to implement multi-threading in order to make my application responsive?
Please guide me and write a sample code if possible how multithreading works in c#. You can simply explain using a form having a gridview that takes datatable as daTASOURCE in a separate thread and GUI is responsive even database takes too much time...Any help is appreciated.Thanx in advance!

Take a look at the BackgroundWorker class. This can do exactly what you want. You can also include something like a progress bar to show users the data is still being loaded before they go ahead and do stuff in your child form.

Use task parallel library, which is included in .NET starting from version 4 or Parallel Extensions. Samples of using TPL you can find here
And read more about it here.:
http://bradwilson.typepad.com/blog/2012/04/tpl-and-servers-pt1.html
http://www.codeproject.com/Articles/30975/Parallel-Extensions-for-the-NET-Framework-Part-I-I
http://blogs.msdn.com/b/csharpfaq/archive/2010/06/01/parallel-programming-in-net-framework-4-getting-started.aspx
http://thenewtechie.wordpress.com/2012/01/03/introduction-to-tpl-part-1/
Reactive Exensions are a little bit harder. But anyway good. Samples here.
Some introduction to it here:
http://www.codeproject.com/Articles/47498/A-quick-look-at-the-Reactive-Extensions-Rx-for-Net
http://mtaulty.com/CommunityServer/blogs/mike_taultys_blog/archive/2010/08/18/reactive-extensions-for-net-stuff-happens.aspx
Anyway you can find very easy more information.
Speaking about BackgroundWorker. It's good solution for winforms. But this approach is outdated. TPL or Rx are new approaches, more perfomant, more comfortable, especial when you have to much controls. Having so much controls with async operation is also a question to UI design, maybe it should be changed. Anyway BackgroundWorker also a good choice and it's up to you what select.

What C# tools exist for triggering, queueing, prioritizing dependent tasks

I have a C# service application which interacts with a database. It was recently migrated from .NET 2.0 to .NET 4.0 so there are plenty of new tools we could use.
I'm looking for pointers to programming approaches or tools/libraries to handle defining tasks, configuring which tasks they depend on, queueing, prioritizing, cancelling, etc.
There are various types of services:
Data (for retrieving and updating)
Calculation (populate some table with the results of a calculation on the data)
Reporting
These services often depend on one another and are triggered on demand, i.e., a Reporting task, will probably have code within it such as
if (IsSomeDependentCalculationRequired())
PerformDependentCalculation(); // which may trigger further calculations
GenerateRequestedReport();
Also, any Data modification is likely to set the Required flag on some of the Calculation or Reporting services, (so the report could be out of date before it's finished generating). The tasks vary in length from a few seconds to a couple of minutes and are performed within transactions.
This has worked OK up until now, but it is not scaling well. There are fundamental design problems and I am looking to rewrite this part of the code. For instance, if two users request the same report at similar times, the dependent tasks will be executed twice. Also, there's currently no way to cancel a task in progress. It's hard to maintain the dependent tasks, etc..
I'm NOT looking for suggestions on how to implement a fix. Rather I'm looking for pointers to what tools/libraries I would be using for this sort of requirement if I were starting in .NET 4 from scratch. Would this be a good candidate for Windows Workflow? Is this what Futures are for? Are there any other libraries I should look at or books or blog posts I should read?
Edit: What about Rx Reactive Extensions?

I don't think your requirements fit into any of the built-in stuff. Your requirements are too specific for that.
I'd recommend that you build a task queueing infrastructure around a SQL database. Your tasks are pretty long-running (seconds) so you don't need particularly high throughput in the task scheduler. This means you won't encounter performance hurdles. It will actually be a pretty manageable task from a programming perspective.
Probably you should build a windows service or some other process that is continuously polling the database for new tasks or requests. This service can then enforce arbitrary rules on the requested tasks. For example it can detect that a reporting task is already running and not schedule a new computation.
My main point is that your requirements are that specific that you need to use C# code to encode them. You cannot make an existing tool fit your needs. You need the turing completeness of a programming language to do this yourself.
Edit: You should probably separate a task-request from a task-execution. This allows multiple parties to request a refresh of some reports while at the same time only one actual computation is running. Once this single computation is completed all task-requests are marked as completed. When a request is cancelled the execution does not need to be cancelled. Only when the last request is cancelled the task-execution is cancelled as well.
Edit 2: I don't think workflows are the solution. Workflows usually operate separately from each other. But you don't want that. You want to have rules which span multiple tasks/workflows. You would be working against the system with a workflow based model.
Edit 3: A few words about the TPL (Task Parallel Library). You mentioned it ("Futures"). If you want some inspiration on how tasks could work together, how dependencies could be created and how tasks could be composed, look at the Task Parallel Library (in particular the Task and TaskFactory classes). You will find some nice design patterns there because it is very well designed. Here is how you model a sequence of tasks: You call Task.ContinueWith which will register a continuation function as a new task. Here is how you model dependencies: TaskFactory.WhenAll(Task[]) starts a task that only runs when all its input tasks are completed.
BUT: The TPL itself is probably not well suited for you because its task cannot be saved to disk. When you reboot your server or deploy new code, all existing tasks are being cancelled and the process aborted. This is likely to be unacceptable. Please just use the TPL as inspiration. Learn from it what a "task/future" is and how they can be composed. Then implement your own form of tasks.
Does this help?

I would try to use the state machine package stateless to model the workflow. Using a package will provide a consistent way to advance the state of the workflow, across the various services. Each of your services would hold an internal statemachine implementation, and expose methods for advancing it. Stateless will be resposible for triggering actions based on the state of the workflow, and enforce you to explicitly setup the various states that it can be in - this will be particularly useful for maintenance, and it will probably help you understand the domain better.

If you want to solve this fundamental problem properly and in a scalable way, you should probably look as SOA architecture style.
Your services will receive commands and generate events you can handle in order to react on facts happen in your system.
And, yes, there are tools for it. For example NServiceBus is a wonderful tool to build SOA systems.

You can do a SQL data agent to run SQL queries in timed interval. You have to write the application yourself it looks like. Write like a long running program that checks the time and does something. I don't think there is clearcut tools out there to do what you are trying to do. Do C# application, WCF service. data automation can be done in the sql itself.

If I understand you right you want to cache the generated reports and do not the work again. As other commenters have pointed out this can be solved elegantly with multiple Producer/Consumer queues and some caches.
First you enqueue your Report request. Based on the report genration parameters you can check the cache first if a previously generated report is already available and simply return this one. If due to changes in the database the report becomes obsolete you need to take care that the cache is invalidated in a reliable manner.
Now if the report was not generated yet you need need to schedule the report for generation. The report scheduler needs to check if the same report is already beeing generated. If yes register an event to notify you when it is completed and return the report once it is finished. Make sure that you do not access the data via the caching layer since it could produce races (report is generated, data is changed and the finished report would be immediatly discared by the cache leaving noting for you to return).
Or if you do want to prevent to return outdated reports you can let the caching layer become your main data provider which will produce as many reports until one report is generated in time which was not outdated. But be aware that if you have constant changes in your database you might enter an endless loop here by constantly generating invalid reports if the report generation time is longer as the average time between to changes to your db.
As you can see you have plenty of options here without actually talking about .NET, TPL, SQL server. First you need to set your goals how fast/scalable and reliable your system should be then you need to choose the appropriate architecture-design as described above for your particular problem domain. I cannot do it for you because I do not have your full domain know how what is acceptable and what not.
The tricky part is the handover part between different queues with the proper reliability and correctness guarantees. Depending on your specific report generation needs you can put this logic into the cloud or use a single thread by putting all work into the proper queues and work on them concurrently or one by one or something in between.
TPL and SQL server can help there for sure but they are only tools. If used wrongly due to not sufficient experience with the one or the other it might turn out that a different approach (like the usage of only in memory queues and persisted reports on in the file system) is better suited for your problem.
From my current understanding I would not use SQL server to misuse it as a cache but if you want a database I would use something like RavenDB or RaportDB which look stable and much more light weight compared to a full blown SQL server.
But if you already have a SQL server running then go ahead and use it.

I am not sure if I understood you correctly, but you might want to have a look at JAMS Scheduler: http://www.jamsscheduler.com/. It's non-free, but a very good system for scheduling depending tasks and reporting. I have used it with success at my previous company. It's written in .NET and there is a .NET API for it, so you can write your own apps communicating with JAMS. They also have a very good support and are eager to implement new features.

Multithreading CSLA.NET

I have a program that we'd like to multi-thread at a certain point. We're using CSLA for our business rules. At a one location of our program we are iterating over a BusinessList object and running some sanity checks against the data one row at a time. When we up the row count to about 10k rows it takes some time to run the process (about a minute). Naturally this sounds like a perfect place to use a bit of TPL and make this multi-threaded.
I've done a fair amount of multithreaded work through the years, so I understand the pitfalls of switching from single to multithreaded code. I was surprised to find that the code bombed within the CSLA routines themselves. It seems to be related to the code behind the CSLA PropertyInfo classes.
All of our business object properties are defined like this:
public static readonly PropertyInfo<string> MyTextProperty = RegisterProperty<string>(c => c.MyText);
public string MyText {
get { return GetProperty(MyTextProperty); }
set { SetProperty(MyTextProperty, value); }
}
Is there something I need to know about multithreading and CSLA? Are there any caveats that aren't found in any written documentation (I haven't found anything as of yet).
--EDIT---
BTW: the way I implemented my multithreading via throwing all the rows into a ConcurrentBag and then spawning 5 or so tasks that just grab objects from the bag till the bag is empty. So I don't think the problem is in my code.

As you've discovered, the CSLA.NET framework is not thread-safe.
To solve your particular problem, I would make use of the Wintellect Power Threading library; either the AsyncEnumerator/SyncGate combo or the ReaderWriterGate on its own.
The Power Threading library will allow you queue 'read' and 'write' requests to a shared resource (your CSLA.NET collection). At one moment in time, only a single 'write' request will be allowed access to the shared resource, all without thread-blocking the queued 'read' or 'write' requests. Its very clever and super handy for safely accessing shared resources from multiple threads. You can spin up as many threads as you wish and the Power Threading library will synchronise the access to your CSLA.NET collection.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.