My current project is a WPF application with an SQL Server back end.
In WPF, the UI can only be modified by the UI thread. If a UI modification needs to be done on another thread, then the dispatcher object can be called and given an action. Effectively, this is mapping my Delegate to a WM_ message.
Since the linq datacontexts to SQL Server are also single threaded, how could I copy this "Dispatcher" idea from WPF and create a similar object that I can use to marshal requests to my public datacontext to be always from the "Public SQL thread".
I'm guessing I'd need to create a thread at start up which initialises the data contexts and then sleeps until woken by the SqlThread.Invoke() method.
Does anyone know of anything similar to this idea or any materials that may help me do this?
If you mean a LINQ-to-SQL DataContext, I would advise against this; use DataContexts as a short-lived unit-of-work, then Dispose() it; don't keep it for lots of different purposes (there are issues with stale data, cache growth, threading, concurrency, plus (importantly) how to handle failure / rollback).
Re the bigger picture:
Essentially you are describing a work queue, such as a producer/consumer queue. There are plenty of such around, or they are relatively easy to write (for example, see here or here; just add a loop to dequeue+process items). IIRC .NET 4.0 also includes (in the parallel extensions) such constructs pre-canned.
Related
First I have developed much in C#, now I'm working on 3D web project and now the most usable language is JavaScript at the moment.
In C# until there becomes the new keywords async/await in new C# specification there was a way to make asynchronous calls by using:
delegates
Begin/End functions, like: BeginInvoke, EndInvoke
IAsync interface
As for JS... Right now I have a need to use some parallel computations and really need a stuff, which is similar to asynchronous work with some locking models like Semaphore, Mutex or other..
As for async/await... I have tried the promises concept, which is implemented in jQuery with its deferred promise:
http://api.jquery.com/deferred.promise/
http://api.jquery.com/category/deferred-object/
It remains me async/await concept in C#:
http://msdn.microsoft.com/en-us/library/hh191443.aspx
But I've also have found such concept as WebWorkers:
https://developer.mozilla.org/en-US/docs/Web/Guide/Performance/Using_web_workers
When I first read it, I think it could be also a solution except promises pattern, but if to look from the point of implementing I understand WebWorkers are launching from other threads than a main page execution thread and the functions aren't really asynchronous, they're just callbacks with one option, that they have been added to the Worker() instance, which can be used in main page thread, am I right?
So... I wonder, how can I also implement something similar to the Semaphore in JavaScript?
Thanks!
UPDATE #1 ( more a reply to Doge ):
Let me describe you a simple example, there is an application, which I'm developing.
I already using the jQuery deferred object for one thing to await all the texture images I've received, which I was awaiting for.
The link is the next: http://bit.ly/O2dZmQ
It's an webgl application with the three.js library, which is building houses on real data (ps: don't look into code, I know it's not good :) I only recently understand the prototype way of programming in js :) first I was too used to C# and its paradigms ).
So... I have such a task. I must wait when all the textures will be loaded from AJAX and only then I'm setting them as textures for meshes.
Right now... When I've created this question, I thought about redeveloping the source code and thought about WebWorkers use.
What have I think first, which I want to do and what I've done when developed WPF/Silverlight application in C#.
I've done another instance of Worker, which will check asynchronously the task I've described above.
And I have done a very tiny and simple example which I want to use, but have a fail.
As I saw WebWorkers don't accept objects if I want to send it to worker. Alright...
Passing objects to a web worker
There is a JSON.stringify() method... But I've see another thing... JSON.stringify() can't parse an object to string where the are circular references.
Chrome sendrequest error: TypeError: Converting circular structure to JSON
Truly... It's rather disappointing... Because if there is C# or even C++ it's NOT a problem to exchange between instances some objects... Some things could be done with some reinterpret casts or other stuff... And it's not a problem to exchange objects even between different threads even in asynchronous work...
So... For the my aim... What is the best solution? Keep the deffered/promises pattern and not to use WebWorkers?
The tiny source, not full application, but just for a small example, what I want to do:
http://pastebin.com/5ernFNme ( need HTML for JS, which is below )
http://pastebin.com/ELcw7SuE ( JS main logic )
http://pastebin.com/PuHrhW8n ( WebWorker, which I suppose to use as for separate checker )
Textures for a tiny sample:
http://s14.postimg.org/wqm0xb2ep/box.jpg
http://s27.postimg.org/nc77umytv/box2.jpg
Minified three.js could be found here:
https://github.com/mrdoob/three.js/blob/master/build/three.min.js
It's important to understand that JavaScript has no threads. It has an event loop that executes events, one-by-one, as they come in.
The consequence of this is that if you have a process that takes a while you're blocking all execution. If JavaScript is also needed to do UI stuff like responding to user events or animations, it's a bit of snag. You could try to split up your process into multiple events to keep the event loop running smoothly but that's not always easy to do.
This is where Workers come it. Workers run in their own thread. To avoid issues related to threads the workers don't share memory.
You can communicate with a worker by sending and receiving messages. The messages in turn come in the event loop, so you don't need semaphores or anything that synchronizes threads. If your worker controller is in JavaScript there are never any atomicity issues.
If your workers are simple input->output workers then you can totally slap a Promise layer on top of that. Just keep in mind that Promises themselves don't add threads or asynchronousness.
You can only send Workers messages, that is: strings. You can't send them objects, and certainly not objects that might have references to other objects because again: memory issues.
If I look at your use case I guess the only reason why you might want the Workers is to take advantage of multiple cores that most CPUs have nowadays.
I think what you're doing is loading images as textures into your canvas, and that's what takes up a lot of time? There is no good way to use Workers here since the Worker would need a reference to the canvas, and that's not happening.
Now if instead you needed to do processing on the textures to transform them in some way you could use Workers. Send the image data as a binary string, possibly base64_encoded, do the conversion and send it back. If your image is large the serialization will also take up a fair chunk of CPU time so your mileage may vary.
From what I can tell your resources load pretty quick and there doesn't seem to be a CPU bottleneck. So I don't know if you really need the Workers.
I'll have a database object that can be accessed from multiple threads as well as from the main thread. I don't want them to access the underlying database object concurrently, so I'll write a set of thread safe public methods that can be accessed from multiple threads.
My first idea was to use a lock around my connection such as lock(oleDbConnection), but the problem is that I would have to lock it for the main thread since is one more thread that can access it. Which would mean rewriting lots of code.
But, since these threads and the main thread wont access the database very often, how about just using some of my control's (maybe the main form's) Invoke method every time I call any of the database methods from another thread. This way, as far as I understand, these methods would be never called concurrently, and I wouldn't need to worry about the main thread. I guess the only problem would be degrading performance a little bit, but as I said, the database is not accessed that often; the reason why I use threads is not so that they can access the database concurrently but so that they can perform other operations concurrently.
So does this sound like a good idea? Am I missing something? Sounds a bit too easy so I'm suspicious.
It sounds like it would work AFAIK, but it also sounds like a really bad idea.
The problem is that when writing lock you are saying "I want this code to be a critical section", whereas when writing Invoke you are saying "I want this to be executed on the UI thread". These two things are certainly not equivalent, which can lead to lots of problems. For example:
Invoke is normally used to access UI controls. What if a developer sees Invoke and nothing UI-related, and goes "gee, that's an unneeded Invoke; let's get rid of it"?
What if more than one UI thread ends up existing?
What if the database operation takes a long time (or times out)? Your UI would stop responding.
I would definitely go for the lock. You typically want the UI thread responsive when performing operations that may take time, which includes any sort of DB access; you don't know whether it's alive or not for instance.
Also, the typical way to handle connections is to create, use and dispose the connection for each request, rather than reusing the same connection. This might perhaps solve some of your concurrency problems.
Why don't you try to use Connection Pool. Every thread can do its work with a different DB connection and send the result to main thread with Invoke. Connection Pooling is a very common approach used in Servers.
See Using Connection Pooling with SQL Server
Equities trading application uses a class library for getting callbacks on stock quote updates, another class library for getting callbacks on order executions or cancelations. I currently have the callbacks execute in the thread pool. I start one background thread for each callback. The threads are very short lived and the work involved includes fetching the data and notifying the observers. Once observers are notified the background thread dies. When I have strategies subscribing to over 1000 actively traded symbols I get OutOfMemory exceptions.
How can I improve this design? I was thinking of starting two threads at the start, one for quotes, the other for executions, and creating each object on its respective threads. Then just have a shared object which allows adding and removing observers to the threads. But 1) how would you keep the thread alive to receive the callbacks? 2) How can you even have a callback object which is initialized on a thread with no reference on the main thread? Is this even possible?
Any help would be appreciated.
Use a producer / consumer model with a simple queue. Then you have a set number of worker threads running and you won't have this problem.
As for how to call the callback function, you could possibly use a struct like this:
struct WorkerData
{
Data data;
Delegate someCallback;
}
when the worker is finished with the data it can invoke the callback itself.
What you've described is a general picture of your application. In order to redesign your application we concrete requirements and at least a simplified model of how the participants interact with each other. Your informal description is not precise enough to suggest a specific data structure and algorithm because without knowing all enough details we might omit something crucial and not meet your needs.
You are saying all the right words and you have a specific problem, out of memory, and you need to fix something. Go back to prototyping. Write a very small but brutally exercised program to demonstrate what you want to do. Then scale it back up to your application. It's much easier to design in the prototype size.
Edit:
Because you are running out of memory, the most likely reasons are that you have a memory leak or you simply have a near-real-time system with insufficient capacity to process the load you are experiencing. A leak might be due to the usual suspects, e.g. not detaching event handlers which you can diagnose with memory profilers, but we'll rule that out for now.
If you must keep up with quotes as they are updated, they have to go somewhere such as a queue or be dispatched to a thread, and unless you can keep up, this can grow unbounded.
The only way to solve this problem is to:
throw some quotes on the floor
get beefier hardware
process quotes more efficiently
I think you are hoping that there is a clear alternative to process quotes more efficiently with a new data structure or algorithm that could make a big difference. But even if you do make it more efficient, the problem could still come back and you may be forced to consider gracefully degrading under overload conditions rather than failing with out of memory.
But in general terms, for high performance simpler is often better and fewer threads swaps is better. For example, if the work done in update is small, making it synchronous could be a big win, even though it seems counter intuitive. You have to know what the update handler does and most of all for a near-real-time system you have to measure, measure, measure to empirically know which is fastest.
To me
I currently have the callbacks execute in the thread pool
and
Once observers are notified the background thread dies
are mildly contradictory. I suspect you might be intending to use threads from a pool, but accidentally using new 'free' (unpooled) threads each time.
You might want to look at the documentation for WeakReference.
However, I suggest you use a profiler/perfmon to find the resource leak first and foremost. Replacing the whole shebang with a queuing approach sounds reasonable, but it's pretty close to what you'd have anyway with a proper threadpool.
I know that if I am modifying a control from a different thread, I should take care because WinForms and WPF don't allow modifying control's state from other threads.
Why is this restriction in place?
If I can write thread-safe code, I should be able to modify control state safely. Then why is this restriction present?
Several GUI frameworks have this limitation. According to the book Java Concurrency in Practice the reason for this is to avoid complex locking. The problem is that GUI controls may have to react to both events from the UI, data binding and so forth, which leads to locking from several different sources and thus a risk of deadlocks. To avoid this .NET WinForms (and other UIs) restricts access to components to a single thread and thus avoids locking.
In the case of windows, when a control is created UI updates are performed via messages from a message pump. The programmer does not have direct control of the thread the pump is running on, therefore the arrival of a message for a control could possibly result in the changing of the state of the control. If another thread (that the programmer was in direct control of) were allowed to change the state of the control then some sort of synchronization logic would have to be put in place to prevent corruption of the control state. The controls in .Net are not thread safe; this is, I suspect by design. Putting synchronization logic in all controls would be expensive in terms of designing, developing, testing and supporting the code that provides this feature. The programmer could of course provide thread safety to the control for his own code, but not for the code that is in .Net that is running concurrently with his code. One solution to this issue is to restrict these types of actions to one thread and one thread only, which makes the control code in .Net simpler to maintain.
.NET reserves the right to access your control in the thread where you created it at any time. Therefore accesses that come from another thread can never be thread safe.
You might be able to make your own code thread-safe, but there is no way for you to inject the necessary synchronization primitives into the builtin WinForm and WPF code that match up with the ones in your code. Remember, there are a lot of messages getting passed around behind the scenes that eventually cause the UI thread to access the control without you really ever realizing it.
Another interesting aspect of a controls thread affinity is that it could (though I suspect they never would) use the Thread Local Storage pattern. Obviously if you accessed a control on a thread other than the one it was created on it would not be able to access the correct TLS data no matter how carefully you structured the code to guard against all of the normal problems of multithreaded code.
Windows supports many operations which, especially used in combination, are inherently not thread-safe. What should happen, for example, if while one thread is trying to insert some text into a text field starting with the 50th character, while another thread tries to delete the first 40 characters from that field? It would be possible for Windows to use locks to ensure that the second operation couldn't be begun until the first one completed, but using locks would add overhead to every operation, and would also raise the possibility of deadlock if actions on one entity require manipulation of another. Requiring that actions involving a particular window must happen on a particular thread is a more stringent requirement than would be necessary to prevent unsafe combinations of operations from being performed simultaneously, but it's relatively easy to analyze. Using controls from multiple threads and avoiding clashes via some other means would generally be more difficult.
Actually, as far as I know, that WAS the plan from the beginning! Every control could be accessed from any thread! And just because thread locking was needed when another thread required access to the control --and because locking is expensive-- a new threading model was crafted called "thread rental". In that model, related controls would be aggregated into "contexts" using only one thread, thus reducing the amount of locking needed.
Pretty cool, huh?
Unfortunately, that attempt was too bold to succeed (and a bit more complex because locking was still required), so the good old Windows Forms threading model --with the single UI thread and with the creating thread to claim ownership of the control-- is used once again in wPF to make our lives ...easier?
I've been working on the same project now since Christmas 2008. I've been asked to take it from a Console Application (which just prints out trace statements), to a full Windows App. Sure, that's fine. The only thing is there are parts of the App that can take several minutes to almost an hour to run. I need to multithread it to show the user status, or errors. But I have no idea where to begin.
I've aready built a little UI in WPF. It's very basic, but I'd like to expand it as I need to. The app works by selecting a source, choosing a destination, and clicking start. I would like a listbox to update as the process goes along. Much in the same way SQL Server Installs, each step has a green check mark by its name as it completes.
How does a newbie start multithreading? What libraries should I check out? Any advice would be greatly appreciated.
p.s. I'm currently reading about this library, http://www.codeplex.com/smartthreadpool
#Martin: Here is how my app is constructed:
Engine: Runs all major components in pre-defined order
Excel: Library I wrote to wrap COM to open/read/close/save Workbooks
Library: Library which understands different types of workbook formats (5 total)
Business Classes: Classes I've written to translate Excel data and prep it for Access
Db Library: A Library I've written which uses ADO.NET to read in Access data
AppSettings: you get the idea
Serialier: Save data in-case of app crash
I use everything from LINQ to ADO.NET to get data, transform it, and then output it.
My main requirement is that I want to update my UI to indicate progress
#Frank: What happens if something in the Background Worker throws an Exception (handled or otherwise)? How does my application recieve notice?
#Eric Lippert: Yes, I'm investigating that right now. Before I complicate things.
Let me know if you need more info. Currently I've running this application from a Unit Test, so I guess callig it a Console Application isn't true. I use Resharper to do this. I'm the only person right now who uses the app, but I'd like a more attractive interface
I don't think you specify the version of the CLR you are using, but you might check out the "BackgroundWorker" control. It is a simple way to implemented multiple threads.
The best part, is that it is a part of the CLR 2.0 and up
Update in response to your update: If you want to be able to update the progress in the UI -- for example in a progress bar -- the background worker is perfect. It uses an event that I think is called: ProgressChanged to report the status. It is very elegant. Also, keep in mind that you can have as many instances that you need and can execute all the instances at the same time (if needed).
In response to your question: You could easily setup an example project and test for your question. I did find the following, here (under remarks, 2nd paragraph from the caution):
If the operation raises an exception
that your code does not handle, the
BackgroundWorker catches the exception
and passes it into the
RunWorkerCompleted event handler,
where it is exposed as the Error
property of
System.ComponentModel..::.RunWorkerCompletedEventArgs.
Threading in C# from Joseph Albahari is quite good.
This page is quite a good summary of threading.
By the sound of it you probably don't need anything very complex - if you just start the task and then want to know when it has finished, you only need a few lines of code to create a new thread and get it to run your task. Then your UI thread can bumble along and check periodically if the task has completed.
Concurrent Programming on Windows is THE best book in the existence on the subject. Written by Joe Duffy, famous Microsoft Guru of multithreading. Everything you ever need to know and more, from the way Windows thread scheduler works to .NET Parallels Extensions Library.
Remember to create your delegates to update the UI so you don't get cross-threading issues and the UI doesn't appear to freeze/lockup
Also if you need a lot of notes/power points/etc etc
Might I suggest all the lecture notes from my undergrad
http://ist.psu.edu/courses/SP04/ist411/lectures.html
The best way for a total newcomer to threading is probably the threadpool. We'll probably need to know a little more about these parts to make more in depth recommendations
EDIT::
Since we now have a little more info, I'm going to stick with my previous answer, it looks like you have a loads of tasks which need doing, the best way to do a load of tasks is to add them to the threadpool and then just keep checking if they're done, if tasks need to be done in a specific order then you can simply add the next one as the previous one finishes. The threadpool really is rather good for this kind of thing and I see no reason not to use it in this case
Jason's link is a good article. Things you need to be aware of are that the UI can only be updated by the main UI thread, you will get cross threading exceptions if you try to do it in the worker thread. The BackgroundWorker control can help you there with the events, but you should also know about Control.Invoke (or Control.Begin/EndInvoke). This can be used to execute delegates in the context of the UI thread.
Also you should read up on the gotchas of accessing the same code/variables from different threads, some of these issues can lead to bugs that are intermittent and tricky to track down.
One point to note is that the volatile keyword only guarantees 'freshness' of variable access, for example, it guarantees that each read and write of the variable will be from main memory, and not from a thread or processor cache or other 'feature' of the memory model. It doesnt stop issues like a thread being interrupted by another thread during its read-update-write process (e.g. changing the variables value). This causes errors where the 2 threads have different (or the same) values for the variable, and can lead to things like values being lost, 2 threads having the same value for the variable when they should have different values, etc. You should use a lock/monitor (or other thread sync method, wait handles, interlockedincrement/decrement etc) to prevent these types of problems, which guarantee only one thread can access the variable. (Monitor also has the advantage that it implicitly performs volatile read/write)
And as someone else has noted, you also should try to avoid blocking your UI thread whilst waiting for background threads to complete, otherwise your UI will become unresponsive. You can do this by having your worker threads raise events that your UI subscribes to that indicate progress or completion.
Matt
Typemock have a new tool called Racer for helping with Multithreading issues. It’s a bit advanced but you can get help on their forum and in other online forums (one that strangely comes to mind is stackoverflow :-) )
I'm a newbie to multithreading as well, but I agree with Frank that a background worker is probably your best options. It works through event subscriptions. Here's the basics of how you used it.
First Instantiate a new background worker
Subscribed methods in your code to the background workers major events:
DoWork: This should contain whatever code that takes a long time to process
ProgressChanged: This is envoked whenever you call ReportProgress() from inside the method subscribed to DoWork
RunWorkerCompleted: Envoked when the DoWork method has completed
When you are ready to run your time consuming process you call the RunAsync() method of the background worker. This starts DoWork method on a separate thread, which can then report it's progress back through the ProgressChanged event. Once it completed RunWorkerComplete will be evoked.
The DoWork event method can also check if the user somehow requested that the process be canceled (CanceLAsync() was called)) by checking the value of the CancelPending property.