I'm presently working on a side-by-side application (C#, WinForms) that injects messages into an application via COM.
This application uses multiple foreach statements, polling entity metrics from the application that accepts COM. A ListBox is used to list each entity, and when a user selects one from this list, a thread is created and executed, calling a method that retrieves the required data.
When a user selects a different entity from the list, the running thread is aborted and a new thread is created for the newly selected entity.
I've spent a day looking into my threading and memory usage, and have come to a conclusion that everything is fine. Never is there more than 6 threads running concurrently (all unique for executing different members), and via the Windows task manager, my application never peaks >10 CPU%, 29M MEM.
The only thing coming to mind is that the COM object you are using is designed to run in a single threaded apartment (STA). If that is the case then it will not matter how many threads you start; they will all eventually get serialized when calling into this COM object. And if your machine has multiple cores then you will definitely see less than 100% usage. 10% seems awfully low though. I would not be surprised to see something around 25% which would basically represent one pegged core of a quad core system, but the 10% figure might require another explanation. If your code or the COM object itself is waiting for IO operations to complete that might explain more of the low throughput.
In WinForms you can do SuspendLayout() and ResumeLayout(). If you are inserting a lot of items (or in general doing a lot of screen updates) you would first call SuspectLayout() then do all of your updates and then ResumeLayout().
You don't mention what's slow, so it's very difficult to say anything with certainty. However, since you say that you insert items into a listbox, I'll make a complete guess and ask how many items is that each time? It can be very slow to insert a lot of items into a list box.
If that's the case, you could speed it up by instead of listing each entity in one listbox, only list a set of categories there and then when the user selects a category you'll populate another listbox with the entities related to that category.
Related
I'm trying to improve upon this program that I wrote for work. Initially I was rushed, and they don't care about performance or anything. So, I made a horrible decision to query an entire database(a SQLite database), and then store the results in lists for use in my functions. However, I'm now considering having each of my functions threaded, and having the functions query only the parts of the database that it needs. There are ~25 functions. My question is, is this safe to do? Also, is it possible to have that many concurrent connections? I will only be PULLING information from the database, never inserting or updating.
The way I've had it described to me[*] is to have each concurrent thread open its own connection to the database, as each connection can only process one query or modification at a time. The group of threads with their connections can then perform concurrent reads easily. If you've got a significant problem with many concurrent writes causing excessive blocking or failure to acquire locks, you're getting to the point where you're exceeding what SQLite does for you (and should consider a server-based DB like PostgreSQL).
Note that you can also have a master thread open the connections for the worker threads if that's more convenient, but it's advised (for your sanity's sake if nothing else!) to only actually use each connection from one thread.
[* For a normal build of SQLite. It's possible to switch things off at build time, of course.]
SQLite has no write concurrency, but it supports arbitrarily many connections that read at the same time.
Just ensure that every thread has its own connection.
25 simultanious connections is not a smart idea. That's a huge number.
I usually create a multi-layered design for this problem. I send all requests to the database through a kind of ObjectFactory class that has an internal cache. The ObjectFactory will forward the request to a ConnectionPoolHandler and will store the results in its cache. This connection pool handler uses X simultaneous connections but dispatches them to several threads.
However, some remarks must be made before applying this design. You first have to ask yourself the following 2 questions:
Is your application the only application that has access to this
database?
Is your application the only application that modifies data in this database?
If the first question is negatively, then you could encounter locking issues. If your second question is answered negatively, then it will be extremely difficult to apply caching. You may even prefer not to implement any caching it all.
Caching is especially interesting in case you are often requesting objects based on a unique reference, such as the primary key. In that case you can store the most often used objects in a Map. A popular collection for caching is an "LRUMap" ("Least-Recently-Used" map). The benifit of this collection is that it automatically arranges the most often used objects to the top. At the same time it has a maximum size and automatically removes items from the map that are rarely ever used.
A second advantage of caching is that each object exists only once. For example:
An Employee is fetched from the database.
The ObjectFactory converts the resultset to an actual object instance
The ObjectFactory immediatly stores it in cache.
A bit later, a bunch of employees are fetched using an SQL "... where name like "John%" statement.
Before converting the resultset to objects, the ObjectFactory first checks if the IDs of these records are perhaps already stored in cache.
Found a match ! Aha, this object does not need to be recreated.
There are several advantages to having a certain object only once in memory.
Last but not least in Java there is something like "Weak References". These are references that are references that in fact can be cleaned up by the garbage collector. I am not sure if it exists in C# and how it's called. By implementing this, you don't even have to care about the maximum amount of cached objects, your garbage collector will take care of it.
I currently have a c# console app where multiple instances run at the same time. The app accesses values in a database and processes them. While a row is being processed it becomes flagged so that no other instance attempts to process it at the same time. My question is what is a efficient and graceful way to unflag those values in the event an instance of the program crashes? So if an instance crashed I would only want to unflag those values currently being processed by that instance of the program.
Thanks
The potential solution will depend heavily on how you start the console applications.
In our case, the applications are started based on configuration records in the database. When one of these applications performs a lock, it uses the primary key from the database configuration record to perform the lock.
When the application starts up, the first thing it does is release all locks on the records that it previously locked.
To control all of the child processes, we have a service that uses the information from the configuration tables to start the processes and then keeps an eye on them, restarting them when they fail.
Each of the processes is also responsible for updating a status table in the database with the last time it was available with a maximum allowed delay of 2 minutes (for heavy processing). This status table is used by sysadmins to watch for problems, but it could also be used to manually release locks in case of a repeating failure in a given process.
If you don't have a structured approach like this, it could be very difficult to automatically unlock records unless you have a solid profile of your application performance that would allow you to know that any lock over 5 minutes old is invalid because it should only take, on average, 15 seconds to process a record with a maximum of 2 minutes.
To be able to handle any kind of crash, even power off I would suggest to timestamp records additionally and after some reasonable timeout treat records as unlocked even if they are flagged.
I have an existing application written in c++ that does a number of tasks currently, reading transactiosn from a database for all customers, processing them and writing the results back.
What I want to do is have multiple versions of this running in parallel on separate machines to increase transaction capacity, by assigning a certain subset of customers to each version of the app so that there is no contention or data sharing required, hence no locking or synchronisation.
What I want to do though is have multiple versions running on the same machine aswell as distributed across other machines, so if I have a quad core box, there would be four instances of the application running, each utilising one of the CPU's.
I will be wrapping the c++ code in a .NET c# interface and managing all these processes - local and distributed from a parent c# management service responsible for creating, starting and stopping the processes, aswell as all communication and management between them.
What I want to know is if I create four instances each on a separate background thread on a quad core box, whether or not the CLR and .NET will automatically take care of spreading the load across the four CPUs on each box or whether I need to do something to make use of the parallel processing capability?
If you mean that you will be running your application in four processes on the same box, then it is the operating system (Windows) which controls how these processes are allocated CPU time. If the processes are doing similar work, then generally they will get roughly equal processor time.
But, have you considered using four threads within a single process? Threads are much more lightweight than processes, and you wouldn't then need a separate management service, i.e., you would have one process (with four threads) instead of 5 processes. Do you come from a unix background by any chance?
You can set the process affinity when launching the process via the Process object (or ProcessThread depending on how you are launching the app).
Here is an SO post which covers the subject (I didn't vote to close as a duplicate (yet) because I'm not 100% sure if this is exactly what you are after).
When a user visits an .aspx page, I need to start some background calculations in a new thread. The results of the calculations need to be stored in the user's Session, so that on a callback, the results can be retrieved. Additionally, on the callback, I need to be able to see what the status of the background calculation is. (E.g. I need to check if the calculation is finished and completed successfully, or if it is still running) How can I accomplish this?
Questions
How would I check on the status of the thread? Multiple users could have background calculations running at the same time, so I'm unsure how the process of knowing which thread belongs to which user would work.. (though in my scenario, the only thread that matters, is the thread originally started by user A -- and user A does a callback to retrieve/check on the status of that thread).
Am I correct in my assumption that passing an HttpSessionState "Session" variable for the user to the new thread, will work as I expect (e.g. I can then add stuff to their Session later).
Thanks. Also I have to say, I might be confused about something but it seems like the SO login system is different now, so I don't have access to my old account.
Edit
I'm now thinking about using the approach described in this article which basically uses a class and a Singleton to manage a list of threads. Instead of storing my data in the database (and incurring the performance penalty associated with retrieving the data, as well as the extra table, maintenance, etc in the database), I'll probably store the data in my class as well.
Edit 2
The approach mentioned in my first edit worked well. Additionally I had timers to ensure the threads, and their associated data, were both cleaned up after the corresponding timers called their cleanup methods. The Objects containing my data and the threads were stored in the Singleton class. For some applications it might be appropriate to use the database for storage but it seemed like overkill for mine, since my data is tied to a specific instance of a page, and is useless outside of that page context.
I would not expect session-state to continue working in this scenario; the worker may have no idea who the user is, and even if it does (or more likely: you capture this data into the worker), no reason to store anything (updating session is a step towards the end of the request pipeline; but if you aren't in the pipeline...?).
I suspect you might need to store this data separately using some unique property of the user (their id or cn), or invent a GUID otherwise. On a single machine it may suffice to store this in a synchronised dictionary (or similar), but on a farm/cluster you may need to push the data down a layer to your database or state server. And fetch manually.
I have a program that we'd like to multi-thread at a certain point. We're using CSLA for our business rules. At a one location of our program we are iterating over a BusinessList object and running some sanity checks against the data one row at a time. When we up the row count to about 10k rows it takes some time to run the process (about a minute). Naturally this sounds like a perfect place to use a bit of TPL and make this multi-threaded.
I've done a fair amount of multithreaded work through the years, so I understand the pitfalls of switching from single to multithreaded code. I was surprised to find that the code bombed within the CSLA routines themselves. It seems to be related to the code behind the CSLA PropertyInfo classes.
All of our business object properties are defined like this:
public static readonly PropertyInfo<string> MyTextProperty = RegisterProperty<string>(c => c.MyText);
public string MyText {
get { return GetProperty(MyTextProperty); }
set { SetProperty(MyTextProperty, value); }
}
Is there something I need to know about multithreading and CSLA? Are there any caveats that aren't found in any written documentation (I haven't found anything as of yet).
--EDIT---
BTW: the way I implemented my multithreading via throwing all the rows into a ConcurrentBag and then spawning 5 or so tasks that just grab objects from the bag till the bag is empty. So I don't think the problem is in my code.
As you've discovered, the CSLA.NET framework is not thread-safe.
To solve your particular problem, I would make use of the Wintellect Power Threading library; either the AsyncEnumerator/SyncGate combo or the ReaderWriterGate on its own.
The Power Threading library will allow you queue 'read' and 'write' requests to a shared resource (your CSLA.NET collection). At one moment in time, only a single 'write' request will be allowed access to the shared resource, all without thread-blocking the queued 'read' or 'write' requests. Its very clever and super handy for safely accessing shared resources from multiple threads. You can spin up as many threads as you wish and the Power Threading library will synchronise the access to your CSLA.NET collection.