I need to put some ListView populate code in a thread. For the simple case, this thread should not run twice. When I want to issue a new thread the previous needs to stop, whatever did.
EDIT 1
The scenario is the following.
I have a textual filter above the ListView. On textchange I call a populateList() method.
The problem is that the code can take longer as it uses SQL LIKE syntax on a larger database.
Until this runs, the user cannot type in nothing. So there are sever hang-outs while you type in "abc", you get to type "c" after 10 seconds only.
So I have in mind to issue the populateList() method in a Thread and allow the user to quickly type in something more. The longer the text the slower takes the SQL query.
In the "abc" situation if you type "a" code goes behind and runs the query, but meanwhile if the user pressed "b" I want to stop the execution of the "a" thread, and issue a new one, now using "ab"... and so on with the "c".
EDIT 2
Still looking for more answers.
I dunno whether i got the question right... If you're dealing with multiple threads and for some reason you want to stop the previous thread, you can check in the new thread whether the previous thread is alive and abandon its action abruptly... If this is not what you expect can you please detail your question?
Update:
Now i understood your question very well. Even I'm working on such an issue but the db being in-memory. In your case, if your db supports LIMIT then you can limit the number of results returned from the query using
LIMIT lowerLimit upperLimit
Now in your thread's method, use a 'for' to loop through the result sets and in each iteration check whether the textbox text has changed. If it was changed then break the loop, clear the list, goto the start of the thread's method and do the same process using the same thread but the textfilter being different.
PSEUDO CODE:
do
isLoop = false
clear the collection that is bound to the listbox
from db, get the record count
for: loop 0 to recordcount increment batchsize
get chuncks of data from db of size batchsize
add the data to collection [using dispatcher.invoke]
if textbox text changed
isLoop = true
break for-loop
endif
endfor
while(isLoop)
Hope this helps.
Related
I am trying to make a C# WinForms application that fetches data from a url that is saved in a table named "Links". And each link has a "Last Checked" and "Next Check" datetime and there is "interval" which decides "next check" based on last check.
Right now, what I am doing is fetching ID with a query BEFORE doing the webscraping, and after that I turn Last Checked into DateTime.Now and Next Check into null untill all is completed. Which both then gets updated, after web scraping is done.
Problem with this is if there is any "abort" with an ongoing process, lastcheck will be a date, but nextcheck will be null.
So I need a better way for two processes to not work on same table's same row. But not sure how.
For a multithreaded solution, the standard engineering approach is to use a pool of workers and a pool of work.
This is just a conceptual sketch - you should adapt it to your circumstances:
A worker (i.e. a thread) looks at the pool of work. If there is some work available, it marks it as in_progress. This has to be done so that no two threads can take the same work. For example, you could use a lock in C# to do the query in a database, and to mark a row before returning it.
You need to have a way of un-marking it after the thread finishes. Successful or not, in_progress must be re-set. Typically, you could use a finally block so that you don't miss it in the event of any exception.
If there is no work available, the thread goes to sleep.
Whenever a new work arrives (i.e. INSERT, or a nextcheck is due), one of sleeping threads is awakened.
When your program starts, it should clear any in_progress flags in the event of a previous crash.
You should take advantage of DBMS transactions so that any changes a worker makes after completing its work are atomic - i.e. other threads percieve them as they had happened all at once.
By changing the size of worker pool, you can set the maximum number of simultaneously active workers.
First thing, the separation of controller/workers might be a better pattern as mentioned in other answer. This will work better if the number of threads gets large and te number of links to check is large.
But if your problem is this:
But problem with it is, if for any reason that scraping gets
aborted/finishes halfway/doesn't work properly, LastCheck becomes
DateTime.Now but NextCheck is left NULL, and previous
LastCheck/NextCheck values are gone, and LastCheck/NextCheck values
are updated for a link that is not actually checked
You just need to handle errors better.
The failure will result in exception. Catch the exception and handle it by resetting the state in the database. For example:
void DoScraping(.....)
{
try
{
// ....
}
catch (Exception err)
{
// oh dear, it went wrong, reset lastcheck/nextcheck
}
}
What you reset last/nextcheck to depends on you. You could reset them to what they where at the start if when you determine 'the next thing to do' you also get the values of last/nextcheck and store in variables. Then in the event of failure just set to what they were before.
I should first mention I am new to programming. I will explain my problem and my question to the best of my ability.
I was wondering if it is possible to update a single column of an ObjectListView (from here on referred to as OLV). What I would like is to have a column that would display the "Elapsed Time" of each row. This Elapsed Time column would be refreshed every 15 to 30 seconds.
How I set my OLV
myOLV.SetObjects(GetMyList());
GetMyList method returns a list populated from a simple database query.
Within the GetMyList method, it would convert the DateTime column into a string showing the elapsed time. This is then shown by the OLV as a string, not a datetime...
elapsed_time = ((TimeSpan)(DateTime.Now - x.time_arrived)).ToString("hh\\:mm\\:ss");
How I was trying to make this work was by using a timer that would every 30 seconds recall the GetMyList method. Because the method would re-query the database and return the records it retrieved, they were more up to date. This worked fine until I had more than 20 rows in the OLV. Each "refresh" for 200 rows takes 4 seconds to complete. That is 4 seconds that the UI is unresponsive...
As I am new to programming, I have never done anything with multi-threading, but I did a little reading up on it and attempted to create a solution. Even when I create a new thread to "refresh" the OLV object, it still causes the entire form to become unresponsive.
My question is, how can I have a column within my ObjectListView refresh every 15-30 seconds without causing the entire UI to become unresponsive?
Is it even possible/ a good idea to have the ObjectListView refresh in the background and then display it when it's ready?
The problem is the run time of the database query, not the updating of ObjectListView (updating 200 rows in ObjectListView should take about 200ms). You need to get the database work off the UI thread and onto a background thread. That background thread can then periodically fetch the updated data from the database. Once you have the updated data, updating ObjectListView is very fast.
However, multi-threading is a deep-dive topic that is likely to bite you. There are many things that can go wrong. You would be better off having a button on your UI that the user can click to Refresh the grid, and running everything synchronously. Once the synchronous version is working perfectly, then start working on the much more error prone multi-threaded version.
To strictly answer your question, you would do something like this:
var thread = new Thread(_ => {
while (!_cancelled) {
Thread.Sleep(15 * 1000);
if (!_cancelled) {
var results = QueryDatabase();
this.objectListView1.SetObjects(results);
}
}
});
thread.Start();
_cancelled is a boolean variable in your class that allows you to cancel the querying process. QueryDatabase() is your method that fetches the new data.
This example doesn't deal with errors, which are a significant issue from background threads.
Another gotcha is that most UI components cannot be updated from a background thread, however, ObjectListView has a few methods that are thread-safe and SetObjects() is one of them so you can call it like this.
Just to repeat: you really should start with the synchronous version, and only start thinking about async once you are happy with that one.
I am for the first time trying to use Thread in my windows service application.Now as per my condition i have to read data from database and if it matches with condition i have to execute a function in new thread.Now the main concern is that as my function which meant to execute in new Thread is lengthy and will take time so i have a query that, Will my program will reach to datareader code and read the new value from the database while my function keeps on executing in the background in thread.My application execution logic is time specific.
Here is the code..
while (dr.Read())
{
time = dr["SendingTime"].ToString();
if ((str = DateTime.Now.ToString("HH:mm")).Equals(time))
{
//Execute Function and send reports based on the data from the database.
Thread thread = new Thread(sendReports);
thread.Start();
}
}
Please help me..
Yep, as the comments said, you will have one thread per row. if you have 4-5 rows, and you'll run that code, you'll get 4-5 threads working happily in the back.
You might be happy with it, and leave it, and in half a year, someone else will play with the DB, and you'll get 10K rows, and this will create 10K threads, and you'll be on a holiday and people will call you panicking because the program is broken ...
In other words, you don't want to do it, because it's a bad practice.
You should either use a queue with working units, and have a fixed number of threads reading from those queues (in which case you might have 10K units there, but lets say 10 threads that will pick them up and process them until they are done), or some other mechanism to make sure you don't create a thread per row.
Unless of course, you don't care ...
I am on MSDN reading about the BackgroundWorker class and I have a question about how it works.
The following code has a for loop in it. And inside the for loop, in the else clause, you're supposed to: Perform a time consuming operation and report progress.
But, why is there a for loop, and why is its maximum value only 10?
private void bw_DoWork(object sender, DoWorkEventArgs e)
{
BackgroundWorker worker = sender as BackgroundWorker;
for (int i = 1; (i <= 10); i++)
{
if ((worker.CancellationPending == true))
{
e.Cancel = true;
break;
}
else
{
// Perform a time consuming operation and report progress.
System.Threading.Thread.Sleep(500);
worker.ReportProgress((i * 10));
}
}
}
I have a really massive database, and it takes sometimes up to a minute to check for new orders based on certain criteria. I don't want to guess how long it may take to complete a query, I want actual progress. How can I make this background worker report progress based on a MySQL SELECT query?
How can I make this background worker report progress based on a MySQL SELECT query?
You can't. That's one of the problems with a synchronous method call that you cannot predict ahead of time how long it is going to take. You have two cut points of time to deal with. Before you call the method, and after you call the method. You do not get anything in between. Either the method has returned, or it has not.
You can use statistics to your advantage though. You can record how long it takes each time it executes, store that, and use that to calculate a prediction, but it's never going to be accurate. With such a prediction, you could space out progress reporting accordingly so that you end up at 100% at or around the statistical prediction you've calculated.
However, if the database is slower or faster than usual, it'll be off.
Also note that whichever thread that is calling into MySQL to retrieve data can not be the same thread that is reporting progress, since it will be "waiting" for the MySQL database and the .NET code that talks to it to return with the data, all in one piece. You need to spin up yet another thread that reports the progress.
But, why is there a for loop, and why
is its maximum value only 10?
In the example, the worker is reporting progress between 10 an 100, purely out of simplicity. The values 10 to 100 come from i (1-10), and the * 10 in ReportProgress.
The documentation says that ReportProgress takes:
The percentage, from 0 to 100, of the
background operation that is complete.
When you write it for your really massive database, you must report progress as a percentage, between 0 and 100.
Given that your database may take "up to a minute", 1% is slightly more than 1/2 second, so you should see any associated progress bar move every 1/2 second or so. That sounds like pretty smooth reporting to me.
(Other answers describe why its difficult to attach the progress to a SQL-query)
You'll need to figure out a way to measure the progress of your query. Instead of one long query, you might be able to do it in batches (say of 10, then the progress increments by 10% each time).
The example is showing how to batch long processes so they can be reported. The 'sleep' instruction in the example would be replaced by a call to a method that did a time-consuming, batchable job.
In your case, unless you can split up your query into multiple parts, you can't really use ReportProgress to give feedback - you won't have any progress to report. A SQL query is a one-shot run, and ReportProgress is used for batchable things.
You may want to look into optimizing your database - it's possible that an index on a heavily-used table or something similar could be a big help. If this isn't possible, you'll have to find a way to do batched queries against the data (or get back the whole thing and go through it in code - ugh) if you want to be able to report meaningful progress.
The example code is just that: An example. So the 10 is arbitrary. It simply shows an example of estimating progress. In this case there are 10 discrete steps, so it can estimate progress easily.
I don't want to guess how long it may
take to complete a query, I want
actual progress.
A database query provides no means to report progress. You cannot do anything but guess.
What I do is this:
Assume that the longest time it will take is the timeout period for the connection. This way if the query fails because the connection died, the user will get a perfectly accurate progress bar. Most queries take far, far less time than the timeout value, so the user sees a little progress and then suddenly it completes. This gives the user the illusion that things happened better than expected!
To accomplish it I perform the Db query asyncronously and run the progress bar off a UI timer rather than using a BackgroundWorker.
I am new to threads and in need of help. I have a data entry app that takes an exorbitant amount of time to insert a new record(i.e 50-75 seconds). So my solution was to send an insert statement out via a ThreadPool and allow the user to begin entering the data for the record while that insert which returns a new record ID while that insert is running. My problem is that a user can hit save before the new ID is returned from that insert.
I tried putting in a Boolean variable which get set to true via an event from that thread when it is safe to save. I then put in
while (safeToSave == false)
{
Thread.Sleep(200)
}
I think that is a bad idea. If i run the save method before that tread returns, it gets stuck.
So my questions are:
Is there a better way of doing this?
What am I doing wrong here?
Thanks for any help.
Doug
Edit for more information:
It is doing an insert into a very large (approaching max size) FoxPro database. The file has about 200 fields and almost as many indexes on it.
And before you ask, no I cannot change the structure of it as it was here before I was and there is a ton of legacy code hitting it. The first problem is, in order to get a new ID I must first find the max(id) in the table then increment and checksum it. That takes about 45 seconds. Then the first insert is simply and insert of that new id and an enterdate field. This table is not/ cannot be put into a DBC so that rules out auto-generating ids and the like.
#joshua.ewer
You have the proccess correct and I think for the short term I will just disable the save button, but I will be looking into your idea of passing it into a queue. Do you have any references to MSMQ that I should take a look at?
1) Many :), for example you could disable the "save" button while the thread is inserting the object, or you can setup a Thread Worker which handle a queue of "save requests" (but I think the problem here is that the user wants to modify the newly created record, so disabling the button maybe it's better)
2) I think we need some more code to be able to understand... (or maybe is a synchronization issue, I am not a bug fan of threads too)
btw, I just don't understand why an insert should take so long..I think that you should check that code first! <- just as charles stated before (sorry, dind't read the post) :)
Everyone else, including you, addressed the core problems (insert time, why you're doing an insert, then update), so I'll stick with just the technical concerns with your proposed solution. So, if I get the flow right:
Thread 1: Start data entry for
record
Thread 2: Background calls to DB to retrieve new Id
The save button is always enabled,
if user tries to save before Thread
2 completes, you put #1 to sleep for
200 ms?
The simplest, not best, answer is to just have the button disabled, and have that thread make a callback to a delegate that enables the button. They can't start the update operation until you're sure things are set up appropriately.
Though, I think a much better solution (though it might be overblown if you're just building a Q&D front end to FoxPro), would be to throw those save operations into a queue. The user can key as quickly as possible, then the requests are put into something like MSMQ and they can complete in their own time asynchronously.
Use a future rather than a raw ThreadPool action. Execute the future, allow the user to do whatever they want, when they hit Save on the 2nd record, request the value from the future. If the 1st insert finished already, you'll get the ID right away and the 2nd insert will be allowed to kick off. If you are still waiting on the 1st operation, the future will block until it is available, and then the 2nd operation can execute.
You're not saving any time unless the user is slower than the operation.
First, you should probably find out, and fix, the reason why an insert is taking so long... 50-75 seconds is unreasonable for any modern database for a single row insert, and indicates that something else needs to be addressed, like indices, or blocking...
Secondly, why are you inserting the record before you have the data? Normally, data entry apps are coded so that the insert is not attempted until all the necessary data for the insert has been gathered from the user. Are you doing this because you are trying to get the new Id back from the database first, and then "update" the new empty record with the user-entered data later? If so, almost every database vendor has a mechanism where you can do the insert only once, without knowing the new ID, and have the database return the new ID as well... What vendor database are you using?
Is a solution like this possible:
Pre-calculate the unique IDs before a user even starts to add. Keep a list of unique Id's that are already in the table but are effectively place holders. When a user is trying to insert, reserve them one of the unique IDs, when the user presses save, they now replace the place-holder with their data.
PS: It's difficult to confirm this, but be aware of the following concurrency issue with what you are proposing (with or without threads): User A, starts to add, user B starts to add, user A calculates ID 1234 as the max free ID, user B calculates ID 1234 as the max free ID. User A inserts ID 1234, User B inserts ID 1234 = Boom!