BackgroundWorker Question (C# Windows Forms)

BackgroundWorker Question (C# Windows Forms) - c#

I am on MSDN reading about the BackgroundWorker class and I have a question about how it works.
The following code has a for loop in it. And inside the for loop, in the else clause, you're supposed to: Perform a time consuming operation and report progress.
But, why is there a for loop, and why is its maximum value only 10?
private void bw_DoWork(object sender, DoWorkEventArgs e)
{
BackgroundWorker worker = sender as BackgroundWorker;
for (int i = 1; (i <= 10); i++)
{
if ((worker.CancellationPending == true))
{
e.Cancel = true;
break;
}
else
{
// Perform a time consuming operation and report progress.
System.Threading.Thread.Sleep(500);
worker.ReportProgress((i * 10));
}
}
}
I have a really massive database, and it takes sometimes up to a minute to check for new orders based on certain criteria. I don't want to guess how long it may take to complete a query, I want actual progress. How can I make this background worker report progress based on a MySQL SELECT query?

How can I make this background worker report progress based on a MySQL SELECT query?
You can't. That's one of the problems with a synchronous method call that you cannot predict ahead of time how long it is going to take. You have two cut points of time to deal with. Before you call the method, and after you call the method. You do not get anything in between. Either the method has returned, or it has not.
You can use statistics to your advantage though. You can record how long it takes each time it executes, store that, and use that to calculate a prediction, but it's never going to be accurate. With such a prediction, you could space out progress reporting accordingly so that you end up at 100% at or around the statistical prediction you've calculated.
However, if the database is slower or faster than usual, it'll be off.
Also note that whichever thread that is calling into MySQL to retrieve data can not be the same thread that is reporting progress, since it will be "waiting" for the MySQL database and the .NET code that talks to it to return with the data, all in one piece. You need to spin up yet another thread that reports the progress.

But, why is there a for loop, and why
is its maximum value only 10?
In the example, the worker is reporting progress between 10 an 100, purely out of simplicity. The values 10 to 100 come from i (1-10), and the * 10 in ReportProgress.
The documentation says that ReportProgress takes:
The percentage, from 0 to 100, of the
background operation that is complete.
When you write it for your really massive database, you must report progress as a percentage, between 0 and 100.
Given that your database may take "up to a minute", 1% is slightly more than 1/2 second, so you should see any associated progress bar move every 1/2 second or so. That sounds like pretty smooth reporting to me.
(Other answers describe why its difficult to attach the progress to a SQL-query)

You'll need to figure out a way to measure the progress of your query. Instead of one long query, you might be able to do it in batches (say of 10, then the progress increments by 10% each time).

The example is showing how to batch long processes so they can be reported. The 'sleep' instruction in the example would be replaced by a call to a method that did a time-consuming, batchable job.
In your case, unless you can split up your query into multiple parts, you can't really use ReportProgress to give feedback - you won't have any progress to report. A SQL query is a one-shot run, and ReportProgress is used for batchable things.
You may want to look into optimizing your database - it's possible that an index on a heavily-used table or something similar could be a big help. If this isn't possible, you'll have to find a way to do batched queries against the data (or get back the whole thing and go through it in code - ugh) if you want to be able to report meaningful progress.

The example code is just that: An example. So the 10 is arbitrary. It simply shows an example of estimating progress. In this case there are 10 discrete steps, so it can estimate progress easily.
I don't want to guess how long it may
take to complete a query, I want
actual progress.
A database query provides no means to report progress. You cannot do anything but guess.
What I do is this:
Assume that the longest time it will take is the timeout period for the connection. This way if the query fails because the connection died, the user will get a perfectly accurate progress bar. Most queries take far, far less time than the timeout value, so the user sees a little progress and then suddenly it completes. This gives the user the illusion that things happened better than expected!
To accomplish it I perform the Db query asyncronously and run the progress bar off a UI timer rather than using a BackgroundWorker.

Related

How do I avoid two (or more) threads that work on a table at the same time to not work on same row?

I am trying to make a C# WinForms application that fetches data from a url that is saved in a table named "Links". And each link has a "Last Checked" and "Next Check" datetime and there is "interval" which decides "next check" based on last check.
Right now, what I am doing is fetching ID with a query BEFORE doing the webscraping, and after that I turn Last Checked into DateTime.Now and Next Check into null untill all is completed. Which both then gets updated, after web scraping is done.
Problem with this is if there is any "abort" with an ongoing process, lastcheck will be a date, but nextcheck will be null.
So I need a better way for two processes to not work on same table's same row. But not sure how.

For a multithreaded solution, the standard engineering approach is to use a pool of workers and a pool of work.
This is just a conceptual sketch - you should adapt it to your circumstances:
A worker (i.e. a thread) looks at the pool of work. If there is some work available, it marks it as in_progress. This has to be done so that no two threads can take the same work. For example, you could use a lock in C# to do the query in a database, and to mark a row before returning it.
You need to have a way of un-marking it after the thread finishes. Successful or not, in_progress must be re-set. Typically, you could use a finally block so that you don't miss it in the event of any exception.
If there is no work available, the thread goes to sleep.
Whenever a new work arrives (i.e. INSERT, or a nextcheck is due), one of sleeping threads is awakened.
When your program starts, it should clear any in_progress flags in the event of a previous crash.
You should take advantage of DBMS transactions so that any changes a worker makes after completing its work are atomic - i.e. other threads percieve them as they had happened all at once.
By changing the size of worker pool, you can set the maximum number of simultaneously active workers.

First thing, the separation of controller/workers might be a better pattern as mentioned in other answer. This will work better if the number of threads gets large and te number of links to check is large.
But if your problem is this:
But problem with it is, if for any reason that scraping gets
aborted/finishes halfway/doesn't work properly, LastCheck becomes
DateTime.Now but NextCheck is left NULL, and previous
LastCheck/NextCheck values are gone, and LastCheck/NextCheck values
are updated for a link that is not actually checked
You just need to handle errors better.
The failure will result in exception. Catch the exception and handle it by resetting the state in the database. For example:
void DoScraping(.....)
{
try
{
// ....
}
catch (Exception err)
{
// oh dear, it went wrong, reset lastcheck/nextcheck
}
}
What you reset last/nextcheck to depends on you. You could reset them to what they where at the start if when you determine 'the next thing to do' you also get the values of last/nextcheck and store in variables. Then in the event of failure just set to what they were before.

ObjectListView elapsed time column Potentially needs Multi-Threading

I should first mention I am new to programming. I will explain my problem and my question to the best of my ability.
I was wondering if it is possible to update a single column of an ObjectListView (from here on referred to as OLV). What I would like is to have a column that would display the "Elapsed Time" of each row. This Elapsed Time column would be refreshed every 15 to 30 seconds.
How I set my OLV
myOLV.SetObjects(GetMyList());
GetMyList method returns a list populated from a simple database query.
Within the GetMyList method, it would convert the DateTime column into a string showing the elapsed time. This is then shown by the OLV as a string, not a datetime...
elapsed_time = ((TimeSpan)(DateTime.Now - x.time_arrived)).ToString("hh\\:mm\\:ss");
How I was trying to make this work was by using a timer that would every 30 seconds recall the GetMyList method. Because the method would re-query the database and return the records it retrieved, they were more up to date. This worked fine until I had more than 20 rows in the OLV. Each "refresh" for 200 rows takes 4 seconds to complete. That is 4 seconds that the UI is unresponsive...
As I am new to programming, I have never done anything with multi-threading, but I did a little reading up on it and attempted to create a solution. Even when I create a new thread to "refresh" the OLV object, it still causes the entire form to become unresponsive.
My question is, how can I have a column within my ObjectListView refresh every 15-30 seconds without causing the entire UI to become unresponsive?
Is it even possible/ a good idea to have the ObjectListView refresh in the background and then display it when it's ready?

The problem is the run time of the database query, not the updating of ObjectListView (updating 200 rows in ObjectListView should take about 200ms). You need to get the database work off the UI thread and onto a background thread. That background thread can then periodically fetch the updated data from the database. Once you have the updated data, updating ObjectListView is very fast.
However, multi-threading is a deep-dive topic that is likely to bite you. There are many things that can go wrong. You would be better off having a button on your UI that the user can click to Refresh the grid, and running everything synchronously. Once the synchronous version is working perfectly, then start working on the much more error prone multi-threaded version.
To strictly answer your question, you would do something like this:
var thread = new Thread(_ => {
while (!_cancelled) {
Thread.Sleep(15 * 1000);
if (!_cancelled) {
var results = QueryDatabase();
this.objectListView1.SetObjects(results);
}
}
});
thread.Start();
_cancelled is a boolean variable in your class that allows you to cancel the querying process. QueryDatabase() is your method that fetches the new data.
This example doesn't deal with errors, which are a significant issue from background threads.
Another gotcha is that most UI components cannot be updated from a background thread, however, ObjectListView has a few methods that are thread-safe and SetObjects() is one of them so you can call it like this.
Just to repeat: you really should start with the synchronous version, and only start thinking about async once you are happy with that one.

Reading database values and executing function in new Thread

I am for the first time trying to use Thread in my windows service application.Now as per my condition i have to read data from database and if it matches with condition i have to execute a function in new thread.Now the main concern is that as my function which meant to execute in new Thread is lengthy and will take time so i have a query that, Will my program will reach to datareader code and read the new value from the database while my function keeps on executing in the background in thread.My application execution logic is time specific.
Here is the code..
while (dr.Read())
{
time = dr["SendingTime"].ToString();
if ((str = DateTime.Now.ToString("HH:mm")).Equals(time))
{
//Execute Function and send reports based on the data from the database.
Thread thread = new Thread(sendReports);
thread.Start();
}
}
Please help me..

Yep, as the comments said, you will have one thread per row. if you have 4-5 rows, and you'll run that code, you'll get 4-5 threads working happily in the back.
You might be happy with it, and leave it, and in half a year, someone else will play with the DB, and you'll get 10K rows, and this will create 10K threads, and you'll be on a holiday and people will call you panicking because the program is broken ...
In other words, you don't want to do it, because it's a bad practice.
You should either use a queue with working units, and have a fixed number of threads reading from those queues (in which case you might have 10K units there, but lets say 10 threads that will pick them up and process them until they are done), or some other mechanism to make sure you don't create a thread per row.
Unless of course, you don't care ...

User Interface become much slower and even crashed for long running time

I am using .Net 4.0 and VS2010.
My program is simply a multi-threaded get request sender which update a bindingList and display the list via DataGridView. The datagridview is in virtual mode. Moreover, I make a textbox and status bar to display the status of requests, one request normally adds 4-5 lines to the textbox and changes the number in the status bar.
The workload remains the same, one request per two seconds. The request is fast and only one request is standing out most of the time. New request thread is called by the last old request thread. The UI is updated a few times per thread using begininvoke and delegate.
MyInvoke mi = new MyInvoke(change);
this.BeginInvoke(mi, new Object[] { true, "Row " + pos + " standing by...", (pos + 1),0 });
I display the whole 3000 requests on the datagridview at the beginning with memory usage 30MB. When the 2XXX request is reached with memory usage 4XMB, I can see the status bar number and textbox updated slower and slower. For example, 2000->2001->2002->2003 becomes 2000->2003.
If I select the application window, sometimes the whole UI would even freeze. My datagridview is fixed on a few rows with virtual mode. I believe it is the problems of the UI thread. When it freezes, I can wait until all requests are done and everything becomes smooth again.
Any thoughts on what is happening?

Finally, I figured out why it was so slow. Noob like me might easily make this kind of error.
The large number of begininvoke I used during the thread does affect but it is not significant enough to freeze the UI thread.
The heavy workload is mainly bought by the below code which assign whole of the text to the textbox every time
textbox1.Text += "string"
The following code could solve it
textbox1.AppendText("string")

How to have a database trigger code

I need some help with this..
This table I have has a date column in it, and when any of the dates in that column equal the servers date I need to tell my website/program to send out an email or perform some certain notification action to let the user know something.
I was thinking of having a program running on the server polling the data base a certain intervals but the problem with this is if the date is 01/31/11 10:30 AM and my interval is every 5 mins there potential for the polling to be inaccurate i.e. the poll polling at 10:35 AM. In other words I need the database to somehow notify something when "x" date has been hit exactly at that date.
I'd like to avoid having a 1sec interval checking the database as I think that would be a huge performance hit.
I'm using ASP.NET MVC 3 with MSSQL and LINQ Entity framework.
Any creative ideas?

You could use Quartz.net to setup those events. Quartz is pretty flexible and powerful - and it was meant for this sort of thing.

Do not have the database trigger the code. Have a trigger create a row in another table with information about what just happened.
Have a separate program periodically read from the second table to email users or whatever you need to do. Have that program delete the row from the table once it's done with the email.

I don't have any personal experience, but Sql Server CLR Integration might be the answer you are looking for. From the description it sounds like you can write almost anything that will compile against the .NET framework and deploy it to a sql server instance and Sql Server will be able to execute it. http://msdn.microsoft.com/en-us/library/ms254498.aspx

you either need to make use of a scheduler (e.g. DBMS_SCHEDULER in Oracle or SQL Server Jobs, etc) or find some third party tool like maybe Quartz.net as mentioned by another responder. Or maybe code something like the following into a polling app
select all jobs due in next 5 minutes, order by due date
while there are jobs
if the next job is due action it
else sleep for duration of interval till job due
loop

This is bit dirty, but I think it will give you the functionality you're looking for.
In Global.asax.cs
public DateTime LastMaxDateTime;
protected void Application_Start(object sender,EventArgs e)
{
LastMaxDateTime = GetMaxDateTime();
Thread bgThread=new Thread(BackgroundThread_CheckDatabase);
bgThread.IsBackground=true;
bgThread.Start();
}
private void BackgroundThread_CheckDatabase()
{
while(true)
{
DateTime dtMaxDateTime = GetMaxDateTime();
if(dtMaxDateTime > this.LastMaxDateTime)
{
//Send Notifications
this.LastMaxDateTime=dtMaxDateTime;
}
Thread.Sleep(5000); //5 seconds
}
}
private DateTime GetMaxDateTime()
{
//function that returns DateTime from something like "SELECT MAX(DateTimeColumn) FROM [MyTable]"
}
Basically, the code keeps track of the newest DateTime in your table and on each poll, checks to see if there's a newer DateTime in the database since the last time it checked. If so, you can send out your notifications. If you're not expecting many records in your table that could cause a race condition, then I don't see a problem with this as a quick solution.

Most efficient way to do it is to have an application that instead of polling runs event-driven.
For example, have a thread query the database for the earliest scheduled event and sleep until then. Then have another thread synchronously wait for a table change (e.g. in PostgreSQL this would be the NOTIFY/LISTEN statements) and signal the first thread to check if the earliest event has changed.

The easiest way is to keep track of the date of your last check. When you check again, pull all rows greater than the last check date and less than or equal to the new check date. To make sure you execute them, you could add a column for when the action was performed and update that. With an index on that new column there shouldn't be any performance problem with checking it every second for rows with a NULL DateExecuted.
You could also read ahead and sort the upcoming items by trigger date and do a Thread.Wait() until the next one comes up to be precise.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.