I'm trying to debug a WPF application that consistently crashes on one particular machine. We never encountered the crash in our testing but we're worried this could be happening to customers as well.
The crash occurs while a BackgroundWorker is running.
_backgroundWorker = new BackgroundWorker();
_backgroundWorker.DoWork += BackgroundWorker_DoWork;
_backgroundWorker.WorkerSupportsCancellation = true;
_backgroundWorker.RunWorkerCompleted += BackgroundWorker_RunWorkerCompleted;
_backgroundWorker.RunWorkerAsync(argument);
I'd know if an Exception was thrown because the RunWorkerCompleted event handler would log the Exception in a log file.
private void BackgroundWorker_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e)
{
if (e.Error != null)
{
Logger.LogMessage(e.Error.ToString());
}
}
I know approximately where in the DoWork handler the crash is occurring because it's in the middle of a set of file operations, and I can see which files have already been moved after the application is run. My first thought was that it was a usual file operation error such as a permissions issue, but all my testing indicates that an Exception would be thrown and I'd be able to see it. I also checked to see if there's a try/catch block that could be catching the Exception and there isn't.
The failure seems to be at some point during the execution of this loop:
if (Directory.Exists(sourceDirectory))
{
var relativeFiles = HelperService.GetRelativeFiles(sourceDirectory);
foreach (string relativeFile in relativeFiles)
{
string destPath = Path.Combine(basePath, relativeFile);
string sourcePath = Path.Combine(sourceDirectory, relativeFile);
HelperService.MoveAndOverwriteFile(sourcePath, destPath);
}
}
And here is HelperService.MoveAndOverwriteFile:
public static void MoveAndOverwriteFile(string sourcePath, string destPath)
{
Directory.CreateDirectory(Path.GetDirectoryName(destPath));
File.Delete(destPath);
File.Move(sourcePath, destPath);
}
I thought maybe something was canceling the BackgroundWorker, but I understand the CancelAsync() method doesn't abort the thread but rather sets the CancellationPending property, meaning CancelAsync() does nothing unless the DoWork handler is designed to respond to it. Is there any .NET functionality that could cancel or abort a BackgroundWorker without me knowing about it?
It's also as though something is forcing the application to shut down, but I haven't found anything that would do that. Could there be some .NET error that's so drastic that it aborts the whole application rather than throwing an Exception? I checked the Windows Event Viewer and there aren't any errors associated with this.
Should I suspect a hardware failure perhaps?
I know BackgroundWorker isn't really used anymore, but I inherited some old code that uses a lot of BackgroundWorkers. I'll probably use async/await in future versions. But I still want to figure this out.
Related
I'm performing an async operation for an upload using Starksoft.Net.Ftp.
Looks like that:
public void UploadFile(string filePath, string packageVersion)
{
_uploadFtpClient= new FtpClient(Host, Port, FtpSecurityProtocol.None)
{
DataTransferMode = UsePassiveMode ? TransferMode.Passive : TransferMode.Active,
FileTransferType = TransferType.Binary,
};
_uploadFtpClient.TransferProgress += TransferProgressChangedEventHandler;
_uploadFtpClient.PutFileAsyncCompleted += UploadFinished;
_uploadFtpClient.Open(Username, Password);
_uploadFtpClient.ChangeDirectoryMultiPath(Directory);
_uploadFtpClient.MakeDirectory(newDirectory);
_uploadFtpClient.ChangeDirectory(newDirectory);
_uploadFtpClient.PutFileAsync(filePath, FileAction.Create);
_uploadResetEvent.WaitOne();
_uploadFtpClient.Close();
}
private void UploadFinished(object sender, PutFileAsyncCompletedEventArgs e)
{
if (e.Error != null)
{
if (e.Error.InnerException != null)
UploadException = e.Error.InnerException;
}
_uploadResetEvent.Set();
}
As you can see, there is a ManualResetEvent in there, which is declared as private variable on top of the class:
private ManualResetEvent _uploadResetEvent = new ManualResetEvent(false);
Well, the sense is just that it should wait for the upload to complete, but it must be async for reporting progress, that's all.
Now, this just works fine.
I have a second method that should cancel the upload, if wished.
public void Cancel()
{
_uploadFtpClient.CancelAsync();
}
When the upload is cancelled a directory on the server also must be deleted.
I have a method for this, too:
public void DeleteDirectory(string directoryName)
{
_uploadResetEvent.Set(); // As the finished event of the upload is not called when cancelling, I need to set the ResetEvent manually here.
if (!_hasAlreadyFixedStrings)
FixProperties();
var directoryEmptyingClient = new FtpClient(Host, Port, FtpSecurityProtocol.None)
{
DataTransferMode = UsePassiveMode ? TransferMode.Passive : TransferMode.Active,
FileTransferType = TransferType.Binary
};
directoryEmptyingClient.Open(Username, Password);
directoryEmptyingClient.ChangeDirectoryMultiPath(String.Format("/{0}/{1}", Directory, directoryName));
directoryEmptyingClient.GetDirListAsyncCompleted += DirectoryListingFinished;
directoryEmptyingClient.GetDirListAsync();
_directoryFilesListingResetEvent.WaitOne(); // Deadlock appears here
if (_directoryCollection != null)
{
foreach (FtpItem directoryItem in _directoryCollection)
{
directoryEmptyingClient.DeleteFile(directoryItem.Name);
}
}
directoryEmptyingClient.Close();
var directoryDeletingClient = new FtpClient(Host, Port, FtpSecurityProtocol.None)
{
DataTransferMode = UsePassiveMode ? TransferMode.Passive : TransferMode.Active,
FileTransferType = TransferType.Binary
};
directoryDeletingClient.Open(Username, Password);
directoryDeletingClient.ChangeDirectoryMultiPath(Directory);
directoryDeletingClient.DeleteDirectory(directoryName);
directoryDeletingClient.Close();
}
private void DirectoryListingFinished(object sender, GetDirListAsyncCompletedEventArgs e)
{
_directoryCollection = e.DirectoryListingResult;
_directoryFilesListingResetEvent.Set();
}
As the finished event of the upload is not called when cancelling, I need to set the ResetEvent manually in the DeleteDirectory-method.
Now, what am I doing here: I first list all files in the directory in order to delete them, as a filled folder can't be deleted.
This method GetDirListAsync is also async which means I need another ManualResetEvent as I don't want the form to freeze.
This ResetEvent is _directoryFilesListingResetEvent. It is declared like the _uploadResetEvent above.
Now, the problem is, it goes to the WaitOne-call of the _directoryFilesListingResetEvent and then it stucks. A deadlock appears and the form freezes. (I've also marked it in the code)
Why is that?
I tried to move the call of _uploadResetEvent.Set(), but it doesn't change.
Does anyone see the problem?
When I try to call the DeleteDirectory-method alone without any upload, it works as well.
I think the problem is that both ResetEvents use the same resource or something and overlap themselves, I don't know.
Thanks for your help.
You are not using this library correctly. The MREs you added cause deadlock. That started with _uploadResetEvent.WaitOne(), blocking the UI thread. This is normally illegal, the CLR ensures that your UI does not go completely dead by pumping a message loop itself. That makes it look like it is still alive, it still repaints for example. A rough equivalent of DoEvents(), although not nearly as dangerous.
But the biggest problem with it is that it will not allow your PutFileAsyncCompleted event handler to run, the underlying async worker is a plain BackgroundWorker. It fires its events on the same thread that started it, which is very nice. But it cannot call its RunWorkerCompleted event handler until the UI thread goes idle. Which is not nice, the thread is stuck in the WaitOne() call. Exact same story for what you are debugging now, your GetDirListAsyncCompleted event handler cannot run for the same reason. So it just freezes there without being able to make progress.
So eliminate _uploadResetEvent completely, rely on your UploadFinished() method instead. You can find out if it was canceled from the e.Cancelled property. Only then do you start the code to delete the directory. Follow the same pattern, using the corresponding XxxAsyncCompleted event to decide what to do next. No need for MREs at all.
Looking at the source, it appears FtpClient uses a BackgroundWorker to perform asynchronous operations. That means its completion event will be posted to whatever SynchronizationContext was set at the time the worker was created. I'll bet the completion of CancelAsync pushes you back onto the UI thread, which blocks when you call WaitOne on the directory list reset event. The GetDirListAsyncCompleted event gets posted to UI message loop, but since the UI thread is blocked, it will never run, and the reset event will never be set.
BOOM! Deadlock.
I'm using a background worker to handle the loading of a file to stop my ui from freezing however it seems that the RunWorkerCompleted is finishing before my DoWork event has completed (Causes errors when exiting dialog)... is there anything I'm doing wrong? Am I better off doing this over a task?
public static <T> LoadDesign(string xmlPath)
{
PleaseWait pw = new PleaseWait(xmlPath);
pw.ShowDialog();
return pw.design;
}
private PleaseWait(string xmlFile)
{
InitializeComponent();
bw = new BackgroundWorker();
bw.WorkerSupportsCancellation = true;
bw.DoWork += (s, e) =>
{
design = (Cast)DllCall containing XmlSerializer.Deserialize(...,xmlFile);
};
bw.RunWorkerCompleted += (s, e) => {
//Exit please wait dialog
this.Close();
};
if (!bw.IsBusy)
bw.RunWorkerAsync();
}
I believe the issue may be down to the fact that my background worker is calling a dll and not waiting for the response. I've tried to add checks such as while(design == null) to no avail..
Edit2
The error is NRE as the design hasn't been loaded, I can easily fix this but would rather get the threading working instead.
There are lots of little mistakes. Given that we are probably not looking at the real code and that we don't have a debugger with a Call Stack window to see where it actually crashes, any one of them might be a factor.
Testing bw.IsBusy and not starting the worker when it is true is a grave mistake. It can't ever be busy in the code as posted but if it actually is possible for it to be true then you've got a nasty bug in your code. Since you actually did subscribe the events on a busy worker. Now the RunWorkerCompleted event handler will run twice.
Using the Close() method to close a dialog is not correct. A dialog should be closed by assigning its DialogResult property. Not the gravest kind of mistake but wrong nonetheless.
There's a race in the code, the worker can complete before the dialog is ever displayed. A dialog can only be closed when its native window was created. In other words, the IsHandleCreated must be true. You must interlock this to ensure this can never happen. Subscribe the dialog's Load event to get the worker started.
You blindly assume that the worker will finish the job and produce a result. That won't be the case when its DoWork method died from an exception. Which is caught by BackgroundWorker and passed to the RunWorkerCompleted event handler as the e.Error property. You must check this property and do something reasonable if it isn't null.
Judging from the comments, I'd guess at the latter bullet being the cause. You debug this by using Debug + Exceptions, tick the Thrown checkbox for CLR exceptions. The debugger will now stop when the exception is thrown, allowing you to find out what went wrong.
It maybe possible your background worker actually does not take much time and complete before the dialog is shown. I'd suggest shift the background worker initialization and start up code to PleaseWait's Form_Load or Form_Shown
If you Call another async function in your BackgroundWorker _DoWork event,
like;
private void BackgroundWorker1_DoWork(object sender, DoWorkEventArgs e)
{
somethingsToDoAsync();
// somethingsToDoAsync() function is to ASYNC
}
_RunWorkerCompleted fires even before completed _Dowork event.
Change other somethingsToDoAsync() function to not async.
This code works, for most of time, so I'm thinking of some race condition. Result class is immutable, but I don't think the issue is with that class.
public Result GetResult()
{
using (var waitHandle = new ManualResetEvent(false))
{
Result result = null;
var completedHandler = new WorkCompletedEventHandler((o, e) =>
{
result = e.Result;
// somehow waitHandle is closed, thus exception occurs here
waitHandle.Set();
});
try
{
this.worker.Completed += completedHandler;
// starts working on separate thread
// when done, this.worker invokes its Completed event
this.worker.RunWork();
waitHandle.WaitOne();
return new WorkResult(result);
}
finally
{
this.worker.Completed -= completedHandler;
}
}
}
Edit: Apologies, I've missed a call to this.worker.RunWork() right before calling GetResult() method. This apparently resulted (sometimes) in doing same job twice, though I'm not sure why waitHandle got closed before waitHandle.Set(), despite having Completed event firing twice. This hasn't compromised the IO work at all (results were correct; after I've changed the code to manually close the waitHandle).
Therefore, Iridium's answer should be closest answer (if not the right one), even though the question wasn't complete.
There doesn't seem anything particularly problematic in the code you've given, which would suggest that there is perhaps something in the code you've not shown that's causing the problem. I'm assuming that the worker you're using is part of your codebase (rather than part of the .NET BCL like BackgroundWorker?) It may be worth posting the code for that, in case there is an issue there that's causing the problem.
If for example, the same worker is used repeatedly from multiple threads (or has a bug in which Completed can be raised more than once for the same piece of work), then if the worker uses the "usual" means for invoking an event handler, i.e.:
var handler = Completed;
if (handler != null)
{
handler(...);
}
You could have an instance where var handler = Completed; is executed before the finally clause (and so before the completedHandler has been detached from the Completed event), but handler(...) is called after the using(...) block is exited (and so after the ManualResetEvent has been disposed). Your event handler will then be executed after waitHandle is disposed, and the exception you are seeing will be thrown.
There is no obvious reason why this would fail from the posted code. But we can't see a stack trace and we can't see the logic that gets the Completed event fired so there are few opportunities to debug this for you. Arbitrarily, if the event fires more than once then you'll certainly have this kind of race problem.
Vexing threading problems are hard to debug, threading races are problems that occur at microsecond scale. Trying to debug it can be enough to make the race disappear. Or it happens so infrequently that having any hope of catching the problem is too rare to justify an attempt.
Such problems often require logging to diagnose the race. Be sure to select a light-weight logging method, logging in itself can alter the timing enough to prevent the race from ever occurring.
Last but certainly not least: do note that there is no point in using a thread here. You get the exact same outcome by directly calling the code that's executed by whatever thread is started by RunWork(). Minus the overhead and the headaches.
If you get rid of the using your code will not throw an exception at the by you designated line...
You have to find a decent place to dispose it, if you really need to.
public Result GetResult()
{
var waitHandle = new ManualResetEvent(false);
Result result = null;
var completedHandler = new WorkCompletedEventHandler((o, e) =>
{
result = e.Result;
// somehow waitHandle is closed, thus exception occurs here
waitHandle.Set();
waitHandle.Dispose();
});
try
{
this.worker.Completed += completedHandler;
// starts working on separate thread
// when done, this.worker invokes its Completed event
this.worker.RunWork();
waitHandle.WaitOne();
return new WorkResult(result);
}
finally
{
this.worker.Completed -= completedHandler;
}
}
i have the folling code:
public static Emgu.CV.Capture _capture;
public static DispatcherTimer _timer;
_timer = new DispatcherTimer();
_timer.Interval = _settings.camera_interval;
_timer.Tick += ProcessFrame;
BacgroundWorker _bw = new BackgroundWorker
{
WorkerReportsProgress = true,
WorkerSupportsCancellation = true
};
_bw.DoWork += (s, e) =>
{
// Initialize the device in background
_capture = new Capture();
};
_bw.RunWorkerCompleted += (s, e) =>
{
_capture.SetCaptureProperty(CAP_PROP.CV_CAP_PROP_FRAME_HEIGHT,
_settings.camera_height);
_capture.SetCaptureProperty(CAP_PROP.CV_CAP_PROP_FRAME_WIDTH,
_settings.camera_width);
Brightness = _capture
.GetCaptureProperty(CAP_PROP.CV_CAP_PROP_BRIGHTNESS);
Contrast = _capture
.GetCaptureProperty(CAP_PROP.CV_CAP_PROP_CONTRAST);
// Get images from camera
_timer.Start();
};
_bw.RunWorkerAsync();
public override void CleanUp()
{
_timer.Stop();
_bw.Dispose();
if (_capture != null) _capture.Dispose();
}
the app work fine but when i close the app throw me: Message: Context0x23754b0' Disconnected. ... how to fix this problem?
This is a COM related error, it no doubt happens because you create the Capture object on the background thread. A COM object has thread affinity, once the thread that creates it stops running, the COM object is dead and cannot be used anymore. Trying to use it anyway produces the warning.
That this doesn't occur in the RunWorkerCompleted event handler is quite remarkable, this must be buried inside the OpenCV or Emgu plumbing in a non-obvious way. That certainly doesn't mean it couldn't occur some day. You'll need to re-think this, it doesn't make much sense to only create the object on the worker and have everything else run on the UI thread. Do everything on the worker, including the disposing. Or none of it.
I assume this would have something to do with your camera capture library and how it potentially uses unmanaged resources.
I'd start by commenting all the code out of your RunWorkerCompleted to see if the message still happens. If it doesn't, then it's caused by one or more of the GetCaptureProperty calls. I suspect it won't though.
I see in the documentation of Egmu.CV.Capture that there is a Capture.DisposeObject() method that talks about releasing the captured object. My guess is that after you instantiate _capture and you do what you need to do, you have to do a clean-up. I'd suggest that after your ProcessFrame finishes (or on exit of your application) that you try calling _capture.DisposeObject() to see if that cleans up and exits gracefully.
Edit:
If all else fails, the approach I would suggest is comment out as much of your code as you can to get to the point where you can exit the program without it throwing an Exception. Then, comment in parts of code until you can locate exactly what gets created or run that will eventually cause your exception on exit. Once you can localize that, you'll have a better idea how to fix it.
I am observing a strange bug in some of my code which I suspect is tied to the way closing a form and background workers interact.
Here is the code potentially at fault:
var worker = new BackgroundWorker();
worker.DoWork += (sender, args) => {
command();
};
worker.RunWorkerCompleted += (sender, args) => {
cleanup();
if (args.Error != null)
MessageBox.Show("...", "...", MessageBoxButtons.OK, MessageBoxIcon.Exclamation);
};
worker.RunWorkerAsync();
This code is executed in a method in a form, when a button is pressed.
command() is slow, it may take a few seconds to run.
The user presses a button which executes the code above to be executed. Before it is done, the form is closed.
The problem is that calling cleanup() sometimes raises ObjectDisposedException. I say "sometimes", because this never happens on my computer. If the form is closed before command() is done, the handler I registered for RunWorkerCompleted is not executed. On another computer, the handler is called once out of hundred times. On a coworker's computer, it's almost always called. Apparently, the probability of execution of the handler rises with the age/slowness of the computer.
First question:
Is this the expected behaviour of BakgroundWorker? I would not expect it to know anything about the form, as there is nothing I can see that ties the form "this" with "worker".
Second question:
How should I go about fixing that problem?
Possible solutions I'm considering:
Test if (!this.IsDisposed) before calling cleanup(). Is that enough, or can the form be disposed while cleanup is being executed?
Wrap the call to cleanup() in a try {} catch (ObjectDisposedException). I don't like that kind of approach too much, as I may be catching exceptions that were raised due to some other unrelated bug in cleanup() or one of the methods it calls.
Register a handler for IsClosing and delay or cancel closing until the handler for RunWorker Completed has run.
Additional information that may be relevant: code from command() will cause updates to be done to GUI objects in "this". Such updates are performed via calls to this F# function:
/// Run a delegate on a ISynchronizeInvoke (typically a Windows.Form).
let runOnInvoker (notification_receiver : ISynchronizeInvoke) excHandler (dlg : Delegate) args =
try
let args : System.Object[] = args |> Seq.cast |> Array.ofSeq
notification_receiver.Invoke (dlg, args) |> ignore
with
| :? System.InvalidOperationException as op ->
excHandler(op)
The exceptions you mentioned do not have any connection to BackgroundWorker, other than the fact that one thread (the worker) tries to access controls which have been disposed by another thread (the UI).
The solution I would use is to attach an event handler to the Form.FormClosed event to set a flag that tells you the UI has been torn down. Then, then RunWorkerCompleted handle will check to see if the UI has been torn down before trying to do anything with the form.
While this approach will probably work more reliably than checking IsDisposed if you are not disposing the form explicitly, it does not provide a 100% guarantee that the form will not be closed and/or disposed just after the cleanup code has checked the flag and found that it is still there. This is the race condition you yourself mention.
To eliminate this race condition, you will need to synchronize, for example like this:
// set this to new object() in the constructor
public object CloseMonitor { get; private set; }
public bool HasBeenClosed { get; private set; }
private void Form1_FormClosed(object sender, FormClosedEventArgs e) {
lock (this.CloseMonitor) {
this.HasBeenClosed = true;
// other code
}
}
and for the worker:
worker.RunWorkerCompleted += (sender, args) => {
lock (form.CloseMonitor) {
if (form.HasBeenClosed) {
// maybe special code for this case
}
else {
cleanup();
// and other code
}
}
};
The Form.FormClosing event will also work fine for this purpose, you can use whichever of the two is more convenient if it makes a difference.
Note that, the way this code is written, both event handlers will be scheduled for execution on the UI thread (this is because WinForms components use a single-threaded apartment model) so you would actually not be affected by a race condition. However, if you decide to spawn more threads in the future you might expose the race condition unless you do use locking. In practice I have seen this happen quite often, so I suggest synchronizing anyway to be future-proof. Performance will not be affected as the sync only happens once.