Parallel.ForEach exits unexpectedly without exceptions

Parallel.ForEach exits unexpectedly without exceptions - c#

I am running a .NET 4.7 C# console application in which I am iterating through a collection of files (list of strings with file paths.
I want to run an operation on each file in parallel.
private void LaunchComparators()
{
//1) Get Trade Files
var files = GetTradeFiles();
//2) Run comparisons
try
{
Parallel.ForEach(files, file => LaunchComparator(file));
}
catch (Exception ex)
{
Log.Error(ex.Message);
throw ex;
}
//2 Write Results
WriteResults();
}
private void LaunchComparator(string file)
{
var comparator = new TradeComparator();
var strategyComparisonOutput = comparator.ComparePerStrategy(file);
}
While running the comparison, the first comparison completes and then the program abruptly stops without any exceptions.
I am not sure what I should do differently here, so that the files are all processed individually.
I am new to parallel programming/threading. Any help appreciated.

Parallel For and ForEach operations do not re-throw exception that occur inside the loop. If you don't add exception handling inside the loop the exceptions are lost and the loop just terminates.
Here is a link to an article on how to handle exception inside of Parallel processing loops:
https://learn.microsoft.com/en-us/dotnet/standard/parallel-programming/how-to-handle-exceptions-in-parallel-loops

Related

ASP.NET Core application exits unexpectedly during execution. The program '[] XYZ.exe' has exited with code 3 (0x3)

Application exists unexpectedly during iterating over IAsyncEnumerable and inserting records to database using EF context. No error or exception is thrown. In the console window it shows only The program '[5372] XYZ.exe' has exited with code 3 (0x3).
The code is somthing like:
public async Task LoadDataAsync()
{
try
{
_context.ChangeTracker.AutoDetectChangesEnabled = false;
int bufferSize = 50;
var myEntityBuffer = new List<MyEntity>(bufferSize);
//GetDataAsync returns IAsyncEnumerable
await foreach (var element in loader.GetDataAsync(cancellationToken).WithCancellation(cancellationToken))
{
myEntityBuffer.Add(new MyEntity(element));
if (addressPointBuffer.Count == bufferSize)
{
_context.MyEntities.AddRange(myEntityBuffer);
await _context.SaveChangesAsync(cancellationToken);
myEntityBuffer = new List<MyEntity>(bufferSize);
_context.ChangeTracker.Clear();
}
}
await _context.SaveChangesAsync(cancellationToken);
}
catch (Exception ex)
{
_logger.LogError(ex, ex.Message);
}
finally
{
_context.ChangeTracker.AutoDetectChangesEnabled = true;
}
}
The same pattern is used by other commands and they work fine, I could not spot the difference except that the element structure is different, different entity. The number of records is large, but the application exists after inserting approx. 80 000 record (the number differs from run to run)
I cannot trace the source of the problem and what makes the application exit. I run this on Development environment.
I appreciate if you could suggest how to trace the issue
So far I have tried the following:
Checking the application Logs - the running code is wrapped with try-catch, however no error were thrown and logged
placing a breakpoint in catch and finally blocks (these lines are not reached)
Wrapping Main method code in try-catch (no execption was catched)
Adding IHostApplicationLifetime Stopped and OnStopping events (these events are not raised upon exit)
increasing logging level for Microsoft.EntityFramework to Debug (I can see only SQL commands but no issues)
If I comment out lines doing database operations (AddRange, SaveChangesAsync) the method completes without issues and the application is still running.
I use the following stack
.NET Runtime 5.0.11
ASP.NET Core Runtime 5.0.11
.NET Desktop Runtime 5.0.11
EntityFramework Core + NetTopologySuite
SQLite + Spatialite
UPDATE
I have stopped using SQLite. I had plans to move to postgresql anyway. On postgresql it does not happen. Still would be nice to know how to diagnose such issues...

Reactive Extensions error handling with Observable SelectMany

I'm trying to write file watcher on certain folder using the reactive extensions library
The idea is to monitor hard drive folder for new files, wait until file is written completely and push event to the subscriber. I do not want to use FileSystemWatcher since it raises Changed event twice for the same file.
So I've wrote it in the "reactive way" (I hope) like below:
var provider = new MessageProviderFake();
var source = Observable.Interval(TimeSpan.FromSeconds(2), NewThreadScheduler.Default).SelectMany(_ => provider.GetFiles());
using (source.Subscribe(_ => Console.WriteLine(_.Name), () => Console.WriteLine("completed to Console")))
{
Console.WriteLine("press Enter to stop");
Console.ReadLine();
}
However I can't find "reactive way" to handle errors. For example, the file directory can be located on the external drive and became unavailable because of connection problem.
So I've added GetFilesSafe that will handle exception errors from the Reactive Extensions:
static IEnumerable<MessageArg> GetFilesSafe(IMessageProvider provider)
{
try
{
return provider.GetFiles();
}
catch (Exception e)
{
Console.WriteLine(e.Message);
return new MessageArg[0];
}
}
and used it like
var source = Observable.Interval(TimeSpan.FromSeconds(2), NewThreadScheduler.Default).SelectMany(_ => GetFilesSafe(provider));
Is there better way to make SelectMany to call provider.GetFiles() even when an exception has been raised? I'm using error counter in such cases to repeat the reading operation N times and then fail (terminate the process).
Is there "try N time and wait Q seconds between attempts" in the Reactive Extensions?
There is a problem with GetFilesSafe also: it returns IEnumerable<MessageArg> for lazy reading however it can raise on iteration and exception will be thrown somewhere in the SelectMany

There's a Retry extension, that just subscribes to the observable again if the current one errors, but it sounds like that won't offer the flexibility you want.
You could build something using Catch, which subscribes to the observable you give it if an error occurs on the outer one. Something like the following (untested):
IObservable<Thing> GetFilesObs(int times, bool delay) {
return Observable
.Return(0)
.Delay(TimeSpan.FromSeconds(delay ? <delay_time> : 0))
.SelectMany(_ => Observable.Defer(() => GetFilesErroringObservable()))
.Catch(Observable.Defer(() => GetFilesObs(times - 1, true)));
}
// call with:
GetFilesObs(<number_of_tries>, false);
As written, this doesn't do anything with the errors other than trigger a retry. In particular, when enough errors have happened, it will just complete without an error, which might not be what you want.

How to put a timeout on Transform

The XslCompiledTransform.Transform will hang under certain conditions (stack overflow, infinite loop, etc). This is a data (input) dependent error, so I don't have complete control in preventing it. If this happens, I'd like to be notified gracefully, but I don't want it to destroy my application process and hence the GUI where the user is inputting the input, which may be "valid" but "incomplete".
If I run the xslt file manually, I get
Process is terminated due to StackOverflowException
But XslCompiledTransform.Transform() will hang my application forever.
So, I want to wrap that call in a timeout, but nothing I've tried seems to work. It still hangs the application.
I want the thread that has the try block to not be hung. I want to create two tasks, one for Transform and the other timeout. Then start both at the same time. I don't know but I think the Run is running before the outer statement gets a chance to wire up the timeout and use the WhenAny.
How can this be fixed?
Update
I updated the code to reflect my current attempt. I can get into the if block if it times out, but whether I abort the thread or not, the application still hangs. I don't understand what it is about XslCompiledTransform.Transform that insists on taking the whole application down if it goes down.
public static Object Load(string mathML)
{
if (mathML == Notebooks.InputCell.EMPTY_MATH)
return null;
XmlDocument input = new XmlDocument();
input.LoadXml(mathML);
XmlDocument target = new XmlDocument(input.CreateNavigator().NameTable);
using (XmlWriter writer = target.CreateNavigator().AppendChild())
{
try
{
Thread thread = null;
var task = Task.Run(() =>
{
thread = Thread.CurrentThread;
XmlTransform.Transform(input, writer);
});
if (!task.Wait(TimeSpan.FromSeconds(5)))
{
thread.Abort();
throw new TimeoutException();
}
}
catch (XsltException xex)
{
if (xex.Message == "An item of type 'Attribute' cannot be constructed within a node of type 'Root'.")
return null;
else
throw;
}
}
return Load(target);
}

Here's how I solved the issue
I took my xsl and compiled it into an assembly and referenced that assembly from my project (which is called Library)
Advantages:
Fixed the hang
Compiled xslt into an assembly is supposedly much faster
Disadvantages:
You tell me! I don't know :)
Library Properties / Build Events / Pre-build Event
"C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.7 Tools\xsltc.exe" /settings:script+ /class:Transform "myStylesheet.xslt"
Library / References
+ myStylesheet.dll
Loading the compiled transform
private static XslCompiledTransform xslTransform;
private static XslCompiledTransform XslTransform
{
get
{
if (xslTransform == null)
{
xslTransform = new XslCompiledTransform();
xslTransform.Load(typeof(Transform));
}
return xslTransform;
}
}
Calling the transform
Same as updated code in the Question

C# Block code till processes release handle on files

I have a foreach loop that starts a process within a try/catch. In the finally section of my try/catch/finally I am trying to ensure the the process does not have a handle on any files. I have to delete files that were being processed.
Nothing I have tried seems to be working. I continue to get System.IO exceptions. "The file is currently in use by another process."
You can see in the finally I am using the WaitForExit() before returning from this method. The very next method call is one to delete files. Why would the process still be open or have a handle on any of these file after this?
Thanks!
try
{
foreach (var fileInfo in jsFiles)
{
//removed for clarity
_process.StartInfo.FileName = "\"C:\\Program Files\\Java\\jre6\\bin\\java\"";
_process.StartInfo.Arguments = stringBuilder.ToString();
_process.StartInfo.UseShellExecute = false;
_process.StartInfo.RedirectStandardOutput = true;
_process.Start();
}
}
catch (Exception e)
{
BuildMessageEventArgs args = new BuildMessageEventArgs("Compression Error: " + e.Message,
string.Empty, "JSMin", MessageImportance.High);
BuildEngine.LogMessageEvent(args);
}
finally
{
_process.WaitForExit();
_process.Close();
}

There's something seriously wrong here. You're starting a bunch of processes, but only waiting for the last spawned one to exit.
Are you sure you don't want the foreach outside the try block?
If you tell us more about what exactly you're trying to do, we could provide better suggestions.

I think you need to restructure your code. As it stands a failure for any of the processes in the foreach will cause an exit from the loop. Even if everything does succeed then your WaitForExit and Close calls in the finally block will only address the last process from the loop above.
You need to deal with each process and its success and/or failure individually. Create a method that accepts a fileInfo parameter and spawns and waits on each process. Move your loop into the client code that will be calling the suggested method.

Is the process a Console application or a GUI application?
For a GUI application, you will have to do Process.CloseMainWindow.

foreach (var fileInfo in jsFiles)
{
using (Process process = new Process())
{
try
{
//Other stuff
process.Start();
}
catch (...)
{
//Exception Handling goes here...
}
finally
{
try
{
process.WaitForExit();
}
catch (...)
{
}
}
}
}
Process.WaitForExit() might throw an exception, so it needs a try/catch of it's own.
If you create the process in the using statement, you don't have to worry about closing it, .NET will dispose of it properly.
It's usually better to not precede local variables with an underscore character. Most people just use that for their fields.

Unable to move file because it's being used by another process -- my program?

My program is unable to File.Move or File.Delete a file because it is being used "by another process", but it's actually my own program that is using it.
I use Directory.GetFiles to initially get the file paths, and from there, I process the files by simply looking at their names and processing information that way. Consequently all I'm doing is working with the strings themselves, right? Afterwards, I try to move the files to a "Handled" directory. Nearly all of them will usually move, but from time to time, they simply won't because they're being used by my program.
Why is it that most of them move but one or two stick around? Is there anything I can do to try freeing up the file? There's no streams to close.
Edit Here's some code:
public object[] UnzipFiles(string[] zipFiles)
{
ArrayList al = new ArrayList(); //not sure of proper array size, so using arraylist
string[] files = null;
for (int a = 0; a < zipFiles.Length; a++)
{
string destination = settings.GetTorrentSaveFolder() + #"\[CSL]--Temp\" + Path.GetFileNameWithoutExtension(zipFiles[a]) + #"\";
try
{
fz.ExtractZip(zipFiles[a], destination, ".torrent");
files = Directory.GetFiles(destination,
"*.torrent", SearchOption.AllDirectories);
for (int b = 0; b < files.Length; b++)
al.Add(files[b]);
}
catch(Exception e)
{}
}
try
{
return al.ToArray(); //return all files of all zips
}
catch (Exception e)
{
return null;
}
}
This is called from:
try
{
object[] rawFiles = directory.UnzipFiles(zipFiles);
string[] files = Array.ConvertAll<object, string>(rawFiles, Convert.ToString);
if (files != null)
{
torrents = builder.Build(files);
xml.AddTorrents(torrents);
directory.MoveProcessedFiles(xml);
directory.MoveProcessedZipFiles();
}
}
catch (Exception e)
{ }
Therefore, the builder builds objects of class Torrent. Then I add the objects of class Torrent into a xml file, which stores information about it, and then I try to move the processed files which uses the xml file as reference about where each file is.
Despite it all working fine for most of the files, I'll get an IOException thrown about it being used by another process eventually here:
public void MoveProcessedZipFiles()
{
string[] zipFiles = Directory.GetFiles(settings.GetTorrentSaveFolder(), "*.zip", SearchOption.TopDirectoryOnly);
if (!Directory.Exists(settings.GetTorrentSaveFolder() + #"\[CSL] -- Processed Zips"))
Directory.CreateDirectory(settings.GetTorrentSaveFolder() + #"\[CSL] -- Processed Zips");
for (int a = 0; a < zipFiles.Length; a++)
{
try
{
File.Move(zipFiles[a], settings.GetTorrentSaveFolder() + #"\[CSL] -- Processed Zips\" + zipFiles[a].Substring(zipFiles[a].LastIndexOf('\\') + 1));
}
catch (Exception e)
{
}
}
}

Based on your comments, this really smells like a handle leak. Then, looking at your code, the fz.ExtractZip(...) looks like the best candidate to be using file handles, and hence be leaking them.
Is the type of fz part of your code, or a third party library? If it's within your code, make sure it closes all its handles (the safest way is via using or try-finally blocks). If it's part of a third party library, check the documentation and see if it requires any kind of cleanup. It's quite possible that it implements IDisposable; in such case put its usage within a using block or ensure it's properly disposed.
The line catch(Exception e) {} is horribly bad practice. You should only get rid of exceptions this way when you know exactly what exception may be thrown and why do you want to ignore it. If an exception your program can't handle happens, it's better for it to crash with a descriptive error message and valuable debug information (eg: exception type, stack trace, etc), than to ignore the issue and continue as if nothing had gone wrong, because an exception means that something has definitely gone wrong.
Long story short, the quickest approach to debug your program would be to:
replace your generic catchers with finally blocks
add/move any relevant cleanup code to the finally blocks
pay attention to any exception you get: where was it thrown form? what kind of exception is it? what the documentation or code comments say about the method throwing it? and so on.
Either
4.1. If the type of fz is part of your code, look for leaks there.
4.2. If it's part of a third party library, review the documentation (and consider getting support from the author).
Hope this helps

What this mean: "there is no streams to close"? You mean that you do not use streams or that you close them?
I believe that you nevertheless have some opened stream.
Do you have some static classes that uses this files?
1. Try to write simple application that will only parse move and delete the files, see if this will works.
2. Write here some pieces of code that works with your files.
3. Try to use unlocker to be sure twice that you have not any other thing that uses those files: http://www.emptyloop.com/unlocker/ (don't forget check files for viruses :))

Class Path was handling multiple files to get me their filenames. Despite being unsuccessful in reproducing the same issue, forcing a garbage collect using GC.Collect at the end of the "processing" phase of my program has been successful in fixing the issue.
Thanks again all who helped. I learned a lot.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.