Running Async Method in Loop without Stackoverflow Exception - c#

This piece of code keeps throwing a stackoverflow exception and I have a feeling it's either because of the await keyword causing the stack to fill up, or a thread availability issue. However, I'm not sure what the best way of remedying this would be.
The results variable is just a collection of StorageFiles and if it's above 1020 or so, the exception is thrown; otherwise it's usually fine.
private async void GetMusicTest()
{
var sfolder = await StorageFolder.GetFolderFromPathAsync(dir);
var query = sfolder.CreateFileQueryWithOptions(queryOptions);
var results = await query.GetFilesAsync();
for (int i = 0; i < results.Count; i++)
{
MusicProperties mp = await results[i].Properties.GetMusicPropertiesAsync();
Debug.WriteLine(mp.Title);
}
}
This code works fine in a console application, but the error is thrown when used in a desktop WinForm app.
Interestingly, if result.Count() is used instead, then the error is thrown after three iterations, whereas results.Count throws it after iterating through at least half of the collection, if not all (it seems to vary). They both return the same values. What's the best way looping through without causing a stackoverflow exception or using up all available threads?

I think this is a bug that should be addressed.
If I'm right, you can work around it by occasionally doing an await Task.Yield() within your loop.

Related

EF core query loop asynchronous

In the previous day I am looking for a way to make my code fully asynchronous. So that when called by a rest API, I' ll get an immediate response meanwhile the process is running in the background.
To do that I simply used
tasks.Add(Task<bool>.Run( () => WholeProcessFunc(parameter) ))
where WholeProcessFunc is the function that make all the calculations(it may be computationally intensive).
It works as expected however I read that it is not optimal to wrap the whole process in a Task.Run.
My code need to compute different entity framework query which result depends on the previous one and contains also foreach loop.
For instance I can' t understand which is the best practice to make async a function like this:
public async Task<List<float>> func()
{
List<float> acsi = new List<float>();
using (var db = new EFContext())
{
long[] ids = await db.table1.Join(db.table2 /*,...*/)
.Where(/*...*/)
.Select(/*...*/).ToArrayAsync();
foreach (long id in ids)
{
var all = db.table1.Join(/*...*/)
.Where(/*...*/);
float acsi_temp = await all.OrderByDescending(/*...*/)
.Select(/*...*/).FirstAsync();
if (acsi_temp < 0) { break; }
acsi.Add(acsi_temp);
}
}
return acsi;
}
In particular I have difficulties with the foreach loop and the fact that the result of a query is used in the next .
Finally with the break statement which I don't get how to translate it. I read about cancellation token, could it be the way ?
Is wrapping up all this function in a Task.Run a solid solution ?
In the previous day I am looking for a way to make my code fully asynchronous. So that when called by a rest api, I' ll get an immediate response meanwhile the process is running in the background.
Well, that's one meaning of the word "asynchronous". Unfortunately, it's completely different than the kind of "asynchronous" that async/await does. async yields to the thread pool, not the client (browser).
It works as expected however I read that it is not optimal to wrap the whole process in a Task.Run.
It only seems to work as expected. It's likely that once your web site gets higher load, it will start to fail. It's definite that once your web site gets busier and you do things like rolling upgrades, it will start to fail.
Is wrapping up all this function in a Task.Run a solid solution ?
Not at all. Fire-and-forget is inherently dangerous.
A proper solution should be a basic distributed architecture:
A durable queue, such as an Azure Queue or Rabbit (if properly configured to be durable).
An independent processor, such as an Azure Function or Win32 Service.
Then the ASP.NET app will encode the work to be done into a queue message, enqueue that to the durable queue, and then return. Some time later, the processor will retrieve the message from that queue and do the actual work.
You can translate your code to return an IAsyncEnumerable<...>, that way the caller can process the results as they are obtained. In an asp.net 5 MVC endpoint, this includes writing serialised json to the browser;
public async IAsyncEnumerable<float> func()
{
using (var db = new EFContext())
{
//...
foreach (long id in ids)
{
//...
if(acsi_temp<0) { yield break; }
yield return acsi_temp;
}
}
}
public async Task<IActionResult> ControllerAction(){
if (...)
return NotFound();
return Ok(func());
}
Note that if your endpoint is an async IAsyncEnumerable coroutine. In asp.net 5, your headers would be flushed before your action even started. Giving you no way to return any http error codes.
Though for performance, you should try rework your queries so you can fetch all the data up front.

Parallel Azure CloudBlockBlob-operations crashing (access violation)

The code sample below results once in a while in an access violation (1 out of 5,000 to 10,000 messages). Using a serial foreach instead of Parallel.ForEach seems to circumvent the problem.
public void DequeBatch<T>(int count)
{
var messages = this.queueListen.ReceiveBatch(count);
var received = new ConcurrentBag<KeyValuePair<Guid, T>>();
Action<BrokeredMessage> UnwrapMessage = message =>
{
blobName = message.GetBody<string>();
obj = Download<T>(blobName);
received.Add(new KeyValuePair<Guid, T>(new Guid(blobName), obj));
};
// offending operation
Parallel.ForEach(messages, new ParallelOptions { MaxDegreeOfParallelism = count }, UnwrapMessage);
}
public override T Download<T>(string blobName)
{
CloudBlockBlob blob;
lock (this.containerDownloadLock)
{
blob = this.containerDownload.GetBlockBlobReference(blobName);
}
T result;
using (var stream = new MemoryStream())
{
blob.DownloadToStream(stream);
stream.Position = 0;
result = Decompress<T>(stream); // dehydrate an object of type T from a GZipStream
}
return result;
}
Q1: What is the offending part which makes the code above thread-unsafe?
Q2: What is the correct and safe approach to up- and download CloudBlockBlobs in parallel?
Edit
Today, the code outlined above ran into a dead-lock. After hitting break-all in the debugger I observed that all of the worker-threads executing blob.DownloadToStream(stream); were trapped in
System.Net.AutoWebProxyScriptEngine.EnterLock
except for one which was blocked (no exception or anything else) in
System.Net.WinHttpProxyFinder.WinHttpGetProxyForUrl
An exception System.AccessViolationException can only originate from unmanaged code or from unsafe managed code. What you have above is normal (i.e. safe) managed code, so you should not be scrutinizing that code at the moment, but instead focus on other possibilities:
Do you have any unmanaged or unsafe code in your app? If so, that might be a reason for memory corruption, which in turn would cause an Access Violation. Test your app under paged heap and GFlags.
Execute your app under debugger and collect a crash dump. Look at the crash dump and check if you have familiar code in the call stack. A Windgb's !analyze command would get the analysis for you automatically. You will have to have to know how to fix up symbols for your and 3-rd party libraries. Example is here.
It might be a bug in Microsoft's implementation of Blob.
If you reasonably excluded #1 and #2, and suspect #3 might be the issue, you should collect a crash dump and send it over to Microsoft, only they would be able to help in that case.

Out of Memory Exception when using File Stream Write Byte to Output Progress Through the Console

I have the following code that throws an out of memory exception when writing large files. Is there something I'm missing?
I am not sure why it is throwing an out of memory error as I thought the Filestream would only use a maximum of 4096 bytes for the buffer? I am not entirely sure what it means by the Buffer to be honest and any advice would be appreciated.
public static async Task CreateRandomFile(string pathway, int size, IProgress<int> prog)
{
byte[] fileSize = new byte[size];
new Random().NextBytes(fileSize);
await Task.Run(() =>
{
using (FileStream fs = File.Create(pathway,4096))
{
for (int i = 0; i < size; i++)
{
fs.WriteByte(fileSize[i]);
prog.Report(i);
}
}
}
);
}
public static void p_ProgressChanged(object sender, int e)
{
int pos = Console.CursorTop;
Console.WriteLine("Progress Copied: " + e);
Console.SetCursorPosition (0, pos);
}
public static void Main()
{
Console.WriteLine("Testing CopyLearning");
//CopyFile()
Progress<int> p = new Progress<int>();
p.ProgressChanged += p_ProgressChanged;
Task ta = CreateRandomFile(#"D:\Programming\Testing\RandomFile.asd", 99999999, p);
ta.Wait();
}
Edit: the 99,999,999 was just created to make a 99MB file
Note: I have commented out prog.Report(i) and it will work fine.
It seems for some reason, the error occurs at the line
Console.writeline("Progress Copied: " + e);
I am not entirely sure why this causes an error? So the error might have been caused because of the progressEvent?
Edit 2: I have followed advice to change the code such that it reports progress every 4000 Bytes by using the following:
if (i%4000==0)
prog.Report(i);
For some reason. I am now able to write files up to 900MBs fine.
I guess the question is, why would the "Edit 2"'s code allow it to write up to 900MB just fine? Is it because it's reporting progress and writing to the console up to 4000x less than before? I didn't realize the Console would take up so much memory especially because I'm assuming all it's doing is outputting "Progress Copied"?
Edit 3:
For some reason when I change the following line as follows:
for (int i = 0; i < size; i++)
{
fs.WriteByte(fileSize[i]);
Console.Writeline(i)
prog.Report(i);
}
where there is a "Console.Writeline()" before the prog.Report(i), it would work fine and copy the file, albeit take a very long time to do so. This leads me to believe that this is a Console related issue for some reason but I am not sure as to what.
fs.WriteByte(fileSize[i]);
prog.Report(i);
You created a fire-hose problem. After deadlocks and threading races, probably the 3rd most likely problem caused by threads. And just as hard to diagnose.
Easiest to see by using the debugger's Debug + Windows + Threads window and look at thread that is executing CreateRandomFile(). With some luck, you'll see it is completed and has written all 99MB bytes. But the progress reported on the console is far behind this, having only reported 125KB bytes written, give or take.
Core issue is the way Progress<>.Report() works. It uses SynchronizationContext.Post() to invoke the ProgressChanged event handler. In a console mode app that will call ThreadPool.QueueUserWorkItem(). That's quite fast, your CreateRandomFile() method won't be bogged down much by it.
But the event handler itself is quite a lot slower, console output is not very fast. So in effect, you are adding threadpool work requests at an enormous rate, 99 million of them in a handful of seconds. No way for the threadpool scheduler to keep up, you'll have roughly 4 of them executing at the same time. All competing to write to the console as well, only one of them can acquire the underlying lock.
So it is the threadpool scheduler that causes OOM, forced to store so many work requests.
And sure, when you call Report() less frequently then the fire-hose problem is a lot less worse. Not actually that simple to ensure it never causes a problem, although directly calling Console.Write() is an obvious fix. Ultimately simple, create a usable UI that is useful to a human. Nobody likes a crazily scrolling window or a blur of text. Reporting progress no more frequently than 20 times per second is plenty good enough for the user's eyes, the console has no trouble keeping up with that.

Unable to step through foreach loop

private async void Clicked(object sender, RoutedEventArgs e)
{
StorageFolder s;
s = KnownFolders.DocumentsLibrary;
IReadOnlyList<StorageFile> l = await s.GetFilesAsync();
bool exists=false;
foreach (StorageFile sf in l)
{
if (string.Equals(sf.Name, "encrypted.txt", StringComparison.CurrentCultureIgnoreCase))
exists = true;
}
MessageDialog d = new MessageDialog(exists.ToString());
await d.ShowAsync();
}
When debugging the code, I'm not able to step through the code inside the foreach loop
While the value of exists at the end of the loop is correct, and I can see the loop execution if I put the MessageDialog code inside the loop, any idea how to step into the loop properly?
Even when I put the MessageDialog code in the loop, I'm unable to step into the if condition, so I suspect the issue lies somewhere in there
EDIT: Putting a breakpoint on the if condition works (currently I'm putting it on the 1st line of the function), but shouldn't I be able to step into the loop normally using F11 if I'm debugging line by line anyways? (atleast that's how it worked in TurboC)
EDIT2: Easiest way I could think of to show the issue clearly: http://www.youtube.com/watch?v=10GgXCqLlVo&feature=youtu.be (skip to 25 second mark to see the actual issue)
I don't know the details of how debugging works with async methods (I don't tend to use debuggers much), but you need to understand that the system will basically call back into the method, having returned from it (as far as the caller is concerned) when it first reaches an await expression which hasn't already completed.
I would put a break point on the bool exists=false; line, i.e. after the first await expression. You should then be able to use F10 (step over) to iterate through the loop.
Alternatively, you could get rid of the loop entirely using LINQ:
bool exists = l.Any(sf => string.Equals(sf.Name, "encrypted.txt",
StringComparison.CurrentCultureIgnoreCase));

Explain Strange Synchronization in WCF Source Code

While looking at the source code of System.ServiceModel.Channels.BufferManager, I noticed this method:
void TuneQuotas()
{
if (areQuotasBeingTuned)
return;
bool lockHeld = false;
try
{
try { }
finally
{
lockHeld = Monitor.TryEnter(tuningLock);
}
// Don't bother if another thread already has the lock
if (!lockHeld || areQuotasBeingTuned)
return;
areQuotasBeingTuned = true;
}
finally
{
if (lockHeld)
{
Monitor.Exit(tuningLock);
}
}
//
// DO WORK... (code removed for brevity)
//
areQuotasBeingTuned = false;
}
Obviously, they want only one thread to run TuneQuotas(), and other threads to not wait if it is already being run by another thread. I should note that the code removed was not try protected.
I'm trying to understand the advantages of this method above over just doing this:
void TuneQuotas()
{
if(!Monitor.TryEnter(tuningLock)) return;
//
// DO WORK...
//
Monitor.Exit(tuningLock);
}
Any ideas why they might have bothered with all that? I suspect the way they use the finally blocks is to guard against a thread abort scenario, but I still don't see the point because, even with all this code, TuneQuotas() would be locked for good if that one thread doesn't make it all the way to the end to set areQuotasBeingTunes=false, for one reason or another. So is there something cool about this pattern that I'm missing?
EDIT:
As a side note, it seems the method exists in .NET 4.0, which I confirmed using this code running on framework 4 (although I cannot confirm that the content of the method hasn't changed from what I found on the web):
var buffMgr = BufferManager.CreateBufferManager(1, 1);
var pooledBuffMgrType = buffMgr.GetType()
.GetProperty("InternalBufferManager")
.GetValue(buffMgr, null)
.GetType();
Debug.WriteLine(pooledBuffMgrType.Module.FullyQualifiedName);
foreach (var methodInfo in pooledBuffMgrType
.GetMethods(BindingFlags.Instance | BindingFlags.NonPublic))
{
Debug.WriteLine(methodInfo.Name);
}
which outputs:
C:\Windows\Microsoft.Net\assembly\GAC_MSIL\System.Runtime.DurableInstancing\v4.0_4.0.0.0__3 1bf3856ad364e35\System.Runtime.DurableInstancing.dll
ChangeQuota
DecreaseQuota
FindMostExcessivePool
FindMostStarvedPool
FindPool
IncreaseQuota
TuneQuotas
Finalize
MemberwiseClone
I'll add some comments:
void TuneQuotas()
{
if (areQuotasBeingTuned)
return; //fast-path, does not require locking
bool lockHeld = false;
try
{
try { }
finally
{
//finally-blocks cannot be aborted by Thread.Abort
//The thread could be aborted after getting the lock and before setting lockHeld
lockHeld = Monitor.TryEnter(tuningLock);
}
// Don't bother if another thread already has the lock
if (!lockHeld || areQuotasBeingTuned)
return; //areQuotasBeingTuned could have switched to true in the mean-time
areQuotasBeingTuned = true; //prevent others from needlessly trying to lock (trigger fast-path)
}
finally //ensure the lock being released
{
if (lockHeld)
{
Monitor.Exit(tuningLock);
}
}
//
// DO WORK... (code removed for brevity)
//
//this might be a bug. There should be a call to Thread.MemoryBarrier,
//or areQuotasBeingTuned should be volatile
//if not, the write might never reach other processor cores
//maybe this doesn't matter for x86
areQuotasBeingTuned = false;
}
The simple version you gave does not protect against some problems. At the very least it is not exception-safe (lock won't be released). Interestingly, the "sophisticated" version, doesn't either.
This method has been removed from .NET 4.
Until .NET 4.0 there was essentially a bug in the code that was generated by a lock statment. It would generate something similar to the following:
Monitor.Enter(lockObject)
// see next paragraph
try
{
// code that was in the lock block
}
finally
{
Monitor.Exit(lockObject);
}
This means that if an exception occurred between Enter and try, the Exit would never be called. As usr alluded to, this could happen due to Thread.Abort.
Your example:
if(!Monitor.TryEnter(tuningLock)) return;
//
// DO WORK...
//
Monitor.Exit(tuningLock);
Suffers from this problem and more. The window in which this code and become interrupted and Exit not be called is basically the whole block of code--by any exception (not just one from Thread.Abort).
I have no idea why most code was written in .NET. But, I surmise that this code was written to avoid the problem of an exception between Enter and try. Let's look at some of the details:
try{}
finally
{
lockHeld = Monitor.TryEnter(tuningLock);
}
Finally blocks basically generate a constrained execution region in IL. Constrained execution regions cannot be interrupted by anything. So, putting the TryEnter in the finally block above ensures that lockHeld reliably holds the state of the lock.
That block of code is contained in a try/finally block whose finally statement calls Monitor.Exit if tuningLock is true. This means that there is no point between the Enter and the try block that can be interrupted.
FWIW, this method was still in .NET 3.5 and is visible in the WCF 3.5 source code (not the .NET source code). I don't know yet what's in 4.0; but I would imagine it would be the same; there's no reason to change working code even if the impetus for part of its structure no longer exists.
For more details on what lock used to generate see http://blogs.msdn.com/b/ericlippert/archive/2007/08/17/subtleties-of-c-il-codegen.aspx
Any ideas why they might have bothered with all that?
After running some tests, I think see one reason (if not THE reason): They probably bothered with all that because it is MUCH faster!
It turns out Monitor.TryEnter is an expensive call IF the object is already locked (if it's not locked, TryEnter is still very fast -- no problems there). So all threads, except the first one, are going to experience the slowness.
I didn't think this would matter that much; since afterall, each thread is going to try taking the lock just once and then move on (not like they'd be sitting there, trying in a loop). However, I wrote some code for comparison and it showed that the cost of TryEnter (when already locked) is significant. In fact, on my system each call took about 0.3 ms without the debugger attached, which is several orders of magnitude slower than using a simple boolean check.
So I suspect, this probably showed up in Microsoft's test results, so they optimized the code as above, by adding the fast track boolean check. But that's just my guess..

Categories

Resources