Parallel Azure CloudBlockBlob-operations crashing (access violation) - c#

The code sample below results once in a while in an access violation (1 out of 5,000 to 10,000 messages). Using a serial foreach instead of Parallel.ForEach seems to circumvent the problem.
public void DequeBatch<T>(int count)
{
var messages = this.queueListen.ReceiveBatch(count);
var received = new ConcurrentBag<KeyValuePair<Guid, T>>();
Action<BrokeredMessage> UnwrapMessage = message =>
{
blobName = message.GetBody<string>();
obj = Download<T>(blobName);
received.Add(new KeyValuePair<Guid, T>(new Guid(blobName), obj));
};
// offending operation
Parallel.ForEach(messages, new ParallelOptions { MaxDegreeOfParallelism = count }, UnwrapMessage);
}
public override T Download<T>(string blobName)
{
CloudBlockBlob blob;
lock (this.containerDownloadLock)
{
blob = this.containerDownload.GetBlockBlobReference(blobName);
}
T result;
using (var stream = new MemoryStream())
{
blob.DownloadToStream(stream);
stream.Position = 0;
result = Decompress<T>(stream); // dehydrate an object of type T from a GZipStream
}
return result;
}
Q1: What is the offending part which makes the code above thread-unsafe?
Q2: What is the correct and safe approach to up- and download CloudBlockBlobs in parallel?
Edit
Today, the code outlined above ran into a dead-lock. After hitting break-all in the debugger I observed that all of the worker-threads executing blob.DownloadToStream(stream); were trapped in
System.Net.AutoWebProxyScriptEngine.EnterLock
except for one which was blocked (no exception or anything else) in
System.Net.WinHttpProxyFinder.WinHttpGetProxyForUrl

An exception System.AccessViolationException can only originate from unmanaged code or from unsafe managed code. What you have above is normal (i.e. safe) managed code, so you should not be scrutinizing that code at the moment, but instead focus on other possibilities:
Do you have any unmanaged or unsafe code in your app? If so, that might be a reason for memory corruption, which in turn would cause an Access Violation. Test your app under paged heap and GFlags.
Execute your app under debugger and collect a crash dump. Look at the crash dump and check if you have familiar code in the call stack. A Windgb's !analyze command would get the analysis for you automatically. You will have to have to know how to fix up symbols for your and 3-rd party libraries. Example is here.
It might be a bug in Microsoft's implementation of Blob.
If you reasonably excluded #1 and #2, and suspect #3 might be the issue, you should collect a crash dump and send it over to Microsoft, only they would be able to help in that case.

Related

C# StreamReader causing +200ms overhead to read data from source

There is an issue where we are seeing some periodic +200ms overhead on reading the Input Stream from a Stream Reader when there is load on the system. I am wondering has anyone else seen this and if they have done anything to fix it?
The following is the code:
string requestBody;
var streamReaderTime = Stopwatch.StartNew();
using (var streamReader = new StreamReader(context.Request.InputStream, context.Request.ContentEncoding))
{
var allLines = streamReader.ReadLines();
var request = new StringBuilder();
allLines.ForEach(line => request.Append(line));
requestBody = request.ToString();
}
streamReaderTime.Stop();
ReadLine is just as follows:
public static IEnumerable<string> ReadLines(this StreamReader reader)
{
while (!reader.EndOfStream)
{
yield return reader.ReadLine();
}
}
Note: Using ReadLines() or ReadToEnd() makes very little difference if any.
We run performance tests overnight and we are seeing the following behavior just from graphing streamReaderTime.
A single request takes between 45ms and 70ms to execute but it can be seen from the screenshot that it is adding on a fixed value and sometimes an even bigger spike. I saw it before being at around 1.5 seconds.
If anyone has any solutions/suggestions it would be greatly appreciated.
Edit : I did have ReadToEnd() instead of ReadLines() and that got rid of the StringBuilder but it was still the same overhead. Is there an alternative to StreamReader, just to test out even? It does seem like GC cost since having a request ever ten seconds does not effect it, but the exact same request per second will cause this overhead to happen. Also I am not able to reproduce it locally either, it is only in the virtual environment that this is happening.
This issue is not with the above code at all. The issue is from the caller. The service calling is using a library that is cutting the connections to early and that overhead is the connection being re-established again.

Await doesn't give a result

Ok, I have some code to present. Here is extension method for NetworkStream object.
public async static Task<byte[]> ReadDataAsync(this NetworkStream clientStream)
{
byte[] data = {};
var buffer = new byte[1024];
if (clientStream.CanRead)
{
using (var ms = new MemoryStream())
{
try
{
int bytesRead;
while (clientStream.DataAvailable &&
(bytesRead = await clientStream.ReadAsync(buffer, 0, buffer.Length)) > 0)
{
await ms.WriteAsync(buffer, 0, bytesRead);
}
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
return data;
}
data = ms.ToArray();
}
}
else
{
Console.WriteLine("Closing clientStream.");
clientStream.Close();
}
return data;
}
And the code where I am trying to call this method.
public async static Task Preform(Socket client)
{
var stream = new NetworkStream(client);
var data = await stream.ReadDataAsync();
var message = await MessageFabrique.DeserializeMessage(data);
ServerCollections.Instance.ServerIssueQueue.Add(new ServerIssue
{
Message = message,
ClientStream = stream
});
}
ReadDataAsync method always returns me to an empty array. And at the moment when i'm trying to deserialize data there is an exception - because data[0]. Please help me. Why is this happening, if await guarantees me the result, when it needed?
clientStream.DataAvailable does not mean data might show up in the future. It means data is available right now for reading. Get rid of it and just read, the read will block till data shows up or will return 0 when the stream hits it's end.
Scott's answer is right, but .Net already takes care of you...
You might consider Stream.CopyToAsync
await clientStream.CopyToAsync(ms)
for code with considerably less places to go wrong.
In addition to the other answers, you might also want to create a synchronization context. See this article for details.
The summary is that async/await works differently in console applications than it does in a UI application. WPF and WebForms applications have a synchronization context by default but console applications don't. The result (which is actually remarkably "poorly advertised" in the documentation) is that the behavior of async/await is much less predictable in a console application than it is in a UI application, and that this might make it not work "as advertised" under certain circumstances.
For example, in a UI application "async" doesn't necessarily mean that the code runs on a background thread. It's the equivalent of "come back to me later when I'm ready." As an analogy, consider going out to eat with 10 people: when the waiter comes by, the first person he asks to order isn't ready. Two bad solutions here would be to a) bring in a second waiter to either wait for the first guy to become ready or take the other 9 people's orders) or b) wait until the first guy's ready to start taking orders. The optimal thing is to take the other 9 people's orders and then come back to the first guy hoping he'll be ready by that time. At risk of oversimplifying this is basically how async works in a UI (unless you're explicitly putting the code on a background thread with something like Task.Run). However, in a console application when you use async there's no guarantee as to where the code will actually run.
If, however, you add a synchronization context as described in the the article I link to it'll behave in a much more predictable manner.

Out of Memory Exception when using File Stream Write Byte to Output Progress Through the Console

I have the following code that throws an out of memory exception when writing large files. Is there something I'm missing?
I am not sure why it is throwing an out of memory error as I thought the Filestream would only use a maximum of 4096 bytes for the buffer? I am not entirely sure what it means by the Buffer to be honest and any advice would be appreciated.
public static async Task CreateRandomFile(string pathway, int size, IProgress<int> prog)
{
byte[] fileSize = new byte[size];
new Random().NextBytes(fileSize);
await Task.Run(() =>
{
using (FileStream fs = File.Create(pathway,4096))
{
for (int i = 0; i < size; i++)
{
fs.WriteByte(fileSize[i]);
prog.Report(i);
}
}
}
);
}
public static void p_ProgressChanged(object sender, int e)
{
int pos = Console.CursorTop;
Console.WriteLine("Progress Copied: " + e);
Console.SetCursorPosition (0, pos);
}
public static void Main()
{
Console.WriteLine("Testing CopyLearning");
//CopyFile()
Progress<int> p = new Progress<int>();
p.ProgressChanged += p_ProgressChanged;
Task ta = CreateRandomFile(#"D:\Programming\Testing\RandomFile.asd", 99999999, p);
ta.Wait();
}
Edit: the 99,999,999 was just created to make a 99MB file
Note: I have commented out prog.Report(i) and it will work fine.
It seems for some reason, the error occurs at the line
Console.writeline("Progress Copied: " + e);
I am not entirely sure why this causes an error? So the error might have been caused because of the progressEvent?
Edit 2: I have followed advice to change the code such that it reports progress every 4000 Bytes by using the following:
if (i%4000==0)
prog.Report(i);
For some reason. I am now able to write files up to 900MBs fine.
I guess the question is, why would the "Edit 2"'s code allow it to write up to 900MB just fine? Is it because it's reporting progress and writing to the console up to 4000x less than before? I didn't realize the Console would take up so much memory especially because I'm assuming all it's doing is outputting "Progress Copied"?
Edit 3:
For some reason when I change the following line as follows:
for (int i = 0; i < size; i++)
{
fs.WriteByte(fileSize[i]);
Console.Writeline(i)
prog.Report(i);
}
where there is a "Console.Writeline()" before the prog.Report(i), it would work fine and copy the file, albeit take a very long time to do so. This leads me to believe that this is a Console related issue for some reason but I am not sure as to what.
fs.WriteByte(fileSize[i]);
prog.Report(i);
You created a fire-hose problem. After deadlocks and threading races, probably the 3rd most likely problem caused by threads. And just as hard to diagnose.
Easiest to see by using the debugger's Debug + Windows + Threads window and look at thread that is executing CreateRandomFile(). With some luck, you'll see it is completed and has written all 99MB bytes. But the progress reported on the console is far behind this, having only reported 125KB bytes written, give or take.
Core issue is the way Progress<>.Report() works. It uses SynchronizationContext.Post() to invoke the ProgressChanged event handler. In a console mode app that will call ThreadPool.QueueUserWorkItem(). That's quite fast, your CreateRandomFile() method won't be bogged down much by it.
But the event handler itself is quite a lot slower, console output is not very fast. So in effect, you are adding threadpool work requests at an enormous rate, 99 million of them in a handful of seconds. No way for the threadpool scheduler to keep up, you'll have roughly 4 of them executing at the same time. All competing to write to the console as well, only one of them can acquire the underlying lock.
So it is the threadpool scheduler that causes OOM, forced to store so many work requests.
And sure, when you call Report() less frequently then the fire-hose problem is a lot less worse. Not actually that simple to ensure it never causes a problem, although directly calling Console.Write() is an obvious fix. Ultimately simple, create a usable UI that is useful to a human. Nobody likes a crazily scrolling window or a blur of text. Reporting progress no more frequently than 20 times per second is plenty good enough for the user's eyes, the console has no trouble keeping up with that.

foreach mysteriously hangs on first item of a ResourceSet

Occasionally our site slows down and the RAM usage goes up massively high. Then the app pool stops and I have to restart it. Then it's ok for a few days before the RAM suddenly spikes again and the app pool soon stops. The CPU isn't high.
Before the app pool stops I've noticed that one of our pages always hangs. The line it hangs on is a foreach on a ResourceSet :
var englishLocations = Lang.Countries.ResourceManager.GetResourceSet(new CultureInfo("en-GB"),true,true);
foreach(DictionaryEntry entry2 in englishLocations) // THIS LINE HANGS
We have the same code deployed on a different box and this doesn't happen. The main differences between the two boxes are :
Bad box
Window Server 2008 R2 Standard SP 1
IIS 7.5.7600.16385
.NET 4.5
24GB RAM
Good box
Window Server 2008 Server SP 2
IIS 7.0.6000.16386 SP 2
.NET 4.0
24GB RAM
I've tried adding uploadReadAheadSize="0" to the web.config as described here :
http://rionscode.wordpress.com/2013/03/11/resolving-controller-blocking-within-net-4-5-and-asp-net-mvc/
Which didn't work.
Why would foreach hang? It's hanging on the very first item, actually on the foreach.
Thanks.
I know it is an old post, but nevertheless... There is the potential of a deadlock when iterating over a ResourceSet and at the same time retrieving some other object through from the same Resources.
The problem is that when using a ResourceSet the iterator takes out locks on the internal resource cache of the ResourceReader http://referencesource.microsoft.com/#mscorlib/system/resources/resourcereader.cs,1389 and then in the method AllocateStringNameForIndex takes out a lock on the reader itself: http://referencesource.microsoft.com/#mscorlib/system/resources/resourcereader.cs,447
lock (_reader._resCache) {
key = _reader.AllocateStringForNameIndex(_currentName, out _dataPosition); // locks the reader
Getting an object takes out the same locks int the opposite order:
http://referencesource.microsoft.com/#mscorlib/system/resources/runtimeresourceset.cs,300
and http://referencesource.microsoft.com/#mscorlib/system/resources/runtimeresourceset.cs,335
lock(Reader) {
....
lock(_resCache) {
_resCache[key] = resLocation;
}
}
This can lead to a deadlock. We had this exact issue recently..
I experienced very similar problem.
Every once in a while IIS would hang, and I would see number of requests just sitting there. They were all in state ExecuteRequestHandler and with ManagedPipelineHandler module name.
After investigating with process explorer, I could see that all of them were sitting at mscorlib.dll!ResourceEnumerator.get_Entry, additional stack trace suggested some NGen action and then ntdll.dll!WaitForMultipleObjects.
My working hypothesis is that when multiple threads start enumerating those resources, we can run into a deadlock (possibly on some native code file generation), and alll subsequent threads then just keep on piling up.
To resolve it, I just created a critical section around this code block, to ensure that it is executed sequentially - I haven't experienced the issue since.
private static readonly object ResourceLock = new object();
public static MvcHtmlString SerializeGlobalResources(this HtmlHelper helper)
{
lock (ResourceLock)
{
// Existing code goes here ....
}
}
Based upon another answer to give you some idea how about using a try catch model ?
Perhaps it hangs because that resource isnt available / locked /..permissions etc.
var englishLocations = Lang.Countries.ResourceManager.GetResourceSet(new CultureInfo("en-GB"),true,true);
foreach(DictionaryEntry entry2 in englishLocations) // THIS LINE HANGS
ResourceManager CultureResourceManager = new ResourceManager("My.Language.Assembly", System.Reflection.Assembly.GetExecutingAssembly());
ResourceSet resourceSet = CultureResourceManager.GetResourceSet("sv-SE", true, true);
try { resourceSet.GetString("my_language_resource");}
catch (exception ex) { // from here log your error ex to wherever you like with some code }

OutOfMemoryException handling and Windows Media Player SDK

How it must work:
WindowsMediaPlayer windowsMediaPlayer = new WindowsMediaPlayer();
IWMPMediaCollection collection = windowsMediaPlayer.mediaCollection;
IWMPMedia newMedia = collection.add(path); //causes OutOfMemoryException after some thousands method's iterations
I've tried to avoid it this way:
try
{
newMedia = collection.add(path);
return newMedia;
}
catch (Exception)
{
collection = null;
GC.Collect();
GC.WaitForPendingFinalizers();
WindowsMediaPlayer windowsMediaPlayer = new WindowsMediaPlayer();
IWMPMediaCollection collectionNew = windowsMediaPlayer.mediaCollection;
return CreateNewMedia(collectionNew, path);
}
But this does not work – I still get infinite exception loop inside catch.
You can not handle OutOfMemoryException like ordinary one. The reason you may have handling that, is just, in some way, to save the state of application, in order to provide to the consumer of your application a way to recover lost data.
What I mean is that there is no meaning calling GC.Collect or whatever, cause application is going to dead anyway, but CLR kindly give you notification about that before.
So to resolve this issue, check the memory consumption of your application, that on 32bit machine has to be something about 1.2GB of RAM, or control the quantity of the elements in collections you have, that can not exceed, for ordinary list, 2GB of memory on 32bit and on 64bit too.

Categories

Resources