Threads hang until i attach debugger - c#

I have a service using WCF. Internally it has a dictionary with lists that you can add to or get a subset of from different endpoints.
The code is something like this:
List<Data> list = null;
try
{
locker.EnterReadLock();
list = internalData[Something].Where(x => x.hassomething()).ToList();
}
finally
{
locker.ExitReadLock();
}
foreach (var y in list)
{
result[y.proprty1].Add(y.property2); // <-- here it hangs
}
return result;
So the internalData is locked with a ReaderWriterLockSlim for all operations, readerlock for reading and writerlock for adding. I make a copy of the items inside the lock and work on this copy later.
The issue is after a while, more and more cpu-cores goes to 100% and finally is uses all cores. It can run perfectly for days and million calls before it stops.
Attaching debugger and pausing shows that one treads hang on adding to the result dictionary. But as soon as I resume all threads will continue and a lot of memory is being released.
Is there something special happening when a debugger is attached, pausing and resuming that will release something like this?

I changed my locks to lock(something) { code... } and the problem with hangs went away. So it looks lilke i ran into the issue Steffen Winkler pointed out in his comment. http://joeduffyblog.com/2007/02/07/introducing-the-new-readerwriterlockslim-in-orcas/

Related

CPU is 100% at multithreading

First I've read all the posts here regarding this issue and I manged to progress a bit. However it seems I do need your help :)
I have a program with several threads, sometimes (not always) the CPU usage of the program is increasing up to 100% and never reduced until I shut down the program.
As I read in other similar posts, I ran the app using the visual studio (2012 - Ultimate).
I paused the app, and open the threads window.
There I pauses the threads until I've found the 4 threads which stuck the app.
The all refer to the same line of code (a call for constructor).
I checked the constructor inside and outside and couldn't find any loop which could cause it.
To be more careful I've added break point to almost every line of code and resume the app. None of them have been triggered.
This is the line of code:
public static void GenerateDefacementSensors(ICrawlerManager cm)
{
m_SensorsMap = new Dictionary<DefacementSensorType, DefacementSensor>();
// Create instance of all sensors
// For any new defacement sensor, don't forget to add an appropriate line here
// m_SensorsMap.add(DefacementSensorType.[Type], new [Type]Sensor())
try
{
if (m_SensorsMap.Count <= 0)
{
m_SensorsMap.Add(DefacementSensorType.BackgroundSensor, new BackgroundSensor());
m_SensorsMap.Add(DefacementSensorType.TaglinesSensor, new TaglinesSensor(cm.Database));
m_SensorsMap.Add(DefacementSensorType.SingleImageSensor, new SingleImageSensor());
}
}
catch (Exception)
{
Console.WriteLine("There was a problem initializing defacement sensors");
}
}
The second "m_SensorsMap.Add" is marked with green arrow, as I understand it, it means it's still waiting to the first line to finish.
By the way, the m_SensorsMap.Count value is 3.
How can I find the problem?
Is it a loop?
Or maybe a deadlock (not make sense because it shouldn't be 100% cpu, right?)
It's pointless to upload a code because this is a huge project.
I need more general help like how to debug?
Is it could something else than a loop?
Because it's a bug that returns every while and than I'm not closing the app until I found the problem :)
Thanks in advance!!
Edit:
The constructors:
public TaglinesSensor(IDatabase db)
{
m_DB = db;
}
I couldn't found the problem so I've changed the design on order not to call those constructors anymore.
Thanks for the guys who tried to help.
Shaul

foreach mysteriously hangs on first item of a ResourceSet

Occasionally our site slows down and the RAM usage goes up massively high. Then the app pool stops and I have to restart it. Then it's ok for a few days before the RAM suddenly spikes again and the app pool soon stops. The CPU isn't high.
Before the app pool stops I've noticed that one of our pages always hangs. The line it hangs on is a foreach on a ResourceSet :
var englishLocations = Lang.Countries.ResourceManager.GetResourceSet(new CultureInfo("en-GB"),true,true);
foreach(DictionaryEntry entry2 in englishLocations) // THIS LINE HANGS
We have the same code deployed on a different box and this doesn't happen. The main differences between the two boxes are :
Bad box
Window Server 2008 R2 Standard SP 1
IIS 7.5.7600.16385
.NET 4.5
24GB RAM
Good box
Window Server 2008 Server SP 2
IIS 7.0.6000.16386 SP 2
.NET 4.0
24GB RAM
I've tried adding uploadReadAheadSize="0" to the web.config as described here :
http://rionscode.wordpress.com/2013/03/11/resolving-controller-blocking-within-net-4-5-and-asp-net-mvc/
Which didn't work.
Why would foreach hang? It's hanging on the very first item, actually on the foreach.
Thanks.
I know it is an old post, but nevertheless... There is the potential of a deadlock when iterating over a ResourceSet and at the same time retrieving some other object through from the same Resources.
The problem is that when using a ResourceSet the iterator takes out locks on the internal resource cache of the ResourceReader http://referencesource.microsoft.com/#mscorlib/system/resources/resourcereader.cs,1389 and then in the method AllocateStringNameForIndex takes out a lock on the reader itself: http://referencesource.microsoft.com/#mscorlib/system/resources/resourcereader.cs,447
lock (_reader._resCache) {
key = _reader.AllocateStringForNameIndex(_currentName, out _dataPosition); // locks the reader
Getting an object takes out the same locks int the opposite order:
http://referencesource.microsoft.com/#mscorlib/system/resources/runtimeresourceset.cs,300
and http://referencesource.microsoft.com/#mscorlib/system/resources/runtimeresourceset.cs,335
lock(Reader) {
....
lock(_resCache) {
_resCache[key] = resLocation;
}
}
This can lead to a deadlock. We had this exact issue recently..
I experienced very similar problem.
Every once in a while IIS would hang, and I would see number of requests just sitting there. They were all in state ExecuteRequestHandler and with ManagedPipelineHandler module name.
After investigating with process explorer, I could see that all of them were sitting at mscorlib.dll!ResourceEnumerator.get_Entry, additional stack trace suggested some NGen action and then ntdll.dll!WaitForMultipleObjects.
My working hypothesis is that when multiple threads start enumerating those resources, we can run into a deadlock (possibly on some native code file generation), and alll subsequent threads then just keep on piling up.
To resolve it, I just created a critical section around this code block, to ensure that it is executed sequentially - I haven't experienced the issue since.
private static readonly object ResourceLock = new object();
public static MvcHtmlString SerializeGlobalResources(this HtmlHelper helper)
{
lock (ResourceLock)
{
// Existing code goes here ....
}
}
Based upon another answer to give you some idea how about using a try catch model ?
Perhaps it hangs because that resource isnt available / locked /..permissions etc.
var englishLocations = Lang.Countries.ResourceManager.GetResourceSet(new CultureInfo("en-GB"),true,true);
foreach(DictionaryEntry entry2 in englishLocations) // THIS LINE HANGS
ResourceManager CultureResourceManager = new ResourceManager("My.Language.Assembly", System.Reflection.Assembly.GetExecutingAssembly());
ResourceSet resourceSet = CultureResourceManager.GetResourceSet("sv-SE", true, true);
try { resourceSet.GetString("my_language_resource");}
catch (exception ex) { // from here log your error ex to wherever you like with some code }

Why do I have a lock here?

See the following concurrent performance analysis representing the work done by a parallel foreach:
Inside the loop each thread reads data from the DB and process it. There are no locks between threads as each one process different data.
Looks like there are periodic locks in all the thread of the foreach due to unknown reasons (see the black vertical rectangles). If you see the selected locked segment (the dark red one) you will see that the stack shows the thread locked at StockModel.Quotation constructor. The code there just constructs two empty lists!
I've read somewhere that this could be caused by the GC so I've changed the garbage collection to run in server mode with:
<runtime>
<gcServer enabled="true"/>
</runtime>
I got a small improvement (about 10% - 15% faster) but I still have the vertical locks everywhere.
I've also added to all the DB queries the WITH(NOLOCK) as I'm only reading data without any difference.
Any hint on what's happening here?
The computer where the analysis has been done has 8 cores.
EDIT: After enabling Microsoft Symbol servers turns out that all threads are blocked on calls like wait_gor_gc_done or WaitUntilGCComplete. I thought that enabling GCServer I had one GC for each thread so I would avoid the "vertical" lock but seems that it's not the case. Am I wrong?
Second question: as the machine is not under memory pressure (5 of 8 gigs are used) is there a way to delay the GC execution or to pause it until the parallel foreach ends (or to configure it to fire less often)?
If your StockModel.Quotation class allows for it, you could create a pool to limit the number of new objects created. This is a technique they sometimes use in games to prevent the garbage collector stalling in the middle of renders.
Here's a basic pool implementation:
class StockQuotationPool
{
private List<StockQuotation> poolItems;
private volatile int itemsInPool;
public StockQuotationPool(int poolSize)
{
this.poolItems = new List<StockQuotation>(poolSize);
this.itemsInPool = poolSize;
}
public StockQuotation Create(string name, decimal value)
{
if (this.itemsInPool == 0)
{
// Block until new item ready - maybe use semaphore.
throw new NotImplementedException();
}
// Items are in the pool, but no items have been created.
if (this.poolItems.Count == 0)
{
this.itemsInPool--;
return new StockQuotation(name, value);
}
// else, return one in the pool
this.itemsInPool--;
var item = this.poolItems[0];
this.poolItems.Remove(item);
item.Name = name;
item.Value = value;
return item;
}
public void Release(StockQuotation quote)
{
if (!this.poolItems.Contains(quote)
{
this.poolItems.Add(quote);
this.itemsInPool++;
}
}
}
That's assuming that the StockQuotation looks something like this:
class StockQuotation
{
internal StockQuotation(string name, decimal value)
{
this.Name = name;
this.Value = value;
}
public string Name { get; set; }
public decimal Value { get; set; }
}
Then instead of calling the new StockQuotation() constructor, you ask the pool for a new instance. The pool returns an existing instance (you can precreate them if you want) and sets all the properties so that it looks like a new instance. You may need to play around until you find a pool size that is large enough to accommodate the threads at the same time.
Here's how you'd call it from the thread.
// Get the pool, maybe from a singleton.
var pool = new StockQuotationPool(100);
var quote = pool.Create("test", 1.00m);
try
{
// Work with quote
}
finally
{
pool.Release(quote);
}
Lastly, this class isn't thread safe at the moment. Let me know if you need any help with making it so.
You could try using GCLatencyMode.LowLatency; See related question here: Prevent .NET Garbage collection for short period of time
I recently attempted this with no luck. Garbage collection was still being called when caching bitmap images of Icon sizes on a form I was displaying. What worked for me was using Ants performance profiler and Reflector to find the exact calls that were causing the GC.Collect and work around it.

Force loop containing asynchronous task to maintain sequence

Something tells me this might be a stupid question and I have in fact approached my problem from the wrong direction, but here goes.
I have some code that loops through all the documents in a folder - The alphabetical order of these documents in each folder is important, this importance is also reflected in the order the documents are printed. Here is a simplified version:
var wordApp = new Microsoft.Office.Interop.Word.Application();
foreach (var file in Directory.EnumerateFiles(folder))
{
fileCounter++;
// Print file, referencing a previously instantiated word application object
wordApp.Documents.Open(...)
wordApp.PrintOut(...)
wordApp.ActiveDocument.Close(...)
}
It seems (and I could be wrong) that the PrintOut code is asynchronous, and the application sometimes gets into a situation where the documents get printed out of order. This is confirmed because if I step through, or place a long enough Sleep() call, the order of all the files is correct.
How should I prevent the next print task from starting before the previous one has finished?
I initially thought that I could use a lock(someObject){} until I remembered that they are only useful for preventing multiple threads accessing the same code block. This is all on the same thread.
There are some events I can wire into on the Microsoft.Office.Interop.Word.Application object: DocumentOpen, DocumentBeforeClose and DocumentBeforePrint
I have just thought that this might actually be a problem with the print queue not being able to accurately distinguish lots of documents that are added within the same second. This can't be the problem, can it?
As a side note, this loop is within the code called from the DoWork event of a BackgroundWorker object. I'm using this to prevent UI blocking and to feedback the progress of the process.
Your event-handling approach seems like a good one. Instead of using a loop, you could add a handler to the DocumentBeforeClose event, in which you would get the next file to print, send it to Word, and continue. Something like this:
List<...> m_files = Directory.EnumerateFiles(folder);
wordApp.DocumentBeforeClose += ProcessNextDocument;
...
void ProcessNextDocument(...)
{
File file = null;
lock(m_files)
{
if (m_files.Count > 0)
{
file = m_files[m_files.Count - 1];
m_files.RemoveAt(m_files.Count - 1);
}
else
{
// Done!
}
}
if (file != null)
{
PrintDocument(file);
}
}
void PrintDocument(File file)
{
wordApp.Document.Open(...);
wordApp.Document.PrintOut(...);
wordApp.ActiveDocument.Close(...);
}
The first parameter of Application.PrintOut specifies whether the printing should take place in the background or not. By setting it to false it will work synchronously.

Explain Strange Synchronization in WCF Source Code

While looking at the source code of System.ServiceModel.Channels.BufferManager, I noticed this method:
void TuneQuotas()
{
if (areQuotasBeingTuned)
return;
bool lockHeld = false;
try
{
try { }
finally
{
lockHeld = Monitor.TryEnter(tuningLock);
}
// Don't bother if another thread already has the lock
if (!lockHeld || areQuotasBeingTuned)
return;
areQuotasBeingTuned = true;
}
finally
{
if (lockHeld)
{
Monitor.Exit(tuningLock);
}
}
//
// DO WORK... (code removed for brevity)
//
areQuotasBeingTuned = false;
}
Obviously, they want only one thread to run TuneQuotas(), and other threads to not wait if it is already being run by another thread. I should note that the code removed was not try protected.
I'm trying to understand the advantages of this method above over just doing this:
void TuneQuotas()
{
if(!Monitor.TryEnter(tuningLock)) return;
//
// DO WORK...
//
Monitor.Exit(tuningLock);
}
Any ideas why they might have bothered with all that? I suspect the way they use the finally blocks is to guard against a thread abort scenario, but I still don't see the point because, even with all this code, TuneQuotas() would be locked for good if that one thread doesn't make it all the way to the end to set areQuotasBeingTunes=false, for one reason or another. So is there something cool about this pattern that I'm missing?
EDIT:
As a side note, it seems the method exists in .NET 4.0, which I confirmed using this code running on framework 4 (although I cannot confirm that the content of the method hasn't changed from what I found on the web):
var buffMgr = BufferManager.CreateBufferManager(1, 1);
var pooledBuffMgrType = buffMgr.GetType()
.GetProperty("InternalBufferManager")
.GetValue(buffMgr, null)
.GetType();
Debug.WriteLine(pooledBuffMgrType.Module.FullyQualifiedName);
foreach (var methodInfo in pooledBuffMgrType
.GetMethods(BindingFlags.Instance | BindingFlags.NonPublic))
{
Debug.WriteLine(methodInfo.Name);
}
which outputs:
C:\Windows\Microsoft.Net\assembly\GAC_MSIL\System.Runtime.DurableInstancing\v4.0_4.0.0.0__3 1bf3856ad364e35\System.Runtime.DurableInstancing.dll
ChangeQuota
DecreaseQuota
FindMostExcessivePool
FindMostStarvedPool
FindPool
IncreaseQuota
TuneQuotas
Finalize
MemberwiseClone
I'll add some comments:
void TuneQuotas()
{
if (areQuotasBeingTuned)
return; //fast-path, does not require locking
bool lockHeld = false;
try
{
try { }
finally
{
//finally-blocks cannot be aborted by Thread.Abort
//The thread could be aborted after getting the lock and before setting lockHeld
lockHeld = Monitor.TryEnter(tuningLock);
}
// Don't bother if another thread already has the lock
if (!lockHeld || areQuotasBeingTuned)
return; //areQuotasBeingTuned could have switched to true in the mean-time
areQuotasBeingTuned = true; //prevent others from needlessly trying to lock (trigger fast-path)
}
finally //ensure the lock being released
{
if (lockHeld)
{
Monitor.Exit(tuningLock);
}
}
//
// DO WORK... (code removed for brevity)
//
//this might be a bug. There should be a call to Thread.MemoryBarrier,
//or areQuotasBeingTuned should be volatile
//if not, the write might never reach other processor cores
//maybe this doesn't matter for x86
areQuotasBeingTuned = false;
}
The simple version you gave does not protect against some problems. At the very least it is not exception-safe (lock won't be released). Interestingly, the "sophisticated" version, doesn't either.
This method has been removed from .NET 4.
Until .NET 4.0 there was essentially a bug in the code that was generated by a lock statment. It would generate something similar to the following:
Monitor.Enter(lockObject)
// see next paragraph
try
{
// code that was in the lock block
}
finally
{
Monitor.Exit(lockObject);
}
This means that if an exception occurred between Enter and try, the Exit would never be called. As usr alluded to, this could happen due to Thread.Abort.
Your example:
if(!Monitor.TryEnter(tuningLock)) return;
//
// DO WORK...
//
Monitor.Exit(tuningLock);
Suffers from this problem and more. The window in which this code and become interrupted and Exit not be called is basically the whole block of code--by any exception (not just one from Thread.Abort).
I have no idea why most code was written in .NET. But, I surmise that this code was written to avoid the problem of an exception between Enter and try. Let's look at some of the details:
try{}
finally
{
lockHeld = Monitor.TryEnter(tuningLock);
}
Finally blocks basically generate a constrained execution region in IL. Constrained execution regions cannot be interrupted by anything. So, putting the TryEnter in the finally block above ensures that lockHeld reliably holds the state of the lock.
That block of code is contained in a try/finally block whose finally statement calls Monitor.Exit if tuningLock is true. This means that there is no point between the Enter and the try block that can be interrupted.
FWIW, this method was still in .NET 3.5 and is visible in the WCF 3.5 source code (not the .NET source code). I don't know yet what's in 4.0; but I would imagine it would be the same; there's no reason to change working code even if the impetus for part of its structure no longer exists.
For more details on what lock used to generate see http://blogs.msdn.com/b/ericlippert/archive/2007/08/17/subtleties-of-c-il-codegen.aspx
Any ideas why they might have bothered with all that?
After running some tests, I think see one reason (if not THE reason): They probably bothered with all that because it is MUCH faster!
It turns out Monitor.TryEnter is an expensive call IF the object is already locked (if it's not locked, TryEnter is still very fast -- no problems there). So all threads, except the first one, are going to experience the slowness.
I didn't think this would matter that much; since afterall, each thread is going to try taking the lock just once and then move on (not like they'd be sitting there, trying in a loop). However, I wrote some code for comparison and it showed that the cost of TryEnter (when already locked) is significant. In fact, on my system each call took about 0.3 ms without the debugger attached, which is several orders of magnitude slower than using a simple boolean check.
So I suspect, this probably showed up in Microsoft's test results, so they optimized the code as above, by adding the fast track boolean check. But that's just my guess..

Categories

Resources