Understanding the purpose of CERs in this example - c#

I'm reading through Constrained Execution Regions and other errata [Brian Grunkemeyer] in an attempt to understand constrained execution regions, however I'm having some problems understanding the following sample:
RuntimeHelpers.PrepareConstrainedRegions();
try {
// Prepare my backout code
MethodInfo m = _list.GetType().GetMethod("RemoveAt", new Type[] { typeof(int) });
RuntimeHelpers.PrepareMethod(m.MethodHandle);
IEnumerator en = c.GetEnumerator();
while(en.MoveNext()) {
_list.Insert(index++, en.Current);
// Assuming that these lines aren't reordered.
numAdded++;
}
_version++;
}
catch(Exception) {
// Reliable backout code
while(numAdded > 0) {
_list.RemoveAt(index--);
numAdded--;
}
throw;
}
My understanding is that the try block is not constrained, only the finally and catch blocks are constrained. This means that during the try block an asynchronous exception (e.g. ThreadAbortException) can be thrown at any time, in particular it could be thrown before numAdded++ but after _list.Insert. In this case the backout code would remove one item too few from _list.
Given this I'm struggling to understand the purpose of the constrained execution region in this example.
Is my understanding of this correct or have I missed something?

The documentation and the actual behavior of CERs do not match exactly based on what I observe. The issue you describe where a ThreadAbortException gets injected between Insert and numAdded++ is not possible with any of the .NET Framework versions I have tested. There are two possible reasons for this.
PrepareConstrainedRegions does, despite what the documentation says, have an observable effect on the try block. It will delay certain abort injections; specifically those that do not come while the thread is in an alertable state.
Even in the absence of the PrepareConstrainedRegions call the abort still will not get injected into that location. Based on the SSCLI code the abort will be injected at the backward jump to spin the while loop.
I figured some of this out while answering my own related question here and then attempting to answer a question about how Thread.Abort actually works here.
Point #2 is not legit. It is an implementation detail of the SSCLI that may not carry over to the official distributions (though I suspect it actually does). Furthermore, it ignores the possibility of having the abort injected at some point during the execution of Insert. I suppose it is possible that the crucial bits of Insert could use a CER internally though.
Point #1 may be the one that matters, but that begs the questions why did Microsoft not document it and why did the article you cited not mention it either. Surely the author of the article knew of this fact. Otherwise, I too am not understanding how the code presented could possibly be safe. In other words, it seems safe only by accident right now.
If I had to take a guess as to what PrepareConstrainedRegions is doing behind the scenes I would say that it sets a flag in the JIT engine that tells it not to inject the GC poll hook that gets placed strategically at backward branch jumps for code inside a CER try block. This GC poll hook is where the asynchronous abort would typically be injected (in addition to its main purpose related to garbage collection).

Related

Why my code does not speed up with a multithreaded Parallel.For loop?

I tried to transform a simple sequential loop into a parallel computed loop with the System.Threading.Tasks library.
The code compiles, returns correct results, but It does not save any computational cost, otherwise, it takes longer.
EDIT: Sorry guys, I have probably oversimplified the question and made some errors doing that.
To append additional information, I am running the code on an i7-4700QM, and it is referenced in a Grasshopper script.
Here is the actual code. I also switched to a non thread-local variables
public static class LineNet
{
public static List<Ray> SolveCpu(List<Speaker> sources, List<Receiver> targets, List<Panel> surfaces)
{
ConcurrentBag<Ray> rays = new ConcurrentBag<Ray>();
for (int i = 0; i < sources.Count; i++)
{
Parallel.For(
0,
targets.Count,
j =>
{
Line path = new Line(sources[i].Position, targets[j].Position);
Ray ray = new Ray(path, i, j);
if (Utils.CheckObstacles(ray,surfaces))
{
rays.Add(ray);
}
}
);
}
}
}
The Grasshopper implementation just collects sources targets and surfaces, calls the method Solve and returns rays.
I understand that dispatching workload to threads is expensive, but is it so expensive?
Or is the ConcurrentBag just preventing parallel calculation?
Plus, my classes are immutable (?), but if I use a common List the kernel aborts the operation and throws an exception, is someone able to tell why?
Without a good Minimal, Complete, and Verifiable code example that reliably reproduces the problem, it is not possible to provide a definitive answer. The code you posted does not even appear to be an excerpt of real code, because the type declared as the return type of the method isn't the same as the value actually returned by the return statement.
However, certainly the code you posted does not seem like a good use of Parallel.For(). Your Line constructor would have be fairly expensive to justify parallelizing the task of creating the items. And to be clear, that's the only possible win here.
At the end, you still need to aggregate all of the Line instances that you created into a single list, so all those intermediate lists created for the Parallel.For() tasks are just pure overhead. And the aggregation is necessarily serialized (i.e. only one thread at a time can be adding an item to the result collection), and in the worst way (each thread only gets to add a single item before it gives up the lock and another thread has a chance to take it).
Frankly, you'd be better off storing each local List<T> in a collection, and then aggregating them all at once in the main thread after Parallel.For() returns. Not that that would be likely to make the code perform better than a straight-up non-parallelized implementation. But at least it would be less likely to be worse. :)
The bottom line is that you don't seem to have a workload that could benefit from parallelization. If you think otherwise, you'll need to explain the basis for that thought in a clearer, more detailed way.
if I use a common List the kernel aborts the operation and throws an exception, is someone able to tell why?
You're already using (it appears) List<T> as the local data for each task, and indeed that should be fine, as tasks don't share their local data.
But if you are asking why you get an exception if you try to use List<T> instead of ConcurrentBag<T> for the result variable, well that's entirely to be expected. The List<T> class is not thread safe, but Parallel.For() will allow each task it runs to execute the localFinally delegate concurrently with all the others. So you have multiple threads all trying to modify the same not-thread-safe collection concurrently. This is a recipe for disaster. You're fortunate you get the exception; the actual behavior is undefined, and it's just as likely you'll simply corrupt the data structure as cause a run-time exception.

Is it okay to perform async operations without anybody "knowing" about it?

I'm using a COM library with RCWs. I've always found it's a best practice to manually release any unmanaged resources and garbage collect before they go out of scope. I don't want this to "slow down" my application, and I don't care when this actually finishes, if at all, before the application exits. So I have this method:
// best effort:
internal static void CleanupComObjects(params object[] toRelease)
{
Task.Run(() =>
{
var t = Task.WhenAll(toRelease.Select((x) =>
Task.Run(() =>
System.Runtime.InteropServices.Marshal.ReleaseComObject(x))));
t.Wait();
System.GC.Collect();
});
}
Is it OK to do this without any clients or users of my API/application knowing or caring that async code is running, as long as it doesn't produce any side effects other than what is expected by the caller, or is there something I haven't though about which could result in unexpected problems? (<-- To "primarily opinion based" close voters, note this last line.)
I don't care when this actually finishes, if at all, before the application exits
That statement seems in conflict with your other one:
I've always found it's a best practice to manually release any unmanaged resources and garbage collect before they go out of scope
If you really didn't care, it seems to me you could just let .NET manage the COM objects' lifetimes, as the RCW is intended to accomplish.
That said, I don't see anything wrong with your approach per se. The implementation seems heavy-handed to me, but this sort of "fire-and-forget" approach occurs in many other scenarios. You could even argue that .NET's basic garbage collection algorithm is an example (after all, you don't typically know or have control over when or how it happens…it "just does").
Personally, I would write it in a more streamlined fashion:
internal static async Task CleanupComObjects(params object[] toRelease)
{
await Task.WhenAll(toRelease.Select((x) =>
Task.Run(() =>
System.Runtime.InteropServices.Marshal.ReleaseComObject(x))));
System.GC.Collect();
}
Or even:
internal static Task CleanupComObjects(params object[] toRelease)
{
return Task.Run(() =>
{
foreach (object o in toRelease)
{
System.Runtime.InteropServices.Marshal.ReleaseComObject(o);
}
System.GC.Collect();
});
}
(Returning Task from the method gives the caller the opportunity to observe the completion of the operation. It's not required to, but you may find going forward there's a good reason to, e.g. so that you can detect exceptions that might occur.)
In other words, it's not clear to me why you want the calls to ReleaseComObject() to occur concurrently. Depending on where these objects come from and the apartment model, you could just be asking for trouble trying to do it that way.
Speaking of which, the other detail missing from your question is what apartment model you're dealing with here. If these are STA objects, your operations are going to get marshalled by .NET back to the owning thread, meaning any attempt to release them concurrently with any other code running on that thread (whether that's other code unrelated to the objects or the other calls to ReleaseComObject()) will be pointless. You might initiate the release concurrency, but the release operations and your other code in that thread will all get serialized anyway.
All that said…
In general, it is my preference to write code that solves a problem. Is there a specific problem that has occurred here that you are trying to address? The code is already a bit irregular in that you aren't relying on the normal GC management of .NET to deal with your objects. Then you add to that irregularity by shifting your explicit management of them to other threads, something that may or may not even be successful depending on the type of COM objects you're dealing with.
I would hope that you're doing all that work for a good reason. But what is that reason? What specific problem is it that you're trying to solve? Have you been able to confirm that this type of code does in fact solve that problem?
It seems to me that those are important questions, but there's not enough information in the post above to address them. So I encourage you to investigate that on your own. You might want to consider a separate Stack Overflow question in which you provide a good Minimal, Complete, and Verifiable code example that reliably reproduces the problem you're trying to solve, to solicit advice for alternative means to solving that problem.

.net 2.0 lock and exceptions before the try-finally. Are there any other exceptions besides thread abort?

Today I've run into this:
https://blogs.msdn.microsoft.com/ericlippert/2009/03/06/locks-and-exceptions-do-not-mix/
I am using .net 2.0, so, basically, this code
lock(syncRootVar) {
DoStuff();
}
Will unfold into this
Monitor.Enter(syncRootVar);
try {
DoStuff();
} finally {
Monitor.Exit(syncRootVar);
}
As Lippert wrote on the blog, there might be a nop operation between the Enter call and the try-finally block, being a potential position for a thread abort exception to be raised and therefore messing up with the lock.
I have two questions about this:
Is there a common way of handling this troublesome situation and still clean up the lock object in order to not affect other threads?
Are there other situations that might result in the lock being acquired, but exceptions raising before the try-finally block?
As the article points out, the issue you seem to be concerned with is no longer an issue. The C# compiler has been changed (and presumably with Roslyn will retain the change) so that the lock is taken inside the try/finally. It's not possible to take the lock but fail to execute the finally clause.
Now (also as the article points out) you have a different problem: if the code in the protected block of code is mutating state, an exception could result in other code seeing partially-mutated state. This may or may not be a problem; usually it would be, but of course each specific scenario is different. It's possible some code would be safe in such a case.
• Is there a common way of handling this troublesome situation and still clean up the lock object in order to not affect other threads?
For the specific situation you've asked about, the two biggest things you can do are:
Don't abort threads. This is always good advice and should always be followed. If you don't abort a thread, you won't have that problem.
Use the latest version of the compiler. The newer versions of the compiler don't generate code that would be susceptible to the problem.
• Are there other situations that might result in the lock being acquired, but exceptions raising before the try-finally block?
No, not with the latest version of the compiler. There's not even the original situation.
Now what about that pesky "partially-mutated" issue? Well, you'll have to address each case individually. But if an exception might be thrown, and leaving the lock with partially-mutated state is possible, then you'll have to add your own clean-up code. E.g.:
lock(syncRootVar) {
try {
DoStuff();
} catch {
UndoStuff();
throw;
}
}

Why do we need the directly call in a thread-safe call block?

Refer the thread-safe call tutorial at MSDN, have a look at following statments:
// InvokeRequired required compares the thread ID of the
// calling thread to the thread ID of the creating thread.
// If these threads are different, it returns true.
if (this.textBox1.InvokeRequired) {
SetTextCallback d = new SetTextCallback(SetText);
this.Invoke(d, new object[] { text });
} else {
this.textBox1.Text = text;
}
Of course, I've used it many times in my codes, and understand a little why to use it.
But I still have some unclear questions about those statements, so anybody help me to find them out, please.
The questions are:
Will the code can run correctly with the statements in the if body only? I tried and seems it just cause the problem if the control is not initialize completely. I don't know is there more problem?
Which the advantage of calling method directly (else body) instance via invoker? Does it save resource (CPU, RAM) or something?
Thanks!
You can of course always call using the Invoker, but:
It usually makes the code more verbose and difficult to read.
It is less efficient as there are several extra layers to contend with (setting up delegates, calling the dispatcher and so on).
If you are sure you'll always be on the GUI thread, you can just ignore the above checks and call directly.
If you always run just the first part of the if statement, it will always be fine, as Invoke already checks if you're on the UI thread.
The reason you don't want to do this is that Invoke has to do a a lot of work to run your method, even if you're already on the right thread. Here's what it has to do (extracted from the source of Control.cs):
Find the marshaling control via an upward traversal of the parent control chain
Check if the control is an ActiveX control and, if so, demand unmanaged code permissions
Work out if the call needs to be invoked asynchronously to avoid potential deadlock
Take a copy of the calling thread's execution context so the same security permissions will be used when the delegate is finally called
Enqueue the method call, then post a message to invoke the method, then wait (if synchronous) until it completes
None of the steps in the second branch are required during a direct call from the UI thread, as all the preconditions are already guaranteed, so it's definitely going to be faster, although to be fair, unless you're updating controls very frequently, you're very unlikely to notice any difference.

Is C#'s using statement abort-safe?

I've just finished reading "C# 4.0 in a Nutshell" (O'Reilly) and I think it's a great book for a programmer willing to switch to C#, but it left me wondering. My problem is the definition of using statement. According to the book (p. 138),
using (StreamReader reader = File.OpenText("file.txt")) {
...
}
is precisely equivalent to:
StreamReader reader = File.OpenText("file.txt");
try {
...
} finally {
if (reader != null)
((IDisposable)reader).Dispose();
}
Suppose, however, that this is true and that this code is executed in a separate thread. This thread is now aborted with thread.Abort(), so a ThreadAbortException is thrown and suppose the thread is exactly after initializing the reader and before entering the try..finally clause. This would mean that the reader is not disposed!
A possible solution would be to code this way:
StreamReader reader = null;
try {
reader = File.OpenText("file.txt");
...
} finally {
if (reader != null)
((IDisposable)reader).Dispose();
}
This would be abort-safe.
Now for my questions:
Are authors of the book right and the using statement is not abort-safe or are they wrong and it behaves like in my second solution?
If using is equivalent to the first variant (not abort-safe), why does it check for null in finally?
According to the book (p. 856), ThreadAbortException can be thrown anywhere in managed code. But maybe there are exceptions and the first variant is abort-safe after all?
EDIT: I know that using thread.Abort() is not considered good practice. My interest is purely theoretical: how does the using statement behave exactly?
The book's companion web site has more info on aborting threads here.
In short, the first translation is correct (you can tell by looking at the IL).
The answer to your second question is that there may be scenarios where the variable can be legitimately null. For instance, GetFoo() may return null here, in which you wouldn't want a NullReferenceException thrown in the implicit finally block:
using (var x = GetFoo())
{
...
}
To answer your third question, the only way to make Abort safe (if you're calling Framework code) is to tear down the AppDomain afterward. This is actually a practical solution in many cases (it's exactly what LINQPad does whenever you cancel a running query).
There's really no difference between your two scenarios -- in the second, the ThreadAbort could still happen after the call to OpenText, but before the result is assigned to the reader.
Basically, all bets are off when you get a ThreadAbortException. That's why you should never purposely abort threads rather than using some other method of gracefully bringing the thread to a close.
In response to your edit -- I would point out again that your two scenarios are actually identical. The 'reader' variable will be null unless the File.OpenText call successfully completes and returns a value, so there's no difference between writing the code out the first way vs. the second.
Thread.Abort is very very bad juju; if people are calling that you're already in a lot of trouble (unrecoverable locks, etc). Thread.Abort should really be limited to the scanerio of inhuming a sickly process.
Exceptions are generally unrolled cleanly, but in extreme cases there is no guarantee that every bit of code can execute. A more pressing example is "what happens if the power fails?".
Re the null check; what if File.OpenText returned null? OK, it won't but the compiler doesn't know that.
A bit offtopic but the behaviour of the lock statement during thread abortion is interesting too. While lock is equivalent to:
object obj = x;
System.Threading.Monitor.Enter(obj);
try {
…
}
finally {
System.Threading.Monitor.Exit(obj);
}
It is guaranteed(by the x86 JITter) that the thread abort doesn't occur between Monitor.Enter and the try statement.
http://blogs.msdn.com/b/ericlippert/archive/2007/08/17/subtleties-of-c-il-codegen.aspx
The generated IL code seems to be different in .net 4:
http://blogs.msdn.com/b/ericlippert/archive/2009/03/06/locks-and-exceptions-do-not-mix.aspx
The language spec clearly states that the first one is correct.
http://msdn.microsoft.com/en-us/vcsharp/aa336809.aspx MS Spec(Word document)
http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-334.pdf ECMA Spec
In case of a thread aborting both code variants can fail. The second one if the abort occurs after the expression has been evaluated but before the assignment to the local variable occurred.
But you shouldn't use thread abortion anyways since it can easily corrupt the state of the appdomain. So only abort threads if you force unload an appdomain.
You are focusing on the wrong problem. The ThreadAbortException is just as likely to abort the OpenText() method. You might hope that it is resilient to that but it isn't. The framework methods do not have try/catch clauses that try to deal with a thread abort.
Do note that the file doesn't remain opened forever. The FileStream finalizer will, eventually, close the file handle. This of course can still cause exceptions in your program when you keep running and try to open the file again before the finalizer runs. Albeit that this is something you always have to be defensive about when you run on a multi-tasking operating system.
Are authors of the book right and the using statement is not abort-safe or are they wrong and it behaves like in my second solution?
According to the book (p. 856), ThreadAbortException can be thrown anywhere in managed code. But maybe there are exceptions and the first variant is abort-safe after all?
The authors are right. The using block is not abort-safe. Your second solution is also not abort-safe, the thread could be aborted in the middle of the resource acquisition.
Although it's not abort-safe, any disposable that has unmanged resources should also implement a finalizer, which will eventually run and clean up the resource. The finalizer should be robust enough to also take care of not completely initialized objects, in case the thread aborts in the middle of the resource acquisition.
A Thread.Abort will only wait for code running inside Constrained Execution Regions (CERs), finally blocks, catch blocks, static constructors, and unmanaged code. So this is an abort-safe solution (only regarding the acquisition and disposal of the resource):
StreamReader reader = null;
try {
try { }
finally { reader = File.OpenText("file.txt"); }
// ...
}
finally {
if (reader != null) reader.Dispose();
}
But be careful, abort-safe code should run fast and not block. It could hang a whole app domain unload operation.
If using is equivalent to the first variant (not abort-safe), why does it check for null in finally?
Checking for null makes the using pattern safe in the presence of null references.
The former is indeed exactly equivalent to the latter.
As already pointed out, ThreadAbort is indeed a bad thing, but it's not quite the same as killing the task with Task Manager or switching off your PC.
ThreadAbort is an managed exception, which the runtime will raise when it is possible, and only then.
That said, once you're into ThreadAbort, why bother trying to cleanup? You're in death throes anyway.
the finally-statement is always executed, MSDN says "finally is used to guarantee a statement block of code executes regardless of how the preceding try block is exited."
So you don't have to worry about not cleaning resources etc (only if windows, the Framework-Runtime or anything else bad you can't control happens, but then there are bigger problems than cleaning up Resources ;-))

Categories

Resources