Profile Time Spent in the Inner .NET Framework Methods

Profile Time Spent in the Inner .NET Framework Methods - c#

In Visual Studio - is there a way to profile the time spent in the inner methods of .NET Framework ?
An example - consider a good-old fashioned ArrayList, and adding some random numbers to it:
static void Main(string[] args)
{
const int noNumbers = 10000; // 10k
ArrayList numbers = new ArrayList();
Random random = new Random(1); // use the same seed as to make
// benchmarking consistent
for (int i = 0; i < noNumbers; i++)
{
int currentNumber = random.Next(10); // generate a non-negative
// random number less than 10
numbers.Add(currentNumber); // BOXING occurs here
}
}
I can step into the .NET Framework source code just fine while debugging. One can use the default Microsoft symbols and the source code for .NET (as described in this answer) or go the dotPeek route (detailed here). As for the cleanest option of just using the Reference Source symbols - as Hans Passant was saying in his answer almost 5 years ago - the framework version (down to security updates installed) for which the symbols were created would have to match exactly to your version; you'd have to be really lucky to get that working (I wasn't). Bottom line - there are 2 ways I can successfully use to step into the .NET source code.
For the sample at hand, there aren't big differences between the Reference Source code and the dotPeek reverse-engineered code - that is for the methods invoked by ArrayList's Add - namely the Capacity setter and ArrayList's EnsureCapacity, of which the latter can be seen below (ReferenceSource code on the left, dotPeek source code on the right):
Running an "Instrumentation" profiling session will return a breakdown of the time spent in each method, but as long as the .NET types go, it appears that one only gets to see the methods the respective code called "directly" - in this case the function that Adds elements to the ArrayList, the one that generates a Random int, and the respective types' constructors. But there's no trace of EnsureCapacity or Capacity's setter, which are both invoked heavily by ArrayList's Add.
Drilling down on a specific .NET method doesn't show any of the methods it called in turn, nor any source code (despite being able to see that very code earlier, while stepping into with the debugger):
How can one get to see those additional, "inner" .NET methods ? If Visual Studio can't do it, perhaps another tool can ?
PS There is a very similar question here, however it's almost 10 years old, and there's not much there that brings light to the problem.
Later Update: As KolA very well points out, JetBrains dotTrace can show this. A line-by-line profiling session below:

perhaps another tool can ?
DotTrace can profile performance down to properties if that's what you're looking for. This example is for generic List<T> not ArrayList but I think it shouldn't matter.

Related

Cannot view values of local variables in async method of class library from WinUI 3 template

I've modified the template WinUI 3 project which includes a .Core project. I'm using this project the same way that the template did except I've converted the method which returns data into an asynchronous one instead so that I can update the UI as data is parsed rather than all at once.
However, this is preventing me from viewing local variable values in the debugger. I instead get
Cannot obtain value of the local variable or argument because it is not available at this instruction pointer, possibly because it has been optimized away.
I'm setting breakpoints on lines where the values couldn't possibly be garbage collected as it is being used on the very next line. Stepping though the method from top to bottom does not help to reveal values.
Essentially code like this:
int i = 1;
int j = 2;
i = j + 1;
where I've put a breakpoint on the second line but am unable to see the value of i.
I've disabled every optimization I can. I've also verified that the .dll produced by the class library is not optimized in the modules window. I've been trying to solve this for days and it's severely affecting my productivity.

Why can I declare a child variable with the same name as a variable in the parent scope?

I wrote some code recently where I unintentionally reused a variable name as a parameter of an action declared within a function that already has a variable of the same name. For example:
var x = 1;
Action<int> myAction = (x) => { Console.WriteLine(x); };
When I spotted the duplication, I was surprised to see that the code compiled and ran perfectly, which is not behavior I would expect based on what I know about scope in C#. Some quick Googling turned up SO questions that complain that similar code does produce an error, such as Lambda Scope Clarification. (I pasted that sample code into my IDE to see if it would run, just to make sure; it runs perfectly.) Additionally, when I enter the Rename dialog in Visual Studio, the first x is highlighted as a name conflict.
Why does this code work? I'm using C# 8 with Visual Studio 2019.

Why does this code work? I'm using C# 8 with Visual Studio 2019.
You've answered your own question! It's because you're using C# 8.
The rule from C# 1 through 7 was: a simple name cannot be used to mean two different things in the same local scope. (The actual rule was slightly more complex than that but describing how is tedious; see the C# specification for details.)
The intention of this rule was to prevent the sort of situation that you're talking about in your example, where it becomes very easy to be confused about the meaning of the local. In particular, this rule was designed to prevent confusions like:
class C
{
int x;
void M()
{
x = 123;
if (whatever)
{
int x = 356;
...
And now we have a situation where inside the body of M, x means both this.x and the local x.
Though well-intentioned, there were a number of problems with this rule:
It was not implemented to spec. There were situations where a simple name could be used as, say, both a type and a property, but these were not always flagged as errors because the error detection logic was flawed. (See below)
The error messages were confusingly worded, and inconsistently reported. There were multiple different error messages for this situation. They inconsistently identified the offender; that is, sometimes the inner usage would be called out, sometimes the outer, and sometimes it was just confusing.
I made an effort in the Roslyn rewrite to sort this out; I added some new error messages, and made the old ones consistent regarding where the error was reported. However, this effort was too little, too late.
The C# team decided for C# 8 that the whole rule was causing more confusion than it was preventing, and the rule was retired from the language. (Thanks Jonathon Chase for determining when the retirement happened.)
If you are interested to learn the history of this problem and how I attempted to fix it, see these articles I wrote about it:
https://ericlippert.com/2009/11/02/simple-names-are-not-so-simple/
https://ericlippert.com/2009/11/05/simple-names-are-not-so-simple-part-two/
https://ericlippert.com/2014/09/25/confusing-errors-for-a-confusing-feature-part-one/
https://ericlippert.com/2014/09/29/confusing-errors-for-a-confusing-feature-part-two/
https://ericlippert.com/2014/10/03/confusing-errors-for-a-confusing-feature-part-three/
At the end of part three I noted that there was also an interaction between this feature and the "Color Color" feature -- that is, the feature that allows:
class C
{
Color Color { get; set; }
void M()
{
Color = Color.Red;
}
}
Here we have used the simple name Color to refer to both this.Color and the enumerated type Color; according to a strict reading of the specification this should be an error, but in this case the spec was wrong and the intention was to allow it, as this code is unambiguous and it would be vexing to make the developer change it.
I never did write that article describing all the weird interactions between these two rules, and it would be a bit pointless to do so now!

Memory Allocation Time (The Fast Way)

For a really simple code snippet, I'm trying to see how much of the time is spent actually allocating objects on the small object heap (SOH).
static void Main(string[] args)
{
const int noNumbers = 10000000; // 10 mil
ArrayList numbers = new ArrayList();
Random random = new Random(1); // use the same seed as to make
// benchmarking consistent
for (int i = 0; i < noNumbers; i++)
{
int currentNumber = random.Next(10); // generate a non-negative
// random number less than 10
object o = currentNumber; // BOXING occurs here
numbers.Add(o);
}
}
In particular, I want to know how much time is spent allocating space for the all the boxed int instances on the heap (I know, this is an ArrayList and there's horrible boxing going on as well - but it's just for educational purposes).
The CLR has 2 ways of performing memory allocations on the SOH: either calling the JIT_TrialAllocSFastMP (for multi-processor systems, ...SFastSP for single processor ones) allocation helper - which is really fast since it consists of a few assembly instructions - or failing back to the slower JIT_New allocation helper.
PerfView sees just fine the JIT_New being invoked:
However, I can't figure out which - if any - is the native function involved for the "quick way" of allocating. I certainly don't see any JIT_TrialAllocSFastMP. I've already tried raising the count of the loop (from 10 to 500 mil), in the hope of increasing my chances of of getting a glimpse of a few stacks containing the elusive function, but to no avail.
Another approach was to use JetBrains dotTrace (line-by-line) performance viewer, but it falls short of what I want: I do get to see the approximate time it takes the boxing operation for each int, but 1) it's just a bar and 2) there's both the allocation itself and the copying of the value (of which the latter is not what I'm after).
Using the JetBrains dotTrace Timeline viewer won't work either, since they currently don't (quite) support native callstacks.
At this point it's unclear to me if there's a method being dynamically generated and called when JIT_TrialAllocSFastMP is invoked - and by miracle neither of the PerfView-collected stack frames (one every 1 ms) ever capture it -, or somehow the Main's method body gets patched, and those few assembly instructions mentioned above are somehow injected directly in the code. It's also hard to believe that the fast way of allocating memory is never called.
You could ask "But you already have the .NET Core CLR code, why can't you figure out yourself ?". Since the .NET Framework CLR code is not publicly available, I've looked into its sibling, the .NET Core version of the CLR (as Matt Warren recommends in his step 6 here). The \src\vm\amd64\JitHelpers_InlineGetThread.asm file contains a JIT_TrialAllocSFastMP_InlineGetThread function. The issue is that parsing/understanding the C++ code there is above my grade, and also I can't really think of a way to "Step Into" and see how the JIT-ed code is generated, since this is way lower-level that your usual Press-F11-in-Visual-Studio.
Update 1: Let's simplify the code, and only consider individual boxed int values:
const int noNumbers = 10000000; // 10 mil
object o = null;
for (int i=0;i<noNumbers;i++)
{
o = i;
}
Since this is a Release build, and dead code elimination could kick in, WinDbg is used to check the final machine code.
The resulting JITed code, whose main loop is highlighted in blue below, which simply does repeated boxing, shows that the method that handles the memory allocation is not inlined (note the call to hex address 00af30f4):
This method in turn tries to allocate via the "fast" way, and if that fails, goes back to the "slow" way of a call to JIT_New itself):
It's interesting how the call stack in PerfView obtained from the code above doesn't show any intermediary method between the level of Main and the JIT_New entry itself (given that Main doesn't directly call JIT_New):

C# assignment in constructor to member ... doesn't change it

I have a simple class intended to store scaled integral values
using member variables "scaled_value" (long) with a "scale_factor".
I have a constructor that fills a new class instance with a decimal
value (although I think the value type is irrelevant).
Assignment to the "scaled_value" slot appears... to not happen.
I've inserted an explicit assignment of the constant 1 to it.
The Debug.Assert below fails... and scaled_value is zero.
On the assertion break in the immediate window I can inspect/set using assignment/inspect "scale_factor"; it changes as I set it.
I can inspect "scaled_value". It is always zero. I can type an
assignment to it which the immediate window executes, but its value
doesn't change.
I'm using Visual Studio 2017 with C# 2017.
What is magic about this slot?
public class ScaledLong : Base // handles scaled-by-power-of-ten long numbers
// intended to support equivalent of fast decimal arithmetic while hiding scale factors from user
{
public long scaled_value; // up to log10_MaxLong digits of decimal precision
public sbyte scale_factor; // power of ten representing location of decimal point range -21..+21. Set by constructor AND NEVER CHANGED.
public byte byte_size; // holds size of value in underlying memory array
string format_string;
<other constructors with same arguments except last value type>
public ScaledLong(sbyte sf, byte size, string format, decimal initial_value)
{
scale_factor = sf;
byte_size = size;
format_string = format;
decimal temp;
sbyte exponent;
{ // rip exponent out of decimal value leaving behind an integer;
_decimal_structure.value = initial_value;
exponent = (sbyte)_decimal_structure.Exponent;
_decimal_structure.Exponent = 0; // now decimal value is integral
temp = _decimal_structure.value;
}
sbyte sfDelta = (sbyte)(sf - exponent);
if (sfDelta >= 0)
{ // sfDelta > 0
this.scaled_value = 1;
Debug.Assert(scaled_value == 1);
scaled_value = (long)Math.Truncate(temp * DecimalTenToPower[sfDelta]);
}
else
{
temp = Math.Truncate(temp / DecimalHalfTenToPower[-sfDelta]);
temp += (temp % 2); /// this can overflow for value at very top of range, not worth fixing; note: this works for both + and- numbers (?)
scaled_value = (long)(temp / 2); // final result
}
}

The biggest puzzles often have the stupidest foundations. This one is a lesson in unintended side effects.
I found this by thinking about, wondering how in earth a member can get modified in unexpected ways. I found the solution before I read #mjwills comment, but he was definitely sniffing at the right thing.
What I left out (of course!) was that I had just coded a ToString() method for the class... that wasn't debugged. Why did I leave it out? Because it obviously can't affect anything so it can't be part of the problem.
Bzzzzt! it used the member variable as a scratchpad and zeroed it (there's the side effect); that was obviously unintended.
When this means is that when code the just runs, ToString() isn't called and the member variable DOES get modified correctly. (I even had unit tests for the "Set" routine checked all that and they were working).
But, when you are debugging.... the debugger can (and did in this case) show local variables. To do that, it will apparently call ToString() to get a nice displayable value. So the act of single stepping caused ToSTring() to get called, and its buggy scratch variable assignment zeroed out the slot after each step call.
So it wasn't a setter that bit me. It was arguably a getter. (Where is FORTRAN's PURE keyword when you need it?)
Einstein hated spooky actions at a distance. Programmers hate spooky side effects at a distance.
One wonders a bit at the idea of the debugger calling ToString() on a class, whose constructor hasn't finished. What assertions about the state of the class can ToString trust, given the constructor isn't done? I think the MS debugger should be fixed. With that, I would have spent my time debugging ToString instead of chasing this.
Thanks for putting up with my question. It got me to the answer.

If you still have a copy of that old/buggy code it would be interesting to try to build it under VS 2019 and Rider (hopefully the latest, 2022.1.1 at this point) with ReSharper (built in) allowed to do the picky scan and with a .ruleset allowed to bitch about just about anything (just for the 1st build - you'll turn off a lot but you need it to scream in order to see what to turn off). And with .NET 5.0 or 6.0
The reason I mention is that I remember some MS bragging about doing dataflow analysis to some degree in 2019 and I did see Rider complaining about some "unsafe assignments". If the old code is long lost - never mind.
CTOR-wise, if CTOR hasn't finished yet, we all know that the object "doesn't exist" yet and has invalid state, but to circumvent that, C# uses default values for everything. When you see code with constant assignments at the point of definition of data members that look trivial and pointless - the reason for that is that a lot of people do remember C++ and don't trust implicit defaults - just in case :-)
There is a 2-phase/round initialization sequence with 2 CTOR-s and implicit initializations in-between. Not widely documented (so that people with weak hearts don't use it :-) but completely deterministic and thread-safe (hidden fuses everywhere). Just for the sake of it's stability you never-ever want to have a call to any method before the 2 round is done (plain CTOR done still doesn't mean fully constructed object and any method invocation from the outside may trigger the 2nd round prematurely).
1st (plain) CTOR can be used in implicit initializations before the 2nd runs => you can control the (implicit) ordering, just have to be careful and step through it in debugger.
Oh and .ToString normally shouldn't be defined at all - on purpose :-) It's de-facto intrinsic => compiler can take it's liberties with it. Plus, if you define it, pretty soon you'll be obliged to support (and process) format specifiers.
I used to define ToJson (before big libs came to fore) to provide, let's say a controllable printable (which can also go over the wire and is 10-100 times faster than deserialization). These days VS debugger has a collection of "visualizers" and an option to tell debugger to use it or not (when it's off then it will jerk ToString's chain if it sees it.
Also, it's good to have dotPeek (or actual Reflector, owned by Redgate these days) with "find source code" turned off. Then you see the real generated code which is sometimes glorious (String is intrinsic and compiler goes a few extra miles to optimize its operations) and sometimes ugly (async/await - total faker, inefficient and flat out dangerous - how do you say "deadlock" in C# :-) - not kidding) but you need to to be able to see the final code or you are driving blind.

Question about Garbage collection C# .NET

I am experiencing problem in my application with OutOfMemoryException. My application can search for words within texts. When I start a long running process search to search about 2000 different texts for about 2175 different words the application will terminate at about 50 % through with a OutOfMemoryException (after about 6 hours of processing)
I have been trying to find the memory leak. I have an object graph like: (--> are references)
a static global application object (controller) --> an algorithm starter object --> text mining starter object --> text mining algorithm object (this object performs the searching).
The text mining starter object will start the text mining algorithm object's run()-method in a separate thread.
To try to fix the issue I have edited the code so that the text mining starter object will split the texts to search into several groups and initialize one text mining algorithm object for each group of texts sequentially (so when one text mining algorithm object is finished a new will be created to search the next group of texts). Here I set the previous text mining algorithm object to null. But this does not solve the issue.
When I create a new text mining algorithm object I have to give it some parameters. These are taken from properties of the previous text mining algorithm object before I set that object to null. Will this prevent garbage collection of the text mining algorithm object?
Here is the code for the creation of new text mining algorithm objects by the text mining algorithm starter:
private void RunSeveralAlgorithmObjects()
{
IEnumerable<ILexiconEntry> currentEntries = allLexiconEntries.GetGroup(intCurrentAlgorithmObject, intNumberOfAlgorithmObjectsToUse);
algorithm.LexiconEntries = currentEntries;
algorithm.Run();
intCurrentAlgorithmObject++;
for (int i = 0; i < intNumberOfAlgorithmObjectsToUse - 1; i++)
{
algorithm = CreateNewAlgorithmObject();
AddAlgorithmListeners();
algorithm.Run();
intCurrentAlgorithmObject++;
}
}
private TextMiningAlgorithm CreateNewAlgorithmObject()
{
TextMiningAlgorithm newAlg = new TextMiningAlgorithm();
newAlg.SortedTermStruct = algorithm.SortedTermStruct;
newAlg.PreprocessedSynonyms = algorithm.PreprocessedSynonyms;
newAlg.DistanceMeasure = algorithm.DistanceMeasure;
newAlg.HitComparerMethod = algorithm.HitComparerMethod;
newAlg.LexiconEntries = allLexiconEntries.GetGroup(intCurrentAlgorithmObject, intNumberOfAlgorithmObjectsToUse);
newAlg.MaxTermPercentageDeviation = algorithm.MaxTermPercentageDeviation;
newAlg.MaxWordPercentageDeviation = algorithm.MaxWordPercentageDeviation;
newAlg.MinWordsPercentageHit = algorithm.MinWordsPercentageHit;
newAlg.NumberOfThreads = algorithm.NumberOfThreads;
newAlg.PermutationType = algorithm.PermutationType;
newAlg.RemoveStopWords = algorithm.RemoveStopWords;
newAlg.RestrictPartialTextMatches = algorithm.RestrictPartialTextMatches;
newAlg.Soundex = algorithm.Soundex;
newAlg.Stemming = algorithm.Stemming;
newAlg.StopWords = algorithm.StopWords;
newAlg.Synonyms = algorithm.Synonyms;
newAlg.Terms = algorithm.Terms;
newAlg.UseSynonyms = algorithm.UseSynonyms;
algorithm = null;
return newAlg;
}
Here is the start of the thread that is running the whole search process:
// Run the algorithm in it's own thread
Thread algorithmThread = new Thread(new ThreadStart
(RunSeveralAlgorithmObjects));
algorithmThread.Start();
Can something here prevent the previous text mining algorithm object from being garbage collected?

I recommend first identifying what exactly is leaking. Then postulate a cause (such as references in event handlers).
To identify what is leaking:
Enable native debugging for the project. Properties -> Debug -> check Enable unmanaged code debugging.
Run the program. Since the memory leak is probably gradual, you probably don't need to let it run the whole 6 hours; just let it run for a while and then Debug -> Break All.
Bring up the Immediate window. Debug -> Windows -> Immediate
Type one of the following into the immediate window, depending on whether you're running 32 or 64 bit, .NET 2.0/3.0/3.5 or .NET 4.0:
.load C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\sos.dll for 32-bit .NET 2.0-3.5
.load C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319\sos.dll for 32-bit .NET 4.0
.load C:\WINDOWS\Microsoft.NET\Framework64\v2.0.50727\sos.dll for 64-bit .NET 2.0-3.5
.load C:\WINDOWS\Microsoft.NET\Framework64\v4.0.30319\sos.dll for 64-bit .NET 4.0
You can now run SoS commands in the Immediate window. I recommend checking the output of !dumpheap -stat, and if that doesn't pinpoint the problem, check !finalizequeue.
Notes:
Running the program the first time after enabling native debugging may take a long time (minutes) if you have VS set up to load symbols.
The debugger commands that I recommended both start with ! (exclamation point).
These instructions are courtesy of the incredible Tess from Microsoft, and Mario Hewardt, author of Advanced .NET Debugging.
Once you've identified the leak in terms of which object is leaking, then postulate a cause and implement a fix. Then you can do these steps again to determine for sure whether or not the fix worked.

1) As I said in a comment, if you use events in your code (the AddAlgorithmListeners makes me suspect this), subscribing to an event can create a "hidden" dependency between objects which is easily forgotten. This dependency can mean that an object is not freed, because someone is still listening to one of it's events. Make sure you unsubscribe from all events when you no longer need to listen to them.
2) Also, I'd like to point you to one (probably not-so-off-topic) issue with your code:
private void RunSeveralAlgorithmObjects()
{
...
algorithm.LexiconEntries = currentEntries;
// ^ when/where is algorithm initialized?
for (...)
{
algorithm = CreateNewAlgorithmObject();
....
}
}
Is algoritm already initialized when this method is invoked? Otherwise, setting algorithm.LexiconEntries wouldn't seem like a valid thing to do. This means your method is dependent on some external state, which to me looks like a potential place for bugs creeping in your program logic.
If I understand it correctly, this object contains some state describing the algorithm, and CreateNewAlgorithmObject derives a new state for algorithm from the current state. If this was my code, I would make algorithm an explicit parameter to all your functions, as a signal that the method depends on this object. It would then no longer be hidden "external" state upon which your functions depend.
P.S.: If you don't want to go down that route, the other thing you could consider to make your code more consistent is to turn CreateNewAlgorithmObject into a void method and re-assign algorithm directly inside that method.

Is AddAlgorithmListeners attaching event handlers to events exposed by the algorithm object ? Are the listening objects living longer than the algorithm object - in which case they can continue to keep the algorithm object from being collected.
If yes, try unsubscribing events before you let the object go out of scope.
for (int i = 0; i < intNumberOfAlgorithmObjectsToUse - 1; i++)
{
algorithm = CreateNewAlgorithmObject();
AddAlgorithmListeners();
algorithm.Run();
RemoveAlgoritmListeners(); // See if this fixes your issue.
intCurrentAlgorithmObject++;
}

my suspect is in AddAlgorithmListeners(); are you sure you remove the listener after execution completed?

Is the IEnumerable returned by GetGroup() throw-away or cached? That is, does it hold onto the objects it has emitted, as if it does it would obviously grow linearly with each iteration.
Memory profiling is useful, have you tried examining the application with a profiler? I found Red Gate's useful in the past (it's not free, but does have an evaluation version, IIRC).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.