What can I do to programmatically prevent or limit Resource contentions?

What can I do to programmatically prevent or limit Resource contentions? - c#

I have created an app that, given enough data, fails to complete, with "The transaction log for database 'tempdb' is full due to 'ACTIVE_TRANSACTION'." and "Cannot find table 0."
The Stored Procedure that the report uses does not explicitly reference "tempdb" so it must be something that SQL Server manages on its own.
Anyway, I ran the "Resource Contention" analysis in Visual Studio 2013 via Analyze > Performance and Diagnostics.
When it finished, it told me in the "Concurrency Profiling Report" that there were 30,790 total contentions, with "Handle2" and "Multiple Handles 1" making up over 99% of the "Most Contended Resources" and "_CorExeMain", Thread ID 4936 as the "Most Contended Thread"
This is all interesting enough, I guess, but now that I know this, what can I do about it?
Is 30,790 an excessive amount of total contentions? It sounds like it, but I don't know. But again, assuming it is, this information doesn't really seem to tell me anything of value, that is: what can I do to improve the situation? How can I programmatically prevent or limit Resource and/or Thread contention?
There were no errors in the report generated, and the six messages were of the "for information only" variety. There was one Warning:
Warning 1 DA0022: # Gen 1 Collections / # Gen 2 Collections = 2.52; There is a relatively high rate of Gen 2 garbage collections occurring. If, by design, most of your program's data structures are allocated and persisted for a long time, this is not ordinarily a problem. However, if this behavior is unintended, your app may be pinning objects. If you are not certain, you can gather .NET memory allocation data and object lifetime information to understand the pattern of memory allocation your application uses.
...but the persisting of data structures for "a long time" is, indeed, by design.
UPDATE
I then ran the ".NET Memory Allocation" report, and here too I'm not sure what to make of it:
Is 124 million bytes excessive? Is there anything untoward about the functions allocating the most memory, or which types have the most memory allocated?
I also don't understand why the red vertical line moves about after the report has been generated; it was first around "616", then it moved to 0 (as seen in the screenshot above), and now it's around 120.
UPDATE 2
I see now (after running the final performance check (Instrumentation)) that the vertical red line is just a lemming - it follows the cursor wherever you drag it. This has some purpose, I guess...

This is what ended up working: a two-pronged attack:
0) The database guru refactored the Stored Procedure to be more efficient.
1) I reworked the data reading code by breaking it into two parts. I determined the first half of the data to be retrieved, and got that, then followed a short pause with a second pass, retrieving the rest of the data. In short, what used to be this:
. . .
ReadData(_unit, _monthBegin, _monthEnd, _beginYearStr, _endYearStr);
. . .
...is now this:
. . .
if // big honkin' amount of data
{
string monthBegin1 = _monthBegin;
string monthEnd1 = GetEndMonthOfFirstHalf(_monthBegin, _monthEnd, _beginYearStr, _endYearStr);
string monthBegin2 = GetBeginMonthOfSecondHalf(_monthBegin, _monthEnd, _beginYearStr, _endYearStr);
string monthEnd2 = _monthEnd;
string yearBegin1 = _beginYearStr;
string yearEnd1 = GetEndYearOfFirstHalf(_monthBegin, _monthEnd, _beginYearStr, _endYearStr);
string yearBegin2 = GetBeginYearOfSecondHalf(_monthBegin, _monthEnd, _beginYearStr, _endYearStr);
string yearEnd2 = _endYearStr;
ReadData(_unit, monthBegin1, monthEnd1, yearBegin1, yearEnd1);
Thread.Sleep(10000);
ReadData(_unit, monthBegin2, monthEnd2, yearBegin2, yearEnd2);
}
else // a "normal" Unit (not an overly large amount of data to be processed)
{
ReadData(_unit, _monthBegin, _monthEnd, _beginYearStr, _endYearStr);
}
. . .
It now successfully runs to completion without any exceptions. I don't know if the problem was entirely database-related, or if all Excel Interop activity was also problematic, but I can now generate Excel files of 1,341KB with this operation.

Related

Data Structures & Techniques for operating on large data volumes (1 mln. recs and more)

A WPF .NET 4.5 app that I have been developing, initially to work on small data volumes, now works on much larger data volumes in the region of 1 million and more and of course I started running out of memory. The data comes from a MS SQL DB and data processing needs to be loaded to a local data structure, because this data is then transformed / processed / references by the code in CLR a continuous and uninterrupted data access is required, however not all data has to be loaded into memory straight away, but only when it is actually accessed. As a small example an Inverse Distance Interpolator uses this data to produce interpolated maps and all data needs to be passed to it for a continuous grid generation.
I have re-written some parts of the app for processing data, such as only load x amount of rows at any given time and implement a sliding window approach to data processing which works. However doing this for the rest of the app will require some time investment and I wonder if there can be a more robust and standard way of approaching this design problem (there has to be, I am not the first one)?
tldr; Does C# provide any data structures or techniques for accessing large data amounts in an interrupted manner, so it behaves like a IEnumerable but data is not in memory until it is actually accessed or required, or is it completely up to me to manage memory usage? My ideal would be a structure that would automatically implement a buffer like mechanism and load in more data as of when that data is accessed and freeing memory from the data that has been accessed and no longer of interest. Like some DataTable with an internal buffer maybe?

As far as iterating through a very large data set that is too large to fit in memory goes, you can use a producer-consumer model. I used something like this when I was working with a custom data set that contained billions of records--about 2 terabytes of data total.
The idea is to have a single class that contains both producer and consumer. When you create a new instance of the class, it spins up a producer thread that fills a constrained concurrent queue. And that thread keeps the queue full. The consumer part is the API that lets you get the next record.
You start with a shared concurrent queue. I like the .NET BlockingCollection for this.
Here's an example that reads a text file and maintains a queue of 10,000 text lines.
public class TextFileLineBuffer
{
private const int QueueSize = 10000;
private BlockingCollection<string> _buffer = new BlockingCollection<string>(QueueSize);
private CancellationTokenSource _cancelToken;
private StreamReader reader;
public TextFileLineBuffer(string filename)
{
// File is opened here so that any exception is thrown on the calling thread.
_reader = new StreamReader(filename);
_cancelToken = new CancellationTokenSource();
// start task that reads the file
Task.Factory.StartNew(ProcessFile, TaskCreationOptions.LongRunning);
}
public string GetNextLine()
{
if (_buffer.IsCompleted)
{
// The buffer is empty because the file has been read
// and all lines returned.
// You can either call this an error and throw an exception,
// or you can return null.
return null;
}
// If there is a record in the buffer, it is returned immediately.
// Otherwise, Take does a non-busy wait.
// You might want to catch the OperationCancelledException here and return null
// rather than letting the exception escape.
return _buffer.Take(_cancelToken.Token);
}
private void ProcessFile()
{
while (!_reader.EndOfStream && !_cancelToken.Token.IsCancellationRequested)
{
var line = _reader.ReadLine();
try
{
// This will block if the buffer already contains QueueSize records.
// As soon as a space becomes available, this will add the record
// to the buffer.
_buffer.Add(line, _cancelToken.Token);
}
catch (OperationCancelledException)
{
;
}
}
_buffer.CompleteAdding();
}
public void Cancel()
{
_cancelToken.Cancel();
}
}
That's the bare bones of it. You'll want to add a Dispose method that will make sure that the thread is terminated and that the file is closed.
I've used this basic approach to good effect in many different programs. You'll have to do some analysis and testing to determine the optimum buffer size for your application. You want something large enough to keep up with the normal data flow and also handle bursts of activity, but not so large that it exceeds your memory budget.
IEnumerable modifications
If you want to support IEnumerable<T>, you have to make some minor modifications. I'll extend my example to support IEnumerable<String>.
First, you have to change the class declaration:
public class TextFileLineBuffer: IEnumerable<string>
Then, you have to implement GetEnumerator:
public IEnumerator<String> GetEnumerator()
{
foreach (var s in _buffer.GetConsumingEnumerable())
{
yield return s;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
With that, you can initialize the thing and then pass it to any code that expects an IEnumerable<string>. So it becomes:
var items = new TextFileLineBuffer(filename);
DoSomething(items);
void DoSomething(IEnumerable<string> list)
{
foreach (var s in list)
Console.WriteLine(s);
}

#Sergey The producer-consumer model is probably your safest solution (Proposed by Jim Mischel) for complete scalability.
However, if you were to increase the room for the elephant (using your visual metaphor that fits very well), then compression on the fly is a viable option. Decompress when used and discard after use, leaving the core data structure compressed in memory. Obviously it depends on the data - how much it lends itself to compression, but there is a hell of alot of room in most data structures. If you have ON and OFF flags for some meta data, this can be buried in the unused bits of 16/32 bit numbers, or at least held in bits not bytes; use 16 bit integers for lat / longs with a constant scaling factor to convert each to real numbers before use; strings can be compressed using winzip type libraries - or indexed so that only ONE copy is held and no duplicates exist in memory, etc....
Decompression (albeit custom made) on the fly can be lightning fast.
This whole process can be very laborious I admit, but can definitely keep the room large enough as the elephant grows - in some instances. (Of course, it may never be good enough if the data is simply growing indefinitely)
EDIT: Re any sources...
Hi #Sergey, I wish I could!! Truly! I have used this technique for data compression and really the whole thing was custom designed on a whiteboard with one or two coders involved.
Its certainly not (all) rocket science, but its good to fully scope out the nature of all the data, then you know (for example) that a certain figure will never exceed 9999, so then you can choose how to store it in minimum bits, and then allocate the left over bits (assuming 32 bit storage) to other values. (A real world example is the number of fingers a person has...loosely speaking you could set an upper limit at 8 or 10, although 12 is possible, and even 20 is remotely feasible, etc if they have extra fingers. You can see what I mean) Lat / Longs are the PERFECT example of numbers that will never cross logical boundaries (unless you use wrap around values...). That is, they are always in between -90 and +90 (just guessing which type of Lat Longs) - which is very easy to reduce / convert as the range of values is so neat.
So we did not rely 'directly' on any third party literature. Only upon algorithms designed for specific types of data.
In other projects, for fast real time DSP (processing) the smarter (experienced game programmers) coders would convert floats to 16 bit ints and have a global scaling factor calculated to give max precision for the particular data stream (accelerometers, LVDT, Pressure gauges, etc) you are collecting.
This reduced the transmitted AND stored data without losing ANY information. Similarly, for real time wave / signal data you could use (Fast) Fourier Transform to turn your noisy wave into its Amplitude, Phase and Spectrum components - literally half of the data values, without actually losing any (significant) data. (Within these algorithms, the data 'loss' is completely measurable - so you can decide if you are in fact losing data)
Similarly there are algorithms like Rainfall Analysis (nothing to do with rain, more about cycles and frequency) which reduces your data alot. Peak detection and vector analysis can be enough for some other signals, which basically throws out about 99% of the data...The list is endless, but the technique MUST be intimately suited to your data. And you may have many different types of data, each lending itself to a different 'reduction' technique. I'm sure you can google 'lossless data reduction' (although I think the term lossless is coined by music processing and a little misleading since digital music has already lost the upper and lower freq ranges...I digress)....Please post what you find (if of course you have the time / inclination to research this further)
I would be interested to discuss your meta data, perhaps a large chunk can be 'reduced' quite elegantly...

Process very large XML file

I need to process an XML file with the following structure:
<FolderSizes>
<Version></Version>
<DateTime Un=""></DateTime>
<Summary>
<TotalSize Bytes=""></TotalSize>
<TotalAllocated Bytes=""></TotalAllocated>
<TotalAvgFileSize Bytes=""></TotalAvgFileSize>
<TotalFolders Un=""></TotalFolders>
<TotalFiles Un=""></TotalFiles>
</Summary>
<DiskSpaceInfo>
<Drive Type="" Total="" TotalBytes="" Free="" FreeBytes="" Used=""
UsedBytes=""><![CDATA[ ]]></Drive>
</DiskSpaceInfo>
<Folder ScanState="">
<FullPath Name=""><![CDATA[ ]]></FullPath>
<Attribs Int=""></Attribs>
<Size Bytes=""></Size>
<Allocated Bytes=""></Allocated>
<AvgFileSz Bytes=""></AvgFileSz>
<Folders Un=""></Folders>
<Files Un=""></Files>
<Depth Un=""></Depth>
<Created Un=""></Created>
<Accessed Un=""></Accessed>
<LastMod Un=""></LastMod>
<CreatedCalc Un=""></CreatedCalc>
<AccessedCalc Un=""></AccessedCalc>
<LastModCalc Un=""></LastModCalc>
<Perc><![CDATA[ ]]></Perc>
<Owner><![CDATA[ ]]></Owner>
<!-- Special element; see paragraph below -->
<Folder></Folder>
</Folder>
</FolderSizes>
The <Folder> element is special in that it repeats within the <FolderSizes> element but can also appear within itself; I reckon up to about 5 levels.
The problem is that the file is really big at a whopping 11GB so I'm having difficulty processing it - I have experience with XML documents, but nothing on this scale.
What I would like to do is to import the information into a SQL database because then I will be able to process the information in any way necessary without having to concern myself with this immense, impractical file.
Here are the things I have tried:
Simply load the file and attempt to process it with a simple C# program using an XmlDocument or XDocument object
Before I even started I knew this would not work, as I'm sure everyone would agree, but I tried it anyway, and ran the application on a VM (since my notebook only has 4GB RAM) with 30GB memory. The application ended up using 24GB memory, and taking very, very long, so I just cancelled it.
Attempt to process the file using an XmlReader object
This approach worked better in that it didn't use as much memory, but I still had a few problems:
It was taking really long because I was reading the file one line at a time.
Processing the file one line at a time makes it difficult to really work with the data contained in the XML because now you have to detect the start of a tag, and then the end of that tag (hopefully), and then create a document from that information, read the info, attempt to determine which parent tag it belongs to because we have multiple levels... Sound prone to problems and errors
Did I mention it takes really long reading the file one line at a time; and that still without actually processing that line - literally just reading it.
Import the information using SQL Server
I created a stored procedure using XQuery and running it recursively within itself processing the <Folder> elements. This went quite well - I think better than the other two approaches - until one of the <Folder> elements ended up being rather big, producing a An XML operation resulted an XML data type exceeding 2GB in size. Operation aborted. error. I read up about it and I don't think it's an adjustable limit.
Here are more things I think I should try:
Re-write my C# application to use unmanaged code
I don't have much experience with unmanaged code, so I'm not sure how well it will work and how to make it as unmanaged as possible.
I once wrote a little application that works with my webcam, receiving the image, inverting the colours, and painting it to a panel. Using normal managed code didn't work - the result was about 2 frames per second. Re-writing the colour inversion method to use unmanaged code solved the problem. That's why I thought that unmanaged might be a solution.
Rather go for C++ in stead of C#
Not sure if this is really a solution. Would it necessarily be better that C#? Better than unmanaged C#?
The problem here is that I haven't actually worked with C++ before, so I'll need to get to know a few things about C++ before I can really start working with it, and then probably not very efficiently yet.
I thought I'd ask for some advice before I go any further, possibly wasting my time.
Thanks in advance for you time and assistance.
EDIT
So before I start processing the file I run through it and check the size in a attempt to provide the user with feedback as to how long the processing might take; I made a screenshot of the calculation:
That's about 1500 lines per second; if the average line length is about 50 characters, that's 50 bytes per line, that's 75 kilobytes per second, for an 11GB file should take about 40 hours, if my maths is correct. But this is only stepping each line. It's not actually processing the line or doing anything with it, so when that starts, the processing rate drops significantly.
This is the method that runs during the size calculation:
private int _totalLines = 0;
private bool _cancel = false; // set to true when the cancel button is clicked
private void CalculateFileSize()
{
xmlStream = new StreamReader(_filePath);
xmlReader = new XmlTextReader(xmlStream);
while (xmlReader.Read())
{
if (_cancel)
return;
if (xmlReader.LineNumber > _totalLines)
_totalLines = xmlReader.LineNumber;
InterThreadHelper.ChangeText(
lblLinesRemaining,
string.Format("{0} lines", _totalLines));
string elapsed = string.Format(
"{0}:{1}:{2}:{3}",
timer.Elapsed.Days.ToString().PadLeft(2, '0'),
timer.Elapsed.Hours.ToString().PadLeft(2, '0'),
timer.Elapsed.Minutes.ToString().PadLeft(2, '0'),
timer.Elapsed.Seconds.ToString().PadLeft(2, '0'));
InterThreadHelper.ChangeText(lblElapsed, elapsed);
if (_cancel)
return;
}
xmlStream.Dispose();
}
Still runnig, 27 minutes in :(

you can read an XML as a logical stream of elements instead of trying to read it line-by-line and piece it back together yourself. see the code sample at the end of this article
also, your question has already been asked here

Why does appending to TextBox.Text during a loop take up more memory with each iteration?

Short Question
I have a loop that runs 180,000 times. At the end of each iteration it is supposed to append the results to a TextBox, which is updated real-time.
Using MyTextBox.Text += someValue is causing the application to eat huge amounts of memory, and it runs out of available memory after a few thousand records.
Is there a more efficient way of appending text to a TextBox.Text 180,000 times?
Edit I really don't care about the result of this specific case, however I want to know why this seems to be a memory hog, and if there is a more efficient way to append text to a TextBox.
Long (Original) Question
I have a small app which reads a list of ID numbers in a CSV file and generates a PDF report for each one. After each pdf file is generated, the ResultsTextBox.Text gets appended with the ID Number of the report that got processed and that it was successfully processed. The process runs on a background thread, so the ResultsTextBox gets updated real-time as items get processed
I am currently running the app against 180,000 ID numbers, however the memory the application is taking up is growing exponentially as time goes by. It starts by around 90K, but by about 3000 records it is taking up roughly 250MB and by 4000 records the application is taking up about 500 MB of memory.
If I comment out the update to the Results TextBox, the memory stays relatively stationary at roughly 90K, so I can assume that writing ResultsText.Text += someValue is what is causing it to eat memory.
My question is, why is this? What is a better way of appending data to a TextBox.Text that doesn't eat memory?
My code looks like this:
try
{
report.SetParameterValue("Id", id);
report.ExportToDisk(ExportFormatType.PortableDocFormat,
string.Format(#"{0}\{1}.pdf", new object[] { outputLocation, id}));
// ResultsText.Text += string.Format("Exported {0}\r\n", id);
}
catch (Exception ex)
{
ErrorsText.Text += string.Format("Failed to export {0}: {1}\r\n",
new object[] { id, ex.Message });
}
It should also be worth mentioning that the app is a one-time thing and it doesn't matter that it is going to take a few hours (or days :)) to generate all the reports. My main concern is that if it hits the system memory limit, it will stop running.
I'm fine with leaving the line updating the Results TextBox commented out to run this thing, but I would like to know if there is a more memory efficient way of appending data to a TextBox.Text for future projects.

I suspect the reason the memory usage is so large is because textboxes maintain a stack so that the user can undo/redo text. That feature doesn't seem to be required in your case, so try setting IsUndoEnabled to false.

Use TextBox.AppendText(someValue) instead of TextBox.Text += someValue. It's easy to miss since it's on TextBox, not TextBox.Text. Like StringBuilder, this will avoid creating copies of the entire text each time you add something.
It would be interesting to see how this compares to the IsUndoEnabled flag from keyboardP's answer.

Don't append directly to the text property. Use a StringBuilder for the appending, then when done, set the .text to the finished string from the stringbuilder

Instead of using a text box I would do the following:
Open up a text file and stream the errors to a log file just in case.
Use a list box control to represent the errors to avoid copying potentially massive strings.

Personally, I always use string.Concat* . I remember reading a question here on Stack Overflow years ago that had profiling statistics comparing the commonly-used methods, and (seem) to recall that string.Concat won out.
Nonetheless, the best I can find is this reference question and this specific String.Format vs. StringBuilder question, which mentions that String.Format uses a StringBuilder internally. This makes me wonder if your memory hog lies elsewhere.
**based on James' comment, I should mention that I never do heavy string formatting, as I focus on web-based development.*

Maybe reconsider the TextBox? A ListBox holding string Items will probably perform better.
But the main problem seem to be the requirements, Showing 180,000 items cannot be aimed at a (human) user, neither is changing it in "Real Time".
The preferable way would be to show a sample of the data or a progress indicator.
When you do want to dump it at the poor User, batch string updates. No user could descern more than 2 or 3 changes per second. So if you produce 100/second, make groups of 50.

Some responses have alluded to it, but nobody has outright stated it which is surprising.
Strings are immutable which means a String cannot be modified after it is created. Therefore, every time you concatenate to an existing String, a new String Object needs to be created. The memory associated with that String Object also obviously needs to be created, which can get expensive as your Strings become larger and larger. In college, I once made the amateur mistake of concatenating Strings in a Java program that did Huffman coding compression. When you're concatenating extremely large amounts of text, String concatenation can really hurt you when you could have simply used StringBuilder, as some in here have mentioned.

Use the StringBuilder as suggested.
Try to estimate the final string size then use that number when instantiating the StringBuilder. StringBuilder sb = new StringBuilder(estSize);
When updating the TextBox just use assignment eg: textbox.text = sb.ToString();
Watch for cross-thread operations as above. However use BeginInvoke. No need to block
the background thread while the UI updates.

A) Intro: already mentioned, use StringBuilder
B) Point: don't update too frequently, i.e.
DateTime dtLastUpdate = DateTime.MinValue;
while (condition)
{
DoSomeWork();
if (DateTime.Now - dtLastUpdate > TimeSpan.FromSeconds(2))
{
_form.Invoke(() => {textBox.Text = myStringBuilder.ToString()});
dtLastUpdate = DateTime.Now;
}
}
C) If that's one-time job, use x64 architecture to stay within 2Gb limit.

StringBuilder in ViewModel will avoid string rebindings mess and bind it to MyTextBox.Text. This scenario will increase performance many times over and decrease memory usage.

Something that has not been mentioned is that even if you're performing the operation in the background thread, the update of the UI element itself HAS to happen on the main thread itself (in WinForms anyway).
When updating your textbox, do you have any code that looks like
if(textbox.dispatcher.checkAccess()){
textbox.text += "whatever";
}else{
textbox.dispatcher.invoke(...);
}
If so, then your background op is definitely being bottlenecked by the UI Update.
I would suggest that your background op use StringBuilder as noted above, but instead of updating the textbox every cycle, try updating it at regular intervals to see if it increases performance for you.
EDIT NOTE:have not used WPF.

You say memory grows exponentially. No, it is a quadratic growth, i.e. a polynomial growth, which is not as dramatic as an exponential growth.
You are creating strings holding the following number of items:
1 + 2 + 3 + 4 + 5 ... + n = (n^2 + n) /2.
With n = 180,000 you get total memory allocation for 16,200,090,000 items, i.e. 16.2 billion items! This memory will not be allocated at once, but it is a lot of cleanup work for the GC (garbage collector)!
Also, bear in mind, that the previous string (which is growing) must be copied into the new string 179,999 times. The total number of copied bytes goes with n^2 as well!
As others have suggested, use a ListBox instead. Here you can append new strings without creating a huge string. A StringBuild does not help, since you want to display the intermediate results as well.

Function profiling woes - Visual Studio 2010 Ultimate

I am trying to profile my application to monitor the effects of a function, both before and after refactoring. I have performed an analysis of my application and having looked at the Summary I've noticed that the Hot Path list does not mention any of my functions used, it only mentions functions up to Application.Run()
I'm fairly new to profiling and would like to know how I could get more information about the Hot Path as demonstrated via the MSDN documentation;
MSDN Example:
My Results:
I've noticed in the Output Window there are a lot of messages relating to a failure when loading symbols, a few of them are below;
Failed to load symbols for C:\Windows\system32\USP10.dll.
Failed to load symbols for C:\Windows\system32\CRYPTSP.dll.
Failed to load symbols for (Omitted)\WindowsFormsApplication1\bin\Debug\System.Data.SQLite.dll.
Failed to load symbols for C:\Windows\system32\GDI32.dll.
Failed to load symbols for C:\Windows\WinSxS\x86_microsoft.windows.common-controls_6595b64144ccf1df_6.0.7601.17514_none_41e6975e2bd6f2b2\comctl32.dll.
Failed to load symbols for C:\Windows\system32\msvcrt.dll.
Failed to load symbols for C:\Windows\Microsoft.NET\Framework\v4.0.30319\nlssorting.dll.
Failed to load symbols for C:\Windows\Microsoft.Net\assembly\GAC_32\System.Data\v4.0_4.0.0.0__b77a5c561934e089\System.Data.dll. Failed to load symbols for
C:\Windows\Microsoft.Net\assembly\GAC_32\System.Transactions\v4.0_4.0.0.0__b77a5c561934e089\System.Transactions.dll.
Unable to open file to serialize symbols: Error VSP1737: File could not be opened due to sharing violation: - D:\(Omitted)\WindowsFormsApplication1110402.vsp
(Formatted using code tool so it's readable)
Thanks for any pointers.

The "Hot Path" shown on the summary view is the most expensive call path based on the number of inclusive samples (samples from the function and also samples from functions called by the function) and exclusive samples (samples only from the function). A "sample" is just the fact the function was at the top of the stack when the profiler's driver captured the stack (this occurs at very small timed intervals). Thus, the more samples a function has, the more it was executing.
By default for sampling analysis, a feature called "Just My Code" is enabled that hides functions on the stack coming from non-user modules (it will show a depth of 1 non-user functions if called by a user function; in your case Application.Run). Functions coming from modules without symbols loaded or from modules known to be from Microsoft would be excluded. Your "Hot Path" on the summary view indicates that the most expensive stack didn't have anything from what the profiler considers to be your code (other than Main). The example from MSDN shows more functions because the PeopleTrax.* and PeopleNS.* functions are coming from "user code". "Just My Code" can be turned off by clicking the "Show All Code" link on the summary view, but I would not recommend doing so here.
Take a look at the "Functions Doing The Most Individual Work" on the summary view. This displays functions that have the highest exclusive sample counts and are therefore, based on the profiling scenario, the most expensive functions to call. You should see more of your functions (or functions called by your functions) here. Additionally, the "Functions" and "Call Tree" view might show you more details (there's a drop-down at the top of the report to select the current view).
As for your symbol warnings, most of those are expected because they are Microsoft modules (not including System.Data.SQLite.dll). While you don't need the symbols for these modules to properly analyze your report, if you checked "Microsoft Symbol Servers" in "Tools -> Options -> Debugging -> Symbols" and reopened the report, the symbols for these modules should load. Note that it'll take much longer to open the report the first time because the symbols need to be downloaded and cached.
The other warning about the failure to serialize symbols into the report file is the result of the file not being able to be written to because it is open by something else that prevents writing. Symbol serialization is an optimization that allows the profiler to load symbol information directly from the report file on the next analysis. Without symbol serialization, analysis simply needs to perform the same amount of work as when the report was opened for the first time.
And finally, you may also want to try instrumentation instead of sampling in your profiling session settings. Instrumentation modifies modules that you specify to capture data on each and every function call (be aware that this can result in a much, much larger .vsp file). Instrumentation is ideal for focusing in on the timing of specific pieces of code, whereas sampling is ideal for general low-overhead profiling data collection.

Do you mind too much if I talk a bit about profiling, what works and what doesn't?
Let's make up an artificial program, some of whose statements are doing work that can be optimized away - i.e. they are not really necessary.
They are "bottlenecks".
Subroutine foo runs a CPU-bound loop that takes one second.
Also assume subroutine CALL and RETURN instructions take insignificant or zero time, compared to everything else.
Subroutine bar calls foo 10 times, but 9 of those times are unnecessary, which you don't know in advance and can't tell until your attention is directed there.
Subroutines A, B, C, ..., J are 10 subroutines, and they each call bar once.
The top-level routine main calls each of A through J once.
So the total call tree looks like this:
main
A
bar
foo
foo
... total 10 times for 10 seconds
B
bar
foo
foo
...
...
J
...
(finished)
How long does it all take? 100 seconds, obviously.
Now let's look at profiling strategies.
Stack samples (like say 1000 samples) are taken at uniform intervals.
Is there any self time? Yes. foo takes 100% of the self time.
It's a genuine "hot spot".
Does that help you find the bottleneck? No. Because it is not in foo.
What is the hot path? Well, the stack samples look like this:
main -> A -> bar -> foo (100 samples, or 10%)
main -> B -> bar -> foo (100 samples, or 10%)
...
main -> J -> bar -> foo (100 samples, or 10%)
There are 10 hot paths, and none of them look big enough to gain you much speedup.
IF YOU HAPPEN TO GUESS, and IF THE PROFILER ALLOWS, you could make bar the "root" of your call tree. Then you would see this:
bar -> foo (1000 samples, or 100%)
Then you would know that foo and bar were each independently responsible for 100% of the time and therefore are places to look for optimization.
You look at foo, but of course you know the problem isn't there.
Then you look at bar and you see the 10 calls to foo, and you see that 9 of them are unnecessary. Problem solved.
IF YOU DIDN'T HAPPEN TO GUESS, and instead the profiler simply showed you the percent of samples containing each routine, you would see this:
main 100%
bar 100%
foo 100%
A 10%
B 10%
...
J 10%
That tells you to look at main, bar, and foo. You see that main and foo are innocent. You look at where bar calls foo and you see the problem, so it's solved.
It's even clearer if in addition to showing you the functions, you can be shown the lines where the functions are called. That way, you can find the problem no matter how large the functions are in terms of source text.
NOW, let's change foo so that it does sleep(oneSecond) rather than be CPU bound. How does that change things?
What it means is it still takes 100 seconds by the wall clock, but the CPU time is zero. Sampling in a CPU-only sampler will show nothing.
So now you are told to try instrumentation instead of sampling. Contained among all the things it tells you, it also tells you the percentages shown above, so in this case you could find the problem, assuming bar was not very big. (There may be reasons to write small functions, but should satisfying the profiler be one of them?)
Actually, the main thing wrong with the sampler was that it can't sample during sleep (or I/O or other blocking), and it doesn't show you code line percents, only function percents.
By the way, 1000 samples gives you nice precise-looking percents. Suppose you took fewer samples. How many do you actually need to find the bottleneck? Well, since the bottleneck is on the stack 90% of the time, if you took only 10 samples, it would be on about 9 of them, so you'd still see it.
If you even took as few as 3 samples, the probability it would appear on two or more of them is 97.2%.**
High sample rates are way overrated, when your goal is to find bottlenecks.
Anyway, that's why I rely on random-pausing.
** How did I get 97.2 percent? Think of it as tossing a coin 3 times, a very unfair coin, where "1" means seeing the bottleneck. There are 8 possibilities:
#1s probabality
0 0 0 0 0.1^3 * 0.9^0 = 0.001
0 0 1 1 0.1^2 * 0.9^1 = 0.009
0 1 0 1 0.1^2 * 0.9^1 = 0.009
0 1 1 2 0.1^1 * 0.9^2 = 0.081
1 0 0 1 0.1^2 * 0.9^1 = 0.009
1 0 1 2 0.1^1 * 0.9^2 = 0.081
1 1 0 2 0.1^1 * 0.9^2 = 0.081
1 1 1 3 0.1^0 * 0.9^3 = 0.729
so the probability of seeing it 2 or 3 times is .081*3 + .729 = .972

Methodology for saving values over time

I have a task, which I know how to code (in C#),
but I know a simple implementation will not meet ALL my needs.
So, I am looking for tricks which might meet ALL my needs.
I am writing a simulation involving N number of entities interacting over time.
N will start at around 30 and move in to many thousands.
a. The number of entities will change during the course of the simulation.
b. I expect this will require each entity to have its own trace file.
Each Entity has a minimum of 20 parameters, up to millions; I want to track over time.
a. This will most likely required that we can’t keep all values in memory at all times. Some subset should be fine.
b. The number of parameters per entity will initially be fixed, but I can think of some test which would have the number of parameters slowing changing over time.
Simulation will last for millions of time steps and I need to keep every value for every parameter.
What I will be using these traces for:
a. Plotting a subset (configurable) of the parameters for a fixed amount of time from the current time step to the past.
i. Normally on the order of 300 time steps.
ii. These plots are in real time while the simulation is running.
b. I will be using these traces to re-play the simulation, so I need to quickly access all the parameters at a give time step so I can quickly move to different times in the simulation.
i. This requires the values be stored in a file(s) which can be inspected/loaded after restarting the software.
ii. Using a database is NOT an option.
c. I will be using the parameters for follow up analysis which I can’t define up front so a more flexible system is desirable.
My initial thought:
One class per entity which holds all the parameters.
Backed by a memory mapped file.
Only a fixed, but moving, amount of the file is mapped to main memory
A second memory mapped file which holds time indexes to main file for quicker access during re-playing of simulation. This may be very important because each entity file will represent a different time slice of the full simulation.

I would start with SQLite. SQLite is like a binary format library that you can query conveniently and quickly. It is not really like a database, in that you can really run it on any machine, with no installation whatsoever.
I strongly recommend against XML, given the requirement of millions of steps, potentially with millions parameters.
EDIT: Given the sheer amount of data involved, SQLite may well end up being too slow for you. Don't get me wrong, SQLite is very fast, but it won't beat seeks & reads, and it looks like your use case is such that basic binary IO is rather appropriate.
If you go with the binary IO method you should expect some moderately involved coding, and the absence of such niceties as your file staying in a consistent state if the application dies halfway through (unless you code this specifically that is).

KISS -- just write a logfile for each entity and at each time slice write out every parameter in a specified order (so you don't double the size of the logfile by adding parameter names). You can have a header in each logfile if you want to specify the parameter names of each column and the identify of the entity.
If there are many parameter values that will remain fixed or slowly changing during the course of the simulation, you can write these out to another file that encodes only changes to parameter values rather than every value at each time slice.
You should probably synchronize the logging so that each log entry is written out with the same time value. Rather than coordinate through a central file, just make the first value in each line of the file the time value.
Forget about database - too slow and too much overhead for simulation replay purposes. For replaying of a simulation, you will simply need sequential access to each time slice, which is most efficiently and fastest implemented by simply reading in the lines of the files one by one.
For the same reason - speed and space efficiency - forget XML.

Just for the memory part...
1.You can save the data as xElemet (sorry for not knowing much about linq) but it holds an XML logic.
2.hold a record counter.
after n records save the xelement to an xmlFile (data1.xml,...dataN.xml)
It can be a perfect log to any parameter you have with any logic you like:
<run>
<step id="1">
<param1 />
<param2 />
<param3 />
</step>
.
.
.
<step id="N">
<param1 />
<param2 />
<param3 />
</step>
</run>
This way your memory is free and the data is relatively free.
You don't have to think too much about DB issues and it's pretty amazing what LINQ can do for you... just open the currect XML log file...

here is what i am doing now
int bw = 0;
private void timer1_Tick(object sender, EventArgs e)
{
bw = Convert.ToInt32(lblBytesReceived.Text) - bw;
SqlCommand comnd = new SqlCommand("insert into tablee (bandwidthh,timee) values (" + bw.ToString() + ",#timee)", conn);
conn.Open();
comnd.Parameters.Add("#timee",System.Data.SqlDbType.Time).Value = DateTime.Now.TimeOfDay;
comnd.ExecuteNonQuery();
conn.Close();
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.