C# Updateing object references and multithreading - c#

After reading so much about how to do it, I'm quite confused.
So here is what I want to do:
I have a datastructure/object that holds all kinds of information. I tread the datastructure as if it were immutable. Whenever I need to update information, I make a DeepCopy and do the changes to it. Then I swap the old and the newly created object.
Now I don't know how to do everything right.
Let's look at it from the side of the reader/consumer threads.
MyObj temp = dataSource;
var a = temp.a;
... // many instructions
var b = temp.b;
....
As I understand reading references is atomic. So I don't need any volatile or locking to assign the current reference of dataSource to temp. But what about the Garbage Collection. As I understand the GC has some kind of reference counter to know when to free memory. So when another thread updates dataSource exactly at the moment when dataSource is assigned to temp, does the GC increase the counter on the right memory block?
The other thing is compiler/CLR optimization. I assign dataSource to temp and use temp to access data members. What does the CLR do? Does it really make a copy of dataSource or does the optimizer just use dataSource to access .a and .b? Let's assume that between temp.a and temp.b are lot's of instructions so that the reference to temp/dataSource cannot be held in a CPU register. So is temp.b really temp.b or is it optimized to dataSource.b because the copy to temp can be optimized away. This is especially important if another thread updates dataSource to point to a new object.
Do I really need volatile, lock, ReadWriterLockSlim, Thread.MemoryBarrier or something else?
The important thing to me is that I want to make sure that temp.a and temp.b access the old datastructure even when another thread updates dataSource to another newly created data structure. I never change data inside an existing structure. Updates are always done by creating a copy, updating data and then updating the reference to the new copy of the datastructre.
Maybe one more question. If I don't use volatile, how long does it take until all cores on all CPUs see the updated reference?
When it comes to volatile please have a look here: When should the volatile keyword be used in C#?
I have done a little test programm:
namespace test1 {
public partial class Form1 : Form {
public Form1() { InitializeComponent(); }
Something sharedObj = new Something();
private void button1_Click(object sender, EventArgs e) {
Thread t = new Thread(Do); // Kick off a new thread
t.Start(); // running WriteY()
for (int i = 0; i < 1000; i++) {
Something reference = sharedObj;
int x = reference.x; // sharedObj.x;
System.Threading.Thread.Sleep(1);
int y = reference.y; // sharedObj.y;
if (x != y) {
button1.Text = x.ToString() + "/" + y.ToString();
Update();
}
}
}
private void Do() {
for (int i = 0; i < 1000000; i++) {
Something someNewObject = sharedObj.Clone(); // clone from immutable
someNewObject.Do();
sharedObj = someNewObject; // atomic
}
}
}
class Something {
public Something Clone() { return (Something)MemberwiseClone(); }
public void Do() { x++; System.Threading.Thread.Sleep(0); y++; }
public int x = 0;
public int y = 0;
}
}
In Button1_click there is a for-loop and inside the for-loop I access a datastructure/object once using the direct "shareObj" and once using a temporarily created "reference". Using the reference is enough to make sure that "var a" and "var b" are initialized with values from the same object.
The only thing I don't understand is, why is "Something reference = sharedObj;" not optimized away and "int x = reference.x;" not replaced by "int x = sharedObj.x;"?
How does the compiler, jitter know not to optimize this? Or are temporarily objects never optimized in C#?
But most important: Is my example running as intended because it is correct or is it running as intended only by accident?

As I understand reading references is atomic.
Correct. This is a very limited property though. It means reading a reference will work; you'll never get the bits of half an old reference mixed with the bits of half a new reference resulting in a reference that doesn't work. If there's a concurrent change it promises nothing about whether you get the old or the new reference (what would such a promise even mean?)
So I don't need any volatile or locking to assign the current reference of dataSource to temp.
Maybe, though there are cases where this can have problems.
But what about the Garbage Collection. As I understand the GC has some kind of reference counter to know when to free memory.
Incorrect. There is no reference counting in .NET garbage collection.
If there is a static reference to an object, then it is not eligible for reclamation.
If there is an active local reference to an object, then it is not eligble for reclamation.
If there is a reference to an object in a field of an object that is not eligible for reclamation, then it too is not eligible for reclamation, recursively.
There's no counting here. Either there is an active strong reference prohibiting reclamation, or there isn't.
This has a great many very important implications. Of relevance here is that there can never be any incorrect reference counting, since there is no reference counting. Strong references will not disappear under your feet.
The other thing is compiler/CLR optimization. I assign dataSource to temp and use temp to access data members. What does the CLR do? Does it really make a copy of dataSource or does the optimizer just use dataSource to access .a and .b?
That depends on what dataSource and temp are as far as whether they are local or not, and how they are used.
If dataSource and temp are both local, then it is perfectly possible that either the compiler or the jitter would optimise the assignment away. If they are both local though, they are both local to the same thread, so this isn't going to impact multi-threaded use.
If dataSource is a field (static or instance), then since temp is definitely a local in the code shown (because its initialised in the code fragment shown) the assignment cannot be optimised away. For one thing, grabbing a local copy of a field is in itself a possible optimisation, being faster to do several operations on a local reference than to continually access a field or static. There's not much point having a compiler or jitter "optimisation" that just makes things slower.
Consider what actually happens if you were to not use temp:
var a = dataSource.a;
... // many instructions
var b = dataSource.b;
To access dataSource.a the code must first obtain a reference to dataSource and then access a. Afterwards it obtains a reference to dataSource and then accesses b.
Optimising by not using a local makes no sense, since there's going to be an implicit local anyway.
And there is the simple fact that the fear you have is something considered: After temp = dataSource there's no assumption that temp == dataSource because there could be other threads changing dataSource, so it's not valid to make optimisations predicated on temp == dataSource.*
Really the optimisations you are concerned about are either not relevant or not valid and hence not going to happen.
There is a case that could cause you problems though. It is just about possible for a thread running on one core to not see a change to dataSource made by a thread changing on another core. As such if you have:
/* Thread A */
dataSource = null;
/* Some time has passed */
/* Thread B */
var isNull = dataSource == null;
Then there's no guarantee that just because Thread A had finished setting dataSource to null, that Thread B would see this. Ever.
The memory models in use in .NET itself and in the processors .NET generally runs on (x86 and x86-64) would prevent that happening, but in terms of possible future optimisations, this is something that's possible. You need memory barriers to ensure that Thread A's publishing definitely affects Thread B's reading. lock and volatile are both ways to ensure that.
*One doesn't even need to be multi-threaded for this to not follow, though it is possible to prove in particular cases that there are no single-thread changes that would break that assumption. That doesn't really matter though, because the multi-threaded case still applies.

Related

Why isn't compiler optimizing away this code

I have a code using third-party-tool iterating over a collection of points.
for (int i = 0; i < pcoll.PointCount; i++) { /* ... */ }
When doing profiling via dotTrace I noticed that the PointCount-proerty is accessed every iteration (see picture above)
.
I expected that the value for this property is optimized away by the compiler but obviously that doesn't happen. Maybe this is actually a problem within the COM-based 3rd-party lib or also within dotTrace self when collecting the information.
I'm not sure if this topic wouldn't fit better to Gis.StackExchange. However maybe someone has any idea under which circumstances optimzation won't take place or how it might happen.
Simply put, how is the compiler to know whether pcoll.PointCount will change between invocations? It can't safely make the assumption that the value will remain unchanged, so it can't optimise this code by caching the value of the first call to pcoll.PointCount.
It may have changed in the meantime.
Indeed, one of the reasons to test i < pcoll.PointCount every iteration rather than just using foreach(var point in pcoll) is precisely because you think the collection might change in the meantime, and enumerators don't guarantee to cope with changes to the collection they enumerate.
This differs from, for example, an array accessed through a local variable, because the only way the Length of an array accessed through a local variable can change, is if the change is made locally.
Even there though, it's worth remembering that the compiler often skips some obvious optimisations because it's known that the jitter makes the same optimisation too.
The expected optimization is true for fields. But property has setter/getter (accessing property is in fact calling them as methods), so compiler will have hard time to try to optimize it.
To fix, make it a field or read it once
var max = pcoll.PointCount;
for (int i = 0; i < max; i++) { /* ... */ }

Performance of List<struct> vs List<class>

Out of curiosity, I was trying to test the performance of List<T> using both value and reference types.
The results were not as I expected, leading me to believe my understanding of how these objects are laid out in memory might not be correct.
This was my experiment:
Create a basic class containing just two members, an int and a bool
Create 2 List<T> objects to hold my test classes (List1 and List2)
Randomly generate test objects and add them to List1 and List2 alternately
Time how long it takes to iterate through List1 (doing some arbitrary work such as incrementing a counter and then accessing the element)
I then repeated with a struct in place of a class
My assumptions were that when using a class, the references held in the List<T> would be contiguous, but because of how I created them (switching between adding to List1 and List2), the objects they point to probably wouldn't be.
I thought that when using a struct, because it is a value type, the objects themselves would be held contiguously in memory (since the List<T> holds the actual items rather than a collection of references).
Because of this, I expected struct to perform better (due to prefetchers etc..)
In actual fact, both were very similar.
What's going on here?
Edit - Added code to actually access the element in the iterator, code sample included
Test class (or struct)
public class/struct TestClass
{
public int TestInt;
public bool TestBool;
}
Creating random Lists:
var list1 = new List<TestClass>();
var list2 = new List<TestClass>();
var toggle = false;
for (var i=0; i < 4000000; i++)
{
// Random object generation removed for simplicity
if (toggle)
list1.Add(randomObject);
else
list2.Add(randomObject);
toggle = !toggle;
}
Testing:
var stopWatch = new Stopwatch();
var counter = 0;
var testBool = false;
stopwatch.Start();
foreach(var item in list1)
{
// Access the element
testBool = item.TestBool;
counter++;
}
stopwatch.Stop();
Repeat with TestObject as both a class and a struct.
I realise there isn't much difference, but I expected struct to perform significantly better than class
// Access the element
testBool = item.TestBool;
That has no effect, the optimizer will remove the statement since it has no useful side-effects. You are not actually measuring the difference between a struct and a class since you never actually access the element.
counter++;
Same story, pretty likely to be optimized away. Unless you actually use the counter value, after the loop completes. Having the optimizer remove too much code and make the test meaningless is a common micro-benchmark hazard. A workaround would be:
foreach(var item in list1)
{
// Access the element
counter += item.TestInt;
}
Console.WriteLine(counter);
Benchmark guidelines are:
Only profile code produced by the Release configuration. The Debug build produces too much extra code and suppresses optimization
Tools + Options, Debugging, General, untick the "Suppress JIT optimization on module load". This ensures that you get optimized code even if you run with the debugger
Debug + Windows + Disassembly is a very important debugger window to show you what code really runs. Having some understanding of machine code is required to interpret that window correctly
Very important to put an outer loop around the test code to ensure that you run the test at least 10 times. This removes cold start effects, like the processor having to fill the L1 instruction cache and the jitter having to load the IL from the assembly and compile it the first time it executes. And removes random outliers you'll get from having to compete with other processes that run on the machine and also compete for the processor.
Differences of 15% are not statistically significant.
If you aren't actually accessing members of the class objects stored in your list, then the following two types should provide equivalent performance for iteration.
List<IntPtr>
List<object>
Even though the reference type instances aren't filling a contiguous section of memory, the references themselves are.
The exception to the above case would be if the CLR compresses pointers when executing 64-bit applications with less than 32GiB of memory. This strategy is documented as Compressed OOPS in the JVM. However, the x86-64 instruction set includes instructions that allow this compression/decompression to be performed extremely efficiently, so even in this case you should see performance similar to List<int>.
Things get interesting when your value types exceed the size of a pointer (IntPtr.Size). After that point, the performance of a List<T> containing references should quickly surpass the performance of a List<T> of value types. This is due to the fact that regardless of how big your reference type instance is, the reference to that instance is at most IntPtr.Size.

c# foreach statement with a add is super slow for me?

Wonder if anyone could explain what is going on with this weird little optimisation in our draw code. We replaced the first little bit of code with the second and got a huge speed increase (4400 tick -> 15 ticks using stopwatch class)
// Add all the visible sprites to the list
m_renderOrder.Clear();
foreach (CSpriteInternalData sprite in m_InternalData)
{
if (!sprite.m_bRender) continue;
m_renderOrder.Add(sprite);
}
Replaced with...
// Add all the visible sprites to the list
m_renderOrder.Clear();
renderOrderCount = 0;
for (int i = 0; i < m_numSprites; i++ )
{
if (m_InternalData[i].m_bRender)
m_renderOrder[renderOrderCount++] = m_InternalData[i];
}
I looks to be the simplest little change, for such a huge increase in speed. Can anyone help?
If CSpriteInternalData is a struct, i.e. a value type, each time when you assign a value of that type to a variable, a copy is done.
MyStruct a = new MyStruct(50);
MyStruct b = a; //a is copied to b;
a.Value = 10;
Console.WriteLine(b.Value); //still 10, has a separate copy of value
If structs are small and portable, that is not much of a problem, but if the structs are large, they can get slow. Foreach creates a variable that is repeatedly assigned a value from the collection, so if CSpriteInternalData is a struct, each one is in turn copied to the sprite variable, and that could take time.
Also, the line when you Add the item to the m_renderOrder collection, invokes another copy of the structure, but I guess only a few of them have the m_bRender flag set, so that one does not take too much time.
If that is the cause of the slowdown / speedup I would wholeheartedly recommend that you change CSpriteInternalData to a class, that would use reference behavior, and just copy references around, instead of whole copies.
foreach always creates an instance of an enumerator that is returned by GetEnumerator method and that enumerator also keeps state throughout the life cycle of the foreach loop.
It then repeatedly calls for the Next() object on the enumerator and runs your code for each object it returns.
You can create you own emulator in case. But better to user for loop instead when execution time matters.
The foreach loop has a slightly different purpose. It is meant for itterating through some collection that implements IEnumerable. It's performance is much slower.
for the reference you can see
http://www.codeproject.com/Articles/6759/FOREACH-Vs-FOR-C

Use if statements to micromanage stack

Should if statements be used to assist in the stack's memory de-allocation?
Example A:
var objectHolder = new ObjectHolder();
if (true)
{
List<DefinedObject> objectList;
using (var sr = new GenericStreamReader<DefinedObject>())
{
objectList= sr.Get().ToList();
}
if (true)
{
var DOF = new DefinedObjectFactory();
objectHolder.DefinedObjects = DOF.DefineObjects(objectList);
}
}
//example endpoint
Example B:
var objectHolder = new ObjectHolder();
List<DefinedObject> objectList;
using (var sr = new GenericStreamReader<DefinedObject>())
{
objectList= sr.Get().ToList();
}
var DOF = new DefinedObjectFactory();
objectHolder.DefinedObjects = DOF.DefineObjects(objectList);
//example endpoint
Will Example A have a lighter footprint on the stack when example endpoint is reached versus when example endpoint is reached in Example B??
First off, the whole point of a stack-based allocation system is precisely that you do not need to optimize it in any way. Don't worry about it. The jitter is perfectly capable of realizing that a local will never be read or written again, and re-using its storage if it feels that's the best thing to do. Let the jitter do its job; it doesn't need your help. (*)
Rather, write your program so that local variables make sense to the reader. That's what you should be optimizing for.
Finally, there is never a need to say "if (true) { }" to introduce a new scope. Just introduce a new scope. It is perfectly legal to say:
void M()
{
{ // new scope
}
{ // another one
}
}
(*) There is a situation where the jitter needs your help, and that is the situation where a local refers to an object on the heap that contains a resource that is going to be used by unmanaged code. The jitter does not know that unmanaged code is going to use the object's resources, and might decide that no one is using this object any more and clean it up early. The finalizer of the object might then release the resource on the finalizer thread while the unmanaged code is using the resource! An object is not guaranteed to stay alive just because a local variable is holding onto it. If the local variable is never read from again then the jitter can re-use its storage and tell the garbage collector that it is safe to collect the referred-to object, which will then potentially crash the unmanaged code. You can use KeepAlive to hint to the jitter that a particular object needs to remain alive and not be optimized away.
if(true) will be compiled out in optimized build (the only build where variable lifespan is shorter than whole method) - so there is absolutely no difference between two versions you've suggested.
Based on the usage, I'm assuming DefinedObjectFactory is a class, not a struct. Therefore, the only thing that's on the stack is a reference to DefinedObjectFactory. The actual object is on the heap, and is controlled by the garbage collector.
The only stack space you're potentially saving is the space for a single pointer, so it's not worth it.
Even if this does make some difference, I think it's very likely that you're worrying about the wrong things. Is stack space allocation really an issue for your app?
In general, doing "clever" things in code for the sake of micro-optimizations is usually not worth it. It's usually a much better idea to write your code in the most clean and straightforward way possible. After doing that, if you find you actually have some perf/scalability problems (based on doing actual measurements), you can choose to rewrite/optimize the parts that are bottlenecks.
Most times you'll find that the clean/straightforward/readable version of the code performs just fine. And if it doesn't, the problems are probably not in the places you thought they were.
There's only one thing you can guarantee about de-allocation and garbage collection - it will happen at some stage.
As other people have said, the only thing your if(true) will achieve is being optimised out by the Jitter.
You're already using a using(..) { } pattern so instead of using the if(true) block I'd refactor your code to support this:
if (true)
{
var DOF = new DefinedObjectFactory();
objectHolder.DefinedObjects = DOF.DefineObjects(objectList);
}
To:
using (var DOF = new DefinedObjectFactory())
{
objectHolder.DefinedObjects = DOF.DefineObjects(objectList);
}
And see if that helps.
There is one otherthing that you can try but it's absolutely not recommended for production code as you shouldn't pre-empt the memory manager, but that is simply add a call to
GC.Collect();
When you exit your blocks.
I don't think it'll help but it might demonstrate to you why it's generally not worth worrying about scope and de-allocation.

Avoiding array duplication

According to [MSDN: Array usage guidelines](http://msdn.microsoft.com/en-us/library/k2604h5s(VS.71).aspx):
Array Valued Properties
You should use collections to avoid code inefficiencies. In the following code example, each call to the myObj property creates a copy of the array. As a result, 2n+1 copies of the array will be created in the following loop.
[Visual Basic]
Dim i As Integer
For i = 0 To obj.myObj.Count - 1
DoSomething(obj.myObj(i))
Next i
[C#]
for (int i = 0; i < obj.myObj.Count; i++)
DoSomething(obj.myObj[i]);
Other than the change from myObj[] to ICollection myObj, what else would you recommend? Just realized that my current app is leaking memory :(
Thanks;
EDIT: Would forcing C# to pass references w/ ref (safety aside) improve performance and/or memory usage?
No, it isn't leaking memory - it is just making the garbage collector work harder than it might. Actually, the MSDN article is slightly misleading: if the property created a new collection every time it was called, it would be just as bad (memory wise) as with an array. Perhaps worse, due to the usual over-sizing of most collection implementations.
If you know a method/property does work, you can always minimise the number of calls:
var arr = obj.myObj; // var since I don't know the type!
for (int i = 0; i < arr.Length; i++) {
DoSomething(arr[i]);
}
or even easier, use foreach:
foreach(var value in obj.myObj) {
DoSomething(value);
}
Both approaches only call the property once. The second is clearer IMO.
Other thoughts; name it a method! i.e. obj.SomeMethod() - this sets expectation that it does work, and avoids the undesirable obj.Foo != obj.Foo (which would be the case for arrays).
Finally, Eric Lippert has a good article on this subject.
Just as a hint for those who haven't use the ReadOnlyCollection mentioned in some of the answers:
[C#]
class XY
{
private X[] array;
public ReadOnlyCollection<X> myObj
{
get
{
return Array.AsReadOnly(array);
}
}
}
Hope this might help.
Whenever I have properties that are costly (like recreating a collection on call) I either document the property, stating that each call incurs a cost, or I cache the value as a private field. Property getters that are costly, should be written as methods.
Generally, I try to expose collections as IEnumerable rather than arrays, forcing the consumer to use foreach (or an enumerator).
It will not make copies of the array unless you make it do so. However, simply passing the reference to an array privately owned by an object has some nasty side-effects. Whoever receives the reference is basically free to do whatever he likes with the array, including altering the contents in ways that cannot be controlled by its owner.
One way of preventing unauthorized meddling with the array is to return a copy of the contents. Another (slightly better) is to return a read-only collection.
Still, before doing any of these things you should ask yourself if you are about to give away too much information. In some cases (actually, quite often) it is even better to keep the array private and instead let provide methods that operate on the object owning it.
myobj will not create new item unless you explicitly create one. so to make better memory usage I recommend to use private collection (List or any) and expose indexer which will return the specified value from the private collection

Categories

Resources