everyone,recently i was debugging a program for improve performance.i notice a interest thing about assignment's performance.the below code is my test code.
CODE A
public class Word{....}
public class Chunk
{
private Word[] _items;
private int _size;
public Chunk()
{
_items = new Word[3];
}
public void Add(Word word)
{
_items[_size++] = word;
}
}
main
Chunk chunk = new Chunk();
for (int i = 0; i < 3; i++)
{
chunk.Add(new Word() { });//
}
CODE B
public class Chunk
{
private Word[] _items;
private int _size;
public Chunk()
{
_items = new Word[3];
}
public Word[] Words
{
get
{
return _items;
}
}
public int Size
{
get{return _size;}
set{_size=value;}
}
}
main
Chunk chunk = new Chunk();
for (int i = 0; i < 3; i++)
{
chunk.Words[i] = new Word() { };
chunk.Size + = 1;
}
in my test with visual studio'profiling tool,calling the main method 32000 times,that performance shows the CODE B FASTER than the CODE A.why the CODE B faster than the CODE A?who can give me a suggestion?
thanks
update:sorry,i forgot increase _size code in the CODE B,i have updated my CODE B
update: #Shiv Kuma Yes, code A is similar with Code B in the situation of 30000 call times. I tested the 700K file and the code can be called 29000 times or so.
Meanwhile, code B is 100 millisecond faster than Code A, and actually Code B is much better during the real segment.
Here one more thing I’m wondering is why Code B is faster than Code A even for the same assignment?
Anyway, thanks for you reply.
Three reason I can think of.
Chunk.Add() is a method call, a method call is always expensive compared to same code running inline.
There are two incremnets in the first code sample ( _size++ and i++ )
chunk.Words array might cached locally (2nd example) therefore no need to evaluate chunk.items (1st example) every time Add is called.
In CODE A you are incrementing twice. Once in your for loop:
for (int i = 0; i < 3; i++)
And once in your method:
_items[_size++] = word;
In CODE B you are only incrementing once in the for loop.
It isn't much but it would definitely cause the performance difference.
Yes, the method call would also add a small amount of overhead.
Related
I'm creating a C# console-app. I have some critical paths and thought that creating structs would be faster than creating classes since I would not need garbage collection for structs. In my test however I found the opposite.
In the test below, I create 1000 structs and 1000 classes.
class Program
{
static void Main(string[] args)
{
int iterations = 1000;
Stopwatch sw = new Stopwatch();
sw.Start();
List<Struct22> structures = new List<Struct22>();
for (int i = 0; i < iterations; ++i)
{
structures.Add(new Struct22());
}
sw.Stop();
Console.WriteLine($"Struct creation consumed {sw.ElapsedTicks} ticks");
Stopwatch sw2 = new Stopwatch();
sw2.Start();
List<Class33> classes = new List<Class33>();
for (int i = 0; i < iterations; ++i)
{
classes.Add(new Class33());
}
sw2.Stop();
Console.WriteLine($"Class creation consumed {sw2.ElapsedTicks} ticks");
Console.ReadLine();
}
}
My classe / struct are simple:
class Class33
{
public int Property { get; set; }
public int Field;
public void Method() { }
}
struct Struct22
{
public int Property { get; set; }
public int Field;
public void Method() { }
}
Results (drum roll please...)
Struct creating consuming 3038 ticks
Class creating consuming 404 ticks
So the question is: Why would it take close to 10x the amount of time for a Class than it does for a Struct ?
EDIT. I made the Program "Do something" by just assigning integers to the properties.
static void Main(string[] args)
{
int iterations = 10000000;
Stopwatch sw = new Stopwatch();
sw.Start();
List<Struct22> structures = new List<Struct22>();
for (int i = 0; i < iterations; ++i)
{
Struct22 s = new Struct22()
{
Property = 2,
Field = 3
};
structures.Add(s);
}
sw.Stop();
Console.WriteLine($"Struct creating consuming {sw.ElapsedTicks} ticks");
Stopwatch sw2 = new Stopwatch();
sw2.Start();
List<Class33> classes = new List<Class33>();
for (int i = 0; i < iterations; ++i)
{
Class33 c = new Class33()
{
Property = 2,
Field = 3
};
classes.Add(c);
}
sw2.Stop();
Console.WriteLine($"Class creating consuming {sw2.ElapsedTicks} ticks");
Console.ReadLine();
}
and the result is astounding to me. Classes are still at least 2x but the simple fact of assigning integers had a 20x impact!
Struct creating consuming 903456 ticks
Class creating consuming 4345929 ticks
EDIT: I removed references to Methods so there are no reference types in my Class or Struct:
class Class33
{
public int Property { get; set; }
public int Field;
}
struct Struct22
{
public int Property { get; set; }
public int Field;
}
The performance difference can probably (or at least in part) be explained by a simple example.
For structures.Add(new Struct22()); this is what really happens:
A Struct22 is created and intialized.
The Add method is called, but it receives a copy because the item is a value type.
So calling Add in this case has overhead, incurred by making a new Struct22 and copying all fields and properties into it from the original.
To demonstrate, not focusing on speed but on the fact that copying takes place:
private static void StructDemo()
{
List<Struct22> list = new List<Struct22>();
Struct22 s1 = new Struct22() { Property = 2, Field = 3 }; // #1
list.Add(s1); // This creates copy #2
Struct22 s3 = list[0]; // This creates copy #3
// Change properties:
s1.Property = 777;
// list[0].Property = 888; <-- Compile error, NOT possible
s3.Property = 999;
Console.WriteLine("s1.Property = " + s1.Property);
Console.WriteLine("list[0].Property = " + list[0].Property);
Console.WriteLine("s3.Property = " + s3.Property);
}
This will be the output, proving that both Add() and the use of list[0] caused copies to be made:
s1.Property = 777
list[0].Property = 2
s3.Property = 999
Let this be a reminder that the behaviour of structs can be substantially different compared to objects, and that performance should be just one aspect when deciding what to use.
As commented, deciding on struct vs class has many considerations. I have not seen many people concerned with instantiation as it is usually a very small part of the performance impact based on this descision.
I ran a few tests with your code and found it interesting that as the number of instances increases the struct is faster.
I cant answer your question as it appears that your assertion is not true. Classes do not always instantiate faster than Structs. Everything I have read states the opposite, but your test produces the interesting results you mentioned.
There are tools you can use to really dig in and try to find out why you get the results you do.
10000
Struct creation consumed 2333 ticks
Class creation consumed 1616 ticks
100000
Struct creation consumed 5672 ticks
Class creation consumed 8459 ticks
1000000
Struct creation consumed 73462 ticks
Class creation consumed 221704 ticks
List<T> stores T objects in internal Array.
Each time when the limit of capacity is reached, new double sized internal array is created and all values from old array are copied ...
When you create an empty List and try to populate it 1000 times, internal array recreated and copied about 10 times.
So in Your example classes could create slower, but each time when new array is created, List should copy only references to objects in case of List of Class, and all structure data in case of List of Struct ...
Try to create List with initial capacity initialized, for your code it should be:
new List<Struct22>(1000)
in this case internal array wont be recreated and structure case will work much faster
I have an app that executes some code A a number of iterations, usually up to 1M iterations.
After that code A has executed, I collect info such as time taken, error messages,if exception was thrown, etc in a specialized object lets say ExecutionInfo. I then add that instance of ExecutionInfo in a ConcurrentBag (nevermind the ConcurrentBag, it might as well be a List but needs to be threadsafe).
After 1M iterations I got a collection of 1M instances of ExecutionInfo. Next step is to summarize everything into lets say ExecutionInfoAggregation, using Linq extensions such as Average, Min,Max,Count for various interesting data. Following code runs after 1M iterations, and consumes 92% of hte CPU (says the profiler):
private void Summarize(IEnumerable<MethodExecutionResult> methodExecutions)
{
List<MethodExecutionResult> items = methodExecutions.ToList();
if (!items.Any())
{
return;
}
AvgMethodExecutionTime = Math.Round(items.Average(x => x.ExecutionTime.TotalMilliseconds),3);
MinMethodExecutionTime = Math.Round(items.Min(x => x.ExecutionTime.TotalMilliseconds),3);
MaxMethodExecutionTime = Math.Round(items.Max(x => x.ExecutionTime.TotalMilliseconds),3);
FailedExecutionsCount = items.Count(x => !x.Success);
}
Btw, mem usage of app is 'skyrocketing'.
This is obviously not performant at all. My solution to this would be following:
Replace the collection type with a better suitable one, that allows fast insertions and fast queriying. What could that be, if any?
Dont query the collection after the 1M iterations, but aggregate after each code A execution.
Try to find a more compact way to store collected data.
Any ideas how to optimize the queries? Is there a better approach?
EDIT: Just saw the call to ToList() is not necessary
Rather than saving info about each execution I would aggregate them after each method execution.
public class MethodExecutions
{
private int _excCount = 0;
private Int64 _totalExcTime = 0;
private int _excMaxTimeTotalMilliseconds = 0;
private int _excMinTimeTotalMilliseconds = int.MaxValue;
private int _failCount = 0;
public void Add(int excTime, bool isFail)
{
_excCount += 1;
_totalExcTime += excTime;
if (excTime > _excMaxTimeTotalMilliseconds)
_excMaxTimeTotalMilliseconds = excTime;
if (excTime < _excMinTimeTotalMilliseconds)
_excMinTimeTotalMilliseconds = excTime;
if (isFail)
_failCount++;
}
public void Summarize(out int avgTime, out int minTime, out int maxTime, out int failCount)
{
avgTime = (int) Math.Round((double) _totalExcTime / _excCount);
minTime = _excMinTimeTotalMilliseconds;
maxTime = _excMaxTimeTotalMilliseconds;
failCount = _failCount;
}
}
This question seems to be nonsense. The behaviour cannot be reproduced reliably.
Comparing the following test programs, I observed a huge performance difference between the first and the second of the following examples (the first example is by factor ten slower than the second):
First example (slow):
interface IWrappedDict {
int Number { get; }
void AddSomething (string k, string v);
}
class WrappedDict : IWrappedDict {
private Dictionary<string, string> dict = new Dictionary<string,string> ();
public void AddSomething (string k, string v) {
dict.Add (k, v);
}
public int Number { get { return dict.Count; } }
}
class TestClass {
private IWrappedDict wrappedDict;
public TestClass (IWrappedDict theWrappedDict) {
wrappedDict = theWrappedDict;
}
public void DoSomething () {
// this function does the performance test
for (int i = 0; i < 1000000; ++i) {
var c = wrappedDict.Number; wrappedDict.AddSomething (...);
}
}
}
Second example (fast):
// IWrappedDict as above
class WrappedDict : IWrappedDict {
private Dictionary<string, string> dict = new Dictionary<string,string> ();
private int c = 0;
public void AddSomething (string k, string v) {
dict.Add (k, v); ++ c;
}
public int Number { get { return c; } }
}
// rest as above
Funnily, the difference vanishes (the first example gets fast as well) if I change the type of the member variable TestClass.wrappedDict from IWrappedDict to WrappedDict. My interpretation of this is that Dictionary.Count re-counts the elements every time it is accessed and that potential caching of the number of elements is done by compiler optimization only.
Can anybody confirm this? Is there any way to get the number of elements in a Dictionary in a performant way?
No, Dictionary.Count does not recount the elements every time it's used. The dictionary maintains a count, and should be as fast as your second version.
I suspect that in your test of the second example, you already had WrappedDict instead of IWrappedDict, and this is actually about interface member access (which is always virtual) and the JIT compiling inlining calls to the property when it knows the concrete type.
If you still believe Count is the problem, you should be able to edit your question to show a short but complete program which demonstrates both the fast and slow versions, including how you're timing it all.
Sound like your timing is off; I get:
#1: 330ms
#2: 335ms
when running the following in release mode, outside of the IDE:
public void DoSomething(int count) {
// this function does the performance test
for (int i = 0; i < count; ++i) {
var c = wrappedDict.Number; wrappedDict.AddSomething(i.ToString(), "a");
}
}
static void Execute(int count, bool show)
{
var obj1 = new TestClass(new WrappedDict1());
var obj2 = new TestClass(new WrappedDict2());
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
GC.WaitForPendingFinalizers();
var watch = Stopwatch.StartNew();
obj1.DoSomething(count);
watch.Stop();
if(show) Console.WriteLine("#1: {0}ms", watch.ElapsedMilliseconds);
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
GC.WaitForPendingFinalizers();
watch = Stopwatch.StartNew();
obj2.DoSomething(count);
watch.Stop();
if(show) Console.WriteLine("#2: {0}ms", watch.ElapsedMilliseconds);
}
static void Main()
{
Execute(1, false); // for JIT
Execute(1000000, true); // for measuring
}
Basically: "cannot reproduce". Also: for completeness, no: .Count does not count all the items (it already knows the count), nor does the compiler add any magic automatic caching code (note: there are a few limited examples of things like that; for example, the JIT can remove bounds-checking on a for loop over a vector).
No, a dictionary or hashtable never iterates the entries to determine the length.
It will (or should) always keep track of the number of entries.
Thus, the time complexity is O(1).
typedef struct {
int e1;
int e2;
int e3;
int e4;
int e5;
} abc;
void Hello(abc * a, int index)
{
int * post = (&(a->e1) + index);
int i;
for(i = 0; i<5; i++)
{
*(post + i) = i;
}
}
The problem I face here is how they able to access the next element in the struct by
*(post + i)
I'm not sure how all these would be done in C# and moreover, I don't want to use unsafe pointers in C#, but something alternate to it.
Thanks!
You should replace the struct with an array of 5 elements.
If you want to, you can wrap the array in a class with five properties.
edit...
When you say 'Wrap,' it generally means to write properties in a class that set or get the value of either a single variable, an array element, or a member of another class whose instance lives inside your class (the usual usage here = 'wrap an object'). Very useful for separating concerns and joining functionality of multiple objects. Technically, all simple properties just 'wrap' their private member variables.
Sample per comment:
class test
{
int[] e = new int[5];
public void Hello(int index)
{
for (int i = 0; i <= 4; i++) {
// will always happen if index != 0
if (i + index > 4) {
MsgBox("Original code would have overwritten memory. .Net will now blow up.");
}
e[i + index] = i;
}
}
public int e1 {
get { return e[0]; }
set { e[0] = value; }
}
public int e2 {
get { return e[1]; }
set { e[1] = value; }
}
//' ETC etc etc with e3-e5 ...
}
The problem with the C code is that if index is greater than 0 it runs off the end of the abc struct, thus overwriting random memory. This is exactly why C#, a safer language, does not allow these sorts of things. The way I'd implement your code in C# would be:
struct abc
{
public int[] e;
}
void Hello(ref abc a, int index)
{
a.e = new int[5];
for (int i = 0; i < 5; ++i)
a.e[index + i] = i;
}
Note that if index > 0, you'll get an out of bounds exception instead of possibly silent memory overwriting as you would in the C snippet.
The thinking behind the C codes is an ill fit for C#. The C code is based on the assumption that the fields of the struct will be placed sequentially in memory in the order defined the fields are defined in.
The above looks like either homework or a contrived example. Without knowing the real intent it's hard to give a concrete example in C#.
other examples here suggest changing the data structure but if you can't/don't want to do that, you can use reflection combined with an array of objects of the struct type to accomplish the same result as above.
void Hello(abc currentObj){
var fields = typeof(abc).GetFields();
for(var i = 0;i<fields.Length;i++){
fields[i].SetValue(currentObj,i);
}
}
for (var keyValue = 0; keyValue < dwhSessionDto.KeyValues.Count; keyValue++)
{...}
var count = dwhSessionDto.KeyValues.Count;
for (var keyValue = 0; keyValue < count; keyValue++)
{...}
I know there's a difference between the two, but is one of them faster than the other? I would think the second is faster.
Yes, the first version is much slower. After all, I'm assuming you're dealing with types like this:
public class SlowCountProvider
{
public int Count
{
get
{
Thread.Sleep(1000);
return 10;
}
}
}
public class KeyValuesWithSlowCountProvider
{
public SlowCountProvider KeyValues
{
get { return new SlowCountProvider(); }
}
}
Here, your first loop will take ~10 seconds, whereas your second loop will take ~1 second.
Of course, you might argue that the assumption that you're using this code is unjustified - but my point is that the right answer will depend on the types involved, and the question doesn't state what those types are.
Now if you're actually dealing with a type where accessing KeyValues and Count is cheap (which is quite likely) I wouldn't expect there to be much difference. Mind you, I'd almost always prefer to use foreach where possible:
foreach (var pair in dwhSessionDto.KeyValues)
{
// Use pair here
}
That way you never need the count. But then, you haven't said what you're trying to do inside the loop either. (Hint: to get more useful answers, provide more information.)
it depends how difficult it is to compute dwhSessionDto.KeyValues.Count if its just a pointer to an int then the speed of each version will be the same. However, if the Count value needs to be calculated, then it will be calculated every time, and therefore impede perfomance.
EDIT -- heres some code to demonstrate that the condition is always re-evaluated
public class Temp
{
public int Count { get; set; }
}
static void Main(string[] args)
{
var t = new Temp() {Count = 5};
for (int i = 0; i < t.Count; i++)
{
Console.WriteLine(i);
t.Count--;
}
Console.ReadLine();
}
The output is 0, 1, 2 - only !
See comments for reasons why this answer is wrong.
If there is a difference, it’s the other way round: Indeed, the first one might be faster. That’s because the compiler recognizes that you are iterating from 0 to the end of the array, and it can therefore elide bounds checks within the loop (i.e. when you access dwhSessionDTo.KeyValues[i]).
However, I believe the compiler only applies this optimization to arrays so there probably will be no difference here.
It is impossible to say without knowing the implementation of dwhSessionDto.KeyValues.Count and the loop body.
Assume a global variable bool foo = false; and then following implementations:
/* Loop body... */
{
if(foo) Thread.Sleep(1000);
}
/* ... */
public int Count
{
get
{
foo = !foo;
return 10;
}
}
/* ... */
Now, the first loop will perform approximately twice as fast as the second ;D
However, assuming non-moronic implementation, the second one is indeed more likely to be faster.
No. There is no performance difference between these two loops. With JIT and Code Optimization, it does not make any difference.
There is no difference but why you think that thereis difference , can you please post your findings?
if you see the implementation of insert item in Dictionary using reflector
private void Insert(TKey key, TValue value, bool add)
{
int freeList;
if (key == null)
{
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
}
if (this.buckets == null)
{
this.Initialize(0);
}
int num = this.comparer.GetHashCode(key) & 0x7fffffff;
int index = num % this.buckets.Length;
for (int i = this.buckets[index]; i >= 0; i = this.entries[i].next)
{
if ((this.entries[i].hashCode == num) && this.comparer.Equals(this.entries[i].key, key))
{
if (add)
{
ThrowHelper.ThrowArgumentException(ExceptionResource.Argument_AddingDuplicate);
}
this.entries[i].value = value;
this.version++;
return;
}
}
if (this.freeCount > 0)
{
freeList = this.freeList;
this.freeList = this.entries[freeList].next;
this.freeCount--;
}
else
{
if (this.count == this.entries.Length)
{
this.Resize();
index = num % this.buckets.Length;
}
freeList = this.count;
this.count++;
}
this.entries[freeList].hashCode = num;
this.entries[freeList].next = this.buckets[index];
this.entries[freeList].key = key;
this.entries[freeList].value = value;
this.buckets[index] = freeList;
this.version++;
}
Count is a internal member to this class which is incremented each item you insert an item into dictionary
so i beleive that there is no differenct at all.
The second version can be faster, sometimes. The point is that the condition is reevaluated after every iteration, so if e.g. the getter of "Count" actually counts the elements in an IEnumerable, or interogates a database /etc, this will slow things down.
So I'd say that if you dont affect the value of "Count" in the "for", the second version is safer.