I have an app that executes some code A a number of iterations, usually up to 1M iterations.
After that code A has executed, I collect info such as time taken, error messages,if exception was thrown, etc in a specialized object lets say ExecutionInfo. I then add that instance of ExecutionInfo in a ConcurrentBag (nevermind the ConcurrentBag, it might as well be a List but needs to be threadsafe).
After 1M iterations I got a collection of 1M instances of ExecutionInfo. Next step is to summarize everything into lets say ExecutionInfoAggregation, using Linq extensions such as Average, Min,Max,Count for various interesting data. Following code runs after 1M iterations, and consumes 92% of hte CPU (says the profiler):
private void Summarize(IEnumerable<MethodExecutionResult> methodExecutions)
{
List<MethodExecutionResult> items = methodExecutions.ToList();
if (!items.Any())
{
return;
}
AvgMethodExecutionTime = Math.Round(items.Average(x => x.ExecutionTime.TotalMilliseconds),3);
MinMethodExecutionTime = Math.Round(items.Min(x => x.ExecutionTime.TotalMilliseconds),3);
MaxMethodExecutionTime = Math.Round(items.Max(x => x.ExecutionTime.TotalMilliseconds),3);
FailedExecutionsCount = items.Count(x => !x.Success);
}
Btw, mem usage of app is 'skyrocketing'.
This is obviously not performant at all. My solution to this would be following:
Replace the collection type with a better suitable one, that allows fast insertions and fast queriying. What could that be, if any?
Dont query the collection after the 1M iterations, but aggregate after each code A execution.
Try to find a more compact way to store collected data.
Any ideas how to optimize the queries? Is there a better approach?
EDIT: Just saw the call to ToList() is not necessary
Rather than saving info about each execution I would aggregate them after each method execution.
public class MethodExecutions
{
private int _excCount = 0;
private Int64 _totalExcTime = 0;
private int _excMaxTimeTotalMilliseconds = 0;
private int _excMinTimeTotalMilliseconds = int.MaxValue;
private int _failCount = 0;
public void Add(int excTime, bool isFail)
{
_excCount += 1;
_totalExcTime += excTime;
if (excTime > _excMaxTimeTotalMilliseconds)
_excMaxTimeTotalMilliseconds = excTime;
if (excTime < _excMinTimeTotalMilliseconds)
_excMinTimeTotalMilliseconds = excTime;
if (isFail)
_failCount++;
}
public void Summarize(out int avgTime, out int minTime, out int maxTime, out int failCount)
{
avgTime = (int) Math.Round((double) _totalExcTime / _excCount);
minTime = _excMinTimeTotalMilliseconds;
maxTime = _excMaxTimeTotalMilliseconds;
failCount = _failCount;
}
}
Related
I'm creating a C# console-app. I have some critical paths and thought that creating structs would be faster than creating classes since I would not need garbage collection for structs. In my test however I found the opposite.
In the test below, I create 1000 structs and 1000 classes.
class Program
{
static void Main(string[] args)
{
int iterations = 1000;
Stopwatch sw = new Stopwatch();
sw.Start();
List<Struct22> structures = new List<Struct22>();
for (int i = 0; i < iterations; ++i)
{
structures.Add(new Struct22());
}
sw.Stop();
Console.WriteLine($"Struct creation consumed {sw.ElapsedTicks} ticks");
Stopwatch sw2 = new Stopwatch();
sw2.Start();
List<Class33> classes = new List<Class33>();
for (int i = 0; i < iterations; ++i)
{
classes.Add(new Class33());
}
sw2.Stop();
Console.WriteLine($"Class creation consumed {sw2.ElapsedTicks} ticks");
Console.ReadLine();
}
}
My classe / struct are simple:
class Class33
{
public int Property { get; set; }
public int Field;
public void Method() { }
}
struct Struct22
{
public int Property { get; set; }
public int Field;
public void Method() { }
}
Results (drum roll please...)
Struct creating consuming 3038 ticks
Class creating consuming 404 ticks
So the question is: Why would it take close to 10x the amount of time for a Class than it does for a Struct ?
EDIT. I made the Program "Do something" by just assigning integers to the properties.
static void Main(string[] args)
{
int iterations = 10000000;
Stopwatch sw = new Stopwatch();
sw.Start();
List<Struct22> structures = new List<Struct22>();
for (int i = 0; i < iterations; ++i)
{
Struct22 s = new Struct22()
{
Property = 2,
Field = 3
};
structures.Add(s);
}
sw.Stop();
Console.WriteLine($"Struct creating consuming {sw.ElapsedTicks} ticks");
Stopwatch sw2 = new Stopwatch();
sw2.Start();
List<Class33> classes = new List<Class33>();
for (int i = 0; i < iterations; ++i)
{
Class33 c = new Class33()
{
Property = 2,
Field = 3
};
classes.Add(c);
}
sw2.Stop();
Console.WriteLine($"Class creating consuming {sw2.ElapsedTicks} ticks");
Console.ReadLine();
}
and the result is astounding to me. Classes are still at least 2x but the simple fact of assigning integers had a 20x impact!
Struct creating consuming 903456 ticks
Class creating consuming 4345929 ticks
EDIT: I removed references to Methods so there are no reference types in my Class or Struct:
class Class33
{
public int Property { get; set; }
public int Field;
}
struct Struct22
{
public int Property { get; set; }
public int Field;
}
The performance difference can probably (or at least in part) be explained by a simple example.
For structures.Add(new Struct22()); this is what really happens:
A Struct22 is created and intialized.
The Add method is called, but it receives a copy because the item is a value type.
So calling Add in this case has overhead, incurred by making a new Struct22 and copying all fields and properties into it from the original.
To demonstrate, not focusing on speed but on the fact that copying takes place:
private static void StructDemo()
{
List<Struct22> list = new List<Struct22>();
Struct22 s1 = new Struct22() { Property = 2, Field = 3 }; // #1
list.Add(s1); // This creates copy #2
Struct22 s3 = list[0]; // This creates copy #3
// Change properties:
s1.Property = 777;
// list[0].Property = 888; <-- Compile error, NOT possible
s3.Property = 999;
Console.WriteLine("s1.Property = " + s1.Property);
Console.WriteLine("list[0].Property = " + list[0].Property);
Console.WriteLine("s3.Property = " + s3.Property);
}
This will be the output, proving that both Add() and the use of list[0] caused copies to be made:
s1.Property = 777
list[0].Property = 2
s3.Property = 999
Let this be a reminder that the behaviour of structs can be substantially different compared to objects, and that performance should be just one aspect when deciding what to use.
As commented, deciding on struct vs class has many considerations. I have not seen many people concerned with instantiation as it is usually a very small part of the performance impact based on this descision.
I ran a few tests with your code and found it interesting that as the number of instances increases the struct is faster.
I cant answer your question as it appears that your assertion is not true. Classes do not always instantiate faster than Structs. Everything I have read states the opposite, but your test produces the interesting results you mentioned.
There are tools you can use to really dig in and try to find out why you get the results you do.
10000
Struct creation consumed 2333 ticks
Class creation consumed 1616 ticks
100000
Struct creation consumed 5672 ticks
Class creation consumed 8459 ticks
1000000
Struct creation consumed 73462 ticks
Class creation consumed 221704 ticks
List<T> stores T objects in internal Array.
Each time when the limit of capacity is reached, new double sized internal array is created and all values from old array are copied ...
When you create an empty List and try to populate it 1000 times, internal array recreated and copied about 10 times.
So in Your example classes could create slower, but each time when new array is created, List should copy only references to objects in case of List of Class, and all structure data in case of List of Struct ...
Try to create List with initial capacity initialized, for your code it should be:
new List<Struct22>(1000)
in this case internal array wont be recreated and structure case will work much faster
I am doing some heavy computations in C# .NET and when doing these computations in parallel.for loop I must collect some data in collection, but because of limited memory I can't collect all results, so I only store the best ones.
Those computations must be as fast as possible because they are already taking too much time. So after optimizing a lot I find out that the slowest thing was my ConcurrentDictionary collection. I am wondering if I should switch to something with faster add, remove and find the highest (perhaps a sorted collection) and just use locks for my main operation or I can do something good using ConcurrentColletion and speed up it a little.
Here is my actual code, I know it's bad because of this huge lock, but without it I seem to lose consistency and a lot of my remove attempts are failing.
public class SignalsMultiValueConcurrentDictionary : ConcurrentDictionary<double, ConcurrentBag<Signal>>
{
public int Limit { get; set; }
public double WorstError { get; private set; }
public SignalsDictionaryState TryAddSignal(double key, Signal signal, out Signal removed)
{
SignalsDictionaryState state;
removed = null;
if (this.Count >= Limit && signal.AbsoluteError > WorstError)
return SignalsDictionaryState.NoAddedNoRemoved;
lock (this)
{
if (this.Count >= Limit)
{
ConcurrentBag<Signal> signals;
if (TryRemove(WorstError, out signals))
{
removed = signals.FirstOrDefault();
state = SignalsDictionaryState.AddedAndRemoved;
}
else
state = SignalsDictionaryState.AddedFailedRemoved;
}
else
state = SignalsDictionaryState.AddedNoRemoved;
this.Add(key, signal);
WorstError = Keys.Max();
}
return state;
}
private void Add(double key, Signal value)
{
ConcurrentBag<Signal> values;
if (!TryGetValue(key, out values))
{
values = new ConcurrentBag<Signal>();
this[key] = values;
}
values.Add(value);
}
}
Note also because I use absolute error of signal, sometimes (should be very rare) I store more than one value on one key.
The only operation used in my computations is TryAddSignal because it does what I want -> if I have more signlas than limit then it removes signal with highest error and adds new signal.
Because of the fact that I set Limit property at the start of the computations I don't need a resizable collection.
The main problem here is even without that huge lock, Keys.Max is a little too slow. So maybe I need other collection?
Keys.Max() is the killer. That's O(N). No need for a dictionary if you do this.
You can't incrementally compute the max value because you are adding and removing. So you better use a data structure that is made for this. Trees usually are. The BCL has SortedList and SortedSet and SortedDictionary I believe. One of them was based on a fast tree. It has min and max operations.
Or, use a .NET collection library with a priority queue.
Bug: Add is racy. You might overwrite a non-empty collection.
The large lock statement is at least dubious. An easier improvement, if you say that Keys.Max() is slow, is to calculate the maximum value incrementally. You'll need to refresh it only after removing a key:
//...
if (TryRemove(WorstError, out signals))
{
WorstError = Keys.Max();
//...
WorstError = Math.Max(WorstError, key);
What I did in the end was to implement Heap based on binary-tree as was suggested by #usr. My final collection was not concurrent but synchronized (I used locks). I checked performance thought and it does the job fast enough.
Here is pseudocode:
public class SynchronizedCollectionWithMaxOnTop
{
double Max => _items[0].AbsoluteError;
public ItemChangeState TryAdd(Item item, out Item removed)
{
ItemChangeState state;
removed = null;
if (_items.Count >= Limit && signal.AbsoluteError > Max)
return ItemChangeState.NoAddedNoRemoved;
lock (this)
{
if (_items.Count >= Limit)
{
removed = Remove();
state = ItemChangeState.AddedAndRemoved;
}
else
state = ItemChangeState.AddedNoRemoved;
Insert(item);
}
return state;
}
private void Insert(Item item)
{
_items.Add(item);
HeapifyUp(_items.Count - 1);
}
private void Remove()
{
var result = new Item(_items[0]);
var lastIndex = _items.Count - 1;
_items[0] = _items[lastIndex];
_items.RemoveAt(lastIndex);
HeapifyDown(0);
return result;
}
}
I have a list of Vector2's Generated I have to check against a dictionary to see if they exist, this function gets executed every tick.
which would run fastest/ be better to do it this way?
public static bool exists(Vector2 Position, Dictionary<Vector2, object> ToCheck)
{
try
{
object Test = ToCheck[Position];
return (true);
}
catch
{
return (false);
}
}
Or should I stick with The norm ?
public static bool exists(Vector2 Position, Dictionary<Vector2, object> ToCheck)
{
if (ToCheck.ContainsKey(Position))
{
return (true);
}
return (false);
}
Thanks for the input :)
Side Note: (The Value for the key doesn't matter at this point or i would use TryGetValue instead of ContainsKey)
I know it's an old question, but just to add a bit of empirical data...
Running 50,000,000 look-ups on a dictionary with 10,000 entries and comparing relative times to complete:
..if every look-up is successful:
a straight (unchecked) run takes 1.2 seconds
a guarded (ContainsKey) run takes 2 seconds
a handled (try-catch) run takes 1.21 seconds
..if 1 out of every 10,000 look-ups fail:
a guarded (ContainsKey) run takes 2 seconds
a handled (try-catch) run takes 1.37 seconds
..if 16 out of every 10,000 look-ups fail:
a guarded (ContainsKey) run takes 2 seconds
a handled (try-catch) run takes 3.27 seconds
..if 250 out of every 10,000 look-ups fail:
a guarded (ContainsKey) run takes 2 seconds
a handled (try-catch) run takes 32 seconds
..so a guarded test will add a constant overhead and nothing more, and try-catch test will operate almost as fast as no test if it never fails, but kills performance proportionally to the number of failures.
Code I used to run tests:
using System;
using System.Collections.Generic;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{ Test(0);
Test(1);
Test(16);
Test(250);
}
private static void Test(int failsPerSet)
{ Dictionary<int, bool> items = new Dictionary<int,bool>();
for(int i = 0; i < 10000; i++)
if(i >= failsPerSet)
items[i] = true;
if(failsPerSet == 0)
RawLookup(items, failsPerSet);
GuardedLookup(items, failsPerSet);
CaughtLookup(items, failsPerSet);
}
private static void RawLookup
( Dictionary<int, bool> items
, int failsPerSet
){ int found = 0;
DateTime start ;
Console.Write("Raw (");
Console.Write(failsPerSet);
Console.Write("): ");
start = DateTime.Now;
for(int i = 0; i < 50000000; i++)
{ int pick = i % 10000;
if(items[pick])
found++;
}
Console.WriteLine(DateTime.Now - start);
}
private static void GuardedLookup
( Dictionary<int, bool> items
, int failsPerSet
){ int found = 0;
DateTime start ;
Console.Write("Guarded (");
Console.Write(failsPerSet);
Console.Write("): ");
start = DateTime.Now;
for(int i = 0; i < 50000000; i++)
{ int pick = i % 10000;
if(items.ContainsKey(pick))
if(items[pick])
found++;
}
Console.WriteLine(DateTime.Now - start);
}
private static void CaughtLookup
( Dictionary<int, bool> items
, int failsPerSet
){ int found = 0;
DateTime start ;
Console.Write("Caught (");
Console.Write(failsPerSet);
Console.Write("): ");
start = DateTime.Now;
for(int i = 0; i < 50000000; i++)
{ int pick = i % 10000;
try
{ if(items[pick])
found++;
}
catch
{
}
}
Console.WriteLine(DateTime.Now - start);
}
}
}
Definitely use the ContainsKey check; exception handling can add a large overhead.
Throwing exceptions can negatively impact performance. For code that routinely fails, you can use design patterns to minimize performance issues.
Exceptions are not meant to be used for conditions you can check for.
I recommend reading the MSDN documentation on exceptions generally, and on exception handling in particular.
Never use try/catch as a part of your regular program path. It is really expensive and should only catch errors that you cannot prevent. ContainsKey is the way to go here.
Side Note: No. You would not. If the value matters you check with ContainsKey if it exists and retrieve it, if it does. Not try/catch.
Side Note: (The Value for the key doesn't matter at this point or i would use TryGetValue instead of ContainsKey)
The answer you accepted is correct, but just to add, if you only care about the key and not the value, maybe you're looking for a HashSet rather than a Dictionary?
In addition, your second code snippet is a method which literally adds zero value. Just use ToCheck.ContainsKey(Position), don't make a method which just calls that method and returns its value but does nothing else.
for (var keyValue = 0; keyValue < dwhSessionDto.KeyValues.Count; keyValue++)
{...}
var count = dwhSessionDto.KeyValues.Count;
for (var keyValue = 0; keyValue < count; keyValue++)
{...}
I know there's a difference between the two, but is one of them faster than the other? I would think the second is faster.
Yes, the first version is much slower. After all, I'm assuming you're dealing with types like this:
public class SlowCountProvider
{
public int Count
{
get
{
Thread.Sleep(1000);
return 10;
}
}
}
public class KeyValuesWithSlowCountProvider
{
public SlowCountProvider KeyValues
{
get { return new SlowCountProvider(); }
}
}
Here, your first loop will take ~10 seconds, whereas your second loop will take ~1 second.
Of course, you might argue that the assumption that you're using this code is unjustified - but my point is that the right answer will depend on the types involved, and the question doesn't state what those types are.
Now if you're actually dealing with a type where accessing KeyValues and Count is cheap (which is quite likely) I wouldn't expect there to be much difference. Mind you, I'd almost always prefer to use foreach where possible:
foreach (var pair in dwhSessionDto.KeyValues)
{
// Use pair here
}
That way you never need the count. But then, you haven't said what you're trying to do inside the loop either. (Hint: to get more useful answers, provide more information.)
it depends how difficult it is to compute dwhSessionDto.KeyValues.Count if its just a pointer to an int then the speed of each version will be the same. However, if the Count value needs to be calculated, then it will be calculated every time, and therefore impede perfomance.
EDIT -- heres some code to demonstrate that the condition is always re-evaluated
public class Temp
{
public int Count { get; set; }
}
static void Main(string[] args)
{
var t = new Temp() {Count = 5};
for (int i = 0; i < t.Count; i++)
{
Console.WriteLine(i);
t.Count--;
}
Console.ReadLine();
}
The output is 0, 1, 2 - only !
See comments for reasons why this answer is wrong.
If there is a difference, it’s the other way round: Indeed, the first one might be faster. That’s because the compiler recognizes that you are iterating from 0 to the end of the array, and it can therefore elide bounds checks within the loop (i.e. when you access dwhSessionDTo.KeyValues[i]).
However, I believe the compiler only applies this optimization to arrays so there probably will be no difference here.
It is impossible to say without knowing the implementation of dwhSessionDto.KeyValues.Count and the loop body.
Assume a global variable bool foo = false; and then following implementations:
/* Loop body... */
{
if(foo) Thread.Sleep(1000);
}
/* ... */
public int Count
{
get
{
foo = !foo;
return 10;
}
}
/* ... */
Now, the first loop will perform approximately twice as fast as the second ;D
However, assuming non-moronic implementation, the second one is indeed more likely to be faster.
No. There is no performance difference between these two loops. With JIT and Code Optimization, it does not make any difference.
There is no difference but why you think that thereis difference , can you please post your findings?
if you see the implementation of insert item in Dictionary using reflector
private void Insert(TKey key, TValue value, bool add)
{
int freeList;
if (key == null)
{
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
}
if (this.buckets == null)
{
this.Initialize(0);
}
int num = this.comparer.GetHashCode(key) & 0x7fffffff;
int index = num % this.buckets.Length;
for (int i = this.buckets[index]; i >= 0; i = this.entries[i].next)
{
if ((this.entries[i].hashCode == num) && this.comparer.Equals(this.entries[i].key, key))
{
if (add)
{
ThrowHelper.ThrowArgumentException(ExceptionResource.Argument_AddingDuplicate);
}
this.entries[i].value = value;
this.version++;
return;
}
}
if (this.freeCount > 0)
{
freeList = this.freeList;
this.freeList = this.entries[freeList].next;
this.freeCount--;
}
else
{
if (this.count == this.entries.Length)
{
this.Resize();
index = num % this.buckets.Length;
}
freeList = this.count;
this.count++;
}
this.entries[freeList].hashCode = num;
this.entries[freeList].next = this.buckets[index];
this.entries[freeList].key = key;
this.entries[freeList].value = value;
this.buckets[index] = freeList;
this.version++;
}
Count is a internal member to this class which is incremented each item you insert an item into dictionary
so i beleive that there is no differenct at all.
The second version can be faster, sometimes. The point is that the condition is reevaluated after every iteration, so if e.g. the getter of "Count" actually counts the elements in an IEnumerable, or interogates a database /etc, this will slow things down.
So I'd say that if you dont affect the value of "Count" in the "for", the second version is safer.
everyone,recently i was debugging a program for improve performance.i notice a interest thing about assignment's performance.the below code is my test code.
CODE A
public class Word{....}
public class Chunk
{
private Word[] _items;
private int _size;
public Chunk()
{
_items = new Word[3];
}
public void Add(Word word)
{
_items[_size++] = word;
}
}
main
Chunk chunk = new Chunk();
for (int i = 0; i < 3; i++)
{
chunk.Add(new Word() { });//
}
CODE B
public class Chunk
{
private Word[] _items;
private int _size;
public Chunk()
{
_items = new Word[3];
}
public Word[] Words
{
get
{
return _items;
}
}
public int Size
{
get{return _size;}
set{_size=value;}
}
}
main
Chunk chunk = new Chunk();
for (int i = 0; i < 3; i++)
{
chunk.Words[i] = new Word() { };
chunk.Size + = 1;
}
in my test with visual studio'profiling tool,calling the main method 32000 times,that performance shows the CODE B FASTER than the CODE A.why the CODE B faster than the CODE A?who can give me a suggestion?
thanks
update:sorry,i forgot increase _size code in the CODE B,i have updated my CODE B
update: #Shiv Kuma Yes, code A is similar with Code B in the situation of 30000 call times. I tested the 700K file and the code can be called 29000 times or so.
Meanwhile, code B is 100 millisecond faster than Code A, and actually Code B is much better during the real segment.
Here one more thing I’m wondering is why Code B is faster than Code A even for the same assignment?
Anyway, thanks for you reply.
Three reason I can think of.
Chunk.Add() is a method call, a method call is always expensive compared to same code running inline.
There are two incremnets in the first code sample ( _size++ and i++ )
chunk.Words array might cached locally (2nd example) therefore no need to evaluate chunk.items (1st example) every time Add is called.
In CODE A you are incrementing twice. Once in your for loop:
for (int i = 0; i < 3; i++)
And once in your method:
_items[_size++] = word;
In CODE B you are only incrementing once in the for loop.
It isn't much but it would definitely cause the performance difference.
Yes, the method call would also add a small amount of overhead.