Problem in calculating time taken to execute a function - c#

I am trying to find the time taken to run a function. I am doing it this way:
SomeFunc(input) {
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
//some operation on input
stopWatch.Stop();
long timeTaken = stopWatch.ElapsedMilliseconds;
}
Now the "some operation on input" as mentioned in the comments takes significant time based on the input to SomeFunc.
The problem is when I call SomeFunc multiple times from the main, I get timeTaken correctly only for the first time, and the rest of the time it is being assigned to 0. Is there a problem with the above code?
EDIT:
There is a UI with multiple text fields, and when a button is clicked, it is delegated to the SomeFunc. The SomeFunc makes some calculations based on the input (from the text fields) and displays the result on the UI. I am not allowed to share the code in "some operation on input" since I have signed an NDA. I can however answer your questions as to what I am trying to achieve there. Please help.
EDIT 2:
As it seems that I am getting weird value when the function is called the first time, and as #Mike Bantegui mentioned, there must be JIT optimization going on, the only solution I can think of now (to not get zero as execution time) is that to display the time in nano seconds. How is it possible to display the time in nano seconds in C#?

Well, you aren't outputing that data anywhere. Ideally you would do it something more like this.
void SomeFunc(input)
{
Do sstuff
}
main()
{
List<long> results = new List<long>();
Stopwatch sw = new Stopwatch();
for(int i = 0; i < MAX_TRIES; i++)
{
sw.Start();
SomeFunc(arg);
sw.Stop();
results.Add(sw.ElapsedMilliseconds);
sw.Reset();
}
//Perform analyses and results
}

In fact you are getting the wrong time at the first start and correct time to the remaining. You can't relay just on the first call to measure the time. However It seams to be that the operation is too fast and so you get the 0 results. To measure the test correctly call the function 1000 times for example to see the average cost time:
Stopwatch watch = StopWatch.StartNew();
for (int index = 0; index < 1000; index++)
{
SomeFunc(input);
}
watch.Stop();
Console.WriteLine(watch.ElapsedMilliseconds);
Edit:
How is it possible to display the time in nano seconds
You can get watch.ElapsedTicks and then convert it to nanoseconds : (watch.ElapsedTicks / Stopwatch.Frequency) * 1000000000

As a simple example, consider the following (contrived) example:
double Mean(List<double> items)
{
double mu = 0;
foreach (double val in items)
mu += val;
return mu / items.Length;
}
We can time it like so:
void DoTimings(int n)
{
Stopwatch sw = new Stopwatch();
int time = 0;
double dummy = 0;
for (int i = 0; i < n; i++)
{
List<double> items = new List<double>();
// populate items with random numbers, excluded for brevity
sw.Start();
dummy += Mean(items);
sw.Stop();
time += sw.ElapsedMilliseconds;
}
Console.WriteLine(dummy);
Console.WriteLine(time / n);
}
This works if the list of items is actually very large. But if it's too small, we'll have to do multiple runs under one timing:
void DoTimings(int n)
{
Stopwatch sw = new Stopwatch();
int time = 0;
double dummy = 0;
List<double> items = new List<double>(); // Reuse same list
// populate items with random numbers, excluded for brevity
sw.Start();
for (int i = 0; i < n; i++)
{
dummy += Mean(items);
time += sw.ElapsedMilliseconds;
}
sw.Stop();
Console.WriteLine(dummy);
Console.WriteLine(time / n);
}
In the second example, if the size of the list is too small, then we can accurately get an idea of how long it takes by simply running this for a large enough n. Each has it's advantages and flaws though.
However, before doing either of these I would do a "warm up" calculation before hand:
// Or something smaller, just enough to let the compiler JIT
double dummy = 0;
for (int i = 0; i < 10000; i++)
dummy += Mean(data);
Console.WriteLine(dummy);
// Now do the actual timing
An alternative method of both would be to do what #Rig did in his answer, and build up a list of results to do statistics on. In the first case, you'd simply build up a list of each individual time. In the second case, you would build up a list of the average timing of multiple runs, since the time for a calculation could smaller than finest grained time in your Stopwatch.
With all that said, I would say there is one very large caveat in all of this: Calculating the time it takes for something to run is very hard to do properly. It's admirable to want to do profiling, but you should do some research on SO and see what other people have done to do this properly. It's very easy to write a routine that times something badly, but very hard to do it right.

Related

Is parallel code supposed to run slower than sequential code, after a certain dataset size?

I'm fairly new to C# and programming in general and I was trying out parallel programming.
I have written this example code that computes the sum of an array first, using multiple threads, and then, using one thread (the main thread).
I've timed both cases.
static long Sum(int[] numbers, int start, int end)
{
long sum = 0;
for (int i = start; i < end; i++)
{
sum += numbers[i];
}
return sum;
}
static async Task Main()
{
// Arrange data.
const int COUNT = 100_000_000;
int[] numbers = new int[COUNT];
Random random = new();
for (int i = 0; i < numbers.Length; i++)
{
numbers[i] = random.Next(100);
}
// Split task into multiple parts.
int threadCount = Environment.ProcessorCount;
int taskCount = threadCount - 1;
int taskSize = numbers.Length / taskCount;
var start = DateTime.Now;
// Run individual parts in separate threads.
List<Task<long>> tasks = new();
for (int i = 0; i < taskCount; i++)
{
int begin = i * taskSize;
int end = (i == taskCount - 1) ? numbers.Length : (i + 1) * taskSize;
tasks.Add(Task.Run(() => Sum(numbers, begin, end)));
}
// Wait for all threads to finish, as we need the result.
var partialSums = await Task.WhenAll(tasks);
long sumAsync = partialSums.Sum();
var durationAsync = (DateTime.Now - start).TotalMilliseconds;
Console.WriteLine($"Async sum: {sumAsync}");
Console.WriteLine($"Async duration: {durationAsync} miliseconds");
// Sequential
start = DateTime.Now;
long sumSync = Sum(numbers, 0, numbers.Length);
var durationSync = (DateTime.Now - start).TotalMilliseconds;
Console.WriteLine($"Sync sum: {sumSync}");
Console.WriteLine($"Sync duration: {durationSync} miliseconds");
var factor = durationSync / durationAsync;
Console.WriteLine($"Factor: {factor:0.00}x");
}
When the array size is 100 million, the parallel sum is computed 2x faster. (on average).
But when the array size is 1 billion, it's significantly slower than the sequential sum.
Why is it running slower?
Hardware Information
Environment.ProcessorCount = 4
GC.GetGCMemoryInfo().TotalAvailableMemoryBytes = 8468377600
Timing:
When array size is 100,000,000
When array size is 1,000,000,000
New Test:
This time instead of separate threads (it was 3 in my case) working on different parts of a single array of 1,000,000,000 integers, I physically divided the dataset into 3 separate arrays of 333,333,333 (one-third in size). This time, although, I'm working on adding up a billion integers on the same machine, my parallel code runs faster (as expected)
private static void InitArray(int[] numbers)
{
Random random = new();
for (int i = 0; i < numbers.Length; i++)
{
numbers[i] = (int)random.Next(100);
}
}
public static async Task Main()
{
Stopwatch stopwatch = new();
const int SIZE = 333_333_333; // one third of a billion
List<int[]> listOfArrays = new();
for (int i = 0; i < Environment.ProcessorCount - 1; i++)
{
int[] numbers = new int[SIZE];
InitArray(numbers);
listOfArrays.Add(numbers);
}
// Sequential.
stopwatch.Start();
long syncSum = 0;
foreach (var array in listOfArrays)
{
syncSum += Sum(array);
}
stopwatch.Stop();
var sequentialDuration = stopwatch.Elapsed.TotalMilliseconds;
Console.WriteLine($"Sequential sum: {syncSum}");
Console.WriteLine($"Sequential duration: {sequentialDuration} ms");
// Parallel.
stopwatch.Restart();
List<Task<long>> tasks = new();
foreach (var array in listOfArrays)
{
tasks.Add(Task.Run(() => Sum(array)));
}
var partialSums = await Task.WhenAll(tasks);
long parallelSum = partialSums.Sum();
stopwatch.Stop();
var parallelDuration = stopwatch.Elapsed.TotalMilliseconds;
Console.WriteLine($"Parallel sum: {parallelSum}");
Console.WriteLine($"Parallel duration: {parallelDuration} ms");
Console.WriteLine($"Factor: {sequentialDuration / parallelDuration:0.00}x");
}
Timing
I don't know if it helps figure out what went wrong in the first approach.
The asynchronous pattern is not the same as running code in parallel. The main reason for asynchronous code is better resource utilization while the computer is waiting for some kind of IO device. Your code would be better described as parallel computing or concurrent computing.
While your example should work fine, it may not be the easiest, nor optimal way to do it. The easiest option would probably be to use Parallel Linq: numbers.AsParallel().Sum();. There is also a Parallel.For method that should be better suited, including an overload that maintains a thread local state. Note that while the parallel.For will attempt to optimize its partitioning, you probably want to process chunks of data in each iteration to reduce overhead. I would try around 1-10k values or so.
We can only guess the reason your parallel method is slower. Summing numbers is a really fast operation, so it may be that the computation is limited by memory bandwith or Cache usage. And while you want your work partitions to be fairly large, using too large partitions may result in less overall parallelism if a thread gets suspended for any reason. You may also want partitions on certain sizes to work well with the caching system, see cache associativity. It is also possible you are including things you did not intend to measure, like compilation times or GCs, See benchmark .Net that takes care of many of the edge cases when measuring performance.
Also, never use DateTime for measuring performance, Stopwatch is both much easier to use and much more accurate.
My machine has 4GB RAM, so initializing an int[1_000_000_000] results in memory paging. Going from int[100_000_000] to int[1_000_000_000] results in non-linear performance degradation (100x instead of 10x). Essentially a CPU-bound operation becomes I/O-bound. Instead of adding numbers, the program spends most of its time reading segments of the array from the disk. In these conditions using multiple threads can be detrimental for the overall performance, because the pattern of accessing the storage device becomes more erratic and less streamlined.
Maybe something similar happens on your 8GB RAM machine too, but I can't say for sure.

Convert List<double> to double[n,1]

I need to convert a large List of length n into a double[n,1] array. What is the fastest way to make the conversion?
For further background this is to pass into an set Excel object's Range.Value which requires a two dimensional array.
I'm writing this on the assumption that you really want the most efficient way to do this. Extreme performance almost always comes with a trade-off, usually code readability.
I can still substantially optimize one part of this as the comments note, but I didn't want to go overboard using dynamic methods on first pass.
const int TEST_SIZE = 100 * 1000;
//Test data setup
var list = new List<double>();
for (int i = 0; i < TEST_SIZE; i++)
list.Add(i);
//Grab the list's underlying array, which is not public
//This can be made MUCH faster with dynamic methods if you want me to optimize
var underlying = (double[])typeof(List<double>)
.GetField("_items", BindingFlags.NonPublic | BindingFlags.Instance)
.GetValue(list);
//We need the actual length of the list because there can be extra space in the array
//Do NOT use "underlying.Length"
int underlyingLength = list.Count;
//Benchmark it
var sw = Stopwatch.StartNew();
var twodarray = new double[underlyingLength, 1];
Buffer.BlockCopy(underlying, 0, twodarray, 0, underlyingLength * sizeof(double));
var elapsed = sw.Elapsed;
Console.WriteLine($"Elapsed: {elapsed}");
Output:
Elapsed: 00:00:00.0001998
Hardware used:
AMD Ryzen 7 3800X # 3.9 Ghz
32 GB DDR4 3200 RAM
I think this is what you want.
This operation will take no more than a few milliseconds even on a slow core. So why bother? How many times will you do this conversion? If millions of times, than try to find a better approach. But if you do this when the end-user presses a button...
Criticize the answer, but please providing metrics if about efficiency.
// Populate a List with 100.000 doubles
Random r = new Random();
List<double> dList = new List<double>();
int i = 0;
while (i++ < 100000) dList.Add(r.NextDouble());
// Convert to double[100000,1]
Stopwatch chrono = Stopwatch.StartNew();
// Conversion:
double[,] ddArray = new double[dList.Count, 1];
int dIndex = 0;
dList.ForEach((x) => ddArray[dIndex++, 0] = x);
Console.WriteLine("Completed in: {0}ms", chrono.Elapsed);
Outputs: (10 repetitions) - Maximum: 2.6 ms
Completed in: 00:00:00.0020677ms
Completed in: 00:00:00.0026287ms
Completed in: 00:00:00.0013854ms
Completed in: 00:00:00.0010382ms
Completed in: 00:00:00.0019168ms
Completed in: 00:00:00.0011480ms
Completed in: 00:00:00.0011172ms
Completed in: 00:00:00.0013586ms
Completed in: 00:00:00.0017165ms
Completed in: 00:00:00.0010508ms
Edit 1.
double[,] ddArray = new double[dList.Count, 1];
foreach (double x in dList) ddArray[dIndex++, 0] = x;
seems just a little bit faster, but needs more testing:
Completed in: 00:00:00.0020318ms
Completed in: 00:00:00.0019077ms
Completed in: 00:00:00.0023162ms
Completed in: 00:00:00.0015881ms
Completed in: 00:00:00.0013692ms
Completed in: 00:00:00.0022482ms
Completed in: 00:00:00.0015960ms
Completed in: 00:00:00.0012306ms
Completed in: 00:00:00.0015039ms
Completed in: 00:00:00.0016553ms

How i can adjust stopwatch to get the same values every time?

How I can adjust stopwatch to get the same values every time?
For this code for example:
Stopwatch w = new Stopwatch();
for (int i = 0; i < 40; i++)
{
w.Start();
test();
w.Stop();
w.Reset();
Console.WriteLine(w.ElapsedMilliseconds);
}
I get different value each time.
That's because of interruptions and how much resources your process/thread got allocated during execution. You can't do anything about it.
You should run your measurement multiple times and do some statistical analysis on the results: either average, median or e.g. 75th percentile

Time elapsed between two functions

I need to find the time elapsed between two functions doing the same operation but written in different algorithm. I need to find the fastest among the two
Here is my code snippet
Stopwatch sw = new Stopwatch();
sw.Start();
Console.WriteLine(sample.palindrome()); // algorithm 1
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);//tried sw.elapsed and sw.elapsedticks
sw.Reset(); //tried with and without reset
sw.Start();
Console.WriteLine(sample.isPalindrome()); //algorithm 2
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
Technically this should give the time taken for two algorithms. This gives that the algorithm 2 is faster. But it gives different time if I interchange the calling of two function. Like if I call algorithm2 first and algorithm1 second it says algorithm1 is faster.
I dont know what I am doing wrong.
I assume your palindrome methods runs extremely fast in this example and therefore in order to get a real result you will need to run them a couple of times and then decide which is faster.
Something like this:
int numberOfIterations = 1000; // you decide on a reasonable threshold.
sample.palindrome(); // Call this the first time and avoid measuring the JIT compile time
Stopwatch sw = new Stopwatch();
sw.Start();
for(int i = 0 ; i < numberOfIterations ; i++)
{
sample.palindrome(); // why console write?
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds); // or sw.ElapsedMilliseconds/numberOfIterations
Now do the same for the second method and you will get more realistic results.
What you must do is execute both methods before the actual calculated tests for the compiled code to be JIT'd. Then test with multiple tries. Here is a code mockup.
The compiled code in CIL format will be JIT'd upon first execution, it will be translated into machine code. So testing it at first is in-accurate. So let the code be JIT'd before actually testing it.
sample.palindrome();
sample.isPalindrome();
Stopwatch sw = Stopwatch.StartNew();
for (int i = 0; i < 1000; i++)
{
sample.palindrome();
Console.WriteLine("palindrome test #{0} result: {1}", i, sw.ElapsedMilliseconds);
}
sw.Stop();
Console.WriteLine("palindrome test Final result: {0}", sw.ElapsedMilliseconds);
sw.Restart();
for (int i = 0; i < 1000; i++)
{
sample.isPalindrome();
Console.WriteLine("isPalindrome test #{0} result: {1}", i, sw.ElapsedMilliseconds);
}
sw.Stop();
Console.WriteLine("isPalindrome test Final result: {0}", sw.ElapsedMilliseconds);
Read more about CIL and JIT
Unless you provide the code of palindrome and isPalindrome function along with the sample class, I can't do much except speculate.
The most likely reason which I guess for this is that both your functions use the same class variables and other data. So when you call the function for the first time, it has to allocate memory to the variables, whereas the next time you call some other function, those one time expenses have already occurred. If not variables, it could be some other matter, but along the same lines.
I suggest that you call both the functions twice, and note the duration only the second time a function is called, so that any resources which they need to use may have been allocated once, and there's lesser probability of something behind the scenes messing with the result.
Let me know if this works. This is mere speculation on my part, and I may be wrong.

Strange speed difference when adding a new item on List (C#)

I've made some speed tests concerning Lists in C#. Here is a result that I cannot explain. I hope someone can figure out what is happening.
Miliseconds for 1000 iterations if cloneList.RemoveAt(cloneList.Count - 1) is called before cloneList.Add(next): x milliseconds.
Miliseconds for 1000 iterations if cloneList.RemoveAt(cloneList.Count - 1) is NOT called before cloneList.Add(next): at least 20x milliseconds.
It seems if a have one more statement my code get 20 times faster (see the code below):
Stopwatch stopWatch = new Stopwatch();
Random random = new Random(100);
TimeSpan caseOneTimeSpan = new TimeSpan();
TimeSpan caseTwoTimeSpan = new TimeSpan();
int len = 1000;
List<int> myList = new List<int>();
myList.Capacity = len + 1;
// filling the list
for (int i = 0; i < len; i++)
myList.Add(random.Next(1000));
// number of tests (1000)
for (int i = 0; i < 1000; i++)
{
List<int> cloneList = myList.ToList();
int next = random.Next();
// case 1 - remove last item before adding the new item
stopWatch.Start();
cloneList.RemoveAt(cloneList.Count - 1);
cloneList.Add(next);
caseOneTimeSpan += stopWatch.Elapsed;
// reset stopwatch and clone list
stopWatch.Reset();
cloneList = myList.ToList();
// case 2 - add without removing
stopWatch.Start();
cloneList.Add(next);
caseTwoTimeSpan += stopWatch.Elapsed;
stopWatch.Reset();
}
Console.WriteLine("Case 1: " + caseOneTimeSpan.TotalMilliseconds);
Console.WriteLine("Case 2: " + caseTwoTimeSpan.TotalMilliseconds);
Console.WriteLine("Case 2 / Case 1: " + caseTwoTimeSpan.TotalMilliseconds / caseOneTimeSpan.TotalMilliseconds);
When you add an item to a list there are two possibilities:
The internal buffer is large enough to add another item. The item is placed in the next free location. Speed: O(1) (This is the most common case.)
The internal buffer is not large enough. Create a new, larger, buffer. Copy all items from the old buffer to the new one. Add the next item to the new buffer. Speed: O(n) (this shouldn't be occurring often)
While most Add calls will be O(1), some are O(n).
Removing the last item is always O(1).
Since Add is sometimes dependent on the size of the list, when the list is larger it takes longer (if any calls require a new buffer). If you always remove items when adding a new one you are ensuring that the internal buffer always has enough space.
You can look at the Capacity property of List to see the current size of the internal buffer and compare it to Count, which is the number of items that the list actually has. (Therefore Capacity-Count is the number of free items in the buffer.) While not often useful in real programs, looking at these tools when debugging or developing an application can be useful to helping you see what's going on underneath.

Categories

Resources