Cost of common operations for C#?

Cost of common operations for C#? - c#

In Code Complete 2 (page 601 and 602) there is a table of "Cost of Common Operations".
The baseline operation integer assignment is given a value 1 and then the relative time for common operations is listed for Java and C++. For example:
C++ Java
Integer assignment 1 1
Integer division 5 1.5
Floating point square root 15 4
The question is has anyone got this data for C#? I know that these won't help me solve any problems specifically, I'm just curious.

I implemented some of the tests from the book. Some raw data from my computer:
Test Run #1:
TestIntegerAssignment 00:00:00.6680000
TestCallRoutineWithNoParameters 00:00:00.9780000
TestCallRoutineWithOneParameter 00:00:00.6580000
TestCallRoutineWithTwoParameters 00:00:00.9650000
TestIntegerAddition 00:00:00.6410000
TestIntegerSubtraction 00:00:00.9630000
TestIntegerMultiplication 00:00:00.6490000
TestIntegerDivision 00:00:00.9720000
TestFloatingPointDivision 00:00:00.6500000
TestFloatingPointSquareRoot 00:00:00.9790000
TestFloatingPointSine 00:00:00.6410000
TestFloatingPointLogarithm 00:00:41.1410000
TestFloatingPointExp 00:00:34.6310000
Test Run #2:
TestIntegerAssignment 00:00:00.6750000
TestCallRoutineWithNoParameters 00:00:00.9720000
TestCallRoutineWithOneParameter 00:00:00.6490000
TestCallRoutineWithTwoParameters 00:00:00.9750000
TestIntegerAddition 00:00:00.6730000
TestIntegerSubtraction 00:00:01.0300000
TestIntegerMultiplication 00:00:00.7000000
TestIntegerDivision 00:00:01.1120000
TestFloatingPointDivision 00:00:00.6630000
TestFloatingPointSquareRoot 00:00:00.9860000
TestFloatingPointSine 00:00:00.6530000
TestFloatingPointLogarithm 00:00:39.1150000
TestFloatingPointExp 00:00:33.8730000
Test Run #3:
TestIntegerAssignment 00:00:00.6590000
TestCallRoutineWithNoParameters 00:00:00.9700000
TestCallRoutineWithOneParameter 00:00:00.6680000
TestCallRoutineWithTwoParameters 00:00:00.9900000
TestIntegerAddition 00:00:00.6720000
TestIntegerSubtraction 00:00:00.9770000
TestIntegerMultiplication 00:00:00.6580000
TestIntegerDivision 00:00:00.9930000
TestFloatingPointDivision 00:00:00.6740000
TestFloatingPointSquareRoot 00:00:01.0120000
TestFloatingPointSine 00:00:00.6700000
TestFloatingPointLogarithm 00:00:39.1020000
TestFloatingPointExp 00:00:35.3560000
(1 Billion Tests Per Benchmark, Compiled with Optimize, AMD Athlon X2 3.0ghz, using Jon Skeet's microbenchmarking framework available at http://www.yoda.arachsys.com/csharp/benchmark.html)
Source:
class TestBenchmark
{
[Benchmark]
public static void TestIntegerAssignment()
{
int i = 1;
int j = 2;
for (int x = 0; x < 1000000000; x++)
{
i = j;
}
}
[Benchmark]
public static void TestCallRoutineWithNoParameters()
{
for (int x = 0; x < 1000000000; x++)
{
TestStaticRoutine();
}
}
[Benchmark]
public static void TestCallRoutineWithOneParameter()
{
for (int x = 0; x < 1000000000; x++)
{
TestStaticRoutine2(5);
}
}
[Benchmark]
public static void TestCallRoutineWithTwoParameters()
{
for (int x = 0; x < 1000000000; x++)
{
TestStaticRoutine3(5,7);
}
}
[Benchmark]
public static void TestIntegerAddition()
{
int i = 1;
int j = 2;
int k = 3;
for (int x = 0; x < 1000000000; x++)
{
i = j + k;
}
}
[Benchmark]
public static void TestIntegerSubtraction()
{
int i = 1;
int j = 6;
int k = 3;
for (int x = 0; x < 1000000000; x++)
{
i = j - k;
}
}
[Benchmark]
public static void TestIntegerMultiplication()
{
int i = 1;
int j = 2;
int k = 3;
for (int x = 0; x < 1000000000; x++)
{
i = j * k;
}
}
[Benchmark]
public static void TestIntegerDivision()
{
int i = 1;
int j = 6;
int k = 3;
for (int x = 0; x < 1000000000; x++)
{
i = j/k;
}
}
[Benchmark]
public static void TestFloatingPointDivision()
{
float i = 1;
float j = 6;
float k = 3;
for (int x = 0; x < 1000000000; x++)
{
i = j / k;
}
}
[Benchmark]
public static void TestFloatingPointSquareRoot()
{
double x = 1;
float y = 6;
for (int x2 = 0; x2 < 1000000000; x2++)
{
x = Math.Sqrt(6);
}
}
[Benchmark]
public static void TestFloatingPointSine()
{
double x = 1;
float y = 6;
for (int x2 = 0; x2 < 1000000000; x2++)
{
x = Math.Sin(y);
}
}
[Benchmark]
public static void TestFloatingPointLogarithm()
{
double x = 1;
float y = 6;
for (int x2 = 0; x2 < 1000000000; x2++)
{
x = Math.Log(y);
}
}
[Benchmark]
public static void TestFloatingPointExp()
{
double x = 1;
float y = 6;
for (int x2 = 0; x2 < 1000000000; x2++)
{
x = Math.Exp(6);
}
}
private static void TestStaticRoutine() {
}
private static void TestStaticRoutine2(int i)
{
}
private static void TestStaticRoutine3(int i, int j)
{
}
private static class TestStaticClass
{
}

Straight from the source, Know what things cost.
IIRC Rico Mariani had relative measures as the ones you asked for on his blog, I can't find it anymore, though (I know it's in one of thoe twohudnred "dev" bookmarks...)

It's a reasonable question, but nearly every performance problem I've seen, especially in Java and C# boiled down to:
too many layers of abstraction, and
reliance on event-based notification-style coding.
which have little or nothing to do with basic operations.
The problem with abstraction is it is fine until the workload gets heavy. Each layer usually exacts a small performance penalty, and these accumulate in a compounded fashion. At that point you start needing workarounds. (I think StringBuilder is an example of such a workaround.)
The problem with event-based notification-style coding (as opposed to simpler data structures kept consistent by a periodic process) is that what can seem like simple actions, such as setting a property to a value, can result in a ripple effect of actions throughout the data structure doing far more than one might expect.

Related

What is the best way to process output of segmentation network in Microsoft.ML?

The network produces 1 x N x K tensor, where N is number of pixel positions and K is number of classes, each value represents score for a class at given position.
Current code to retrieve best class affinity for each position is working, but it is terribly slow and takes x4 more time, than the network run itself.
private int[,] GetClasses(List<DisposableNamedOnnxValue> output)
{
Tensor<float> outTensor = output.First().AsTensor<float>();
int[,] classes = new int[frameWidth,frameHeight];
for (int i = 0; i < frameWidth; ++i)
{
for (int j = 0; j < frameHeight; ++j)
{
int finalClass = 0;
float finalClassScore = 0;
for (int k = 0; k < nClasses; ++k)
{
float score = outTensor[0, i * frameHeight + j, k];
if (score > finalClassScore)
{
finalClassScore = score;
finalClass = k;
}
}
classes[i, j] = finalClass;
}
}
return classes;
}
Is there a better, faster way of doing this in Microsoft.ML ?

The solution I went with was to add argmax layer to the initial keras model. Keras output single value through argmax.

How can I access multi dimension in one dimension array?

I got the following codes:
Boo[,,] boos = new Boo[8, 8, 8];
Boo GetMe(int i, int j, int k)
{
return boos[i, j, k];
}
The code above is inefficient so i convert it to one dimensional array:
Boo[] boosone;
Boo[,,] boos = new Boo[8, 8, 8];
Boo GetMe(int i, int j, int k)
{
if (boosone == null)
{
boosone = new Boo[8 * 8 * 8];
int num = 0;
for (int x = 0; x < 8; x++)
{
for (int y = 0; y < 8; y++)
{
for (int z = 0; z < 8; z++)
{
boosone[num] = boos[x, y, z];
num++;
}
}
}
}
return boosone[?];
}
How can I get the Boo (from the same position like in multidimensional array j k l) from the one dimensional array boosone?

int index = (8 * 8 * i) + (8 * j) + k;
return boosone[index];

Not really sure why you're saying that the first 3D array is not efficient (I mean, have you actually noticed a particularly heavy slowdown when using it?), but you can do that with some simple offset calculations.
First of all, if you target the latest C# version, you can replace the whole copy function with just two lines, and your code would then look like this:
using System;
using System.Runtime.InteropServices;
Boo[] boosone;
Boo[,,] boos = new Boo[8, 8, 8];
Boo GetMe(int i, int j, int k)
{
if (boosone == null)
{
boosone = new Boo[boos.Length];
MemoryMarshal.CreateSpan(ref boos[0, 0, 0], boosone.Length).CopyTo(boosone);
}
return boosone[boos.GetLength(1) * boos.GetLength(2) * i + boos.GetLength(2) * j + k];
}
If you don't want to use the MemoryMarshal class for some reason, you could also use LINQ to flatten your 3D array, although this approach is much less efficient:
boosone = boos.Cast<Boo>().ToArray();

Accessing a multi dimensional array isn't any slower then accessing a single dimension array, in fact they are both stored in memory exactly the same way. It's not what you are doing, it's how you are doing it.
If you want to wrap either array in a trivial method, give the compiler a hint that it can be inline
[MethodImpl(MethodImplOptions.AggressiveInlining)]
Boo GetMe(int i, int j, int k)
{
return boos[i, j, k];
}
On saying that, this method does absolutely nothing and has no advantage then just using the array indexer.
If you want to work with segments of an array with out the overhead of reallocation, consider using Span<T> or Memory<T> or ArraySegment
At about this point I would write example code, however as I have no idea what you are doing, it's hard to guess what you need.
What I suggest, is download BenchmarkDotNet, and start profiling your code to work out what is the most efficient and performant way to do what you desire, don't guess...

Why don't you look at jagged arrays which provide better performance? I did a test (under RELEASE configuration) which showed that you wrapper is twice faster than the d3 array, but jagged is 3 times faster than the d3 array.
using System;
using System.Diagnostics;
using System.Linq;
using System.Threading;
namespace ArrayWrapper
{
class ArrayPerformanceTest
{
int xSize = 2;
int ySize = 3;
int zSize = 4;
int count = 100000000;
int delay = 500;
static void Main(string[] args)
{
new ArrayPerformanceTest().Run();
}
private void Run()
{
var d3Array = CreateD3Array();
var wrapped = GetD1Adapter(d3Array);
var jagged = GetJaggedArray(d3Array);
Thread.Sleep(delay);
TestD3Array(d3Array);
Thread.Sleep(delay);
TestWrappedArray(wrapped);
Thread.Sleep(delay);
TestJaggeddArray(jagged);
Thread.Sleep(delay);
}
private int[,,] CreateD3Array()
{
var rectangular = new int[xSize, ySize, zSize];
int i = 7;
for (var x = 0; x < xSize; x++)
for (var y = 0; y < ySize; y++)
for (var z = 0; z < zSize; z++)
rectangular[x, y, z] = ++i;
return rectangular;
}
private int[] GetD1Adapter(int[,,] d3Array)
{
return d3Array.Cast<int>().ToArray();
}
private int[][][] GetJaggedArray(int[,,] d3Array)
{
var xSize = d3Array.GetUpperBound(0) + 1;
var ySize = d3Array.GetUpperBound(1) + 1;
var zSize = d3Array.GetUpperBound(2) + 1;
var jagged = new int[xSize].Select(j => new int[ySize].Select(k => new int[zSize].ToArray()).ToArray()).ToArray();
for (var x = 0; x < xSize; x++)
for (var y = 0; y < ySize; y++)
for (var z = 0; z < zSize; z++)
jagged[x][y][z] = d3Array[x, y, z];
return jagged;
}
private void TestD3Array(int[,,] d3Array)
{
int i;
var sw = new Stopwatch();
sw.Start();
for (var c = 0; c < count; c++)
for (var x = 0; x < xSize; x++)
for (var y = 0; y < ySize; y++)
for (var z = 0; z < zSize; z++)
i = d3Array[x, y, z];
sw.Stop();
Console.WriteLine($"{nameof(d3Array),7} {sw.ElapsedTicks,10}");
}
private void TestWrappedArray(int[] wrapped)
{
int i;
var sw = new Stopwatch();
sw.Start();
for (var c = 0; c < count; c++)
for (var x = 0; x < xSize; x++)
for (var y = 0; y < ySize; y++)
for (var z = 0; z < zSize; z++)
i = wrapped[x * ySize * zSize + y * zSize + z];
sw.Stop();
Console.WriteLine($"{nameof(wrapped),7} {sw.ElapsedTicks,10}");
}
private void TestJaggeddArray(int[][][] jagged)
{
int i;
var sw = new Stopwatch();
sw.Start();
for (var c = 0; c < count; c++)
for (var x = 0; x < xSize; x++)
for (var y = 0; y < ySize; y++)
for (var z = 0; z < zSize; z++)
i = jagged[x][y][z];
sw.Stop();
Console.WriteLine($"{nameof(jagged),7} {sw.ElapsedTicks,10}");
}
}
}
Output:
d3Array 15541709
wrapped 8213316
jagged 5322008
I also analysed CPU usage.
It is of the same rate for all 3 approaches.

C# Multithreading - Performance Issue

I have console application with numerical calculation, which I am trying to parallel using the ThreadPool.
I got object state as class (simple-data-passing):
public class DataContainer
{
public double[,] Exa;
public double[] EQ;
public int iStart;
public int iEnd;
}
Definition for WaitCallback
private void Calculate(object state)
{
DataContainer data = state as DataContainer;
for (int m = data.iStart; m < data.iEnd; m++)
{
double temp= 0.0;
for (int i = 0; i < 500000; i++)
{
for (int j = i + 1; j < 500000; j++)
{
//Some-Long-Calculation based on data.Exa-around 200 math operation with results of double EQC
if (EQC> temp) { temp= EQC; } //line performance issue-temp is declared in first for-loop block;
}
}
}
}
Execution:
WaitCallback waitCallback = Calculate;
const int numberOfThreads = 100;
ThreadPool.SetMaxThreads(30, 100);
for (int i = 0; i < numberOfThreads; i++)
{
DataContainer tempContainer = new DataContainer();
tempContainer.Exa = Exa;
tempContainer.EQ = EQ;
tempContainer.iStart = CalculateStart(i);
tempContainer.iEnd = CalculateEnd(i);
ThreadPool.QueueUserWorkItem(waitCallback, tempContainer);
}
int numberOfTotalThreads = 0;
int numberOfMaxThreads = 0;
int numberOfWorkingThreads = 0;
int temp = 0;
//do-while - waiting to finish all calculation
do
{
ThreadPool.GetAvailableThreads(out numberOfTotalThreads, out temp);
ThreadPool.GetMaxThreads(out numberOfMaxThreads, out temp);
numberOfWorkingThreads = numberOfMaxThreads - numberOfTotalThreads;
Console.WriteLine("Number of working threads {0}", numberOfWorkingThreads);
Thread.Sleep(1000);
} while (numberOfWorkingThreads > 0);
So one marked line:
if (EQC> temp) { temp= EQC; }
Time exeuction of program slow down from 40s to 600s.
Could you advice how these line should be written to avoid that problem?

Arithmetic operations using console application

I am trying to create an application to record the time elapsed per machine using simple arithmetic operations.
Using console application, with parameters of number of loop and the threads to use with the code below:
public static Int64 IterationCount { get; set; }
static void Main(string[] args)
{
int iterations = int.Parse(args[0]);
int threads = int.Parse(args[1]);
IterationCount = iterations * 1000000000;
Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i < threads; i++)
{
Task.Factory.StartNew(() => Calculate());
Task.WaitAll();
}
sw.Stop();
Console.WriteLine("Elapsed={0}", sw.Elapsed);
}
And my Calculate method:
private static void Calculate()
{
for (int i = 0; i < IterationCount; i++)
{
a = 1 + 2;
b = 1 - 2;
c = 1 * 2;
a = 1 / 2;
}
}
Now I think this is not working because the result of my elapsed time when I entered 10 iterations (I am multiplying the first parameter to 1 billion: 10 * 1,000,000,000) and 4 threads is:
00:00:00:0119747
Any thing I missed?

Your call to Task.WaitAll() has no effect as the signature of the function is
public static void WaitAll(params Task[] tasks).
You see, you can supply a variable count of Tasks to wait for and you call this function with no task; so it will not wait at all.
If you replace your code by the following, you will see the effect.
Task[] tasks = new Task[threads];
for (int i = 0; i < threads; i++)
{
tasks[i] = Task.Factory.StartNew(() => Calculate());
}
Task.WaitAll(tasks);

Turns out my comment is accurate. If I copy the contents of your Calculate method into Visual Studio:
private static void Calculate()
{
for (int i = 0; i < IterationCount; i++)
{
a = 1 + 2;
b = 1 - 2;
c = 1 * 2;
d = 1 / 2;
}
}
after compilation, the generated C# code looks like this:
private static void Calculate()
{
for (int i = 0; i < Program.IterationCount; i++)
{
Program.a = 3;
Program.b = -1;
Program.c = 2;
Program.d = 0;
}
}
Instead, you're going to have to make one of the constants into a variable:
private static void Calculate()
{
int x = 1;
for (int i = 0; i < IterationCount; i++)
{
a = x + 2;
b = x - 2;
c = x * 2;
d = x / 2;
}
}
This code becomes:
private static void Calculate()
{
int x = 1;
for (int i = 0; i < Program.IterationCount; i++)
{
Program.a = x + 2;
Program.b = x - 2;
Program.c = x * 2;
Program.d = x / 2;
}
}

C# multidimensional array value changes unexpectedly

Hello i have a realy strange behavior in this piece of code:
public class IGraphics
{
public int[,] screen;
private int[,] world;
private int[,] entitys;
private int[,] buffer;
private int screenW;
private int screenH;
public IGraphics(int screenW, int screenH) {
this.screenH = screenH;
this.screenW = screenW;
screen = new int[screenW + 1, screenH];
buffer = new int[screenW + 1, screenH];
}
public void loadWorld(int[,] world) {
this.world = world;
}
public void clear() {
screen = new int[screenW + 1, screenH];
world = new int[screenW, screenH];
for (int y = 0; y < world.GetLength(1); y++) {
for (int x = 0; x < world.GetLength(0); x++) {
world[x, y] = 0;
}
}
}
private void loadScreen() {
}
private void updateEntitys()
{
entitys = new int[screenW, screenH];
List<GameObject> EntRow = Common.world.getEntitys();
for (int i = 0; i < EntRow.Count(); i++)
{
entitys[EntRow[i].x, EntRow[i].y] = EntRow[i].Icon;
}
}
public void draw() {
updateEntitys();
for (int y = 0; y < screen.GetLength(1); y++)
{
for (int x = 0; x < screen.GetLength(0) - 1; x++)
{
if (entitys[x, y] == 0)
{
screen[x, y] = world[x, y];
}
else
{
screen[x, y] = entitys[x, y];
}
}
screen[screen.GetLength(0) - 1, y] = 123;
}
if (buffer.Cast<int>().SequenceEqual(screen.Cast<int>()))
{
return;
}
Console.Clear();
buffer = screen;
for (int y = 0; y < screen.GetLength(1); y++) {
for (int x = 0; x < screen.GetLength(0); x++) {
if (screen[x, y] == 123)
{
Console.WriteLine();
}
else {
Console.Write(objectStore.getIcon(screen[x, y]));
}
}
}
}
}
the problem comes in the Draw() function where i set the value of the screen[,] array for some reason it also change the value of the buffer[,] array before the control also tried moving the buffer[,] in a seperate class but i had the same problem.
Someone as an explanation?

When you assign a reference variable to another variable, you copy the pointer to that variable, instead of copying the content, so what you end up with is two variables that point to the same array.
Try to instead copy the array using Clone or copy or something. I think its screen.CopyTo
screen.CopyTo(buffer, 0);

If you look at the body of the method called draw, you will notice this assignment:
buffer = screen;
This is might the cause of the change you noticed.

See the other answers for possible reasons for your problem. However, the following code shows that changing screen[] does not change buffer[]. This kind of effort on your part would allow you to focus your investigation elsewhere. The principle is to first simply.
int counter = 1;
public void draw()
{
for (int y = 0; y < screen.GetLength(1); y++)
{
for (int x = 0; x < screen.GetLength(0) - 1; x++)
{
screen[x, y] = counter++;
}
screen[screen.GetLength(0) - 1, y] = 123;
}
if (buffer.Cast<int>().SequenceEqual(screen.Cast<int>()))
{
MessageBox.Show("Help!");
return;
}
// check again
for (int y = 0; y < screen.GetLength(1); y++)
{
for (int x = 0; x < screen.GetLength(0) - 1; x++)
{
if (screen[x, y] == buffer[x, y])
{
MessageBox.Show("Help two!");
return;
}
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Cost of common operations for C#? - c#

Straight from the source, Know what things cost. IIRC Rico Mariani had relative measures as the ones you asked for on his blog, I can't find it anymore, though (I know it's in one of thoe twohudnred "dev" bookmarks...)

Related

What is the best way to process output of segmentation network in Microsoft.ML?

How can I access multi dimension in one dimension array?

C# Multithreading - Performance Issue

Arithmetic operations using console application

C# multidimensional array value changes unexpectedly

Categories

Resources