double multidimensional array,in c++, the best way - c#

This thread in SO is about multidimensional array in c++.
I have to port some code from c# to cpp. i have code like this:
private double[,] B;
...
this.B = new double[states, symbols];
double[][, ,] epsilon = new double[N][, ,];
double[][,] gamma = new double[N][,];
...
s += gamma[i][t, k] = ...
i have thought to use plain double array of array but it's quite pain. another solution could be vector of vector of double, or a custom Matrix2D and Matrix3D classes?
what is the best way for each of those cases?
WHAT I LEARNED:
multidimensional array in c++ is a great topic, and internet is full of resources. it could be handled in various ways, some of them really tricky, some others more faster to write.
i think that the best way is to deal with it is to use some libraries that takes in account this topic. there are a lot of them: Armadillo (nice MATLAB syntax conversion), Eigen i think is one of the better one, easy to install, easy to use, powerfull. Boost::multi_array is anotherone, and Boost is really a famous lib that is important just to take a look at how it handle the topic. As Konrad Rudolph answer STD with nested vectors or this could be another solution but, after a little search, i think the less elegant even the more easy and fast to code without external libs.
write a custom class. mayebe such a good exercice. peter answer or this or this are a good start point and also this post is interesting but expecially this great post blog from martin moene (one of the best essay on this topic i've read today). I mention also this answer for sparse array.
here is a nice tutorial direct from stroustrup
have a nice time with multidimensional array :-)

C++ has no direct equivalent of T[,] (although you could of course implement one by encapsulating the following code in a class. This is left as an exercise to the reader.
All C++ supports is nesting arrays/vectors (the equivalent of [][] in C#). So your first code would correspond to
vector<vector<double> > B(states, vector<double>(symbols));
… which initialises a vector of vectors, initialising the outer vector with states copies of an appropriately initialised inner vector.
Of course this can be taken to arbitrary complexity but at this point a few typedefs are in order to make the code more understandable.

class StateSymbols
{
public:
StateSymbols(unsigned int states, unsigned int symbols) :
m_states(states),
m_stateSymbols(states * symbols)
{
}
double get(unsigned int state, unsigned int symbol) const
{
return m_stateSymbols[(m_states * symbol) + state];
}
private:
const unsigned int m_states;
std::vector<double> m_stateSymbols;
};

Check out my answer on:
C++ Multi-dimensional Arrays on the Heap
It defines a basic function Create3D to allocate a 3D (generalizes to other dimensions) array in contiguous memory on the heap in a way that allows A[i][j][k] access operator syntax.

I would say dynamic array:
double* *list;
list = new double*[3]; //dimension1=3=row
for(int i=0;i<3;i++)
list[i] = new double[2]; //dimension2 =2 =col
list[0][0] = 1;
//...
for(int i=0;i<3;i++)
delete [] list[i];
delete [] list;

Related

Array.ConvertAll() and ToArray() method, which is better?

Which of the following method is better to read from console and storing it into Int array? Is there any difference between the 2?
int[][] s = new int[3][];
for(int i=0; i<elements.Length;i++)
{
s[i] = Console.ReadLine().Split(' ').Select(x => int.Parse(x)).ToArray();
}
OR
int[][] s = new int[3][];
for (int i = 0; i < 3; i++) {
s[i] = Array.ConvertAll(Console.ReadLine().Split(' '), sTemp => Convert.ToInt32(sTemp));
}
Thanks in Advance!!!
Firstly these aren't equivalent, in one version you are using int.Parse(x) in another Convert.ToInt32(sTemp)
That aside, you have found a perfect example of how to do something more than one way... In programming you will find this a lot.
ConvertAll()
Converts an array of one type to an array of another type.
Select()
Projects each element of a sequence into a new form.
ToArray()
Creates an array from a IEnumerable.
Technically, in combination they produce the same thing, yet got about it in slightly different ways due to the fact they are part of slightly different areas of the BCL that have concerns in slightly different domains.
Personally i don't see ConvertAll used all that much these days as people are very familiar with LINQ and like to chain methods.
As to which is better for performance, we would have to write a lot of tests to figure this out, and it would come down to allocations verse speed per array size per platform. However, i feel the difference would be relatively indistinguishable in day to day, and my guess is you would be struggling to find very much performance difference at all at any significant sigma categorically.
In short, use what You like

Translating C pointer statements to their C# equivalent

I have some C code I am trying to translate into C# code and I'm running into pointers which I am not familiar with so I don't know the C# equivalent. Get I get some help?
Case 1: Given these three lines in C, how do I declare p in C#?
double snorm[169];
double *p = snorm;
*p = 1.0;
Case 2: I have no idea what the pointers are actualy doing so I don't know how to change this line to C#.
*(snorm+n) = *(snorm+n-1) * (double)(2*n-1) / (double)n;
First:
double[] snorm = new double[169];
snorm[0] = 1.0;
Than just use snorm instead of p.
Second:
snorm[n] = snorm[n-1] * (double)(2*n-1)/(double)n;
Basically *p means that you take the value at the address of memory, referenced by p. Incrementing and adding to the pointer are moving the pointer in memory, so p++, as well as (p+1) just refers to the next item in memory (how far it really moves in memory depends on the data type the pointer points to). And, *(p+n) is just a value of the n-th item in the array (if p points to an array)
Anyway, you should get yourself familiar with pointers.
That code is basically using pointers as an alternative to array access. So your first snippet is equivalent to:
double[] snorm = new double[169];
snorm[0] = 1.0;
The next bit is equivalent to:
snorm[n] = snorm[n-1] * (double)(2*n-1) / (double)n;
(I'd use more spaces, but obviously that's a matter of taste.)
The only tricky bit is going to be if something increments a pointer - at that point you'll need to remember that you've basically got an extra offset to add to any future array indexes.

Best practice with Math.Pow

I'm working on a n image processing library which extends OpenCV, HALCON, ... . The library must be with .NET Framework 3.5 and since my experiences with .NET are limited I would like to ask some questions regarding the performance.
I have encountered a few specific things which I cannot explain to myself properly and would like you to ask a) why and b) what is the best practise to deal with the cases.
My first question is about Math.pow. I already found some answers here on StackOverflow which explains it quite well (a) but not what to do about this(b). My benchmark Program looks like this
Stopwatch watch = new Stopwatch(); // from the Diagnostics class
watch.Start();
for (int i = 0; i < 1000000; i++)
double result = Math.Pow(4,7) // the function call
watch.Stop()
The result was not very nice (~300ms on my computer) (I have run the test 10 times and calcuated the average value).
My first idea was to check wether this is because it is a static function. So I implemented my own class
class MyMath
{
public static double Pow (double x, double y) //Using some expensive functions to calculate the power
{
return Math.Exp(Math.Log(x) * y);
}
public static double PowLoop (double x, int y) // Using Loop
{
double res = x;
for(int i = 1; i < y; i++)
res *= x;
return res;
}
public static double Pow7 (double x) // Using inline calls
{
return x * x * x * x * x * x * x;
}
}
THe third thing I checked were if I would replace the Math.Pow(4,7) directly through 4*4*4*4*4*4*4.
The results are (the average out of 10 test runs)
300 ms Math.Pow(4,7)
356 ms MyMath.Pow(4,7) //gives wrong rounded results
264 ms MyMath.PowLoop(4,7)
92 ms MyMath.Pow7(4)
16 ms 4*4*4*4*4*4*4
Now my situation now is basically like this: Don't use Math for Pow. My only problem is just that... do I really have to implement my own Math-class now? It seems somehow ineffective to implement an own class just for the power function. (Btw. PowLoop and Pow7 are even faster in the Release build by ~25% while Math.Pow is not).
So my final questions are
a) am I wrong if I wouldn't use Math.Pow at all (but for fractions maybe) (which makes me somehow sad).
b) if you have code to optimize, are you really writing all such mathematical operations directly?
c) is there maybe already a faster (open-source^^) library for mathematical operations
d) the source of my question is basically: I have assumed that the .NET Framework itself already provides very optimized code / compile results for such basic operations - be it the Math-Class or handling arrays and I was a little surprised how much benefit I would gain by writing my own code. Are there some other, general "fields" or something else to look out in C# where I cannot trust C# directly.
Two things to bear in mind:
You probably don't need to optimise this bit of code. You've just done a million calls to the function in less than a second. Is this really going to cause big problems in your program?
Math.Pow is probably fairly optimal anyway. At a guess, it will be calling a proper numerics library written in a lower level language, which means you shouldn't expect orders of magnitude increases.
Numerical programming is harder than you think. Even the algorithms that you think you know how to calculate, aren't calculated that way. For example, when you calculate the mean, you shouldn't just add up the numbers and divide by how many numbers you have. (Modern numerics libraries use a two pass routine to correct for floating point errors.)
That said, if you decide that you definitely do need to optimise, then consider using integers rather than floating point values, or outsourcing this to another numerics library.
Firstly, integer operations are much faster than floating point. If you don't need floating point values, don't use the floating point data type. This generally true for any programming language.
Secondly, as you have stated yourself, Math.Pow can handle reals. It makes use of a much more intricate algorithm than a simple loop. No wonder it is slower than simply looping. If you get rid of the loop and just do n multiplications, you are also cutting off the overhead of setting up the loop - thus making it faster. But if you don't use a loop, you have to know
the value of the exponent beforehand - it can't be supplied at runtime.
I am not really sure why Math.Exp and Math.Log is faster. But if you use Math.Log, you can't find the power of negative values.
Basically int are faster and avoiding loops avoid extra overhead. But you are trading off some flexibility when you go for those. But it is generally a good idea to avoid reals when all you need are integers, but in this case coding up a custom function when one already exists seems a little too much.
The question you have to ask yourself is whether this is worth it. Is Math.Pow actually slowing your program down? And in any case, the Math.Pow already bundled with your language is often the fastest or very close to that. If you really wanted to make an alternate implementation that is really general purpose (i.e. not limited to only integers, positive values, etc.), you will probably end up using the same algorithm used in the default implementation anyway.
When you are talking about making a million iterations of a line of code then obviously every little detail will make a difference.
Math.Pow() is a function call which will be substantially slower than your manual 4*4...*4 example.
Don't write your own class as its doubtful you'll be able to write anything more optimised than the standard Math class.

Is it correct to use Array.CopyTo to copy elements or should a for-loop always be used?

It's easier to write
intArray1.CopyTo( intArray2, 0 )
than the for-loop equivalent, but System.Array does not provide any generic Copy/CopyTo methods.
Is it better to write the for-loop? Or is using Copy/CopyTo compiled or JIT'd efficiently enough?
Array.Copy/CopyTo will perform faster than a manual loop in most cases as it can do direct memory copying.
If you don't have huge arrays or speed is not an issue, use whatever would look best in your code where you need to copy the items.
If you are copying an array of primitive types as your sample would imply, you can us the memory copy technique yourself using the Buffer classes BlockCopy method.
int[] CopyArray(int[] A, int index)
{
const int INT_SIZE = 4;
int length = A.Length - index;
int[] B = new int[A.Length - index];
Buffer.BlockCopy(A, index * INT_SIZE, B,
0 * INT_SIZE, length * INT_SIZE);
return B;
}
This method is the most efficient manner in which to copy an array of primitives. (It only works with primitives)
I say if you know that you want to copy the entirety of the first array to the second array without changing the values or doing any specific processing on the copy, then use Array.CopyTo.
There are some limitations to this. The array must only have a single dimension as I remember it. Also if the arrays are quite large you might have some speed related issues with the copyto, but I would imagine that would only come into play with very large arrays. So, I would try it and test it, but your mileage may vary.

Vectorising operators in C#

I spend much of my time programming in R or MATLAB. These languages are typically used for manipulating arrays and matrices, and consequently, they have vectorised operators for addition, equality, etc.
For example, in MATLAB, adding two arrays
[1.2 3.4 5.6] + [9.87 6.54 3.21]
returns an array of the same size
ans =
11.07 9.94 8.81
Switching over to C#, we need a loop, and it feels like a lot of code.
double[] a = { 1.2, 3.4, 5.6 };
double[] b = { 9.87, 6.54, 3.21 };
double[] sum = new double[a.Length];
for (int i = 0; i < a.Length; ++i)
{
sum[i] = a[i] + b[i];
}
How should I implement vectorised operators using C#? These should preferably work for all numeric array types (and bool[]). Working for multidimensional arrays is a bonus.
The first idea I had was to overload the operators for System.Double[], etc. directly. This has a number of problems though. Firstly, it could cause confusion and maintainability issues if built-in classes do not bahave as expected. Secondly, I'm not sure if it is even possible to change the behaviour of these built-in classes.
So my next idea was to derive a class from each numerical type and overload the operators there. This creates the hassle of converting from double[] to MyDoubleArray and back, which reduces the benefit of me doing less typing.
Also, I don't really want to have to repeat a load of almost identical functionality for every numeric type. This lead to my next idea of a generic operator class. In fact, someone else had also had this idea: there's a generic operator class in Jon Skeet's MiscUtil library.
This gives you a method-like prefix syntax for operations, e.g.
double sum = Operator<double>.Add(3.5, -2.44); // 1.06
The trouble is, since the array types don't support addition, you can't just do something like
double[] sum = Operator<double[]>.Add(a, b); // Throws InvalidOperationException
I've run out of ideas. Can you think of anything that will work?
Create a Vector class (actually I'd make it a struct) and overload the arithmentic operators for that class... This has probably been done already if you do a google search, there are numerous hits... Here's one that looks promising Vector class...
To handle vectors of arbitrary dimension, I'd:
design the internal array which would persist the individual floats for each of the
vectors dimension values an array list of arbitrary size,
make the Vector constructor take the dimension as an constructor parameter,
In the arithmentic operator overloads, add a validation that the two vectors being added, or subtracted have the same dimension.
You should probably create a Vector class that internally wraps an array and overloads the arithmetic operators. There's a decent matrix/vector code library here.
But if you really need to operate on naked arrays for some reason, you can use LINQ:
var a1 = new double[] { 0, 1, 2, 3 };
var a2 = new double[] { 10, 20, 30, 40 };
var sum = a1.Zip( a2, (x,y) => Operator<double>.Add( x, y ) ).ToArray();
Take a look at CSML. It's a fairly complete matrix library for c#. I've used it for a few things and it works well.
The XNA Framework has the classes you may be able to use. You can use it in your application like any other part of .NET. Just grab the XNA redistributable and code away.
BTW, you don't need to do anything special (like getting the game studio or joining the creator's club) to use it in your application.

Categories

Resources