Vectorising operators in C#

Vectorising operators in C# - c#

I spend much of my time programming in R or MATLAB. These languages are typically used for manipulating arrays and matrices, and consequently, they have vectorised operators for addition, equality, etc.
For example, in MATLAB, adding two arrays
[1.2 3.4 5.6] + [9.87 6.54 3.21]
returns an array of the same size
ans =
11.07 9.94 8.81
Switching over to C#, we need a loop, and it feels like a lot of code.
double[] a = { 1.2, 3.4, 5.6 };
double[] b = { 9.87, 6.54, 3.21 };
double[] sum = new double[a.Length];
for (int i = 0; i < a.Length; ++i)
{
sum[i] = a[i] + b[i];
}
How should I implement vectorised operators using C#? These should preferably work for all numeric array types (and bool[]). Working for multidimensional arrays is a bonus.
The first idea I had was to overload the operators for System.Double[], etc. directly. This has a number of problems though. Firstly, it could cause confusion and maintainability issues if built-in classes do not bahave as expected. Secondly, I'm not sure if it is even possible to change the behaviour of these built-in classes.
So my next idea was to derive a class from each numerical type and overload the operators there. This creates the hassle of converting from double[] to MyDoubleArray and back, which reduces the benefit of me doing less typing.
Also, I don't really want to have to repeat a load of almost identical functionality for every numeric type. This lead to my next idea of a generic operator class. In fact, someone else had also had this idea: there's a generic operator class in Jon Skeet's MiscUtil library.
This gives you a method-like prefix syntax for operations, e.g.
double sum = Operator<double>.Add(3.5, -2.44); // 1.06
The trouble is, since the array types don't support addition, you can't just do something like
double[] sum = Operator<double[]>.Add(a, b); // Throws InvalidOperationException
I've run out of ideas. Can you think of anything that will work?

Create a Vector class (actually I'd make it a struct) and overload the arithmentic operators for that class... This has probably been done already if you do a google search, there are numerous hits... Here's one that looks promising Vector class...
To handle vectors of arbitrary dimension, I'd:
design the internal array which would persist the individual floats for each of the
vectors dimension values an array list of arbitrary size,
make the Vector constructor take the dimension as an constructor parameter,
In the arithmentic operator overloads, add a validation that the two vectors being added, or subtracted have the same dimension.

You should probably create a Vector class that internally wraps an array and overloads the arithmetic operators. There's a decent matrix/vector code library here.
But if you really need to operate on naked arrays for some reason, you can use LINQ:
var a1 = new double[] { 0, 1, 2, 3 };
var a2 = new double[] { 10, 20, 30, 40 };
var sum = a1.Zip( a2, (x,y) => Operator<double>.Add( x, y ) ).ToArray();

Take a look at CSML. It's a fairly complete matrix library for c#. I've used it for a few things and it works well.

The XNA Framework has the classes you may be able to use. You can use it in your application like any other part of .NET. Just grab the XNA redistributable and code away.
BTW, you don't need to do anything special (like getting the game studio or joining the creator's club) to use it in your application.

Related

Array.ConvertAll() and ToArray() method, which is better?

Which of the following method is better to read from console and storing it into Int array? Is there any difference between the 2?
int[][] s = new int[3][];
for(int i=0; i<elements.Length;i++)
{
s[i] = Console.ReadLine().Split(' ').Select(x => int.Parse(x)).ToArray();
}
OR
int[][] s = new int[3][];
for (int i = 0; i < 3; i++) {
s[i] = Array.ConvertAll(Console.ReadLine().Split(' '), sTemp => Convert.ToInt32(sTemp));
}
Thanks in Advance!!!

Firstly these aren't equivalent, in one version you are using int.Parse(x) in another Convert.ToInt32(sTemp)
That aside, you have found a perfect example of how to do something more than one way... In programming you will find this a lot.
ConvertAll()
Converts an array of one type to an array of another type.
Select()
Projects each element of a sequence into a new form.
ToArray()
Creates an array from a IEnumerable.
Technically, in combination they produce the same thing, yet got about it in slightly different ways due to the fact they are part of slightly different areas of the BCL that have concerns in slightly different domains.
Personally i don't see ConvertAll used all that much these days as people are very familiar with LINQ and like to chain methods.
As to which is better for performance, we would have to write a lot of tests to figure this out, and it would come down to allocations verse speed per array size per platform. However, i feel the difference would be relatively indistinguishable in day to day, and my guess is you would be struggling to find very much performance difference at all at any significant sigma categorically.
In short, use what You like

double multidimensional array,in c++, the best way

This thread in SO is about multidimensional array in c++.
I have to port some code from c# to cpp. i have code like this:
private double[,] B;
...
this.B = new double[states, symbols];
double[][, ,] epsilon = new double[N][, ,];
double[][,] gamma = new double[N][,];
...
s += gamma[i][t, k] = ...
i have thought to use plain double array of array but it's quite pain. another solution could be vector of vector of double, or a custom Matrix2D and Matrix3D classes?
what is the best way for each of those cases?
WHAT I LEARNED:
multidimensional array in c++ is a great topic, and internet is full of resources. it could be handled in various ways, some of them really tricky, some others more faster to write.
i think that the best way is to deal with it is to use some libraries that takes in account this topic. there are a lot of them: Armadillo (nice MATLAB syntax conversion), Eigen i think is one of the better one, easy to install, easy to use, powerfull. Boost::multi_array is anotherone, and Boost is really a famous lib that is important just to take a look at how it handle the topic. As Konrad Rudolph answer STD with nested vectors or this could be another solution but, after a little search, i think the less elegant even the more easy and fast to code without external libs.
write a custom class. mayebe such a good exercice. peter answer or this or this are a good start point and also this post is interesting but expecially this great post blog from martin moene (one of the best essay on this topic i've read today). I mention also this answer for sparse array.
here is a nice tutorial direct from stroustrup
have a nice time with multidimensional array :-)

C++ has no direct equivalent of T[,] (although you could of course implement one by encapsulating the following code in a class. This is left as an exercise to the reader.
All C++ supports is nesting arrays/vectors (the equivalent of [][] in C#). So your first code would correspond to
vector<vector<double> > B(states, vector<double>(symbols));
… which initialises a vector of vectors, initialising the outer vector with states copies of an appropriately initialised inner vector.
Of course this can be taken to arbitrary complexity but at this point a few typedefs are in order to make the code more understandable.

class StateSymbols
{
public:
StateSymbols(unsigned int states, unsigned int symbols) :
m_states(states),
m_stateSymbols(states * symbols)
{
}
double get(unsigned int state, unsigned int symbol) const
{
return m_stateSymbols[(m_states * symbol) + state];
}
private:
const unsigned int m_states;
std::vector<double> m_stateSymbols;
};

Check out my answer on:
C++ Multi-dimensional Arrays on the Heap
It defines a basic function Create3D to allocate a 3D (generalizes to other dimensions) array in contiguous memory on the heap in a way that allows A[i][j][k] access operator syntax.

I would say dynamic array:
double* *list;
list = new double*[3]; //dimension1=3=row
for(int i=0;i<3;i++)
list[i] = new double[2]; //dimension2 =2 =col
list[0][0] = 1;
//...
for(int i=0;i<3;i++)
delete [] list[i];
delete [] list;

Best practice with Math.Pow

I'm working on a n image processing library which extends OpenCV, HALCON, ... . The library must be with .NET Framework 3.5 and since my experiences with .NET are limited I would like to ask some questions regarding the performance.
I have encountered a few specific things which I cannot explain to myself properly and would like you to ask a) why and b) what is the best practise to deal with the cases.
My first question is about Math.pow. I already found some answers here on StackOverflow which explains it quite well (a) but not what to do about this(b). My benchmark Program looks like this
Stopwatch watch = new Stopwatch(); // from the Diagnostics class
watch.Start();
for (int i = 0; i < 1000000; i++)
double result = Math.Pow(4,7) // the function call
watch.Stop()
The result was not very nice (~300ms on my computer) (I have run the test 10 times and calcuated the average value).
My first idea was to check wether this is because it is a static function. So I implemented my own class
class MyMath
{
public static double Pow (double x, double y) //Using some expensive functions to calculate the power
{
return Math.Exp(Math.Log(x) * y);
}
public static double PowLoop (double x, int y) // Using Loop
{
double res = x;
for(int i = 1; i < y; i++)
res *= x;
return res;
}
public static double Pow7 (double x) // Using inline calls
{
return x * x * x * x * x * x * x;
}
}
THe third thing I checked were if I would replace the Math.Pow(4,7) directly through 4*4*4*4*4*4*4.
The results are (the average out of 10 test runs)
300 ms Math.Pow(4,7)
356 ms MyMath.Pow(4,7) //gives wrong rounded results
264 ms MyMath.PowLoop(4,7)
92 ms MyMath.Pow7(4)
16 ms 4*4*4*4*4*4*4
Now my situation now is basically like this: Don't use Math for Pow. My only problem is just that... do I really have to implement my own Math-class now? It seems somehow ineffective to implement an own class just for the power function. (Btw. PowLoop and Pow7 are even faster in the Release build by ~25% while Math.Pow is not).
So my final questions are
a) am I wrong if I wouldn't use Math.Pow at all (but for fractions maybe) (which makes me somehow sad).
b) if you have code to optimize, are you really writing all such mathematical operations directly?
c) is there maybe already a faster (open-source^^) library for mathematical operations
d) the source of my question is basically: I have assumed that the .NET Framework itself already provides very optimized code / compile results for such basic operations - be it the Math-Class or handling arrays and I was a little surprised how much benefit I would gain by writing my own code. Are there some other, general "fields" or something else to look out in C# where I cannot trust C# directly.

Two things to bear in mind:
You probably don't need to optimise this bit of code. You've just done a million calls to the function in less than a second. Is this really going to cause big problems in your program?
Math.Pow is probably fairly optimal anyway. At a guess, it will be calling a proper numerics library written in a lower level language, which means you shouldn't expect orders of magnitude increases.
Numerical programming is harder than you think. Even the algorithms that you think you know how to calculate, aren't calculated that way. For example, when you calculate the mean, you shouldn't just add up the numbers and divide by how many numbers you have. (Modern numerics libraries use a two pass routine to correct for floating point errors.)
That said, if you decide that you definitely do need to optimise, then consider using integers rather than floating point values, or outsourcing this to another numerics library.

Firstly, integer operations are much faster than floating point. If you don't need floating point values, don't use the floating point data type. This generally true for any programming language.
Secondly, as you have stated yourself, Math.Pow can handle reals. It makes use of a much more intricate algorithm than a simple loop. No wonder it is slower than simply looping. If you get rid of the loop and just do n multiplications, you are also cutting off the overhead of setting up the loop - thus making it faster. But if you don't use a loop, you have to know
the value of the exponent beforehand - it can't be supplied at runtime.
I am not really sure why Math.Exp and Math.Log is faster. But if you use Math.Log, you can't find the power of negative values.
Basically int are faster and avoiding loops avoid extra overhead. But you are trading off some flexibility when you go for those. But it is generally a good idea to avoid reals when all you need are integers, but in this case coding up a custom function when one already exists seems a little too much.
The question you have to ask yourself is whether this is worth it. Is Math.Pow actually slowing your program down? And in any case, the Math.Pow already bundled with your language is often the fastest or very close to that. If you really wanted to make an alternate implementation that is really general purpose (i.e. not limited to only integers, positive values, etc.), you will probably end up using the same algorithm used in the default implementation anyway.

When you are talking about making a million iterations of a line of code then obviously every little detail will make a difference.
Math.Pow() is a function call which will be substantially slower than your manual 4*4...*4 example.
Don't write your own class as its doubtful you'll be able to write anything more optimised than the standard Math class.

Converting C# code to F# (if statement)

I'd like to know how to convert this code line by line from C# to F#. I am not looking to use any kind of F#'s idioms or something of the like. I am trying to understand how to map directly C#'s constructs to F#.
Here is the C# code:
//requires l.Length > 0
int GetMinimumValue(List<int> l) {
int minVal = l[0];
for (int i = 0; i < l.Length; ++i) {
if (l[i] > minValue) {
minVal = l[i];
}
}
return minVal;
}
And here is my F# attempt:
let getMinValue (l : int list) =
let minVal = l.Head
for i = 0 to (l.Length-1) do
if (l.Item(i) > minVal) then
minVal = col.Item(i)
minVal
Now, this ain't working. The problem seems to be related with the minVal = col.Item(i) line:
This expression was expected to have type unit but here has type bool
What is the problem, really?

If you want to convert it line by line then try the following
let getMinValue (l:System.Collections.Generic.List<int>) =
let mutable min = l.Item(0)
for i = 0 to (l.Count-1) do
if l.Item(i) < min then min <- l.Item(i)
min
Now as to why you're getting that particular error. Take a look at the following line
minVal = col.Item(i)
In F# this is not an assignment but a comparison. So this is an expression which produces a bool value but inside the for loop all expressions must be void/unit returning. Hence you receive an error.
Assignment in F# has at least 2 forms that I am aware of.
// Assigning to a mutable value
let mutable v1 = 42
v1 <- 13
// Assigning to a ref cell
let v1 = ref 0
v1 := 42
And of course, you should absolutely read Brian's article on this subject. It's very detailed and goes over many of the finer points on translating between the two languages
http://lorgonblog.spaces.live.com/Blog/cns!701679AD17B6D310!725.entry

There are a few problems with your literal translation. First of all, there's the immediate problem which causes the compiler error: as others have noted, let bindings are immutable by default. However, there's at least one other big problem: System.Collections.Generic.List<T> is very different from F#'s 't list. The BCL type is a mutable list backed by an array, which provides constant time random access to elements; the F# type is an immutable singly linked list, so accessing the nth element takes O(n) time. If you insist on doing expression-by-expression translation, you may find this blog post by Brian valuable.
I'd strongly recommend that you follow others' advice and try to acclimate yourself to thinking in idiomatic F# rather than literally translating C#. Here are some ways to write some related functions in F#:
// Given an F# list, find the minimum element:
let rec getMinList l =
| [] -> failwith "Can't take the minimum of an empty list"
| [x] -> x
| x::xs ->
let minRest = getMin xs
min x minRest
Note that this works on lists of any element type (with the caveat that the element type needs to be comparable from F#'s perspective or the application of the function will cause a compile-time error). If you want a version which will work on any type of sequence instead of just on lists, you could base it on the Seq.reduce function, which applies the function supplied as its first argument to each pair of elements in a sequence until a single value remains.
let getMin s = Seq.reduce min s
Or best of all, you can use the built-in Seq.min function, which is equivalent.

Short answer: = is not (mutable) assignment in F#.
Question: Do you really mean col?
Suggestions: Try to write this with NO assignments. There is recursion and built-in functions at your disposal :-)

You should read
What does this C# code look like in F#? (part one: expressions and statements)
I am disappointed that none of the other answers already linked it, because people ask the 'how to convert C# to F#' question a lot, and I have posted this answer link a lot, and by now some of the other answerers should know this :)

This is the most literal translation possible:
let getMinimumValue (l: List<int>) =
let mutable minVal = l.[0]
for i=0 to l.Length-1 do
if l.[i] > minVal then
minVal <- l.[i]
minVal

working with incredibly large numbers in .NET

I'm trying to work through the problems on projecteuler.net but I keep running into a couple of problems.
The first is a question of storing large quanities of elements in a List<t>. I keep getting OutOfMemoryException's when storing large quantities in the list.
Now I admit I might not be doing these things in the best way but, is there some way of defining how much memory the app can consume?
It usually crashes when I get abour 100,000,000 elements :S
Secondly, some of the questions require the addition of massive numbers. I use ulong data type where I think the number is going to get super big, but I still manage to wrap past the largest supported int and get into negative numbers.
Do you have any tips for working with incredibly large numbers?

Consider System.Numerics.BigInteger.

You need to use a large number class that uses some basic math principals to split these operations up. This implementation of a C# BigInteger library on CodePoject seems to be the most promising. The article has some good explanations of how operations with massive numbers work, as well.
Also see:
Big integers in C#

As far as Project Euler goes, you might be barking up the wrong tree if you are hitting OutOfMemory exceptions. From their website:
Each problem has been designed according to a "one-minute rule", which means that although it may take several hours to design a successful algorithm with more difficult problems, an efficient implementation will allow a solution to be obtained on a modestly powered computer in less than one minute.

As user Jakers said, if you're using Big Numbers, probably you're doing it wrong.
Of the ProjectEuler problems I've done, none have required big-number math so far.
Its more about finding the proper algorithm to avoid big-numbers.
Want hints? Post here, and we might have an interesting Euler-thread started.

I assume this is C#? F# has built in ways of handling both these problems (BigInt type and lazy sequences).
You can use both F# techniques from C#, if you like. The BigInt type is reasonably usable from other languages if you add a reference to the core F# assembly.
Lazy sequences are basically just syntax friendly enumerators. Putting 100,000,000 elements in a list isn't a great plan, so you should rethink your solutions to get around that. If you don't need to keep information around, throw it away! If it's cheaper to recompute it than store it, throw it away!

See the answers in this thread. You probably need to use one of the third-party big integer libraries/classes available or wait for C# 4.0 which will include a native BigInteger datatype.

As far as defining how much memory an app will use, you can check the available memory before performing an operation by using the MemoryFailPoint class.
This allows you to preallocate memory before doing the operation, so you can check if an operation will fail before running it.

string Add(string s1, string s2)
{
bool carry = false;
string result = string.Empty;
if (s1.Length < s2.Length)
s1 = s1.PadLeft(s2.Length, '0');
if(s2.Length < s1.Length)
s2 = s2.PadLeft(s1.Length, '0');
for(int i = s1.Length-1; i >= 0; i--)
{
var augend = Convert.ToInt64(s1.Substring(i,1));
var addend = Convert.ToInt64(s2.Substring(i,1));
var sum = augend + addend;
sum += (carry ? 1 : 0);
carry = false;
if(sum > 9)
{
carry = true;
sum -= 10;
}
result = sum.ToString() + result;
}
if(carry)
{
result = "1" + result;
}
return result;
}

I am not sure if it is a good way of handling it, but I use the following in my project.
I have a "double theRelevantNumber" variable and an "int PowerOfTen" for each item and in my relevant class I have a "int relevantDecimals" variable.
So... when large numbers is encountered they are handled like this:
First they are changed to x,yyy form. So if the number 123456,789 was inputed and the "powerOfTen" was 10, it would start like this:
theRelevantNumber = 123456,789
PowerOfTen = 10
The number was then: 123456,789*10^10
It is then changed to:
1,23456789*10^15
It is then rounded by the number of relevant decimals (for example 5) to 1,23456 and then saved along with "PowerOfTen = 15"
When adding or subracting numbers together, any number outside the relevant decimals are ignored. Meaning if you take:
1*10^15 + 1*10^10 it will change to 1,00001 if "relevantDecimals" is 5 but will not change at all if "relevantDecimals" are 4.
This method make you able to deal with numbers up doubleLimit*10^intLimit without any problem, and at least for OOP it is not that hard to keep track of.

You don't need to use BigInteger. You can do this even with string array of numbers.
class Solution
{
static void Main(String[] args)
{
int n = 5;
string[] unsorted = new string[6] { "3141592653589793238","1", "3", "5737362592653589793238", "3", "5" };
string[] result = SortStrings(n, unsorted);
foreach (string s in result)
Console.WriteLine(s);
Console.ReadLine();
}
static string[] SortStrings(int size, string[] arr)
{
Array.Sort(arr, (left, right) =>
{
if (left.Length != right.Length)
return left.Length - right.Length;
return left.CompareTo(right);
});
return arr;
}
}

If you want to work with incredibly large numbers look here...
MIKI Calculator
I am not a professional programmer i write for myself, sometimes, so sorry for unprofessional use of c# but the program works. I will be grateful for any advice and correction.
I use this calculator to generate 32-character passwords from numbers that are around 58 digits long.
Since the program adds numbers in the string format, you can perform calculations on numbers with the maximum length of the string variable. The program uses long lists for the calculation, so it is possible to calculate on larger numbers, possibly 18x the maximum capacity of the list.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Vectorising operators in C# - c#

Take a look at CSML. It's a fairly complete matrix library for c#. I've used it for a few things and it works well.

Related

Array.ConvertAll() and ToArray() method, which is better?

double multidimensional array,in c++, the best way

Best practice with Math.Pow

Converting C# code to F# (if statement)

working with incredibly large numbers in .NET

Categories

Resources