I am doing an image comparer for learning purposes.
I have done almost everything already and I am now improving it. To check for similarity, I have 2 jagged-multidimensional arrays (byte[][,]) where I access each element of each array using a triple for loop and store their remainder, like this:
for (int dimension = 0; dimension < 8; dimension++)
{
Parallel.For(0, 16, mycolumn =>
{
Parallel.For(0, 16, myrow =>
{
Diffs[dimension][mycolumn, myrow] =
(byte)Math.Abs(Image1Bytes[dimension][mycolumn, myrow]
- Image2Bytes[dimension][mycolumn, myrow]);
});
});
}
Now, I would like to check how much each dimension is equal to another in the other collection.
How could I compare the entire arrays in each array (like array1[i][,] == array2[j][,])?
I think there are better ways to do these operations, but I have managed to do them pretty quickly.
Here is an older thread on comparing two images that would be simple for you to adapt to your needs.
Compare Bitmaps
Since Array supports the IStructuralEquatable interface, you can use structural comparison:
using System.Collections;
. . .
var areEqual = StructuralComparisons.StructuralEqualityComparer.Equals(array1[i], array2[j]);
Related
(Edit - preface: I am implementing an iterable through all subsets of given size.
For getting the next combination, I am using Gosper's hack to quickly get the 0/1 vector of lexicographically next combination. Now I need to quickly map the vector of combination to the array of elements from my set. Luckily, the elements are the very same as the powers of individual bits, and I am wondering if C# doesn't have fast shortcut for that.)
If I get K-th subset (in lexicographical order) of numbers 0 - (N-1), the bits in binary representation of K are telling me which elements should I choose. What is the most elegant way of checking which bits are set and making a subset (array) based on these?
Something like:
var BitPowers = new List<int>();
for(int i = 0; i<N; ++i)
{
if((K & (1<<i)) != 0)
{
BitPowers.Add(i);
}
}
return BitPowers.ToArray();
would probably suffice, but is it the best way? I guess bit operations are quick, but as the number of possible sets is exponential, optimizing this function as much as I can would be ideal.
There is no .NET builtin API to do such thing, as I know.
The Linq magic
You can write this compact code in one assignment, but less optimized for speed:
int value = 0b10110010;
var BitPowers = Convert.ToString(value, 2)
.Reverse()
.Select((bit, index) => new { index, bit })
.Where(v => v.bit == '1').Select(v => v.index);
foreach ( int index in BitPowers )
Console.WriteLine(index);
It converts the integer to a binary representation of a string, inverted to have good indexes from left to right, then it selects a pair of (bit, index), then it filters on those that are defined, then it selects the indexes for create the enumerable list.
Output
1
4
5
7
Compromise between elegance and speed
You can simplify your loop using a BitArray instance.
Perhaps it is the closest "builtin" way you ask for:
var bits = new BitArray(BitConverter.GetBytes(value));
for ( int index = 0; index < bits.Length; index++ )
if ( bits[index] )
BitPowers.Add(index);
I've ran into an issue with lists as I need to create a 2-dimensional list where I can read data by giving the columns and rows, so I could read from my list by using my_List[col][row] so is it possible making a 2D list that way?
How much performance impact can this have and anything I should be aware that could have a performance impact on the code? I might need to read a few hundred times per second from my 2D list
Is it possible to have a more grid type 2D list so if i have data in 3, 4 and 5 but i dont have anything in 0, 1, and 2 think of it like coordinates. so can I read from the list using myList[3][5] and get the data from there with 0, 1 and 2 having nothing? or do i need to loop it through and add something there like null?
thanks in advance!
Yes, you can indeed use multidimensional arrays or jagged arrays for storing "2D data".
As for creating a data structure that doesn't use any memory space for unused indexes, an option could be to use a dictionary where the keys are tuples of two numbers, like this (assuming that your data are strings):
var items = new Dictionary<(int, int), string>();
items.Add((0,1), "0-1"); //this throws an error if the key already exists
items[(2,3)] = "2-3"; //this silently replaces the value if the key already exists
Console.WriteLine(items.Keys.Contains((0,1))); //true
Console.WriteLine(items.Keys.Contains((0,2))); //false
Console.WriteLine(items[(2,3)]); //"2-3"
Of course you probably want to encapsulate this functionality in its own class, but you get the idea.
Note however that this dictionary approach will probably be worse than a plain array in terms of performance, but it's up to you to experiment and collect some metrics.
You can create 2D Arrays like this :
string[,] twoDArray = new string[2,2];
Then you can loop through it like :
for (int i = 0; i < twoDArray.Length; i++)
{
foreach (int j in twoDArray[i,0])
{
}
}
You can also create 2D Lists like this:
List<List<string>> grid = new List<List<string>>();
and iterate through them using an Enumerator and for example a for loop:
var enu = grid.GetEnumerator();
while (enu.MoveNext())
{
for(int i = 0; i < enu.Current.Count; i++)
{
enu.Current.RemoveAt(i);
}
}
You are basically iterating over all lists and then through each list as long as its size is. Inside the for loop you can alter the encapsuled lists in whatever way you like.
Trying to find a solution to my ranking problem.
Basically I have two multi-dimensional double[,] arrays. Both containing rankings for certain scenarios, so [rank number, scenario number]. More than one scenario can have the same rank.
I want to generate a third multi-dimensional array, taking the intersections of the previous two multi-dimensional arrays to provide a joint ranking.
Does anyone have an idea how I can do this in C#?
Many thanks for any advice or help you can provide!
Edit:
Thank you for all the responses, sorry I should have included an example.
Here it is:
Array One:
[{0,4},{1,0},{1,2},{2,1},{3,5},{4,3}]
Array Two:
[{0,1},{0,4},{1,0},{1,2},{3,5},{4,3}]
Required Result:
[{0,4},{1,0},{1,2},{1,1},{2,5},{3,3}]
Here's some sample code that makes a bunch of assumptions but might be something like what you are looking for. I've added a few comments as well:
static double[,] Intersect(double[,] a1, double[,] a2)
{
// Assumptions:
// a1 and a2 are two-dimensional arrays of the same size
// An element in the array matches if and only if its value is found in the same location in both arrays
// result will contain not-a-number (NaN) for non-matches
double[,] result = new double[a1.GetLength(0), a1.GetLength(1)];
for (int i = 0; i < a1.GetLength(0); i++)
{
for (int j = 0; j < a1.GetLength(1); j++)
{
if (a1[i, j] == a2[i, j])
{
result[i, j] = a1[i, j];
}
else
{
result[i, j] = double.NaN;
}
}
}
return result;
}
For the most part, finding the intersection of multiple dimensional arrays will involve iterating over the elements in each of the dimensions in the arrays. If the indices of the array are not part of the match criteria (my second assumption in my code is removed), you would have to walk each dimension in each array - which increases the run-time of the algorithm (in this case, from O(n^2) to O(n^4).
If you care enough about run-time, I believe array matching is one of the typical examples of dynamic programming (DP) optimization; which you can read up on at your leisure.
I'm not sure how you wanted your results...you could probably return a flat collection of results that can be indexed by a pair, which would potentially save a lot of space if the expected result set is typically small. I went with a third fixed-sized array because it was the easiest thing to do.
Lastly, I'll mention that I don't see a keen C# way of doing this using IEnumerable, LINQ, or something like that. Someone more C# knowledgeable than I can chime in anytime now....
Given the additional information, I'd argue that you aren't actually working with multidimensional arrays, but instead are working with a collection of pairs. The pair is a pair of doubles. I think the following should work nicely:
public class Pair : IEquatable<Pair>
{
public double Rank;
public double Scenario;
public bool Equals(Pair p)
{
return Rank == p.Rank && Scenario == p.Scenario;
}
public override int GetHashCode()
{
int hashRank= Rank.GetHashCode();
int hashScenario = Scenario.GetHashCode();
return hashRank ^ hashScenario;
}
}
You can then use the Intersect operator on IEnumerable:
List<Pair> one = new List<Pair>();
List<Pair> two = new List<Pair>();
// ... populate the lists
List<Pair> result = one.Intersect(two).ToList();
Check out the following msdn article on Enumerable.Intersect() for more information:
http://msdn.microsoft.com/en-us/library/bb910215%28v=vs.90%29.aspx
I found that dictionary lookup could be very slow if compared to flat array access. Any idea why? I'm using Ants Profiler for performance testing. Here's a sample function that reproduces the problem:
private static void NodeDisplace()
{
var nodeDisplacement = new Dictionary<double, double[]>();
var times = new List<double>();
for (int i = 0; i < 6000; i++)
{
times.Add(i * 0.02);
}
foreach (var time in times)
{
nodeDisplacement.Add(time, new double[6]);
}
var five = 5;
var six = 6;
int modes = 10;
var arrayList = new double[times.Count*6];
for (int i = 0; i < modes; i++)
{
int k=0;
foreach (var time in times)
{
for (int j = 0; j < 6; j++)
{
var simpelCompute = five * six; // 0.027 sec
nodeDisplacement[time][j] = simpelCompute; //0.403 sec
arrayList[6*k+j] = simpelCompute; //0.0278 sec
}
k++;
}
}
}
Notice the relative magnitude between flat array access and dictionary access? Flat array is about 20 times faster than dictionary access ( 0.403/0.0278), after taking into account of the array index manipulation ( 6*k+j).
As weird as it sounds, but dictionary lookup is taking a major portion of my time, and I have to optimize it.
Yes, I'm not surprised. The point of dictionaries is that they're used to look up arbitrary keys. Consider what has to happen for a single array dereference:
Check bounds
Multiply index by element size
Add index to pointer
Very, very fast. Now for a dictionary lookup (very rough; depends on implementation):
Potentially check key for nullity
Take hash code of key
Find the right slot for that hash code (probably a "mod prime" operation)
Probably dereference an array element to find the information for that slot
Compare hash codes
If the hash codes match, compare for equality (and potentially go on to the next hash code match)
If you've got "keys" which can very easily be used as array indexes instead (e.g. contiguous integers, or something which can easily be mapped to contiguous integers) then that will be very, very fast. That's not the primary use case for hash tables. They're good for situations which can't easily be mapped that way - for example looking up by string, or by arbitrary double value (rather than doubles which are evenly spaced, and can thus be mapped to integers easily).
I would say that your title is misleading - it's not that dictionary lookup is slow, it's that when arrays are a more suitable approach, they're ludicrously fast.
In addition the Jon's answer I would like to add that your inner loop does not do very much, normally you do a least some more work in the inner loop and then the relative performance loss of the dictionary is somewhat lower.
If you look at the code for Double.GetHashCode() in Reflector you'll find that it is executing 4 lines of code (assuming your double is not 0), just that is more than the body of your inner loop. Dictionary<TKey, TValue>.Insert() (called by the set indexer) is even more code, almost a screen full.
The thing with Dictionary compared to a flat array is that you don't waste to much memory when your keys are not dense (as they are in your case) and that read and write are ~O(1) like arrays (but with a higher constant).
As a side note you can use a multi dimensional array instead of the 6*k+j trick.
Declare it this way
var arrayList = new double[times.Count, 6];
and use it this way
arrayList[k ,j] = simpelCompute;
It won't be faster, but it is easier to read.
I have a two dimensional array that I need to load data into. I know the width of the data (22 values) but I do not know the height (estimated around 4000 records, but variable).
I have it declared as follows:
float[,] _calibrationSet;
....
int calibrationRow = 0;
While (recordsToRead)
{
for (int i = 0; i < SensorCount; i++)
{
_calibrationSet[calibrationRow, i] = calibrationArrayView.ReadFloat();
}
calibrationRow++;
}
This causes a NullReferenceException, so when I try to initialize it like this:
_calibrationSet = new float[,];
I get an "Array creation must have array size or array initializer."
Thank you,
Keith
You can't use an array.
Or rather, you would need to pick a size, and if you ended up needing more then you would have to allocate a new, larger, array, copy the data from the old one into the new one, and continue on as before (until you exceed the size of the new one...)
Generally, you would go with one of the collection classes - ArrayList, List<>, LinkedList<>, etc. - which one depends a lot on what you're looking for; List will give you the closest thing to what i described initially, while LinkedList<> will avoid the problem of frequent re-allocations (at the cost of slower access and greater memory usage).
Example:
List<float[]> _calibrationSet = new List<float[]>();
// ...
while (recordsToRead)
{
float[] record = new float[SensorCount];
for (int i = 0; i < SensorCount; i++)
{
record[i] = calibrationArrayView.ReadFloat();
}
_calibrationSet.Add(record);
}
// access later: _calibrationSet[record][sensor]
Oh, and it's worth noting (as Grauenwolf did), that what i'm doing here doesn't give you the same memory structure as a single, multi-dimensional array would - under the hood, it's an array of references to other arrays that actually hold the data. This speeds up building the array a good deal by making reallocation cheaper, but can have an impact on access speed (and, of course, memory usage). Whether this is an issue for you depends a lot on what you'll be doing with the data after it's loaded... and whether there are two hundred records or two million records.
You can't create an array in .NET (as opposed to declaring a reference to it, which is what you did in your example) without specifying its dimensions, either explicitly, or implicitly by specifying a set of literal values when you initialize it. (e.g. int[,] array4 = { { 1, 2 }, { 3, 4 }, { 5, 6 }, { 7, 8 } };)
You need to use a variable-size data structure first (a generic list of 22-element 1-d arrays would be the simplest) and then allocate your array and copy your data into it after your read is finished and you know how many rows you need.
I would just use a list, then convert that list into an array.
You will notice here that I used a jagged array (float[][]) instead of a square array (float [,]). Besides being the "standard" way of doing things, it should be much faster. When converting the data from a list to an array you only have to copy [calibrationRow] pointers. Using a square array, you would have to copy [calibrationRow] x [SensorCount] floats.
var tempCalibrationSet = new List<float[]>();
const int SensorCount = 22;
int calibrationRow = 0;
while (recordsToRead())
{
tempCalibrationSet[calibrationRow] = new float[SensorCount];
for (int i = 0; i < SensorCount; i++)
{
tempCalibrationSet[calibrationRow][i] = calibrationArrayView.ReadFloat();
} calibrationRow++;
}
float[][] _calibrationSet = tempCalibrationSet.ToArray();
I generally use the nicer collections for this sort of work (List, ArrayList etc.) and then (if really necessary) cast to T[,] when I'm done.
you would either need to preallocate the array to a Maximum size (float[999,22] ) , or use a different data structure.
i guess you could copy/resize on the fly.. (but i don't think you'd want to)
i think the List sounds reasonable.
You could also use a two-dimensional ArrayList (from System.Collections) -- you create an ArrayList, then put another ArrayList inside it. This will give you the dynamic resizing you need, but at the expense of a bit of overhead.