3D sparse matrix implementation? - c#

I've found a quite good sparse matrix implementation for c# over http://www.blackbeltcoder.com/Articles/algorithms/creating-a-sparse-matrix-in-net.
But as i work in 3d coordinate-system, i need a sparse-matrix implementation that i can use to map the 3d-coordinate system.
Details: I'm storing large amounts of primitive shapes data in memory like cubes. I do have large amounts of them (around 30 million) and i've lots of null (zero) entries around. Given that my each entry costs 1-bytes of entry, i'd like to implement a sparse-matrix so that i can fairly save memory space.
Note: Fast access to matrix cells is a fairly important factor for me, so i'd be trading speed over memory consumption.

A very simple solution which I just made is this:
public class Sparse3DMatrix<T>
{
Dictionary<Tuple<int,int,int>, T> values = new Dictionary<Tuple<int, int, int>, T>();
public T this[int x, int y, int z]
{
get { return values[new Tuple<int, int, int>(x, y, z)]; }
set { values[new Tuple<int, int, int>(x, y, z)] = value; }
}
public bool ContainsKey(int x, int y, int z)
{
return values.ContainsKey(new Tuple<int, int, int>(x, y, z));
}
}
usage:
var test = new Sparse3DMatrix<float>();
test[1, 1, 1] = 1f;
Console.WriteLine(test[1, 1, 1]);
It could be extended with methods like those his version have, and with checks for x, y, z values etc.
I'm sure someone have something to say about its performance. It will be a decent implementation unless you really need something it for high-performance. It depends on the hash-code implementation of Tuple and your specific usage. If we assume the hashes are good, we will have O(1) lookup time. If you know you will have a lot of elements, you could use new Dictionary<...>(initial capacity) to avoid unnecessary resizing when added items.
Unlike his, this only have a single Dictionary with all the items. His version have dictionaries of dictionaries. The benefit of his, is if you have to scan over an entire row, you can just iterate the second-level dictionary (this will not help you is you want to scan over columns) which is faster than individual lookup of the items. But having a single dictionary means smaller memory usage - especially when you have few items per row.

Lasse Espeholt's solution is practical but it can be improved by removing elements when they are "zeroed" or nulled. If you don't do this matrix or array can lose sparsity. Here is an alternative solution that assumes if an element of some type has not been inserted that it is the default of that type. For example, for double that means 0.0 and for string that means null.
public class Sparse3DArray<T>
{
private Dictionary<Tuple<int, int, int>, T> data = new Dictionary<Tuple<int, int, int>, T>();
public int Nnz { get { return data.Count; } }
public T this[int x, int y, int z]
{
get
{
var key = new Tuple<int, int, int>(x, y, z);
T value;
data.TryGetValue(key, out value);
return value;
}
set
{
var key = new Tuple<int, int, int>(x, y, z);
if (null == value)
data.Remove(key);
else if (value.Equals(default(T)))
data.Remove(key);
else
data[key] = value;
}
}
}

The fact that you're working in a 3D coordinate system doesn't change whether or not you can use this data structure. A matrix for a 3D space can be contained using a sparse matrix the same as a 2D matrix; it's just the entries that change.
You'd use a sparse matrix for large matricies with lots of zero entries. This is typical in discrete representations of problems in physics that come from finite difference and finite element methods. They have bands of non-zero entries clustered around the diagonal; entries outside the diagonal band are usually zero. A sparse matrix won't store these; decompositions like LU and QR have to be written to know how to deal with the sparsity.
These matricies can describe problems in either 2D or 3D spaces.
I believe you're incorrect if you think you need another data structure.

Why not use a KD-Tree or a similar data structure (such as an Octtree)?
There are great c++ implementations, for instance: FLANN

I would use a Dictionary, but rather than use a Tuple<int, int, int> for the key, you can use a single long as the key and use it to store the coordinates (provided they are shorts). This will reduce your memory footprint and might even improve performance.
private Dictionary<long, T> data = new Dictionary<long, T>();
private long GetKey(short x, short y, short z)
{
return (x * 10000 + y) * 10000 + z;
}

Related

how to create multidimensional arrays in C# without knowing the size

I need to understand on a practical level how to create a matrix[][] in C# without knowing the size.
And consequently also how to modify it (delete elements depending on a search key).
I have an example loop. Two random string variables. Then I am no longer able to continue....
private static Random random = new Random();
for (int i=0; i<unKnown; i++){
var firstVar = RandomString(5);
var secondVar = RandomString(20);
//Matrix[][]
}
public static string RandomString(int length){
const string chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
return new string(Enumerable.Repeat(chars, length)
.Select(s => s[random.Next(s.Length)]).ToArray());
}
Thank you
Arrays are fixed size. They do not adjust their size automatically. E.g. the size is defined when creating the array with
string[] array = new string[10];
If your array is 2 dimensional (10x10) and you delete the value at (1:1) the Array still remains 10x10 but the field at 1:1 is null now.
If you need a solution that adjusts its size you might want to look into Lists.
Otherwise, I advise you to read the documentation.
It really depend on what you want to do.
If you want a 2D array of values you can use multidimensional arrays. This supports arbitrary dimensions, but for more dimensions data sizes tend to go up and other solutions might be preferable:
var matrix = new double[4, 2];
If you want to do math you might want to use a library like Math.Net with specialized matrix types:
var matrix = Matrix<double>.Build.Dense(4, 2);
If you want to do computer graphics you likely want to use a specialized library, like system.Numerics.Matrix4x4
var matrix = new Matrix4x4();
It is also not particularly difficult to create your own matrix class that wraps a regular array. This has the benefit that interoperability is often easier, since most framework and tools accept accept pointers or 1D arrays, while few can handle a multidimensional array. Indexing can be done like:
public class MyMatrix<T>
{
public int Width { get; }
public T[] Data { get; }
public T this[int x, int y]
{
get => Data[y * Width + x];
set => Data[y * Width + x] = value;
}
}
There is also jagged arrays, but there is no guarantee that these will be "square", so they are probably not appropriate if you want a "matrix".
In all cases you will need to loop over the matrix and check each element if you want to do any kind of replacement. Some alternatives require separate loops for width/height, while some allow for a single loop.
I'm not sure if you want a matrix or an array.
Matrix would be like
string[,] matrix = new string[10, 10];
and array would be like
string[] array = new string[10];
You can access the array with array[i] and matrix with matrix[i, j]
You could also use
List<List<string>> matrix = new List<List<string>>(); which may be more convenient to work with and can also be access with indexers. For example
matrix[i][j] = "bob";
matrix[i].RemoveAt(j);
Given the problem you have submitted maybe just a List<string> would work for you.

Decreasing all values of a dictionary without loop. Or alternative storage possibility

I'm storing some data in a Math.net vector, as I have to do some calculations with it as a whole. This data comes with a time information when it was collected. So for example:
Initial = 5, Time 2 = 7, Time 3 = 8, Time 4 = 10
So when I store the data in a Vector it looks like this.
stateVectorData = [5,7,8,10]
Now sometimes I need to extract a single entry of the vector. But I don't have the index itself, but a time Information. So what I try is a dictionary with the information of the time and the index of the data in my stateVector.
Dictionary<int, int> stateDictionary = new Dictionary<int, int>(); //Dict(Time, index)
Everytime I get new data I add an entry to the dictionary(and of course to the stateVector). So at Time 2 I did:
stateDictionary.Add(2,1);
Now this works as long as I don't change my vector. Unfortunately I have to delete an entry in the vector when it gets too old. Assume time 2 is too old I delete the second entry and have a resulting vector of:
stateVector = [5,8,10]
Now my dictionary has the wrong index values stored.
I can think of two possible solutions how to solve this.
To loop through the dictionary and decrease every value (with key > 2) by 1.
What I think would be more elegant, is storing a reference to an vectorentry in the dictionary instead of the index.
So something like
Dictionary<int, ref int> stateDictionary =
new Dictionary<int, ref int>(); //Dict(Time, reference to vectorentry)
stateDictionary.Add(2, ref stateVector[1]);
Using something like this, I wouldn't care about deleting some entrys in the vector, as I still have the reference to the rest of the vectorentries. Now I know it's not possible to store a reference in C#.
So my question is, is there any alternative to looping through the whole dictionary? Or is there another solution without a dictionary I don't see at the moment?
Edit to answer juharr:
Time information doesn't always increase by one. Depends on some parallel running process and how long it takes. Probably increasing between 1 to 3. But also could be more.
There are some values in the vector which never get deleted. I tried to show this with the initial value of 5 which stays in the vector.
Edit 2:
Vector stores at least 5000 to 6000 elements. Maximum is not defined at the moment, as it is restricted by the elements I can handle in real time, so in my case I have about 0.01s to do my further calculations. This is why I search an effective way, so I can increase the number of elements in the vector (or increase the maximum "age" of my vectorentries).
I need the whole vector for calculation about 3 times the number I need to add a value.
I have to delete an entry with the lowest frequency. And finding a single value by its time key will be the most often case. Maybe 30 to 100 times a second.
I know this all sounds very undefined, but the frequency of finding and deleting part depends on an other process, which can vary a lot.
Though hope you can help me. Thanks so far.
Edit 3:
#Robinson
The exact number of times I need the whole vector also depends on the parallel process. Minimum would be two times every iteration (so twice in 0.01s), maximum at least 4 to 6 times every iteration.
Again, the size of the vector is what I want to maximize. So assumed to be very big.
Edit Solution:
First thanks to all, who helped me.
After experimenting a bit, I'm using the following construction.
I'm using a List, where I save the indexes in my state vector.
Additionally I use a Dictionary to assign my Time-key to the List Entry.
So when I delete something in the state vector, I loop only over the List, which seems to be much faster than looping the dictionary.
So it is:
stateVectorData = [5,7,8,10]
IndexList = [1,2,3];
stateDictionary = { Time 2, indexInList = 0; Time 3, indexInList = 1; Time 4, indexInList = 2 }
TimeKey->stateDictionary->indexInList -> IndexList -> indexInStateVector -> data
You can try this:
public class Vector
{
private List<int> _timeElements = new List<int>();
public Vector(int[] times)
{
Add(times);
}
public void Add(int time)
{
_timeElements.Add(time);
}
public void Add(int[] times)
{
_timeElements.AddRange(time);
}
public void Remove(int time)
{
_timeElements.Remove(time);
if (OnRemove != null)
OnRemove(this, time);
}
public List<int> Elements { get { return _timeElements; } }
public event Action<Vector, int> OnRemove;
}
public class Vectors
{
private Dictionary<int, List<Vector>> _timeIndex;
public Vectors(int maxTimeSize)
{
_timeIndex = new Dictionary<int, List<Vector>>(maxTimeSize);
for (var i = 0; i < maxTimeSize; i++)
_timeIndex.Add(i, new List<Vector>());
List = new List<Vector>();
}
public List<Vector> FindVectorsByTime(int time)
{
return _timeIndex[time];
}
public List<Vector> List { get; private set; }
public void Add(Vector vector)
{
List.Add(vector);
vector.Elements.ForEach(element => _timeIndex[element].Add(vector));
vector.OnRemove += OnRemove;
}
private void OnRemove(Vector vector, int time)
{
_timeIndex[time].Remove(vector);
}
}
To use:
var vectors = new Vectors(maxTimeSize: 6000);
var vector1 = new Vector(new[] { 5, 30, 8, 20 });
var vector2 = new Vector(new[] { 25, 5, 23, 11 });
vectors.Add(vector1);
vectors.Add(vector2);
var findsTwo = vectors.FindVectors(time: 5);
vector1.Remove(time: 5);
var findsOne = vectors.FindVectors(time: 5);
The same can be done for adding times, also the code is just for illustration purposes.

List or Dictionary or something else?

I want to save some coordinates in a dictionary, BUT the xPos should be twice or more in the dictionary.
The problem is that the following exception appears:
ArgumentException: An element with the same key already exists in the dictionary.
How can I solve the problem ?
I allready thought that I can use a List or an Array, but I want a Key and a Value.
After I saved the coordinates in a Dict (or something else) I want to check whether a new coordinate is a certain distance of the existing ones.
The xPos is allways the same:
There is a "chart" where I place some blocks in a row with different yPos.
1. Block: xPos = 0, yPos = random
2. Block: xPos = 1, yPos = random
...
n. Block: xPos = 80, yPos = random
n+1. Block: xPos = 0, yPos = 20 + random
I have three iterations, for each 80 Blocks are placed.
SORRY for my bad english :|
I hope you could understand.
Or you can use List of Tuple to store list of int-int pairs without creating new class and without worrying about duplicate values :
.....
List<Tuple<int, int>> blocks = new List<Tuple<int, int>>();
blocks.Add(Tuple.Create(0, random));
blocks.Add(Tuple.Create(1, random));
.....
You can create a class (or a struct) to keep and use your coordinates, instead of a dictionary.
The class can have Key and Value properties and also additional fields if needed.
You should save the values like List, where Position contains:
public int X { get; set; }
public int Y { get; set; }
Or you can use some another class/struct from C# or some 3rd libraries (2d vector, point, etc) depends on where you want to use it. :)
Since you're storing points in a 2-d surface and want to do nearest-point detection, I'd recommend using a Binary Space Partitioning (BSP) tree, such as a QuadTree. Here's a link to a quadtree implementation in C#.

function interpolation c#

I know there have been several questions asked relating to interpolation, but I have not found any answers that would be helpful enough for me so I have the following question.
I have two arrays of points. One tracks time (x axis) and one tracks expenses (y axis) and
I want to get something like this:
InterpolatingPolynomial[{{0, 10}, {1, 122}, {2, 3.65}, {3, 56.3}, {4, 12.4}, {5, 0}}, x]
(In Mathematica that returs a constructed polynomial that fits the points). Is it possible, to return a func<double,double> constructed from two double arrays in C#?
Thanks in advance.
This paper describes exactly what you want. The Vandermonde Determinant method is quite simple to implement as it requires to compute the determinant of a matrix in order to obtain the coefficients of the interpolating polynomial.
I'd suggest to build a class with an appropriate interface though, as building Funcs on the fly is pretty complicated (see here for an example). You could do something like:
public class CosineInterpolation {
public CosineInterpolation(double[] x, double[] y) { ... }
public double Interpolate(double x) { ... }
}
I think I found the solution myself after a long day of search. I interpolate a function using Lagrange Interpolation.
A Func<double,double> can be then easily constructed using DLINQ.
e.g
public Func<doube,double> GetFunction()
{
LagrangeInterpolation lagInter = new LagrangeInterpolation(xVals, yVals);
return ( val => lagInter(GetValue(val) );
}
This returns the Func<double,double> object. (I know that creating a new object each time is not a good solution but this is just for demonstrational purposes)

make limitation for random class in c#

I want to make limitation for random class in c# like generate random variables from 2 ranges without repeat it?
example :
Xpoints[i] = random.Next(0, 25);
Ypoints[i] = random.Next(0, 12);
where 25 we 12 is image dimension so I need all pixels in this image but random ? any suggestion if I use this code i didn't get some pixels and some pixels repeated
Update Simplified by not requiring any specific hashing [1]
Update Generalzed into generic SimpleShuffle extension method
public static IEnumerable<T> SimpleShuffle<T>(this IEnumerable<T> sequence)
{
var rand = new Random();
return sequence.Select(i => new {i, k=rand.Next()})
.OrderBy(p => p.k)
.Select(p => p.i);
}
I though in addition to downvoting (shouting? sorry :)) Anx's answer I thought it'd be nicer to also show what my code would look like:
using System;
using System.Linq;
using System.Collections.Generic;
namespace NS
{
static class Program
{
public static IEnumerable<T> SimpleShuffle<T>(this IEnumerable<T> sequence)
{
var rand = new Random();
return sequence.Select(i => new {i, k=rand.Next()}).OrderBy(p => p.k).Select(p => p.i);
}
public static void Main(string[] args)
{
var pts = from x in Enumerable.Range(0, 24)
from y in Enumerable.Range(0, 11)
select new { x, y };
foreach (var pt in pts.SimpleShuffle())
Console.WriteLine("{0},{1}", pt.x, pt.y);
}
}
}
I totally fixed my earlier problem of how to generate a good hash by realizing that we don't need a hash unless:
a. the source contains (logical) duplicates
b. and we need those to have equivalent sort order
c. and we want to have the same 'random' sort order (deterministic hashing) each time round
a. and b. are false in this case and c. was even going to be a problem (depending on what the OP was requiring). So now, without any strings attached, no more worries about performance (even the irrational worries),
Good luck!
[1] Incidentally this makes the whole thing more flexible because I no longer require the coords to be expressed a byte[]; you can now shuffle any structure you want.
Have a look at the Fisher-Yates Algorithm:
http://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle
It's easy to implement, and works really well.
It shuffles an array of digits, then you can access them sequentially if you like to ensure no repeats.
You might want to use a shuffle algorithm on a list of the indexes (e.g. 25 elements with the values 0..24 for the X axis) instead of random.
By design random doesn't guarantee that no value is repeated; repetitions are very likely.
See also: Optimal LINQ query to get a random sub collection - Shuffle (look for the Fisher-Yates-Durstenfeld solution)
I also believe, Random should not be predictable, and we shouldn't even predict that the value will not be repeating.
But I think sometimes it could be required to randomly get non repeating int, for that we need to maintain state, like for particular instance of Random class, what all values were returned.
here is a small quick and dirty implementation of an algorithm which I thought just now, I am not sure if it is the same as Fisher-Yates solution. I just wrote this class so that you can use it instead of System.Random class.
So It may help you for your requirement, use below NonRepeatingRandom class as per your need...
class NonRepeatingRandom : Random
{
private HashSet<int> _usedValues = new HashSet<int>();
public NonRepeatingRandom()
{
}
public NonRepeatingRandom(int seed):base(seed)
{
}
public override int Next(int minValue, int maxValue)
{
int rndVal = base.Next(minValue, maxValue);
if (_usedValues.Contains(rndVal))
{
int oldRndVal = rndVal;
do
{
rndVal++;
} while (_usedValues.Contains(rndVal) && rndVal <= maxValue);
if (rndVal == maxValue + 1)
{
rndVal = oldRndVal;
do
{
rndVal--;
} while (_usedValues.Contains(rndVal) && rndVal >= minValue);
if (rndVal == minValue - 1)
{
throw new ApplicationException("Cannot get non repeating random for provided range.");
}
}
}
_usedValues.Add(rndVal);
return rndVal;
}
}
Please not that only "Next" method is overridden, and not other, if you want you can override other methods of "Random" class too.
Ps. Just before clicking "Post Your Answer" I saw sehe's answer, I liked his overall idea, but to hash 2 bytes, creating a 16 byte hash? or am I missing something? In my code I am using HashSet which uses int's implementation of GetHashCode method, which is nothing but that value of int itself so no overhead of hashing. But I could be missing some point as it is 3:59 AM here in India :)
Hope it helps salamonti...
The whole point of random numbers is that you do get repeats.
However, if you want to make sure you don't then remove the last chosen value from your array before picking the next number. So if you have a list of numbers:
index = random.Next(0, originallist.Length);
radomisedList.Add(originalList[index]);
originalList.RemoveAt(index);
The list will be randomised yet contain no repeats.
Instead of creating image through two one-dimensional arrays you should create an image through one two-dimensional matrix. Each time you get new random coordinate check if that pixel is already set. If it is then repeat the procedure for that pixel.

Categories

Resources