From MSDN:
Combining collection types (having collections of collections) is
allowed. Jagged arrays are treated as collections of collections.
Multidimensional arrays are not supported.
So, if you can't normally serialize a multidimensional array, how does one get around this efficiently? My thought is to have a property that flattens the array and serialize that collection and unflatten it during deserialization, but I'm not sure if that's efficient?
Anyone find a solution this before?
I should note that the reason I think flattening might work is because my dimensions are a fixed value (they are hard-coded).
Yes, you could flatten it or Jagged-ize it, whatever is more convenient.
Any time you rip through a 2-D array you're expending O(n) effort (or one "processing" unit per item in the input). It's not much trouble to convert between 2-D and 1-D and back as you said. Unless you're dealing in really high volumes (array size of call frequency of the web service), or on a highly constrained system (.Net Compact or .Net Micro Frameworks), I doubt this would really be a big issue. Things like sorting are what get expensive.
string[,] input = new string[5, 3];
string[] output = new string[15];
for (int i = 0; i < input.GetUpperBound(0); i++)
{
for (int j = 0; j < input.GetUpperBound(1); j++)
{
output[j * input.GetUpperBound(j) + i] = input[i, j];
}
}
Maybe with an extension method:
public static string[][] Jaggedize(this string[,] input) {
string[][] output = new string[input.GetLength(0)][];
for (int i = 0; i < input.GetLength(0); i++) {
output[i] = new string[input.GetLength(1)];
for (int j = 0; j < input.GetLength(1); j++) {
output[i][j] = input[i, j];
}
}
return output;
}
Related
I have this Python code that does it (I need a C#):
def transpose(self):
return [[i[j] for i in grid_map] for j in range(min([len(i) for i in grid_map]))]
The dimensions can be any, including non-square, but x and y are bound by ulong's maximum value.
Lastly, I need to use ArrayList<ArrayList<dynamic, dynamic>> because I'll have different datatypes stored there.
Note that the ArrayList class is a legacy collection type that has been replaced by the more modern and efficient List class in the .NET Framework. It is generally recommended to use the List class instead of ArrayList, unless you have a specific reason to use the older collection type.
Part of official Microsoft Doc to Array Lists:
We don't recommend that you use the ArrayList class for new
development. Instead, we recommend that you use the generic List
class. The ArrayList class is designed to hold heterogeneous
collections of objects. However, it does not always offer the best
performance. Instead, we recommend the following: For a heterogeneous
collection of objects, use the List (in C#) or List(Of Object)
(in Visual Basic) type. For a homogeneous collection of objects, use
the List class. See Performance Considerations in the List
reference topic for a discussion of the relative performance of these
classes. See Non-generic collections shouldn't be used on GitHub for
general information on the use of generic instead of non-generic
collection types.
Link to Doc
1.You can use LINQ:
// Transpose the list
dynamic[][] transposedList = list
.OfType<ArrayList>()
.Select(row => row.OfType<dynamic>().ToArray())
.ToArray();
2.Or two nested for - Loops:
// Transpose the list
int rows = list.Count;
int cols = 0;
// Find the maximum number of columns
for (int i = 0; i < rows; i++)
{
cols = Math.Max(cols, list[i].Count);
}
// Create a new 2D list with the same number of rows as the original
// list's number of columns, and the same number of columns
// as the original list's number of rows
List<List<dynamic>> transposedList = new List<List<dynamic>>();
for (int i = 0; i < cols; i++)
{
transposedList.Add(new List<dynamic>());
for (int j = 0; j < rows; j++)
{
transposedList[i].Add(null);
}
}
// Copy the elements from the original list to the transposed list
for (int i = 0; i < rows; i++)
{
for (int j = 0; j < list[i].Count; j++)
{
transposedList[j][i] = list[i][j];
}
}
is there any better (less complex) way to paste the smaller array[,] to the bigger array[,] than looping through them? My code:
private void PasteRoomIntoTheMap(GameObject[,] room, (int, int) positionOnTheMap)
{
(int, int) roomSize = (room.GetLength(0), room.GetLength(1));
int roomsXAxesDimention = 0;
int roomsYAxesDimention = 0;
for (int i = positionOnTheMap.Item1; i <= positionOnTheMap.Item1 + roomSize.Item1; i++)
{
for (int j = positionOnTheMap.Item2; j <= positionOnTheMap.Item2 + roomSize.Item2; j++)
{
Map[i, j] = room[roomsXAxesDimention, roomsYAxesDimention];
roomsYAxesDimention++;
}
roomsXAxesDimention++;
}
}
(I didn't run it yet. There might be some errors but I hope that you will understand this method)
The code that I would love to have:
Map [5...15, 2...5] = room[0...room.GetLength(0), 0...room.GetLength(1)]
Short answer: No
Longer answer:
Multidimensional arrays are really a wrapper around a 1D array, with some special code to handle indices etc. It does however lack some useful features.
You could perhaps improve your example code a bit by copying entire columns or rows using Array.Copy or blockCopy instead of processing item by item. This might help performance if you have very many items that need copying, but is unlikely to make any difference if the sizes are small.
You could probably provide the syntax you desire by creating something like a ArraySlice2D class that reference the original 2D array as well as the region you want to copy. But you probably need to either create your own wrapper to provide index operators, or create extension methods to do it. Perhaps something like
public class My2DArray<T>{
private T[,] array;
public ArraySlice2D<T> this[Range x, Range y]{
get => new ArraySlice2D<T>(array, x, y);
set {
// Copy values from value.RangeX/Y to x/y ranges
}
}
You might also consider writing your own 2D array class that wraps a 1D array. I find that this often makes interoperability easier since most systems can handle 1D arrays, but few handle multidimensional arrays without conversion.
I am not aware of any shortcut for that and there might be to many slightly distinct usecase so that the most apropriate solution would be to implement your own functionality.
It is advisable however to add a size check to your method so you dont end up indexing nonexisting areas of the big array:
private bool ReplaceWithSubArray(int[,] array, int[,] subarray, (int x,int y) indices)
{
if (array.GetLength(0) < subarray.GetLength(0) + indices.x ||
array.GetLength(1) < subarray.GetLength(1) + indices.y)
{
// 'array' to small
return false;
}
for (int x = 0; x <= subarray.GetLength(0); x++)
{
for (int y = 0; y <= subarray.GetLength(1); y++)
{
array[x + indices.x, y + indices.y] = subarray[x, y];
}
}
return true;
}
I am trying to performance tune a routine that needs to sort 8 large arrays "in tandem", where one of the arrays is the array to sort by.
I've already taken care of sorting the first array using a method of my choosing (I'm using TimSort)
I've already taken care of making sure my array of sorted objects have a property denoting their original index. (e.g. sortedArray[0].OriginalIndex would return 2983 if previously unsortedArry[2983] turned out to be the first item)
This means if I were to loop over my now sorted array of objects, I think I can just get all other arrays sorted in the same order in the following naïve way:
private List<object[]> SortInTandem(IndexedObj[] sortedArray, List<object[]> arraysToSort)
for(int i = 0; i < sortedArray.length; i++) {
int originalIndex = sortedArray[i].OriginalIndex;
// Swap the corresponding index from all other arrays to their new position
foreach(object[] array in arraysToSort) {
object temp = array[i];
array[i] = array[originalIndex];
array[originalIndex] = temp;
}
}
return arraysToSort; // Returning original arrays sorted in-place
}
I believe the above algorithm to have the desired result, but it feels less efficient than it could be. (3 times as many assignments as needed?)
I also considered the following approach which minimizes assignments, but requires allocating new arrays to store sorted items, and garbage collecting the old arrays (unless I come up with a way to recycle the allocations between calls):
private List<object[]> SortInTandem(IndexedObj[] sortedArray, List<object[]> arraysToSort) =>
arraysToSort.Select(array =>
{
object[] tandemArray = new object[array.length];
for(int i = 0; i < sortedArray.length; i++)
tandemArray[i] = array[sortedArray[i].OriginalIndex];
}); // Returning newly-allocated arrays
This sort of thing is done continuously in a performance-critical area of code, so I'm looking for thoughts on how I might get the best of both worlds.
Thinking more about the second solution above (allocating new arrays) - it occurred to me that the list of arrays passed in can also be "repurposed" once their sorted variant has been produced, so I actually only need to allocate one new array and then I can reuse the ones passed in to prepare additional results:
// Note the allocated arraysToSort passed in will be repurposed to produced a new set of sorted
// arrays, so the caller must be sure to discard their references and only use what is returned.
private List<object[]> SortInTandem(IndexedObj[] sortedArray, List<object[]> arraysToSort)
{
List<object[]> sortedArrays = new List<object[]>(arraysToSort.Count);
object[] tandemArray = new object[array.length];
for(int i = 0; i < arraysToSort.Count; i++)
{
for(int j = 0; j < sortedArray.length; j++)
tandemArray[j] = array[sortedArray[j].OriginalIndex];
sortedArrays.Add(tandemArray);
tandemArray = arraysToSort[i];
}
return sortedArrays; // Returning one newly-allocated + all but one original arrays repurposed
}
I had a look and couldn't see anything quite answering my question.
I'm not exactly the best at creating accurate 'real life' tests, so i'm not sure if that's the problem here. Basically I want to create a few simple neural networks to create something to the effect of Gridworld. Performance of these neural networks will be critical and i dont want the hidden layer to be a bottleneck as much as possible.
I would rather use more memory and be faster, so I opted to use arrays instead of lists (due to lists having an extra bounds check over arrays). The arrays aren't always full, but because the if statement (check if the element is null) is the same until the end, it can be predicted and there is no performance drop from that at all.
My question comes from how I store the data for the network to process. I figured due to 2D arrays storing all the data together it would be better cache wise and would run faster. But from my mock up test that an array of arrays performs much better in this scenario.
Some Code:
private void RunArrayOfArrayTest(float[][] testArray, Data[] data)
{
for (int i = 0; i < testArray.Length; i++) {
for (int j = 0; j < testArray[i].Length; j++) {
var inputTotal = data[i].bias;
for (int k = 0; k < data[i].weights.Length; k++) {
inputTotal += testArray[i][k];
}
}
}
}
private void Run2DArrayTest(float[,] testArray, Data[] data, int maxI, int maxJ)
{
for (int i = 0; i < maxI; i++) {
for (int j = 0; j < maxJ; j++) {
var inputTotal = data[i].bias;
for (int k = 0; k < maxJ; k++) {
inputTotal += testArray[i, k];
}
}
}
}
These are the two functions that are timed. Each 'creature' has its own network (The first for loop), each network has hidden nodes (The second for loop) and i need to find the sum of the weights for each input (The third loop). In my test i stripped it so that it's not really what i am doing in my actual code, but the same amount of loops happen (The data variable would have it's own 2D array, but i didn't want to possibly skew the results). From this i was trying to get a feel for which one is faster, and to my surprise the array of arrays was.
Code to start the tests:
// Array of Array test
Stopwatch timer = Stopwatch.StartNew();
RunArrayOfArrayTest(arrayOfArrays, dataArrays);
timer.Stop();
Console.WriteLine("Array of Arrays finished in: " + timer.ElapsedTicks);
// 2D Array test
timer = Stopwatch.StartNew();
Run2DArrayTest(array2D, dataArrays, NumberOfNetworks, NumberOfInputNeurons);
timer.Stop();
Console.WriteLine("2D Array finished in: " + timer.ElapsedTicks);
Just wanted to show how i was testing it. The results from this in release mode give me values like:
Array of Arrays finished in: 8972
2D Array finished in: 16376
Can someone explain to me what i'm doing wrong? Why is an array of arrays faster in this situation by so much? Isn't a 2D array all stored together, meaning it would be more cache friendly?
Note i really do need this to be fast as it needs to sum up hundreds of thousands - millions of numbers per frame, and like i said i don't want this is be a problem. I know this can be multi threaded in the future quite easily because each network is completely separate and even each node is completely separate.
Last question i suppose, would something like this be possible to run on the GPU instead? I figure a GPU would not struggle to have much larger amounts of networks with much larger numbers of input/hidden neurons.
In the CLR, there are two different types of array:
Vectors, which are zero-based, single-dimensional arrays
Arrays, which can have non-zero bases and multiple dimensions
Your "array of arrays" is a "vector of vectors" in CLR terms.
Vectors are significantly faster than arrays, basically. It's possible that arrays could be optimized further in later CLR versions, but I doubt that there'll get the same amount of love as vectors, as they're so relatively rarely used. There's not a lot you can do to make CLR arrays faster. As you say, they'll be more cache friendly, but they have this CLR penalty.
You can improve your array-of-arrays code already, however, by only performing the first indexing operation once per row:
private void RunArrayOfArrayTest(float[][] testArray, Data[] data)
{
for (int i = 0; i < testArray.Length; i++) {
// These don't change in the loop below, so extract them
var row = testArray[i];
var inputTotal = data[i].bias;
var weightLength = data[i].weights.Length;
for (int j = 0; j < row.Length; j++) {
for (int k = 0; k < weightLength; k++) {
inputTotal += row[k];
}
}
}
}
If you want to get the cache friendliness and still use a vector, you could have a single float[] and perform the indexing yourself... but I'd probably start off with the array-of-arrays approach.
For example: an array of varying-length arrays of integers.
In C++, we are used to doing things like:
int * * TwoDimAry = new int * [n] ;
for ( int i ( 0 ) ; i < n ; i ++ )
{
TwoDimAry[i] = new int [i + n] ;
}
In this case, if n == 3 then the result would be an array of three pointers to arrays of integers, and would appear like this:
http://img263.imageshack.us/img263/4149/multidimarray.png
Of course, .NET arrays are managed collections, so you don't have to deal with the manual allocation/deletion.
But declaring:
int[][] TwoDimAry ;
... in C# does not appear to have the same effect - namely, you have to innitialize ALL of the sub-arrays at the same time, and they have to be the same length.
I need my sub-arrays to be independent of each-other, as they are in native C++.
What's the best way to implement this using managed collections? Are there any drawbacks I should be aware of?
Like C++, you need to initialize every subarray in an int[][].
However, they don't need to have the same length. (That's why it's called a jagged array)
For example:
int[][] jagged1 = new int[][] { new int[1], new int[2], new int[3] };
Your C++ code can be translated directly to C#:
int[][] TwoDimAry = new int[n][];
for(int i = 0; i < n; i++) {
TwoDimAry[i] = new int[i + n];
}
Here is an example with a jagged array initialized with 1, 2, 3, .. elements for each row
int N = 20;
int[][] array = new int[N][]; // First index is rows, second is columns
for(int i=0; i < N; i++)
{
array[i] = new int[i+1]; // Initialize i-th row with 'i' columns
for( int j = 0; j <= i; j++)
{
array[i][j] = N*j+i; // Set a value for each column in the row
}
}
I have use this enough to know that there aren't many drawbacks overall. Hybrid appraches with List<int[]> or List<int>[] also work.
In .Net, most of the time you don't want to use arrays this way at all. This is because in .Net, arrays are thought of as a different animal from a collection. Managed, yes. Collection? Well, maybe, but it confuses terms because that means something special. If you want a collection (hint: most of the time you do), look in the Systems.Collections namespace, particularly Systems.Collections.Generic. It sounds like you really want either a List<List<int>> or a List<int[]>.