Determining the number of bytes used by a variable

Determining the number of bytes used by a variable - c#

I have the following array:
byte[][] A = new byte[256][];
Each element of this array references another array.
A[n] = new byte[256];
However, most elements reference the same array. In fact, array A only references two or three unique arrays.
Is there an easy way to determine how much memory the entire thing uses?

If your question is to find out the number of unique 1D arrays, you could do:
A.Distinct().Count()
This should do because equality of arrays works on reference-equality by default.
But perhaps you're looking for:
A.Distinct().Sum(oneDimArray => oneDimArray.Length) * sizeof(byte)
Of course, "number of bytes used by variables" is a somewhat imprecise term. In particular, the above expression doesn't account for the storage of the variable A, references in the jagged array, overhead, alignment etc.
EDIT: As Rob points out, you may need to filter null references out if the jagged-array can contain them.
You can estimate the cost of storing the references in the jagged-array with (unsafe context):
A.Length * sizeof(IntPtr)

I don't believe there's any built in functionality.
Whipped this up very quickly, haven't tested it throughly however;
void Main()
{
byte[][] a = new byte[256][];
var someArr = new byte[256];
a[0] = someArr;
a[1] = someArr;
a[2] = new byte[256];
getSize(a).Dump();
}
private long getSize(byte[][] arr)
{
var hashSet = new HashSet<byte[]>();
var size = 0;
foreach(var innerArray in arr)
{
if(innerArray != null)
hashSet.Add(innerArray);
}
foreach (var array in hashSet)
{
size += array.Length * sizeof(byte);
}
return size;
}

I just a modified Rob's getSize method to use the Buffer helper class.
private long getSize(byte[][] arr)
{
Dictionary<byte[], bool> lookup = new Dictionary<byte[], bool>();
long size = 0;
foreach (byte[] innerArray in arr)
{
if (innerArray == null || lookup.ContainsKey(innerArray)) continue;
lookup.Add(innerArray, true);
size += Buffer.ByteLength(innerArray);
}
return size;
}

Related

c# A better way to Copy/Merge multiple arrays into one array

Would anyone be able to advise if there's a better way to copy multiple arrays into a single array? The resulting array must have the elements in the same order such as arrayOne values first then arraySecond values next etc.
Following is a mockup of what I'm currently executing which working as expected. Looking for a smarter way of doing this.
// Initialise the first array.
string[] arrayOne = new string[5];
arrayOne[0] = "arrayOneValue[0]";
arrayOne[1] = "arrayOneValue[1]";
arrayOne[2] = "arrayOneValue[2]";
arrayOne[3] = "arrayOneValue[3]";
arrayOne[4] = "arrayOneValue[4]";
// Initialise the second array.
string[] arrayTwo = new string[6];
arrayTwo[0] = "arrayTwoValue[0]";
arrayTwo[1] = "arrayTwoValue[1]";
arrayTwo[2] = "arrayTwoValue[2]";
arrayTwo[3] = "arrayTwoValue[3]";
arrayTwo[4] = "arrayTwoValue[4]";
arrayTwo[5] = "arrayTwoValue[5]";
// Initialise the third array.
string[] arrayThree = new string[3];
arrayThree[0] = "arrayThreeValue[0]";
arrayThree[1] = "arrayThreeValue[1]";
arrayThree[2] = "arrayThreeValue[2]";
// string[] arrayN = new string[n]
//.
//.
//.
// Initialise the target array.
string[] finalArray = new string[arrayOne.Length + arrayTwo.Length + arrayThree.Length];
// ArrayN - string[] finalArray = new string[arrayOne.Length + arrayTwo.Length + arrayThree.Length + arrayN.Length];
// Copy/merge the three arrays into the target array.
Array.Copy(arrayOne, 0, finalArray, 0, arrayOne.Length);
Array.Copy(arrayTwo, 0, finalArray, arrayOne.Length, arrayTwo.Length);
Array.Copy(arrayThree, 0, finalArray, (arrayOne.Length + arrayTwo.Length), arrayThree.Length);
//.
//.
//.
//.
// ArrayN - Array.Copy(arrayN, 0, finalArray, (arrayOne.Length + arrayTwo.Length + arrayN), arrayN.Length) ?;
As you can see for arrayN the code can get longer. I have a maximum of 5 arrays I'm trying to copy into one array, therefore, it's manageable. I'm using this technique as part of a WebAPI where a collection of oracle parameter objects are consolidated based on business rules to be passed to several Oracle stored procedures. Any advise here is appreciated. Thanks in advance.
Result
Console output
/*--- Destination array -
arrayOneValue[0]
arrayOneValue[1]
arrayOneValue[2]
arrayOneValue[3]
arrayOneValue[4]
arrayTwoValue[0]
arrayTwoValue[1]
arrayTwoValue[2]
arrayTwoValue[3]
arrayTwoValue[4]
arrayTwoValue[5]
arrayThreeValue[0]
arrayThreeValue[1]
arrayThreeValue[2]*/

You can just use LINQ .Concat so that you don't need manually take care of arrays lengths and offsets:
var finalArray = arrayOne.Concat(arrayTwo).Concat(arrayThree).ToArray();
It may be little less performant than using Array.Copy, but this code is much more readable, maintainable and error-safe, which is more important.

By creating a big array up front, then using Array.Copy, we can achieve very reasonable speeds for concatenating, even with a very large number of arrays:
public static T[] ConcatArrays<T>(params T[][] p)
{
var position = 0;
var outputArray = new T[p.Sum(a => a.Length)];
foreach (var curr in p)
{
Array.Copy(curr, 0, outputArray, position, curr.Length);
position += curr.Length;
}
return outputArray;
}
So, now we can either:
string bigArray = ConcatArrays(arrayOne, arrayTwo, arrayThree)
or
string[][] arrays = new[]{arrayOne, arrayTwo, arrayThree};
string bigArray = ConcatArrays(arrays);

var one = new [] { arrayOne, arrayTwo, arrayThree }.SelectMany(x => x);

You can put your input arrays in a collection and iterate over them. I only mention this because this may be more efficient than using LINQ. It depends on the data you're dealing with, but probably not enough to make a difference.
In the code below, on my machine, LINQ takes 9000-13000 ticks (one tick = 100 ns) while calling Array.Copy is ~500 ticks.
public static void Benchmark1()
{
var arr1 = Enumerable.Range(1,10000).ToArray();
var arr2 = Enumerable.Range(10001,20000).ToArray();
var arr3 = Enumerable.Range(20001,30000).ToArray();
var arr4 = Enumerable.Range(30001,40000).ToArray();
var sw = Stopwatch.StartNew();
var result = arr1.Concat(arr2).Concat(arr3).Concat(arr4).ToArray();
sw.Stop();
Console.WriteLine($"Elpased ticks: {sw.ElapsedTicks}");
}
public static void Benchmark2()
{
var arr1 = Enumerable.Range(1,10000).ToArray();
var arr2 = Enumerable.Range(10001,20000).ToArray();
var arr3 = Enumerable.Range(20001,30000).ToArray();
var arr4 = Enumerable.Range(30001,40000).ToArray();
var arrays = new List<int[]>() {arr1, arr2, arr3, arr4};
var sw = Stopwatch.StartNew();
int finalLen = 0;
foreach (var arr in arrays)
{
finalLen += arr.Length;
}
var result = new int[finalLen];
int currentPosition = 0;
foreach (var arr in arrays)
{
Array.Copy(arr, 0, result, currentPosition, arr.Length);
currentPosition += arr.Length;
}
sw.Stop();
Console.WriteLine($"Elpased ticks: {sw.ElapsedTicks}");
}

Do you have to use arrays? Use lists and AddRange(). If you need to have an array eventually then just call ToArray() in the end.

Loading multi-dimensional array dynamically

I have the following code. It's roughly analogous in concept to the python reshape function. It successfully loads 1-dimensional data into a multi-dimensional array, the dimensions of which are not known until runtime. For example {209,64,64,3}. I have to iterate over the 1-dimensional data and create the correct indexes for each dimension of the array.
private void InitializeData()
{
var imageData = ImageData.Load(txtFileName.Text); // one dimensional array
var dimensions = txtDimensions.Text.Split(',').Select(d => int.Parse(d)).ToArray(); // e.g., {-1,64,64,3}
int elements = 1;
foreach (var dim in dimensions.Skip(1))
{
elements *= dim;
}
dimensions[0] = imageData.Length / elements; // {209,64,64,3}
// create multipliers
var multipliers = new int[dimensions.Length - 1];
for (var dimension = 1; dimension < dimensions.Length; dimension++)
{
var multiplier = 1;
for (var followingdimension = dimension; followingdimension < dimensions.Length; followingdimension++)
{
multiplier *= dimensions[followingdimension];
}
multipliers[dimension - 1] = multiplier;
}
// load data
var dataArray = Array.CreateInstance(typeof(int), dimensions);
var indexes = new int[dimensions.Length];
for (var imageDataIndex = 0; imageDataIndex < imageData.Length; imageDataIndex++)
{
indexes[0] = imageDataIndex / multipliers[0];
indexes[dimensions.Length - 1] = imageDataIndex % multipliers[multipliers.Length - 1];
for (var multiplier = 1; multiplier < dimensions.Length - 1; multiplier++)
indexes[multiplier] = (imageDataIndex / multipliers[multiplier]) % dimensions[multiplier];
dataArray.SetValue(imageData[imageDataIndex], indexes);
}
}
Is there a faster or more elegant way of doing this? I do realize those are two different things. I'll do bench-marking on the elegant suggestions, but I'd still like to see them. Because this is just too ugly to look at and was too painful to write to be the best way.
Note (Please)
The data may not always be image data, so I am not looking for bitmap operations. That just happens here but it's not necessarily a typical case. And, my goal is not to get a bitmap, but an array.

I have a partial answer thanks to How to reshape an Array in c#
The code can be replaced with just this:
var imageData = ImageData.Load(txtFileName.Text); // one dimensional array
// e.g., {209,64,64,3}
var dimensions = txtDimensions.Text.Split(',').Select(d => int.Parse(d)).ToArray();
int elements = 1;
foreach (var dim in dimensions.Skip(1))
{
elements *= dim;
}
dimensions[0] = imageData.Length / elements;
// load data
var dataArray = Array.CreateInstance(typeof(int), dimensions);
Buffer.BlockCopy(imageData, 0, dataArray, 0, imageData.Length * sizeof(int));
I would be surprised if there's a faster way to do the actual load then Buffer.BlockCopy, or a simpler one. It turns out whatever dimensional form your original data is in, BlockCopy handles it as long as you can specify your target dimensions as part of a target array.
I'll keep looking for ways to further refine the rest of the original code.

struct vs class performance test

I built a test and got following results:
allocating classes: 15.3260622, allocating structs: 14.7216018.
Looks like a 4% advantage when allocates structs instead of classes. That's cool but is it really enough to add in the language value types? Where I can find an example which shows that structs really beat classes?
void Main()
{
var stopWatch = new System.Diagnostics.Stopwatch();
stopWatch.Start();
for (int i = 0; i < 100000000; i++)
{
var foo = new refFoo()
{
Str = "Alex" + i
};
}
stopWatch.Stop();
stopWatch.Dump();
stopWatch.Restart();
for (int i = 0; i < 100000000; i++)
{
var foo = new valFoo()
{
Str = "Alex" + i
};
}
stopWatch.Stop();
stopWatch.Dump();
}
public struct valFoo
{
public string Str;
}
public class refFoo
{
public string Str;
}

Your methodology is wrong. You are mostly measuring string allocations, conversions of integers to strings, and concatenation of strings. This benchmark is not worth the bits it is written on.
In order to see the benefit of structs, compare allocating an array of 1000 objects and an array of 1000 structs. In the case of the array of objects, you will need one allocation for the array itself, and then one allocation for each object in the array. In the case of the array of structs, you have one allocation for the array of structs.
Also, look at the implementation of the Enumerator of the List class in the C# source code of .Net collections. It is declared as a struct. That's because it only contains an int, so the entire enumerator struct fits inside a machine word, so it is very inexpensive.

Try some simpler test:
int size = 1000000;
var listA = new List<int>(size);
for (int i = 0; i < size; i++)
listA.Add(i);
var listB = new List<object>(size);
for (int i = 0; i < size; i++)
listB.Add(i);
To store 1000000 integers in first case the system allocates 4000000 bytes. In second, if I'm not mistaken — about 12000000 bytes. And I suspect the performance difference will be much greater.

String.Where Comparatively Poor Performance

I have two methods that take a string and remove any 'invalid' characters (characters contained in a hashset). One method uses Linq.Where, another uses a loop w/ char array.
The Linq method takes nearly twice as long (208756.9 ticks) as the loop (108688.2 ticks)
Linq:
string Linq(string field)
{
var c = field.Where(p => !hashChar.Contains(p));
return new string(c.ToArray());
}
Loop:
string CharArray(string field)
{
char[] c = new char[field.Length];
int count = 0;
for (int i = 0; i < field.Length; i++)
if (!hashChar.Contains(field[i]))
{
c[count] = field[i];
count++;
}
if (count == 0)
return field;
char[] f = new char[count];
Buffer.BlockCopy(c, 0, f, 0, count * sizeof(char));
return new string(f);
}
My expectation would be that LINQ would beat, or at least be comparable to, the loop method. The loop method isn't even optimized. I must be missing something here.
How does Linq.Where work under the hood, and why does it lose to my method?

If the source code of ToArray in Mono is any indication, your implementation wins because it performs fewer allocations (scroll down to line 2874 to see the method).
Like many methods of LINQ, the ToArray method contains separate code paths for collections and for other enumerables:
TSource[] array;
var collection = source as ICollection<TSource>;
if (collection != null) {
...
return array;
}
In your case, this branch is not taken, so the code proceeds to this loop:
int pos = 0;
array = EmptyOf<TSource>.Instance;
foreach (var element in source) {
if (pos == array.Length) {
if (pos == 0)
array = new TSource [4];
else
// If the number of returned character is significant,
// this method will be called multiple times
Array.Resize (ref array, pos * 2);
}
array[pos++] = element;
}
if (pos != array.Length)
Array.Resize (ref array, pos);
return array;
As you can see, LINQ's version may allocate and re-allocate the array several times. Your implementation, on the other hand, does just two allocations - the upfront one of the max size, and the final one, where the data is copied. That's why your code is faster.

change array size

Is it possible to change an array size after declaration?
If not, is there any alternative to arrays?
I do not want to create an array with a size of 1000, but I do not know the size of the array when I'm creating it.

You can use Array.Resize(), documented in MSDN.
But yeah, I agree with Corey, if you need a dynamically sized data structure, we have Lists for that.
Important: Array.Resize() doesn't resize the array (the method name is misleading), it creates a new array and only replaces the reference you passed to the method.
An example:
var array1 = new byte[10];
var array2 = array1;
Array.Resize<byte>(ref array1, 20);
// Now:
// array1.Length is 20
// array2.Length is 10
// Two different arrays.

No, try using a strongly typed List instead.
For example:
Instead of using
int[] myArray = new int[2];
myArray[0] = 1;
myArray[1] = 2;
You could do this:
List<int> myList = new List<int>();
myList.Add(1);
myList.Add(2);
Lists use arrays to store the data so you get the speed benefit of arrays with the convenience of a LinkedList by being able to add and remove items without worrying about having to manually change its size.
This doesn't mean an array's size (in this instance, a List) isn't changed though - hence the emphasis on the word manually.
As soon as your array hits its predefined size, the JIT will allocate a new array on the heap that is twice the size and copy your existing array across.

You can use Array.Resize() in .net 3.5 and higher. This method allocates a new array with the specified size, copies elements from the old array to the new one, and then replaces the old array with the new one.
(So you will need the memory available for both arrays as this probably uses Array.Copy under the covers)

Yes, it is possible to resize an array. For example:
int[] arr = new int[5];
// increase size to 10
Array.Resize(ref arr, 10);
// decrease size to 3
Array.Resize(ref arr, 3);
If you create an array with CreateInstance() method, the Resize() method is not working. For example:
// create an integer array with size of 5
var arr = Array.CreateInstance(typeof(int), 5);
// this not work
Array.Resize(ref arr, 10);
The array size is not dynamic, even we can resize it. If you want a dynamic array, I think we can use generic List instead.
var list = new List<int>();
// add any item to the list
list.Add(5);
list.Add(8);
list.Add(12);
// we can remove it easily as well
list.Remove(5);
foreach(var item in list)
{
Console.WriteLine(item);
}

In C#, arrays cannot be resized dynamically.
One approach is to use
System.Collections.ArrayList instead
of a native array.
Another (faster) solution is to
re-allocate the array with a
different size and to copy the
contents of the old array to the new
array.
The generic function resizeArray (below) can be used to do that.
public static System.Array ResizeArray (System.Array oldArray, int newSize)
{
int oldSize = oldArray.Length;
System.Type elementType = oldArray.GetType().GetElementType();
System.Array newArray = System.Array.CreateInstance(elementType,newSize);
int preserveLength = System.Math.Min(oldSize,newSize);
if (preserveLength > 0)
System.Array.Copy (oldArray,newArray,preserveLength);
return newArray;
}
public static void Main ()
{
int[] a = {1,2,3};
a = (int[])ResizeArray(a,5);
a[3] = 4;
a[4] = 5;
for (int i=0; i<a.Length; i++)
System.Console.WriteLine (a[i]);
}

Used this approach for array of bytes:
Initially:
byte[] bytes = new byte[0];
Whenever required (Need to provide original length for extending):
Array.Resize<byte>(ref bytes, bytes.Length + requiredSize);
Reset:
Array.Resize<byte>(ref bytes, 0);
Typed List Method
Initially:
List<byte> bytes = new List<byte>();
Whenever required:
bytes.AddRange(new byte[length]);
Release/Clear:
bytes.Clear()

Use System.Collections.Generic.List

Use a List<T> instead. For instance, instead of an array of ints
private int[] _myIntegers = new int[1000];
use
private List<int> _myIntegers = new List<int>();
later
_myIntegers.Add(1);

In C#, Array.Resize is the simplest method to resize any array to new size, e.g.:
Array.Resize<LinkButton>(ref area, size);
Here, i want to resize the array size of LinkButton array:
<LinkButton> = specifies the array type
ref area = ref is a keyword and 'area' is the array name
size = new size array

private void HandleResizeArray()
{
int[] aa = new int[2];
aa[0] = 0;
aa[1] = 1;
aa = MyResizeArray(aa);
aa = MyResizeArray(aa);
}
private int[] MyResizeArray(int[] aa)
{
Array.Resize(ref aa, aa.GetUpperBound(0) + 2);
aa[aa.GetUpperBound(0)] = aa.GetUpperBound(0);
return aa;
}

If you really need to get it back into an array I find it easiest to convert the array to a list, expand the list then convert it back to an array.
string[] myArray = new string[1] {"Element One"};
// Convert it to a list
List<string> resizeList = myArray.ToList();
// Add some elements
resizeList.Add("Element Two");
// Back to an array
myArray = resizeList.ToArray();
// myArray has grown to two elements.

Use a List (where T is any type or Object) when you want to add/remove data, since resizing arrays is expensive. You can read more about Arrays considered somewhat harmful whereas a List can be added to New records can be appended to the end. It adjusts its size as needed.
A List can be initalized in following ways
Using collection initializer.
List<string> list1 = new List<string>()
{
"carrot",
"fox",
"explorer"
};
Using var keyword with collection initializer.
var list2 = new List<string>()
{
"carrot",
"fox",
"explorer"
};
Using new array as parameter.
string[] array = { "carrot", "fox", "explorer" };
List<string> list3 = new List<string>(array);
Using capacity in constructor and assign.
List<string> list4 = new List<string>(3);
list4.Add(null); // Add empty references. (Not Recommended)
list4.Add(null);
list4.Add(null);
list4[0] = "carrot"; // Assign those references.
list4[1] = "fox";
list4[2] = "explorer";
Using Add method for each element.
List<string> list5 = new List<string>();
list5.Add("carrot");
list5.Add("fox");
list5.Add("explorer");
Thus for an Object List you can allocate and assign the properties of objects inline with the List initialization. Object initializers and collection initializers share similar syntax.
class Test
{
public int A { get; set; }
public string B { get; set; }
}
Initialize list with collection initializer.
List<Test> list1 = new List<Test>()
{
new Test(){ A = 1, B = "Jessica"},
new Test(){ A = 2, B = "Mandy"}
};
Initialize list with new objects.
List<Test> list2 = new List<Test>();
list2.Add(new Test() { A = 3, B = "Sarah" });
list2.Add(new Test() { A = 4, B = "Melanie" });

This worked well for me to create a dynamic array from a class array.
var s = 0;
var songWriters = new SongWriterDetails[1];
foreach (var contributor in Contributors)
{
Array.Resize(ref songWriters, s++);
songWriters[s] = new SongWriterDetails();
songWriters[s].DisplayName = contributor.Name;
songWriters[s].PartyId = contributor.Id;
s++;
}

In case you cannot use Array.Reset (the variable is not local) then Concat & ToArray helps:
anObject.anArray.Concat(new string[] { newArrayItem }).ToArray();

Use a generic List (System.Collections.Generic.List).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Determining the number of bytes used by a variable - c#

Related

c# A better way to Copy/Merge multiple arrays into one array

Loading multi-dimensional array dynamically

struct vs class performance test

String.Where Comparatively Poor Performance

change array size

Categories

Resources