how to manually calculate the memory been used - c#

is there any way of calculate manually the memory that an array is goin to consume.
i am using for languaje C# in a 64 bit OS
let say i have the next array:
int number[][]= new int[2][2];
number[0][0]=25;
number[0][1]=60;
....
...
so my fist question is, each dimension of the array has the same bit asignation? lets say number[0][0] has a 12 bit asing (i dont now if 12 bits is the right answer) so this will make the first line a 24 bit of memory asing?
how much fisical and virtual memory does each dimension takes?
if i use int, double or string for the array is there any diference of memory to been used?
at the end if i used GC.GetTotalMemory will i recibe the same result of the total of memory been used by array?

You need to use the sizeof function to get how many bytes are allocated to your Type.
int number[][] = new int[2][];
for (int i = 0; i < number.Length; i++)
{
number[i] = new int[2];
}
int size = sizeof(int) * number.Length * number[0].Length;

Related

Converting arrays ushort[] to int[] using Buffer.BlockCopy

I'm trying to convert array type of ushort[4k*4k] of values 0-65k to similiar array type of int[] of same values.
It seems to mee that Buffer.BlockCopy is the fastest way to do that.
I'm trying the following code:
ushort[] uPixels = MakeRandomShort(0, 65000, 4000 * 4000);// creates ushort[] array
int[] iPixels = new int[4000 * 4000];
int size = sizeof(ushort);
int length = uPixels.Length * size;
System.Buffer.BlockCopy(uPixels, 0, iPixels, 0, length);
But iPixels stores some strange values in very strange range +-1411814783, +- 2078052064, etc.
What is wrong, and what I need to do to make it work properly?
thanks!
There is a related discussion on GitHub.
To copy an ushort[] to an int[] array does not work with a routine tuned for contiguous memory ranges.
Basically, you have to clear the upper halves of the target int cells.
Then, some sort of (parallelized?) loop is needed to copy the actual data.
It could be possible to use unsafe code with pointers advanced in steps of two bytes. The implementation of Buffer.BlockCopy is not visible in the Microsoft source repository. It might make sense to hunt for the source and modify it.
Update
I implemented two C++ functions and did a rough measurement of the resulting performance compared to the C# loop copy.
C# implementation
const int LEN = 4000 * 4000;
for (int i = 0; i < LEN; i++)
{
iPixels[i] = uPixels[i];
}
C++ implementation SpeedCopy1
// Copy loop with casting from unsigned short to int
__declspec(dllexport) void SpeedCopy1(unsigned short *uArray, int * iArray, int len)
{
for (int i = 0; i < len; i++)
{
*iArray++ = *uArray++;
}
}
C++ implementation SpeedCopy2
/// Copy loop with unsigned shorts
/// Clear upper half of int array elements in advance
__declspec(dllexport) void SpeedCopy2(unsigned short* uArray, int* iArray, int len)
{
unsigned short* up = (unsigned short*)iArray;
memset(iArray, 0, sizeof(int) * len);
for (int i = 0; i < len; i++)
{
*up = *uArray++;
up += 2;
}
}
Resulting times:
C# loop copy 27 ms
SpeedCopy1 9 ms
SpeedCopy2 18 ms
Compared to the C# loop, the external C++ function can reduce the copy time down a third.
It remains to be shown, what effect could be gained by multi-threading.

C# Biginteger for huge number of bits

I would like to know if there is any efficient way to store a big number using C#. I would like to create number consisting of 960 bytes but BigInteger can't hold it. I would be grateful for any advice.
UPDATE: I am using random byte generator to fill up array needed for constructor of BigInteger. For 960 byte array i BigInteger is returning a negative number.
static void Main(string[] args)
{
var arr = new byte[960];
for (int i = 0; i != arr.Length; i++)
{
arr[i] = byte.MaxValue;
}
var big = new BigInteger(arr);
}
is working pretty fine and the result is -1 because the representation of the number is in the two's complement. That means a number with just 1s in binary always resolves to -1 as you can see in the article.
if you add one Length more and set the last element of the array to zero you should get a positive number which represents your binary number (this one byte will not hurt you):
var arr = new byte[961];
arr[arr.Length-1] = 0;
var big2 = new BigInteger(arr);
but then you really should be sure in what format your binary number is and what BigInteger is "reading"

Problems with large bool array in C#

I decided to write a prime number generator as an easy excerise. The code is pretty simple:
static void generatePrimes (long min, long max)
{
bool[] prime = new bool[max + 1];
for (long i=2; i<max+1; i++)
prime [i] = true;
for (long i=2; i<max+1; i++) {
if (prime [i]) {
if (i>=min)
Console.WriteLine (i);
for (long j=i*2; j<max+1; j+=i)
prime [j] = false;
}
}
Console.WriteLine ();
}
It works just fine with input like 1..10000. However, around max=1000000000 it starts to work EXTREMELY slow; also, mono takes about 1Gb of memory. To me, it seems kinda strange: shouldn't the bool[1000000000] take 1000000000 bits, not bytes? Maybe I'm making some stupid mistake that I don't see that makes it so uneffective?
The smallest unit of information a computer can address is a byte. Thus a bool is stored as a byte. You will need special code to put 8 bools in one byte. The BitArray class does this for you.
Nope. Contrarily to C++'s vector<bool>, in C# an array of bool is, well, an array of bools.
If you want your values to be packed (8 bits per bool), use a BitArray instead.

Converting a partial MD5 hash code into a long

I'm using the MD5 algorithm to hash the key for an on-disk hash table (I know it's questionable whether this is the best algorithm to use for this, but I'm going with it for now. The problem is generalizable to any algorithm that produces a byte array). My problem is this:
The size of the hash code determines the number of combinations (buckets) in the hash table. Since MD5 is 128 bit, there are a huge number of combinations (~ 3.4e38) which is way too big for my purpose. So what I want to do is pick off the first n bits of the byte array that MD5 produces, and convert those into a long (or ulong) value. Since MD5 produces a byte array, it would be easy to do if I wanted an integral number of bytes, but this leads to too big a jump in the number of combinations. I'm finding the single bit version to be a lot trickier.
Goal:
n = 10 // I.e. I want 2^10 combinations
long pos = someFcn(byte[] key, n)
where key is the value being hashed, and n is the number of bits of the MD5 result I want to use. Pos, then, will be an integer from 0 to 1023 (in the case of n = 10). If n = 11, the code will be from 0 to 2^11-1 = 2027, etc. Has to be somewhat fast/efficient.
Doesn't seem that hard but it's eluding me. Any help would be much appreciated. Thanks.
First, convert the first four bytes into an integer, with BitConverter.ToInt32. It's getting four bytes no matter what, but this probably won't make it measurably slower, since you're working with 32-bit registers for the rest of the calculations anyway, and complex stuff like "if it's < 16 then do this with the first two bytes" will just make it more complicated
Then, given that integer, take the lowest N bits. If you really want a specific number of bits [a power of two number of buckets] not known at compile time, ~((-1)<<N) is a nice trick to get 2^N-1.
Or you could simply use ToUInt32 instead and modulo a prime number [it might be slightly better to convert to UInt64 instead, then you've got fully half the bits to start with, in this case]
To obtain the first 10 bits, for example:
int result = ((int)key[0] << 2) | (((int)key[1] >> 6) & 0x03)
If you have an array like this,
unsigned char data[2000];
then you can just scrape off the first n bits into an integer like so:
typedef unsigned long long int MyInt;
MyInt scrape(size_t n, unsigned char * data)
{
MyInt result = 0;
size_t b;
for (b = 0; b < n / 8; ++b)
{
result <<= 8;
result += data[b];
}
const size_t remaining_bits = n % 8;
result <<= remaining_bits;
result += (data[b] >> (8 - remaining_bits));
return result;
}
I'm assuming that CHAR_BITS == 8, feel free to generalize the code if you like. Also the size of the array times 8 must be at least n.

Fast casting in C# using BitConverter, can it be any faster?

In our application, we have a very large byte-array and we have to convert these bytes into different types. Currently, we use BitConverter.ToXXXX() for this purpose. Our heavy hitters are, ToInt16 and ToUInt64.
For UInt64, our problem is that the data stream has actually 6-bytes of data to represent a large integer. Since there is no native function to convert 6-bytes of data to UInt64, we do:
UInt64 value = BitConverter.ToUInt64() & 0x0000ffffffffffff;
Our use of ToInt16 is simpler, do don't have to do any bit manipulation.
We do so many of these 2 operations that I wanted to ask the SO community whether there's a faster way to do these conversions. Right now, approximately 20% of our entire CPU cycles is consumed by these two functions.
Have you thought about using memory pointers directly. I can't vouch for its performance but it is a common trick in C++\C...
byte[] arr = { 1, 2, 3, 4, 5, 6, 7, 8 ,9,10,11,12,13,14,15,16};
fixed (byte* a2rr = &arr[0])
{
UInt64* uint64ptr = (UInt64*) a2rr;
Console.WriteLine("The value is {0:X2}", (*uint64ptr & 0x0000FFFFFFFFFFFF));
uint64ptr = (UInt64*) ((byte*) uint64ptr+6);
Console.WriteLine("The value is {0:X2}", (*uint64ptr & 0x0000FFFFFFFFFFFF));
}
You'll need to make your assembly "unsafe" in the build settings as well as mark the method in which you'd be doing this unsafe aswell. You are also tied to little endian with this approach.
You can use the System.Buffer class to copy a whole array over to another array of a different type as a fast, 'block copy' operation:
The BlockCopy method accesses the bytes in the src parameter array using offsets into memory, not programming constructs such as indexes or upper and lower array bounds.
The array types must be of 'primitive' types, they must align, and the copy operation is endian-sensitive. In your case of 6-bytes integers, it can't align with any of .NET's 'primitive' types, unless you can obtain the source array with two bytes of padding for each six, which will then align to Int64. But this method will work for arrays of Int16, which may speed up some of your operations.
Why not:
UInt16 valLow = BitConverter.ToUInt16();
UInt64 valHigh = (UInt64)BitConverter.ToUInt32();
UInt64 Value = (valHigh << 16) | valLow;
You can make that a single statement, although the JIT compiler will probably do that for you automatically.
That will prevent you from reading those extra two bytes that you end up throwing away.
If that doesn't reduce CPU, then you'll probably want to write your own converter that reads the bytes directly from the buffer. You can either use array indexing or, if you think it's necessary, unsafe code with pointers.
Note that, as a commenter pointed out, if you use any of these suggestions, then either you're limited to a particular "endian-ness", or you'll have to write your code to detect little/big endian and react accordingly. The code sample I showed above works for little endian (x86).
See my answer for a similar question here.
It's the same unsafe memory manipulation as in Jimmy's answer, but in a more "friendly" way for consumers. It'll allow you to view your byte array as UInt64 array.
For anyone else who stumbles across this if you only need little endian and do not need to auto detect big endian and convert from that. Then I've written an extended version of bitconverter with a number of additions to handle Span as well as converting arrays of type T for example int[] or timestamp[]
Also extended the types supported to include timestamp, decimal and datetime.
https://github.com/tcwicks/ChillX/blob/master/src/ChillX.Serialization/BitConverterExtended.cs
Example usage:
Random rnd = new Random();
RentedBuffer<byte> buffer = RentedBuffer<byte>.Shared.Rent(BitConverterExtended.SizeOfUInt64
+ (20 * BitConverterExtended.SizeOfUInt16)
+ (20 * BitConverterExtended.SizeOfTimeSpan)
+ (10 * BitConverterExtended.SizeOfSingle);
UInt64 exampleLong = long.MaxValue;
int startIndex = 0;
startIndex += BitConverterExtended.GetBytes(exampleLong, buffer.BufferSpan, startIndex);
UInt16[] shortArray = new UInt16[20];
for (int I = 0; I < shortArray.Length; I++) { shortArray[I] = (ushort)rnd.Next(0, UInt16.MaxValue); }
//When using reflection / expression trees CLR cannot distinguish between UInt16 and Int16 or Uint64 and Int64 etc...
//Therefore Uint methods are renamed.
startIndex += BitConverterExtended.GetBytesUShortArray(shortArray, buffer.BufferSpan, startIndex);
TimeSpan[] timespanArray = new TimeSpan[20];
for (int I = 0; I < timespanArray.Length; I++) { timespanArray[I] = TimeSpan.FromSeconds(rnd.Next(0, int.MaxValue)); }
startIndex += BitConverterExtended.GetBytes(timespanArray, buffer.BufferSpan, startIndex);
float[] floatArray = new float[10];
for (int I = 0; I < floatArray.Length; I++) { floatArray[I] = MathF.PI * rnd.Next(short.MinValue, short.MaxValue); }
startIndex += BitConverterExtended.GetBytes(floatArray, buffer.BufferSpan, startIndex);
//Do stuff with buffer and then
buffer.Return(); //always better to return it as soon as possible
//Or in case you forget
buffer = null;
//and let RentedBufferContract do this automatically
it supports reading from and writing to both byte[] or RentedBuffer however using the RentedBuffer class greatly reduces GC collection overheads.
RentedBufferContract class internally handles returning buffers to the pool to prevent memory leaks.
Also includes a serializer which is similar to messagepack.
Note: MessagePack is a faster serializer with more features however this serializer reduces GC collection overheads by reading from and writing to rented byte buffers.
https://github.com/tcwicks/ChillX/blob/master/src/ChillX.Serialization/ChillXSerializer.cs

Categories

Resources