What's the fastest way to convert a float[] to byte[]? - c#

OpenCV uses floats to store SIFT descriptors, where actually it is integer values from 0 to 255.
I am using OpenCvSharp, which is a C# wrapper for OpenCV and I would like to convert the float-based descriptors to byte[] because it takes only 1/4 of the space.
Before I realized that, I was converting the float[] to byte[] like this to store the descriptors in a database:
float[] floatDescriptor = ...;
byte[] byteDescriptor = new byte[floatDescriptor.Length * sizeof(float)];
Buffer.BlockCopy(floatDescriptor, 0, byteDescriptor, 0, byteDescriptor.Length);
This was very fast, because I could copy the whole float array without any transformation into a byte array. But it takes 4 times more space than this:
float[] floatDescriptor = ...;
byte[] byteDescriptor = new byte[floatDescriptor.Length];
for (int i = 0; i < floatDescriptor.Length; ++i)
{
byteDescriptor [i] = (byte)floatDescriptor[i];
}
But this is a lot slower. Is there a faster way to do it?
Edit
I am aware of the fact, that two different things are happening there. I'm just wondering if there is some kind of faster way for batch processing casts in arrays. Like Buffer.BlockCopy() is faster than BitConverter.GetBytes(float) for every float in an array.

Related

Read and write more than 8 bit symbols

I am trying to write an Encoded file.The file has 9 to 12 bit symbols. While writing a file I guess that it is not written correctly the 9 bit symbols because I am unable to decode that file. Although when file has only 8 bit symbols in it. Everything works fine. This is the way I am writing a file
File.AppendAllText(outputFileName, WriteBackContent, ASCIIEncoding.Default);
Same goes for reading with ReadAllText function call.
What is the way to go here?
I am using ZXing library to encode my file using RS encoder.
ReedSolomonEncoder enc = new ReedSolomonEncoder(GenericGF.AZTEC_DATA_12);//if i use AZTEC_DATA_8 it works fine beacuse symbol size is 8 bit
int[] bytesAsInts = Array.ConvertAll(toBytes.ToArray(), c => (int)c);
enc.encode(bytesAsInts, parity);
byte[] bytes = bytesAsInts.Select(x => (byte)x).ToArray();
string contentWithParity = (ASCIIEncoding.Default.GetString(bytes.ToArray()));
WriteBackContent += contentWithParity;
File.AppendAllText(outputFileName, WriteBackContent, ASCIIEncoding.Default);
Like in the code I am initializing my Encoder with AZTEC_DATA_12 which means 12 bit symbol. Because RS Encoder requires int array so I am converting it to int array. And writing to file like here.But it works well with AZTEC_DATA_8 beacue of 8 bit symbol but not with AZTEC_DATA_12.
Main problem is here:
byte[] bytes = bytesAsInts.Select(x => (byte)x).ToArray();
You are basically throwing away part of the result when converting the single integers to single bytes.
If you look at the array after the call to encode(), you can see that some of the array elements have a value higher than 255, so they cannot be represented as bytes. However, in your code quoted above, you cast every single element in the integer array to byte, changing the element when it has a value greater than 255.
So to store the result of encode(), you have to convert the integer array to a byte array in a way that the values are not lost or modified.
In order to make this kind of conversion between byte arrays and integer arrays, you can use the function Buffer.BlockCopy(). An example on how to use this function is in this answer.
Use the samples from the answer and the one from the comment to the answer for both conversions: Turning a byte array to an integer array to pass to the encode() function and to turn the integer array returned from the encode() function back into a byte array.
Here are the sample codes from the linked answer:
// Convert byte array to integer array
byte[] result = new byte[intArray.Length * sizeof(int)];
Buffer.BlockCopy(intArray, 0, result, 0, result.Length);
// Convert integer array to byte array (with bugs fixed)
int bytesCount = byteArray.Length;
int intsCount = bytesCount / sizeof(int);
if (bytesCount % sizeof(int) != 0) intsCount++;
int[] result = new int[intsCount];
Buffer.BlockCopy(byteArray, 0, result, 0, byteArray.Length);
Now about storing the data into files: Do not turn the data into a string directly via Encoding.GetString(). Not all bit sequences are valid representations of characters in any given character set. So, converting a random sequence of random bytes into a string will sometimes fail.
Instead, either store/read the byte array directly into a file via File.WriteAllBytes() / File.ReadAllBytes() or use Convert.ToBase64() and Convert.FromBase64() to work with a base64 encoded string representation of the byte array.
Combined here is some sample code:
ReedSolomonEncoder enc = new ReedSolomonEncoder(GenericGF.AZTEC_DATA_12);//if i use AZTEC_DATA_8 it works fine beacuse symbol size is 8 bit
int[] bytesAsInts = Array.ConvertAll(toBytes.ToArray(), c => (int)c);
enc.encode(bytesAsInts, parity);
// Turn int array to byte array without loosing value
byte[] bytes = new byte[bytesAsInts.Length * sizeof(int)];
Buffer.BlockCopy(bytesAsInts, 0, bytes, 0, bytes.Length);
// Write to file
File.WriteAllBytes(outputFileName, bytes);
// Read from file
bytes = File.ReadAllBytes(outputFileName);
// Turn byte array to int array
int bytesCount = bytes.Length * 40;
int intsCount = bytesCount / sizeof(int);
if (bytesCount % sizeof(int) != 0) intsCount++;
int[] dataAsInts = new int[intsCount];
Buffer.BlockCopy(bytes, 0, dataAsInts, 0, bytes.Length);
// Decoding
ReedSolomonDecoder dec = new ReedSolomonDecoder(GenericGF.AZTEC_DATA_12);
dec.decode(dataAsInts, parity);

Fastest way to convert float to bytes and then save byte array in memory?

I am currently writing code that converts an audio clip into a float array and then want to convert that float array into bytes, and finally convert that byte array to hexadecimal.
Everything works but we are attempting to save arrays of data that are hundreds of thousands of elements long when this data is converted to bytes and once we try to save this data as a hexadecimal string it is a bit much or takes too long for the mobile devices we are testing on to handle.
So my question is are there any ways to optimize / speed up this process?
Here is my code for Convert our float array to bytes:
public byte[] ConvertFloatsToBytes(float[] audioData){
byte[] bytes = new byte[audioData.Length * 4];
//*** This function converts our current float array elements to the same exact place in byte data
Buffer.BlockCopy(audioData,0,bytes,0,bytes.Length);
return bytes;
}
Here we convert that data into a hex string :
public static string ByteArrayToString(byte[] ba)
{
string hex = BitConverter.ToString(ba);
//Debug.Log("ba.length = " + ba.Length.ToString() +"hex string = " + hex);
return hex.Replace("-","");
}
Ultimately at the end we save the string out and convert it from the hex string to a float array .
Like I said that code is slow but it is working I am just trying to find the best ways to optimize / speed up this process to improve performance
Do you know which part is costing you? I strongly suspect that the conversion to a hexadecimal array isn't the bottleneck in your program.
The final part, where you remove the hyphens ends up copying the string. You can probably do better by writing your own method that duplicates what BitArray.ToString does, without the hyphens. That is:
const string chars = "0123456789ABCDEF";
public string ByteArrayToString(byte[] ba)
{
var sb = new StringBuilder(ba.Length*2);
for (int i = 0; i < ba.Length; ++i)
{
var b = ba[i];
sb.Append(chars[b >> 4]);
sb.Append(chars[b & 0x0F]);
}
return sb.ToString();
}
That will avoid one string copy.
If you're willing to use unsafe code (don't know if you can on the devices you're working with), you can speed that even further by not even copying to the array of bytes. Rather, you fix the array of floats in memory and then address it with a byte pointer See Unsafe Code and Pointers if you're interested in that.
That sounds really convoluted - are audio samples not normally integers?
Anyway, StreamWriter supports writing of single and double natively, so you could use that to build a memory stream that you then convert to hex.

how to manually calculate the memory been used

is there any way of calculate manually the memory that an array is goin to consume.
i am using for languaje C# in a 64 bit OS
let say i have the next array:
int number[][]= new int[2][2];
number[0][0]=25;
number[0][1]=60;
....
...
so my fist question is, each dimension of the array has the same bit asignation? lets say number[0][0] has a 12 bit asing (i dont now if 12 bits is the right answer) so this will make the first line a 24 bit of memory asing?
how much fisical and virtual memory does each dimension takes?
if i use int, double or string for the array is there any diference of memory to been used?
at the end if i used GC.GetTotalMemory will i recibe the same result of the total of memory been used by array?
You need to use the sizeof function to get how many bytes are allocated to your Type.
int number[][] = new int[2][];
for (int i = 0; i < number.Length; i++)
{
number[i] = new int[2];
}
int size = sizeof(int) * number.Length * number[0].Length;

What is the Fastest way to convert byte[] to float[] and vice versa?

Which is the fastest way to convert a byte[] to float[] and vice versa (without a loop of course).
I'm using BlockCopy now, but then I need the double memory. I would like some kind of cast.
I need to do this conversion just to send the data through a socket and reconstruct the array in the other end.
Surely msarchet's proposal makes copies too. You are talking about just changing the way .NET thinks about a memory area, if you dont' want to copy.
But, I don't think what you want is possible, as bytes and floats are represented totally different in memory. A byte uses exactly a byte in memory, but a float uses 4 bytes (32 bits).
If you don't have the memory requirements to store your data, just represent the data as the data type you will be using the most in memory, and convert the values you actually use, when you use them.
How do you want to convert a float (which can represent a value between ±1.5 × 10−45 and±3.4 × 10^38) into a byte (which can represent a value between 0 and 255) anyway?
(see more info her about:
byte: http://msdn.microsoft.com/en-us/library/5bdb6693(v=VS.100).aspx
float: http://msdn.microsoft.com/en-us/library/b1e65aza.aspx
More about floating types in .NET here: http://csharpindepth.com/Articles/General/FloatingPoint.aspx
You can use StructLayout to achieve this (from Stack Overflow question C# unsafe value type array to byte array conversions):
[StructLayout(LayoutKind.Explicit)]
struct UnionArray
{
[FieldOffset(0)]
public Byte[] Bytes;
[FieldOffset(0)]
public float[] Floats;
}
static void Main(string[] args)
{
// From bytes to floats - works
byte[] bytes = { 0, 1, 2, 4, 8, 16, 32, 64 };
UnionArray arry = new UnionArray { Bytes = bytes };
for (int i = 0; i < arry.Bytes.Length / 4; i++)
Console.WriteLine(arry.Floats[i]);
}
IEnumerable<float> ToFloats(byte[] bytes)
{
for(int i = 0; i < bytes.Length; i+=4)
yield return BitConverter.ToSingle(bytes, i);
}
Two ways if you have access to LINQ:
var floatarray = ByteArry.AsEnumerable.Cast<float>().ToArray();
or just using Array Functions
var floatarray = Array.ConvertAll(ByteArray, item => (float)item);

Fast casting in C# using BitConverter, can it be any faster?

In our application, we have a very large byte-array and we have to convert these bytes into different types. Currently, we use BitConverter.ToXXXX() for this purpose. Our heavy hitters are, ToInt16 and ToUInt64.
For UInt64, our problem is that the data stream has actually 6-bytes of data to represent a large integer. Since there is no native function to convert 6-bytes of data to UInt64, we do:
UInt64 value = BitConverter.ToUInt64() & 0x0000ffffffffffff;
Our use of ToInt16 is simpler, do don't have to do any bit manipulation.
We do so many of these 2 operations that I wanted to ask the SO community whether there's a faster way to do these conversions. Right now, approximately 20% of our entire CPU cycles is consumed by these two functions.
Have you thought about using memory pointers directly. I can't vouch for its performance but it is a common trick in C++\C...
byte[] arr = { 1, 2, 3, 4, 5, 6, 7, 8 ,9,10,11,12,13,14,15,16};
fixed (byte* a2rr = &arr[0])
{
UInt64* uint64ptr = (UInt64*) a2rr;
Console.WriteLine("The value is {0:X2}", (*uint64ptr & 0x0000FFFFFFFFFFFF));
uint64ptr = (UInt64*) ((byte*) uint64ptr+6);
Console.WriteLine("The value is {0:X2}", (*uint64ptr & 0x0000FFFFFFFFFFFF));
}
You'll need to make your assembly "unsafe" in the build settings as well as mark the method in which you'd be doing this unsafe aswell. You are also tied to little endian with this approach.
You can use the System.Buffer class to copy a whole array over to another array of a different type as a fast, 'block copy' operation:
The BlockCopy method accesses the bytes in the src parameter array using offsets into memory, not programming constructs such as indexes or upper and lower array bounds.
The array types must be of 'primitive' types, they must align, and the copy operation is endian-sensitive. In your case of 6-bytes integers, it can't align with any of .NET's 'primitive' types, unless you can obtain the source array with two bytes of padding for each six, which will then align to Int64. But this method will work for arrays of Int16, which may speed up some of your operations.
Why not:
UInt16 valLow = BitConverter.ToUInt16();
UInt64 valHigh = (UInt64)BitConverter.ToUInt32();
UInt64 Value = (valHigh << 16) | valLow;
You can make that a single statement, although the JIT compiler will probably do that for you automatically.
That will prevent you from reading those extra two bytes that you end up throwing away.
If that doesn't reduce CPU, then you'll probably want to write your own converter that reads the bytes directly from the buffer. You can either use array indexing or, if you think it's necessary, unsafe code with pointers.
Note that, as a commenter pointed out, if you use any of these suggestions, then either you're limited to a particular "endian-ness", or you'll have to write your code to detect little/big endian and react accordingly. The code sample I showed above works for little endian (x86).
See my answer for a similar question here.
It's the same unsafe memory manipulation as in Jimmy's answer, but in a more "friendly" way for consumers. It'll allow you to view your byte array as UInt64 array.
For anyone else who stumbles across this if you only need little endian and do not need to auto detect big endian and convert from that. Then I've written an extended version of bitconverter with a number of additions to handle Span as well as converting arrays of type T for example int[] or timestamp[]
Also extended the types supported to include timestamp, decimal and datetime.
https://github.com/tcwicks/ChillX/blob/master/src/ChillX.Serialization/BitConverterExtended.cs
Example usage:
Random rnd = new Random();
RentedBuffer<byte> buffer = RentedBuffer<byte>.Shared.Rent(BitConverterExtended.SizeOfUInt64
+ (20 * BitConverterExtended.SizeOfUInt16)
+ (20 * BitConverterExtended.SizeOfTimeSpan)
+ (10 * BitConverterExtended.SizeOfSingle);
UInt64 exampleLong = long.MaxValue;
int startIndex = 0;
startIndex += BitConverterExtended.GetBytes(exampleLong, buffer.BufferSpan, startIndex);
UInt16[] shortArray = new UInt16[20];
for (int I = 0; I < shortArray.Length; I++) { shortArray[I] = (ushort)rnd.Next(0, UInt16.MaxValue); }
//When using reflection / expression trees CLR cannot distinguish between UInt16 and Int16 or Uint64 and Int64 etc...
//Therefore Uint methods are renamed.
startIndex += BitConverterExtended.GetBytesUShortArray(shortArray, buffer.BufferSpan, startIndex);
TimeSpan[] timespanArray = new TimeSpan[20];
for (int I = 0; I < timespanArray.Length; I++) { timespanArray[I] = TimeSpan.FromSeconds(rnd.Next(0, int.MaxValue)); }
startIndex += BitConverterExtended.GetBytes(timespanArray, buffer.BufferSpan, startIndex);
float[] floatArray = new float[10];
for (int I = 0; I < floatArray.Length; I++) { floatArray[I] = MathF.PI * rnd.Next(short.MinValue, short.MaxValue); }
startIndex += BitConverterExtended.GetBytes(floatArray, buffer.BufferSpan, startIndex);
//Do stuff with buffer and then
buffer.Return(); //always better to return it as soon as possible
//Or in case you forget
buffer = null;
//and let RentedBufferContract do this automatically
it supports reading from and writing to both byte[] or RentedBuffer however using the RentedBuffer class greatly reduces GC collection overheads.
RentedBufferContract class internally handles returning buffers to the pool to prevent memory leaks.
Also includes a serializer which is similar to messagepack.
Note: MessagePack is a faster serializer with more features however this serializer reduces GC collection overheads by reading from and writing to rented byte buffers.
https://github.com/tcwicks/ChillX/blob/master/src/ChillX.Serialization/ChillXSerializer.cs

Categories

Resources