IEEE754 to floating point C#

IEEE754 to floating point C# - c#

The below code is simple converting a 32bit-integer from the object being passed to the function, the 32-bit integer represents a floating number. I have checked with an online calculator that that i am getting the sign, exponent and mantessa the correct way but strangely i am getting the answer wrong.
Can anyone please check if i am mathematically (or maybe programmatically) doing it wrong somehow!?
Regards
public double FromFloatSafe(object f)
{
uint fb = Convert.ToUInt32(f);
uint sign, exponent = 0, mantessa = 0;
uint bias = 127;
sign = (fb >> 31) & 1;
exponent = (fb >> 23) & 0xFF;
mantessa = (fb & 0x7FFFFF);
double fSign = Math.Pow((-1), sign);
double fMantessa = 1 + (1 / mantessa);
double fExponent = Math.Pow(2, (exponent -bias));
double ret = fSign * fMantessa * fExponent;
return ret;
}

Something like that:
uint fb = Convert.ToUInt32(f);
return BitConverter.ToSingle(BitConverter.GetBytes((int) fb), 0);

This handles even denormal numbers:
public static float FromFloatSafe(object f)
{
uint fb = Convert.ToUInt32(f);
int sign = (int)((fb >> 31) & 1);
int exponent = (int)((fb >> 23) & 0xFF);
int mantissa = (int)(fb & 0x7FFFFF);
float fMantissa;
float fSign = sign == 0 ? 1.0f : -1.0f;
if (exponent != 0)
{
exponent -= 127;
fMantissa = 1.0f + (mantissa / (float)0x800000);
}
else
{
if (mantissa != 0)
{
// denormal
exponent -= 126;
fMantissa = 1.0f / (float)0x800000;
}
else
{
// +0 and -0 cases
fMantissa = 0;
}
}
float fExponent = (float)Math.Pow(2.0, exponent);
float ret = fSign * fMantissa * fExponent;
return ret;
}
note that I do think there is something fishy here, but you asked for it, I wrote it... I feel this is a XY problem.
Ah... and note that while academically what I wrote is very interesting, I normally do it this way:
[StructLayout(LayoutKind.Explicit)]
public struct UInt32ToFloat
{
[FieldOffset(0)]
public uint UInt32;
[FieldOffset(0)]
public float Single;
}
then
float f = new UInt32ToFloat { UInt32 = Convert.ToUInt32(f) }.Single;

Related

What is the most efficient way to get denominator and numerator from decimal/fractional number

I want to get numerator and denominator from decimal and then simplify both numerator and denominator. For example, if decimal is 0.05, then numerator and denominator should be 1/20. Preferably without loops/iterations or least amount of those and without memory heap.
I'm currently using this code, but it results in 0/1 rather than expected 1/20.
public Fraction(double chance) {
long num = 0;
long den = 1;
unchecked {
ulong bitwiseRepr = BitConverter.DoubleToUInt64Bits(chance);
ulong signBit = bitwiseRepr >> 63;
ulong expBits = bitwiseRepr >> 52;
ulong mntsBits = bitwiseRepr & 0x7FFFFFFFFF;
long signFactor = signBit == 1 ? -1 : 1;
if (expBits == 0x0000 || expBits == 0x0800) { // +0, -0
goto NormalWay;
}
else if (expBits == 0x07FF || expBits == 0x0FFF) { // +Inf, -Inf
num = 0;
den = signFactor;
goto NormalWay;
}
else if (expBits == 0x07FFF) { // NaN
num = long.MaxValue;
den = 0;
goto NormalWay;
}
long significand = (long)((1u << 52) | mntsBits);
int nTrailizingZeros = CountTrailingZeroes(significand);
significand >>= nTrailizingZeros;
int exp = (int)expBits - 127 - 52 + nTrailizingZeros;
if (exp < 0) {
num = significand;
den = 1 << -exp;
}
else {
num = signFactor * significand * (1 << exp);
den = 1;
}
goto NormalWay;
}
NormalWay:
numerator = num;
denominator = den;
Normalize();
}
private static int CountTrailingZeroes(long v) {
int counter = 0;
while (counter < 64 && (v & 1u) == 0) {
v >>= 1;
counter++;
}
return counter;
}
[MethodImpl(MethodImplOptions.AggressiveOptimization | MethodImplOptions.AggressiveInlining)]
public void Normalize() {
bool numeratorIsNegative = numerator < 0;
bool denominatorIsNegative = denominator < 0;
if (numeratorIsNegative) {
numerator *= -1;
}
if (denominatorIsNegative) {
denominator *= -1;
}
if (numerator > long.MaxValue / 2 && denominator > long.MaxValue / 2) {
throw new ArithmeticException($"Numerator or denominator are greater than {long.MaxValue / 2} or lesser than {-long.MaxValue / 2}.");
}
numerator += denominator;
Reduce(GCD(numerator, denominator));
Reduce(Math.Sign(denominator));
numerator %= denominator;
if (numeratorIsNegative) {
numerator *= -1;
}
if (denominatorIsNegative) {
denominator *= -1;
}
}
[MethodImpl(MethodImplOptions.AggressiveOptimization | MethodImplOptions.AggressiveInlining)]
private void Reduce(long x) {
numerator /= x;
denominator /= x;
}
[MethodImpl(MethodImplOptions.AggressiveOptimization | MethodImplOptions.AggressiveInlining)]
private static long GCD(long a, long b) {
while (b != 0) {
long t = b;
b = a % b;
a = t;
}
return a;
}
Current code is also going to be less acurate cause it works just for doubles (yet).

As simple as that:
internal static (BigInteger numerator, BigInteger denominator) Fraction(decimal d)
{
int[] bits = decimal.GetBits(d);
BigInteger numerator = (1 - ((bits[3] >> 30) & 2)) *
unchecked(((BigInteger)(uint)bits[2] << 64) |
((BigInteger)(uint)bits[1] << 32) |
(BigInteger)(uint)bits[0]);
BigInteger denominator = BigInteger.Pow(10, (bits[3] >> 16) & 0xff);
return (numerator, denominator);
}

C# signed fixed point to floating point conversion

I have a temperature sensor returning 2 bytes.
The temperature is defined as follows :
What is the best way in C# to convert these 2 byte to a float ?
My sollution is the following, but I don't like the power of 2 and the for loop :
static void Main(string[] args)
{
byte[] sensorData = new byte[] { 0b11000010, 0b10000001 }; //(-1) * (2^(6) + 2^(1) + 2^(-1) + 2^(-8)) = -66.50390625
Console.WriteLine(ByteArrayToTemp(sensorData));
}
static double ByteArrayToTemp(byte[] data)
{
// Convert byte array to short to be able to shift it
if (BitConverter.IsLittleEndian)
Array.Reverse(data);
Int16 dataInt16 = BitConverter.ToInt16(data, 0);
double temp = 0;
for (int i = 0; i < 15; i++)
{
//We take the LSB of the data and multiply it by the corresponding second power (from -8 to 6)
//Then we shift the data for the next loop
temp += (dataInt16 & 0x01) * Math.Pow(2, -8 + i);
dataInt16 >>= 1;
}
if ((dataInt16 & 0x01) == 1) temp *= -1; //Sign bit
return temp;
}

This might be slightly more efficient, but I can't see it making much difference:
static double ByteArrayToTemp(byte[] data)
{
if (BitConverter.IsLittleEndian)
Array.Reverse(data);
ushort bits = BitConverter.ToUInt16(data, 0);
double scale = 1 << 6;
double result = 0;
for (int i = 0, bit = 1 << 14; i < 15; ++i, bit >>= 1, scale /= 2)
{
if ((bits & bit) != 0)
result += scale;
}
if ((bits & 0x8000) != 0)
result = -result;
return result;
}
You're not going to be able to avoid a loop when calculating this.

Same structs have different HashCode

I have a code:
public class Point
{
public int x;
public int y;
public Point() { x = 0; y = 0; }
public Point(int a, int b) { x = a; y = b; }
}
public struct Coefficients{
public double a;
public double b;
public double c;
public Coefficients(double a, double b, double c)
{
this.a = a;
this.b = b;
this.c = c;
}
public static Coefficients GetFromPoints(Point point1, Point point2)
{
int x1 = point1.x;
int x2 = point2.x;
int y1 = point1.y;
int y2 = point2.y;
double a = y1- y2;
double b = x2 - x1;
double c = x1 * y2 - y1 * x2 ;
double max = Math.Max(Math.Max(a, b), c);
double min= Math.Min(Math.Min(a, b), c);
double divider = Math.Abs(max)> Math.Abs(min)?max:min;
divider = Math.Abs(divider) > 1? divider : 1;
return new Coefficients(a/divider, b/divider, c/divider);
}
}
public class Solution
{
public int MaxPoints(Point[] points)
{
var coef_list = new List<Coefficients>();
for (var x = 0; x < points.Length - 1; x++)
{
for (var y = x + 1; y < points.Length; y++)
{
var coef = Coefficients.GetFromPoints(points[x], points[y]);
coef_list.Add(coef);
}
}
foreach (var item in coef_list) {
Debug.WriteLine(item.a);
Debug.WriteLine(item.b);
Debug.WriteLine(item.c);
Debug.WriteLine(item.GetHashCode());
Debug.WriteLine("---------------");
}
return 0;
}
}
As you can see i used a struct and i remarked weird behavior.
If i have input data like this:
prg.MaxPoints(new Point[] { new Point(4, -1), new Point(4, 0), new Point(4, 5) });
Debug output is:
-0,25
0
1
-450335288
---------------
-0,25
0
1
-450335288
---------------
-0,25
0
1
-450335288
---------------
But if i change args. order to:
prg.MaxPoints(new Point[] { new Point(4, 0),new Point(4, -1) , new Point(4, 5) });
Debug out is:
-0,25
0
1
1697148360
---------------
-0,25
0
1
-450335288
---------------
-0,25
0
1
-450335288
---------------
And there is one thing that can be important is that in first case we have all "dividers"(GetFromPoints method) are positive (4,24,20) in second case one of them is negative and other two are positive (-4,20,24).
Can anybody explain this?
UPD.
when i changed
return new Coefficients(a/divider, b/divider, c/divider);
to
return new Coefficients(a/divider, 0, c/divider);//anyway in all of these cases 2-nd argument is 0
which means that 0 divided by a negative isn't 0?

Basically you are getting a negative zero value double. However the runtime's default GetHashCode for structs appears to blindly just combine the underlying bytes and not call the field's GetHashCode. Here is simplified version of what you are seeing:
public struct S
{
public double value;
public S(double d)
{
value = d;
}
}
public static void Main(string[] args)
{
double d1 = 0;
double d2 = d1 / -1;
Console.WriteLine("using double");
Console.WriteLine("{0} {1}", d1, d1.GetHashCode());
Console.WriteLine(GetComponentParts(d1));
Console.WriteLine("{0} {1}", d2, d2.GetHashCode());
Console.WriteLine(GetComponentParts(d2));
Console.WriteLine("Equals: {0}, Hashcode:{1}, {2}", d1.Equals(d2), d1.GetHashCode(), d2.GetHashCode());
Console.WriteLine();
Console.WriteLine("using a custom struct");
var s1 = new S(d1);
var s2 = new S(d2);
Console.WriteLine(s1.Equals(s2));
Console.WriteLine(new S(d1).GetHashCode());
Console.WriteLine(new S(d2).GetHashCode());
}
// from: https://msdn.microsoft.com/en-us/library/system.double.epsilon(v=vs.110).aspx
private static string GetComponentParts(double value)
{
string result = String.Format("{0:R}: ", value);
int indent = result.Length;
// Convert the double to an 8-byte array.
byte[] bytes = BitConverter.GetBytes(value);
// Get the sign bit (byte 7, bit 7).
result += String.Format("Sign: {0}\n",
(bytes[7] & 0x80) == 0x80 ? "1 (-)" : "0 (+)");
// Get the exponent (byte 6 bits 4-7 to byte 7, bits 0-6)
int exponent = (bytes[7] & 0x07F) << 4;
exponent = exponent | ((bytes[6] & 0xF0) >> 4);
int adjustment = exponent != 0 ? 1023 : 1022;
result += String.Format("{0}Exponent: 0x{1:X4} ({1})\n", new String(' ', indent), exponent - adjustment);
// Get the significand (bits 0-51)
long significand = ((bytes[6] & 0x0F) << 48);
significand = significand | ((long) bytes[5] << 40);
significand = significand | ((long) bytes[4] << 32);
significand = significand | ((long) bytes[3] << 24);
significand = significand | ((long) bytes[2] << 16);
significand = significand | ((long) bytes[1] << 8);
significand = significand | bytes[0];
result += String.Format("{0}Mantissa: 0x{1:X13}\n", new String(' ', indent), significand);
return result;
}
The output:
using double
0 0
0: Sign: 0 (+)
Exponent: 0xFFFFFC02 (-1022)
Mantissa: 0x0000000000000
0 0
0: Sign: 1 (-)
Exponent: 0xFFFFFC02 (-1022)
Mantissa: 0x0000000000000
Equals: True, Hashcode:0, 0
using a custom struct
False
346948956
-1800534692
I've defined two double one of which is the "normal" zero and the other which is "negative" zero. The difference between the two is in the double's sign bit. The two values are equal in all apparent ways (Equals comparison, GetHashCode, ToString representation) except on the byte level. However when they are put into a custom struct the runtime's GetHashCode method just combines the raw bits which gives a different hash code for each struct even through they contain equal values. Equals does the same thing and gets a False result.
I admit this is kind of big gotcha. The solution to this is to make sure to you override Equals and GetHashCode to get the proper equality that you want.
Actually a similar issue has been mentioned before apparently the runtime only does this when the struct's fields are all 8 bytes wide.

Comparing bits efficiently ( overlap set of x )

I want to compare a stream of bits of arbitrary length to a mask in c# and return a ratio of how many bits were the same.
The mask to check against is anywhere between 2 bits long to 8k (with 90% of the masks being 5 bits long), the input can be anywhere between 2 bits up to ~ 500k, with an average input string of 12k (but yeah, most of the time it will be comparing 5 bits with the first 5 bits of that 12k)
Now my naive implementation would be something like this:
bool[] mask = new[] { true, true, false, true };
float dendrite(bool[] input) {
int correct = 0;
for ( int i = 0; i<mask.length; i++ ) {
if ( input[i] == mask[i] )
correct++;
}
return (float)correct/(float)mask.length;
}
but I expect this is better handled (more efficient) with some kind of binary operator magic?
Anyone got any pointers?
EDIT: the datatype is not fixed at this point in my design, so if ints or bytearrays work better, I'd also be a happy camper, trying to optimize for efficiency here, the faster the computation, the better.
eg if you can make it work like this:
int[] mask = new[] { 1, 1, 0, 1 };
float dendrite(int[] input) {
int correct = 0;
for ( int i = 0; i<mask.length; i++ ) {
if ( input[i] == mask[i] )
correct++;
}
return (float)correct/(float)mask.length;
}
or this:
int mask = 13; //1101
float dendrite(int input) {
return // your magic here;
} // would return 0.75 for an input
// of 101 given ( 1100101 in binary,
// matches 3 bits of the 4 bit mask == .75
ANSWER:
I ran each proposed answer against each other and Fredou's and Marten's solution ran neck to neck but Fredou submitted the fastest leanest implementation in the end. Of course since the average result varies quite wildly between implementations I might have to revisit this post later on. :) but that's probably just me messing up in my test script. ( i hope, too late now, going to bed =)
sparse1.Cyclone
1317ms 3467107ticks 10000iterations
result: 0,7851563
sparse1.Marten
288ms 759362ticks 10000iterations
result: 0,05066964
sparse1.Fredou
216ms 568747ticks 10000iterations
result: 0,8925781
sparse1.Marten
296ms 778862ticks 10000iterations
result: 0,05066964
sparse1.Fredou
216ms 568601ticks 10000iterations
result: 0,8925781
sparse1.Marten
300ms 789901ticks 10000iterations
result: 0,05066964
sparse1.Cyclone
1314ms 3457988ticks 10000iterations
result: 0,7851563
sparse1.Fredou
207ms 546606ticks 10000iterations
result: 0,8925781
sparse1.Marten
298ms 786352ticks 10000iterations
result: 0,05066964
sparse1.Cyclone
1301ms 3422611ticks 10000iterations
result: 0,7851563
sparse1.Marten
292ms 769850ticks 10000iterations
result: 0,05066964
sparse1.Cyclone
1305ms 3433320ticks 10000iterations
result: 0,7851563
sparse1.Fredou
209ms 551178ticks 10000iterations
result: 0,8925781
( testscript copied here, if i destroyed yours modifying it lemme know. https://dotnetfiddle.net/h9nFSa )

how about this one - dotnetfiddle example
using System;
namespace ConsoleApplication1
{
public class Program
{
public static void Main(string[] args)
{
int a = Convert.ToInt32("0001101", 2);
int b = Convert.ToInt32("1100101", 2);
Console.WriteLine(dendrite(a, 4, b));
}
private static float dendrite(int mask, int len, int input)
{
return 1 - getBitCount(mask ^ (input & (int.MaxValue >> 32 - len))) / (float)len;
}
private static int getBitCount(int bits)
{
bits = bits - ((bits >> 1) & 0x55555555);
bits = (bits & 0x33333333) + ((bits >> 2) & 0x33333333);
return ((bits + (bits >> 4) & 0xf0f0f0f) * 0x1010101) >> 24;
}
}
}
64 bits one here - dotnetfiddler
using System;
namespace ConsoleApplication1
{
public class Program
{
public static void Main(string[] args)
{
// 1
ulong a = Convert.ToUInt64("0000000000000000000000000000000000000000000000000000000000001101", 2);
ulong b = Convert.ToUInt64("1110010101100101011001010110110101100101011001010110010101100101", 2);
Console.WriteLine(dendrite(a, 4, b));
}
private static float dendrite(ulong mask, int len, ulong input)
{
return 1 - getBitCount(mask ^ (input & (ulong.MaxValue >> (64 - len)))) / (float)len;
}
private static ulong getBitCount(ulong bits)
{
bits = bits - ((bits >> 1) & 0x5555555555555555UL);
bits = (bits & 0x3333333333333333UL) + ((bits >> 2) & 0x3333333333333333UL);
return unchecked(((bits + (bits >> 4)) & 0xF0F0F0F0F0F0F0FUL) * 0x101010101010101UL) >> 56;
}
}
}

I came up with this code:
static float dendrite(ulong input, ulong mask)
{
// get bits that are same (0 or 1) in input and mask
ulong samebits = mask & ~(input ^ mask);
// count number of same bits
int correct = cardinality(samebits);
// count number of bits in mask
int inmask = cardinality(mask);
// compute fraction (0.0 to 1.0)
return inmask == 0 ? 0f : correct / (float)inmask;
}
// this is a little hack to count the number of bits set to one in a 64-bit word
static int cardinality(ulong word)
{
const ulong mult = 0x0101010101010101;
const ulong mask1h = (~0UL) / 3 << 1;
const ulong mask2l = (~0UL) / 5;
const ulong mask4l = (~0UL) / 17;
word -= (mask1h & word) >> 1;
word = (word & mask2l) + ((word >> 2) & mask2l);
word += word >> 4;
word &= mask4l;
return (int)((word * mult) >> 56);
}
This will check 64-bits at a time. If you need more than that you can just split the input data into 64-bit words and compare them one by one and compute the average result.
Here's a .NET fiddle with the code and a working test case:
https://dotnetfiddle.net/5hYFtE

I would change the code to something along these lines:
// hardcoded bitmask
byte mask = 255;
float dendrite(byte input) {
int correct = 0;
// store the xor:ed result
byte xored = input ^ mask;
// loop through each bit
for(int i = 0; i < 8; i++) {
// if the bit is 0 then it was correct
if(!(xored & (1 << i)))
correct++;
}
return (float)correct/(float)mask.length;
}
The above uses a mask and input of 8 bits, but of course you could modify this to use a 4 byte integer and so on.
Not sure if this will work as expected, but it might give you some clues on how to proceed.
For example if you only would like to check the first 4 bits you could change the code to something like:
float dendrite(byte input) {
// hardcoded bitmask i.e 1101
byte mask = 13;
// number of bits to check
byte bits = 4;
int correct = 0;
// store the xor:ed result
byte xored = input ^ mask;
// loop through each bit, notice that we only checking the first 4 bits
for(int i = 0; i < bits; i++) {
// if the bit is 0 then it was correct
if(!(xored & (1 << i)))
correct++;
}
return (float)correct/(float)bits;
}
Of course it might be faster to actually use a int instead of a byte.

Converting two bytes to an IEEE-11073 16-bit SFLOAT in C#

I need to convert a two byte array to SFloat format according to IEEE-11073.
How can I do that?
I answer my question here.
public float ToSFloat(byte[] value)
{
if (value.Length != 2)
throw new ArgumentException();
byte b0 = value[0];
byte b1 = value[1];
var mantissa = unsignedToSigned(ToInt(b0) + ((ToInt(b1) & 0x0F) << 8), 12);
var exponent = unsignedToSigned(ToInt(b1) >> 4, 4);
return (float)(mantissa * Math.Pow(10, exponent));
}
public int ToInt(byte value)
{
return value & 0xFF;
}
private int unsignedToSigned(int unsigned, int size)
{
if ((unsigned & (1 << size-1)) != 0)
{
unsigned = -1 * ((1 << size-1) - (unsigned & ((1 << size-1) - 1)));
}
return unsigned;
}

public float ToSFloat(byte[] value)
{
if (value.Length != 2)
throw new ArgumentException();
byte b0 = value[0];
byte b1 = value[1];
var mantissa = unsignedToSigned(ToInt(b0) + ((ToInt(b1) & 0x0F) << 8), 12);
var exponent = unsignedToSigned(ToInt(b1) >> 4, 4);
return (float)(mantissa * Math.Pow(10, exponent));
}
public int ToInt(byte value)
{
return value & 0xFF;
}
private int unsignedToSigned(int unsigned, int size)
{
if ((unsigned & (1 << size-1)) != 0)
{
unsigned = -1 * ((1 << size-1) - (unsigned & ((1 << size-1) - 1)));
}
return unsigned;
}

Loosely based on the C implementation by Signove on GitHub I have created this function in C#:
Dictionary<Int32, Single> reservedValues = new Dictionary<Int32, Single> {
{ 0x07FE, Single.PositiveInfinity },
{ 0x07FF, Single.NaN },
{ 0x0800, Single.NaN },
{ 0x0801, Single.NaN },
{ 0x0802, Single.NegativeInfinity }
};
Single Ieee11073ToSingle(Byte[] bytes) {
var ieee11073 = (UInt16) (bytes[0] + 0x100*bytes[1]);
var mantissa = ieee11073 & 0x0FFF;
if (reservedValues.ContainsKey(mantissa))
return reservedValues[mantissa];
if (mantissa >= 0x0800)
mantissa = -(0x1000 - mantissa);
var exponent = ieee11073 >> 12;
if (exponent >= 0x08)
exponent = -(0x10 - exponent);
var magnitude = Math.Pow(10d, exponent);
return (Single) (mantissa*magnitude);
}
This function assumes that the bytes are in little endian format. If not you will have to swap bytes[0] and bytes[1] in the first line of the function. Or perhaps even better remove the first line from the function and change the function argument to accept a UInt16 (the IEEE 11073 value) and then let the caller decide how to extract this value from the input.
I highly advise you to test this code because I do not have any test values to verify the correctnes of the conversion.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

IEEE754 to floating point C# - c#

Something like that: uint fb = Convert.ToUInt32(f); return BitConverter.ToSingle(BitConverter.GetBytes((int) fb), 0);

Related

What is the most efficient way to get denominator and numerator from decimal/fractional number

C# signed fixed point to floating point conversion

Same structs have different HashCode

Comparing bits efficiently ( overlap set of x )

Converting two bytes to an IEEE-11073 16-bit SFLOAT in C#

Categories

Resources