Get next smallest Double number - c#

As part of a unit test, I need to test some boundary conditions. One method accepts a System.Double argument.
Is there a way to get the next-smallest double value? (i.e. decrement the mantissa by 1 unit-value)?
I considered using Double.Epsilon but this is unreliable as it's only the smallest delta from zero, and so doesn't work for larger values (i.e. 9999999999 - Double.Epsilon == 9999999999).
So what is the algorithm or code needed such that:
NextSmallest(Double d) < d
...is always true.

If your numbers are finite, you can use a couple of convenient methods in the BitConverter class:
long bits = BitConverter.DoubleToInt64Bits(value);
if (value > 0)
return BitConverter.Int64BitsToDouble(bits - 1);
else if (value < 0)
return BitConverter.Int64BitsToDouble(bits + 1);
else
return -double.Epsilon;
IEEE-754 formats were designed so that the bits that make up the exponent and mantissa together form an integer that has the same ordering as the floating-point numbers. So, to get the largest smaller number, you can subtract one from this number if the value is positive, and you can add one if the value is negative.
The key reason why this works is that the leading bit of the mantissa is not stored. If your mantissa is all zeros, then your number is a power of two. If you subtract 1 from the exponent/mantissa combination, you get all ones and you'll have to borrow from the exponent bits. In other words: you have to decrement the exponent, which is exactly what we want.

The Wikipedia page on double-precision floating point is here: http://en.wikipedia.org/wiki/Double_precision_floating-point_format
For fun I wrote some code to break out the binary representation of the double format, decrements the mantissa and recomposes the resultant double. Because of the implicit bit in the mantissa we have to check for it and modify the exponent accordingly, and it might fail near the limits.
Here's the code:
public static double PrevDouble(double src)
{
// check for special values:
if (double.IsInfinity(src) || double.IsNaN(src))
return src;
if (src == 0)
return -double.MinValue;
// get bytes from double
byte[] srcbytes = System.BitConverter.GetBytes(src);
// extract components
byte sign = (byte)(srcbytes[7] & 0x80);
ulong exp = ((((ulong)srcbytes[7]) & 0x7F) << 4) + (((ulong)srcbytes[6] >> 4) & 0x0F);
ulong mant = ((ulong)1 << 52) | (((ulong)srcbytes[6] & 0x0F) << 48) | (((ulong)srcbytes[5]) << 40) | (((ulong)srcbytes[4]) << 32) | (((ulong)srcbytes[3]) << 24) | (((ulong)srcbytes[2]) << 16) | (((ulong)srcbytes[1]) << 8) | ((ulong)srcbytes[0]);
// decrement mantissa
--mant;
// check if implied bit has been removed and shift if so
if ((mant & ((ulong)1 << 52)) == 0)
{
mant <<= 1;
exp--;
}
// build byte representation of modified value
byte[] bytes = new byte[8];
bytes[7] = (byte)((ulong)sign | ((exp >> 4) & 0x7F));
bytes[6] = (byte)((((ulong)exp & 0x0F) << 4) | ((mant >> 48) & 0x0F));
bytes[5] = (byte)((mant >> 40) & 0xFF);
bytes[4] = (byte)((mant >> 32) & 0xFF);
bytes[3] = (byte)((mant >> 24) & 0xFF);
bytes[2] = (byte)((mant >> 16) & 0xFF);
bytes[1] = (byte)((mant >> 8) & 0xFF);
bytes[0] = (byte)(mant & 0xFF);
// convert back to double and return
double res = System.BitConverter.ToDouble(bytes, 0);
return res;
}
All of which gives you a value that is different from the initial value by a change in the lowest bit of the mantissa... in theory :)
Here's a test:
public static Main(string[] args)
{
double test = 1.0/3;
double prev = PrevDouble(test);
Console.WriteLine("{0:r}, {1:r}, {2:r}", test, prev, test - prev);
}
Gives the following results on my PC:
0.33333333333333331, 0.33333333333333326, 5.5511151231257827E-17
The difference is there, but is probably below the rounding threshold. The expression test == prev evaluates to false though, and there is an actual difference as shown above :)

In .NET Core 3.0 you can use Math.BitIncrement/Math.BitDecrement. No need to do manual bit manipulation anymore
Returns the smallest value that compares greater than a specified value.
Returns the largest value that compares less than a specified value.
Since .NET Core 7.0 there are also Double.BitIncrement and Double.BitDecrement

Related

Modify specific bit in byte

I need to modify (!not toggle XOR!) specific bit in byte value. I have:
source byte (e.g. b11010010);
index of bit to modify (e.g. 4);
new value of bit (0 or 1).
Now, what I need. If new value is 0, then bit[4] must be set to 0. If new value is 1, then bit[4] must be set to 1.
General part:
var bitIndex = 4;
var byte = b11010010;
var mask = 1 << bitIndex;
var newValue = 1;
This is the easiest way to do this:
if(newValue == 1)
byte |= mask; // set bit[bitIndex]
else
byte &= ~mask; // drop bit[bitIndex]
Another way allows to do this without if else statement, but look to hard to understand:
byte = byte & ~mask | (newValue << bitIndex) & mask
Here, first AND drops bit[bitIndex], second AND calculates new value for bit[bitIndex], and OR set bit[bitIndex] to calculated value, not matter is it 0 or 1.
Is there any easier way to set specific bit into given value?
(newValue << bitIndex) only has a single bit set, there's no need for & mask.
So you have just 5 operations.
byte = byte & ~(1 << bitIndex) | (newValue << bitIndex); // bitIndex'th bit becomes newValue
It's still complex enough to be worth a comment, but easy to see that the comment is correct because it's two easily recognized operations chained together (unlike the current accepted answer, which requires every reader to sit down and think about it for a minute)
The canonical way to do this is:
byte ^= (-newValue ^ byte) & (1 << n);
Bit number n will be set if newValue == 1, and cleared if newValue == 0

Looking for a more efficient pop count given a restriction

The popcount function returns the number of 1's in an input. 0010 1101 has a popcount of 4.
Currently, I am using this algorithm to get the popcount:
private int PopCount(int x)
{
x = x - ((x >> 1) & 0x55555555);
x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
return (((x + (x >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24;
}
This works fine and the only reason I ask for more is because this operation is run awfully often and I am looking for additional performance gains.
I'm looking for a way to simplify the algorithm based on the fact that my 1's will always be right aligned. That is, the input will be something like 00000 11111 (returns 5) or 00000 11111 11111 (returns 10).
Is there a way to make a more efficient popcount based on this constraint? If the input was 01011 11101 10011, it would just return 2 because it only cares about the right-most ones. It seems any kind of looping is slower than the existing solution.
Here's a C# implementation that performs "find highest set" (binary logarithm). It may or may not be faster than your current PopCount, it surely is slower than using the real clz and/or popcnt CPU instructions:
static int FindMSB( uint input )
{
if (input == 0) return 0;
return (int)(BitConverter.DoubleToInt64Bits(input) >> 52) - 1022;
}
Test: http://rextester.com/AOXD85351
And a slight variation without a conditional branch:
/* precondition: ones are right-justified, e.g. 00000111 or 00111111 */
static int FindMSB( uint input )
{
return (int)(input & (int)(BitConverter.DoubleToInt64Bits(input) >> 52) - 1022);
}

bitwise shift in uint

uint number = 0x418 in bits : 0000010000011000
uint number1 = 0x8041 in bits: 1000000001000001
uint number2 = 0x1804 in bits: 0001100000000100
I cannot get 0x8041 with
number >> 4;
or
(number >> 4) & 0xffff;
How I can get 0x8041 and 0x1804 from 0x418 with shift?
SOLUTION
(number >> nbits) | (number << (16 - nbits))
C# does not have a bitwise rotate operator - bits shifted past the right end just fall off and vanish. What you can do to solve this is
(number >> nbits) | (number << (32 - nbits))
which will right-rotate a 32-bit unsigned integer by nbits bits.
What you are describing is typically known as Rotation, not Shifting. In assembly (x86), this is exposed via ROR and ROL instructions.
I'm not aware of a bitwise operator available in C# to do this, but the algorithm is simple enough:
value = value & 0x1 ? (1 << Marshal.SizeOf(value) * 8 - 1) | (value >> 1) : ( value >> 1);

How to parse byte using bits value

I have to get values from a byte saved in three parts of bit combination.
Bit Combination is following
| - - | - - - | - - - |
first portion contains two bits
Second portion contains 3 bits
Third portion contains 3 bits
sample value is
11010001 = 209 decimal
What I want is create Three different Properties which get me decimal value of three portion of given bit as defined above.
how can i get Bit values from this decimal number and then get decimal value from respective bits..
Just use shifting and masking. Assuming that the two-bit value is in the high bits of the byte:
int value1 = (value >> 6) & 3; // 3 = binary 11
int value2 = (value >> 3) & 7; // 7 = binary 111
int value3 = (value >> 0) & 7;
The final line doesn't have to use the shift operator of course - shifting by 0 bits does nothing. I think it adds to the consistency though.
For your sample value, that would give value1 = 3, value2 = 2, value3 = 1.
Reversing:
byte value = (byte) ((value1 << 6) | (value2 << 3) | (value3 << 0));
You can extract the different parts using bit-masks, like this:
int part1=b & 0x3;
int part2=(b>>2) & 0x7;
int part3=(b>>5) & 0x7;
This shifts each part into the least-significant-bits, and then uses binary and to mask all other bits away.
And I assume you don't want the decimal value of these bits, but an int containing their value. An integer is still represented as a binary number internally. Representing the int in base 10/decimal only happens once you convert to string.

Number of unset bit left of most significant set bit?

Assuming the 64bit integer 0x000000000000FFFF which would be represented as
00000000 00000000 00000000 00000000
00000000 00000000 >11111111 11111111
How do I find the amount of unset bits to the left of the most significant set bit (the one marked with >) ?
In straight C (long long are 64 bit on my setup), taken from similar Java implementations: (updated after a little more reading on Hamming weight)
A little more explanation: The top part just sets all bit to the right of the most significant 1, and then negates it. (i.e. all the 0's to the 'left' of the most significant 1 are now 1's and everything else is 0).
Then I used a Hamming Weight implementation to count the bits.
unsigned long long i = 0x0000000000000000LLU;
i |= i >> 1;
i |= i >> 2;
i |= i >> 4;
i |= i >> 8;
i |= i >> 16;
i |= i >> 32;
// Highest bit in input and all lower bits are now set. Invert to set the bits to count.
i=~i;
i -= (i >> 1) & 0x5555555555555555LLU; // each 2 bits now contains a count
i = (i & 0x3333333333333333LLU) + ((i >> 2) & 0x3333333333333333LLU); // each 4 bits now contains a count
i = (i + (i >> 4)) & 0x0f0f0f0f0f0f0f0fLLU; // each 8 bits now contains a count
i *= 0x0101010101010101LLU; // add each byte to all the bytes above it
i >>= 56; // the number of bits
printf("Leading 0's = %lld\n", i);
I'd be curious to see how this was efficiency wise. Tested it with several values though and it seems to work.
Based on: http://www.hackersdelight.org/HDcode/nlz.c.txt
template<typename T> int clz(T v) {int n=sizeof(T)*8;int c=n;while (n){n>>=1;if (v>>n) c-=n,v>>=n;}return c-v;}
If you'd like a version that allows you to keep your lunch down, here you go:
int clz(uint64_t v) {
int n=64,c=64;
while (n) {
n>>=1;
if (v>>n) c-=n,v>>=n;
}
return c-v;
}
As you'll see, you can save cycles on this by careful analysis of the assembler, but the strategy here is not a terrible one. The while loop will operate Lg[64]=6 times; each time it will convert the problem into one of counting the number of leading bits on an integer of half the size.
The if statement inside the while loop asks the question: "can i represent this integer in half as many bits", or analogously, "if i cut this in half, have i lost it?". After the if() payload completes, our number will always be in the lowest n bits.
At the final stage, v is either 0 or 1, and this completes the calculation correctly.
If you are dealing with unsigned integers, you could do this:
#include <math.h>
int numunset(uint64_t number)
{
int nbits = sizeof(uint64_t)*8;
if(number == 0)
return nbits;
int first_set = floor(log2(number));
return nbits - first_set - 1;
}
I don't know how it will compare in performance to the loop and count methods that have already been offered because log2() could be expensive.
Edit:
This could cause some problems with high-valued integers since the log2() function is casting to double and some numerical issues may arise. You could use the log2l() function that works with long double. A better solution would be to use an integer log2() function as in this question.
// clear all bits except the lowest set bit
x &= -x;
// if x==0, add 0, otherwise add x - 1.
// This sets all bits below the one set above to 1.
x+= (-(x==0))&(x - 1);
return 64 - count_bits_set(x);
Where count_bits_set is the fastest version of counting bits you can find. See https://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel for various bit counting techniques.
I'm not sure I understood the problem correctly. I think you have a 64bit value and want to find the number of leading zeros in it.
One way would be to find the most significant bit and simply subtract its position from 63 (assuming lowest bit is bit 0). You can find out the most significant bit by testing whether a bit is set from within a loop over all 64 bits.
Another way might be to use the (non-standard) __builtin_clz in gcc.
I agree with the binary search idea. However two points are important here:
The range of valid answers to your question is from 0 to 64 inclusive. In other words - there may be 65 different answers to the question. I think (almost sure) all who posted the "binary search" solution missed this point, hence they'll get wrong answer for either zero or a number with the MSB bit on.
If speed is critical - you may want to avoid the loop. There's an elegant way to achieve this using templates.
The following template stuff finds the MSB correctly of any unsigned type variable.
// helper
template <int bits, typename T>
bool IsBitReached(T x)
{
const T cmp = T(1) << (bits ? (bits-1) : 0);
return (x >= cmp);
}
template <int bits, typename T>
int FindMsbInternal(T x)
{
if (!bits)
return 0;
int ret;
if (IsBitReached<bits>(x))
{
ret = bits;
x >>= bits;
} else
ret = 0;
return ret + FindMsbInternal<bits/2, T>(x);
}
// Main routine
template <typename T>
int FindMsb(T x)
{
const int bits = sizeof(T) * 8;
if (IsBitReached<bits>(x))
return bits;
return FindMsbInternal<bits/2>(x);
}
Here you go, pretty trivial to update as you need for other sizes...
int bits_left(unsigned long long value)
{
static unsigned long long mask = 0x8000000000000000;
int c = 64;
// doh
if (value == 0)
return c;
// check byte by byte to see what has been set
if (value & 0xFF00000000000000)
c = 0;
else if (value & 0x00FF000000000000)
c = 8;
else if (value & 0x0000FF0000000000)
c = 16;
else if (value & 0x000000FF00000000)
c = 24;
else if (value & 0x00000000FF000000)
c = 32;
else if (value & 0x0000000000FF0000)
c = 40;
else if (value & 0x000000000000FF00)
c = 48;
else if (value & 0x00000000000000FF)
c = 56;
// skip
value <<= c;
while(!(value & mask))
{
value <<= 1;
c++;
}
return c;
}
Same idea as user470379's, but counting down ...
Assume all 64 bits are unset. While value is larger than 0 keep shifting the value right and decrementing number of unset bits:
/* untested */
int countunsetbits(uint64_t val) {
int x = 64;
while (val) { x--; val >>= 1; }
return x;
}
Try
int countBits(int value)
{
int result = sizeof(value) * CHAR_BITS; // should be 64
while(value != 0)
{
--result;
value = value >> 1; // Remove bottom bits until all 1 are gone.
}
return result;
}
Use log base 2 to get you the most significant digit which is 1.
log(2) = 1, meaning 0b10 -> 1
log(4) = 2, 5-7 => 2.xx, or 0b100 -> 2
log(8) = 3, 9-15 => 3.xx, 0b1000 -> 3
log(16) = 4 you get the idea
and so on...
The numbers in between become fractions of the log result. So typecasting the value to an int gives you the most significant digit.
Once you get this number, say b, the simple 64 - n will be the answer.
function get_pos_msd(int n){
return int(log2(n))
}
last_zero = 64 - get_pos_msd(n)

Categories

Resources