Unsigned shift right in C# Using Java semantics for negative numbers

Unsigned shift right in C# Using Java semantics for negative numbers - c#

I'm trying to port Java code to C# and I'm running into odd bugs related to the unsigned shift right operator >>> normally the code:
long l = (long) ((ulong) number) >> 2;
Would be the equivalent of Java's:
long l = number >>> 2;
However for the case of -2147483648L which you might recognized as Integer.MIN_VALUE this returns a different number than it would in Java since the cast to ulong changes the semantics of the number hence I get a different result.
How would something like this be possible in C#?
I'd like to preserve the code semantics as much as possible since its a pretty complex body of code.

I believe your expression is incorrect when considering C#'s order precedence. Your code I believe is converting your long to ulong, then back to long, then shifting. I'm assuming your intent was to perform the shift on the ulong.
From the C# Specification §7.2.1, Unary (or in your case, the casting operation) takes precedence over the shifting. Thus your code:
long l = (long) ((ulong) number) >> 2;
would be interpreted as:
ulong ulongNumber = (ulong)number;
long longNumber = (long)ulongNumber;
long shiftedlongNumber = longNumber >> 2;
Given number as -2147483648L, this yields 536870912.
By wrapping the conversion and shifting in parenthesis:
long l = (long) (((ulong) number) >> 2);
Produces logic that could be rewritten as:
ulong ulongNumber = (ulong)number;
ulong shiftedulongNumber = ulongNumber >> 2;
long longShiftedNumber = (long)shiftedulongNumber;
Which given number as -2147483648L, this yields 4611686017890516992.
EDIT: Note that given those ordering rules, there's an extra set of parenthesis in my answer that aren't necessary. The correct expression could be written as:
long l = (long) ((ulong) number >> 2);

Related

Get the sign of a number in C# without conditional statement

Just out of curiosity, is there a way to get the sign of a number, any kind (but obviously a signed type), not just integer using some bitwise/masking, or other kind of, operation?
That is without using any conditional statement or calling the Math.Sign() function.
Thanks in advance!
EDIT: I recognize it was a misleading question. What I had in mind more likely something like: "get the same output of the Math.Sign() or, simplifying get 0 if x <= 0, 1 otherwise".
EDIT #2: to all those asking for code, I didn't have any in mind when I posted the question, but here's an example I came up with, just to give a context of a possible application:
x = (x < 0) ? 0 : x;
Having the sign into a variable could lead to:
x = sign * x; //where sign = 0 for x <= 0, otherwise sign = 1;
The aim would be to achieve the same result as the above :)
EDIT #3: FOUND IT! :D
// THIS IS NOT MEANT TO BE PLAIN C#!
// Returns 0 if x <= 0, 1 otherwise.
int SignOf(x)
{
return (1+x-(x+1)%x)/x;
}
Thanks to everyone!

is there a way to get the sign of a number (any kind, not just integer)
Not for any number type, no. For an integer, you can test the most significant bit: if it's 1, the number is negative. You could theoretically do the same with a floating point number, but bitwise operators don't work on float or double.

Here's a "zero safe" solution that works for all value types (int, float, double, decimal...etc):
(value.GetHashCode() >> 31) + 1;
Output: 1 = 1, -1 = 0, 0.5 = 1, -0.5 = 0, 0 = 1
It's also roughly 10% cheaper than (1+x-(x+1)%x)/x; in C#. Additionally if "value" is an integer, you can drop the GetHashCode() function call in which case (1+x-(x+1)%x)/x; is 127% more expensive ((value >> 31) + 1; is 56% cheaper).
Since 0 is positive it is illogical for a result of 1 for positive numbers & a result of 0 for 0. If you could parametrise -0 you would get an output of 0.
I understand that GetHashCode() is a function call, but the inner workings of the function in the C# language implementation is entirely "arithmetic". Basically the GetHashCode() function reads the memory section, that stores your float type, as an integer type:
*((int*)&singleValue);
How the GetHashCode function works (best source I could find quickly) - https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c3fde60-1b4a-449f-afdc-fe5bba8fece3/hash-code-of-floats?forum=netfxbcl
If you want the output value to be 1 with the same sign as the input, use:
((float.GetHashCode() >> 31) * 2) + 1;
The above floating-point method is roughly 39% cheaper than System.Math.Sign(float) (System.Math.Sign(float) is roughly 65% more expensive). Where System.Math.Sign(float) throws an exception for float.NaN, ((float.NaN.GetHashCode() >> 31) * 2) + 1; does not and will return -1 instead of crashing.
or for integers:
((int >> 31) * 2) + 1;
The above integer method is roughly 56% cheaper than System.Math.Sign(int) (System.Math.Sign(int) is roughly 125% more expensive).

It depends on the type of number value type you are targeting.
For signed Integers C# and most computer systems use the so called Ones' complement representation.
That means the sign is stored in the first bit of the value.
So you can extract the sign like this:
Int16 number = -2;
Int16 sign = (number & Int16.MinValue) >> 16;
Boolean isNegative = Convert.ToBoolean(sign);
Note that up until now we have not used any conditional operator (explicitly anyways)
But: You still don't know whether the number has a sign or not.
The logical equivalent of your question: "How do I know, if my number is negative?" explicitly requires the usage of a conditional operator as the question is, after all conditional.
So you won't be able to dodge:
if(isNegative)
doThis();
else
doThat();

to just get the sign, you can avoid conditional operators as you will see below in Sign extension of int32 struct. however to get the name I dont think you can avoid conditional operator
class Program
{
static void Main(string[] args)
{
Console.WriteLine(0.Sign());
Console.WriteLine(0.SignName());
Console.WriteLine(12.Sign());
Console.WriteLine(12.SignName());
Console.WriteLine((-15).Sign());
Console.WriteLine((-15).SignName());
Console.ReadLine();
}
}
public static class extensions
{
public static string Sign(this int signedNumber)
{
return (signedNumber.ToString("+00;-00").Substring(0, 1));
}
public static string SignName(this int signedNumber)
{
return (signedNumber.ToString("+00;-00").Substring(0, 1)=="+"?"positive":"negative");
}
}

if x==0 you will have a divby0 exception with this code you posted:
int SignOf(x) {
return (1+x-(x+1)%x)/x; }

How to (theoretically) print all possible double precision numbers in C#?

For a little personal research project I want to generate a string list of all possible values a double precision floating point number can have.
I've found the "r" formatting option, which guarantees that the string can be parsed back into the exact same bit representation:
string s = myDouble.ToString("r");
But how to generate all possible bit combinations? Preferably ordered by value.
Maybe using the unchecked keyword somehow?
unchecked
{
//for all long values
myDouble[i] = myLong++;
}
Disclaimer: It's more a theoretical question, I am not going to read all the numbers... :)

using unsafe code:
ulong i = 0; //long is 64 bit, like double
unsafe
{
double* d = (double*)&i;
for(;i<ulong.MaxValue;i++)
Console.WriteLine(*d);
}

You can start with all possible values 0 <= x < 1. You can create those by having zero for exponent and use different values for the mantissa.
The mantissa is stored in 52 bits of the 64 bits that make a double precision number, so that makes for 2 ^ 52 = 4503599627370496 different numbers between 0 and 1.
From the description of the decimal format you can figure out how the bit pattern (eight bytes) should be for those numbers, then you can use the BitConverter.ToDouble method to do the conversion.
Then you can set the first bit to make the negative version of all those numbers.
All those numbers are unique, beyond that you will start getting duplicate values because there are several ways to express the same value when the exponent is non-zero. For each new non-zero exponent you would get the value that were not possible to express with the previously used expontents.
The values between 0 and 1 will however keep you busy for the forseeable future, so you can just start with those.

This should be doable in safe code: Create a bit string. Convert that to a double. Output. Increment. Repeat.... A LOT.
string bstr = "01010101010101010101010101010101"; // this is 32 instead of 64, adjust as needed
long v = 0;
for (int i = bstr.Length - 1; i >= 0; i--) v = (v << 1) + (bstr[i] - '0');
double d = BitConverter.ToDouble(BitConverter.GetBytes(v), 0);
// increment bstr and loop

Convert 24 bit value to float and back

It is possible to convert 24 bit integer value into float and then back to 24 bit integer without losing data?
For example, let's consider 8 bit int, a byte, range is [-127..127] (we drop -128).
public static float ToFloatSample (byte x) { return x / 127f; }
So, if x == -127, result will be -1, if x == 127, result will be 1. If x == 64, result will be ~0.5
public static int ToIntSample (float x) { return (int) (x * 127f); }
So now:
int x = some_number;
float f = ToFloatSample (x);
int y = ToIntSample (f);
Will always x == y ? Using 8 bit int yes, but what if I use 24 bit?

Having thought about your question, I now understand what you're asking.
I understand you have 24-bits which represent a real number n such that -1 <= n <= +1 and you want to load this into an instance of System.Single, and back again.
In C/C++ this is actually quite easy with the frexp and ldexp functions, documented here ( how can I extract the mantissa of a double ), but in .NET it's a more involved process.
The C# language specification (and thusly, .NET) states it uses IEEE-754 1989 format, which means you'll need to dump the bits into an integer type so you can perform the bitwise logic to extract the components. This question has already been asked here on SO except for System.Double instead of System.Single, but converting the answer to work with Single is a trivial exercise for the reader ( extracting mantissa and exponent from double in c# ).
In your case, you'd want to store your 24-bit mantissa value in the low-24 bits of an Int32 and then use the code in that linked question to load and extract it from a Single instance.

Every integer in the range [-16777216, 16777216] is exactly representable as an IEEE 754 32-bit binary floating point number. That includes both the unsigned and 2's complement 24 bit integer ranges. Simple casting will do the job.
The range is wider than you would expect because there is an extra significand bit that is not stored - it is a binary digit that is known not to be zero.

Trying to Assign a Large Number at Compile Time

The following two operations are identical. However, MaxValues1 will not compile due to a "The operation overflows at compile time in checked mode." Can someone please explain what is going on with the compiler and how I can get around it without have to use a hard-coded value as in MaxValues2?
public const ulong MaxValues1 = 0xFFFF * 0xFFFF * 0xFFFF;
public const ulong MaxValues2 = 0xFFFD0002FFFF;

To get unsigned literals, add u suffix, and to make them long a l suffix. i.e. you need ul.
If you really want the overflow behavior, you can add unchecked to get unchecked(0xFFFF * 0xFFFF * 0xFFFF) but that's likely not what you want. You get the overflow because the literals get interpreted as Int32 and not as ulong, and 0xFFFF * 0xFFFF * 0xFFFF does not fit a 32 bit integer, since it is approximately 2^48.
public const ulong MaxValues1 = 0xFFFFul * 0xFFFFul * 0xFFFFul;

By default, integer literals are of type int. You can add the 'UL' suffix to change them to ulong literals.
public const ulong MaxValues1 = 0xFFFFUL * 0xFFFFUL * 0xFFFFUL;
public const ulong MaxValues2 = 0xFFFD0002FFFFUL;

Add numeric suffixes 'UL' to each of the numbers. Otherwise, C# considers them as Int32.
C# - Numeric Suffixes

I think its actually not a ulong until you assign it at the end, try
public const ulong MaxValues1 = (ulong)0xFFFF * (ulong)0xFFFF * (ulong)0xFFFF;
i.e. in MaxValues1 you are multiplying 3 32bit ints together which overflows as the result is implied as another 32bit int, when you cast the op changes to multiplying 3 ulongs together which wont overflow as you are inferring the result is a ulong
(ulong)0xFFFF * 0xFFFF * 0xFFFF;
0xFFFF * (ulong)0xFFFF * 0xFFFF;
also work as the result type is calculated based on the largest type
but
0xFFFF * 0xFFFF * (ulong)0xFFFF;
won't work as the first 2 will overflow the int

Converting a partial MD5 hash code into a long

I'm using the MD5 algorithm to hash the key for an on-disk hash table (I know it's questionable whether this is the best algorithm to use for this, but I'm going with it for now. The problem is generalizable to any algorithm that produces a byte array). My problem is this:
The size of the hash code determines the number of combinations (buckets) in the hash table. Since MD5 is 128 bit, there are a huge number of combinations (~ 3.4e38) which is way too big for my purpose. So what I want to do is pick off the first n bits of the byte array that MD5 produces, and convert those into a long (or ulong) value. Since MD5 produces a byte array, it would be easy to do if I wanted an integral number of bytes, but this leads to too big a jump in the number of combinations. I'm finding the single bit version to be a lot trickier.
Goal:
n = 10 // I.e. I want 2^10 combinations
long pos = someFcn(byte[] key, n)
where key is the value being hashed, and n is the number of bits of the MD5 result I want to use. Pos, then, will be an integer from 0 to 1023 (in the case of n = 10). If n = 11, the code will be from 0 to 2^11-1 = 2027, etc. Has to be somewhat fast/efficient.
Doesn't seem that hard but it's eluding me. Any help would be much appreciated. Thanks.

First, convert the first four bytes into an integer, with BitConverter.ToInt32. It's getting four bytes no matter what, but this probably won't make it measurably slower, since you're working with 32-bit registers for the rest of the calculations anyway, and complex stuff like "if it's < 16 then do this with the first two bytes" will just make it more complicated
Then, given that integer, take the lowest N bits. If you really want a specific number of bits [a power of two number of buckets] not known at compile time, ~((-1)<<N) is a nice trick to get 2^N-1.
Or you could simply use ToUInt32 instead and modulo a prime number [it might be slightly better to convert to UInt64 instead, then you've got fully half the bits to start with, in this case]

To obtain the first 10 bits, for example:
int result = ((int)key[0] << 2) | (((int)key[1] >> 6) & 0x03)

If you have an array like this,
unsigned char data[2000];
then you can just scrape off the first n bits into an integer like so:
typedef unsigned long long int MyInt;
MyInt scrape(size_t n, unsigned char * data)
{
MyInt result = 0;
size_t b;
for (b = 0; b < n / 8; ++b)
{
result <<= 8;
result += data[b];
}
const size_t remaining_bits = n % 8;
result <<= remaining_bits;
result += (data[b] >> (8 - remaining_bits));
return result;
}
I'm assuming that CHAR_BITS == 8, feel free to generalize the code if you like. Also the size of the array times 8 must be at least n.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Unsigned shift right in C# Using Java semantics for negative numbers - c#

Related

Get the sign of a number in C# without conditional statement

How to (theoretically) print all possible double precision numbers in C#?

Convert 24 bit value to float and back

Trying to Assign a Large Number at Compile Time

Converting a partial MD5 hash code into a long

Categories

Resources