Understanding bitwise operator with int list [duplicate] - c#

I can not understand this shift operator (c# reference):
class MainClass1
{
static void Main()
{
int i = 1;
long lg = 1;
Console.WriteLine("0x{0:x}", i << 1);
Console.WriteLine("0x{0:x}", i << 33);
Console.WriteLine("0x{0:x}", lg << 33);
}
}
/*
Output:
0x2
0x2
0x200000000
*/
class MainClass2
{
static void Main()
{
int a = 1000;
a <<= 4;
Console.WriteLine(a);
}
}
/*
Output:
16000
*/

<< is the left-shift operator; this takes the binary representation of a value, and moves all the bits "n" places to the left (except for "mod", see "1"), back-filling with zeros.
>> is the right-shift operator; this does nearly the opposite (moving to the right), except for signed values (i.e. those that can be negative) it back-fills with 1s for negative values, else zeros.
1:
The shift operator is essentially "mod" the width of the data. An int is 32 bits, so a left shift of 33 (in Int32) is exactly the same as a left shift of 1. You don't get all zeros. A long is 64 bits, so a left-shift of 33 gives a different answer (original times 2^33).
2:
Each left shift (within the data width) is the same (for integers) as x2 - so <<4 is x2x2x2x2 = x16.
This is simple binary:
0000000001 = 1
<< goes to
0000000010 = 2
<< goes to
0000000100 = 4
<< goes to
0000001000 = 8

Just to expand on Marc's answer a little (Marc, feel free to include this in yours and I'll delete this answer) this is specified in section 7.8 of the spec:
The predefined shift operators are listed below.
Shift left:
int operator <<(int x, int count);
uint operator <<(uint x, int count);
long operator <<(long x, int count);
ulong operator <<(ulong x, int count);
The << operator shifts x left by a number of bits computed as described below.
The high-order bits outside the range of the result type of x are discarded, the remaining bits are shifted left, and the low-order empty bit positions are set to zero.
Shift right:
int operator >>(int x, int count);
uint operator >>(uint x, int count);
long operator >>(long x, int count);
ulong operator >>(ulong x, int count);
The >> operator shifts x right by a number of bits computed as described below.
When x is of type int or long, the low-order bits of x are discarded, the remaining bits are shifted right, and the high-order empty bit positions are set to zero if x is non-negative and set to one if x is negative.
When x is of type uint or ulong, the low-order bits of x are discarded, the remaining bits are shifted right, and the high-order empty bit positions are set to zero.
For the predefined operators, the number of bits to shift is computed as follows:
When the type of x is int or uint, the shift count is given by the low-order five bits of count. In other words, the shift count is computed from count & 0x1F.
When the type of x is long or ulong, the shift count is given by the low-order six bits of count. In other words, the shift count is computed from count & 0x3F.
If the resulting shift count is zero, the shift operators simply return the value of x.

A few more notes for the novice programmer:
Why use shift operators? They don't seem to do much. Well, there are 2 reasons:
They are really fast, because nearly all CPUs have shift registers, meaning the shifting operation is done in the hardware, in the minimum amount of effort (cycles).
Because they are fast, a lot of protocols and standards are designed to take advantage of this. For example IP address operations, checking a CRC, graphic operations etc.

"The shift operator is essentially "mod" the width of the data."
Rubbish! If the amount of the shift is greater than, or equal to, the width of the data, the result is undefined. Do not expect the same 'mod' operation that you happen to have seen, to happen with different compilers, or different versions of the same compiler, or in different shift situations in the same program, or when anything else changes. That's what 'undefined' means.

Related

Wrong bitshift results

I'm working with bitshift for the first time and I'm experiencing unexpected results.
I'm declaring the shift amount as follows:
byte p_size = 0;
if (ver == 0x12 || ver == 0x13)
p_size = 20;
else
p_size = 40;
The value to be shifted is declared as
int t_size = rinput.ReadInt32();
And finally the code I use to shift:
int temp = t_size >> p_size << p_size;
Let's say t_size = 0x2000385E and p_size = 20. temp = 0x20000000 as expected.
Now if t_size = 0x40001014 and p_size = 40, temp = 0x40001000 instead of 0x40000000. I calculated "manually" using a bitwise calculator and it matches the expected result of 0x40000000.
It's probably a silly oversight on my behalf but I don't understand what would cause the weird results with p_size = 40... any advice is appreciated!
Shifting a 32 integer by 40 bits doesn't really make sense since you would be shifting the integer by more bits than it contains.
Both the left and right shift operators document what they do in this case:
If the first operand is an int or uint (32-bit quantity), the shift
count is given by the low-order five bits of the second operand
(second operand & 0x1f).
So when p_size is 40, the shifts are shifting by 40 & 0x1f = 8 bits.
If you need to shift by 40 bits, but your value into long.
Current behavior is expected as 40 & 0x1f is 8 as described in operator >>
If the first operand is an int or uint (32-bit quantity), the shift count is given by the low-order five bits of the second operand (second operand & 0x1f).
You probably looking for some masking rather than shifts - maybe
t_size & 0xFF000000

How to work with the bits in a byte

I have a single byte which contains two values. Here's the documentation:
The authority byte is split into two fields. The three least significant bits carry the user’s authority level (0-5). The five most
significant bits carry an override reject threshold. If these bits are
set to zero, the system reject threshold is used to determine whether
a score for this user is considered an accept or reject. If they are
not zero, then the value of these bits multiplied by ten will be the
threshold score for this user.
Authority Byte:
7 6 5 4 3 ......... 2 1 0
Reject Threshold .. Authority
I don't have any experience of working with bits in C#.
Can someone please help me convert a Byte and get the values as mentioned above?
I've tried the following code:
BitArray BA = new BitArray(mybyte);
But the length comes back as 29 and I would have expected 8, being each bit in the byte.
-- Thanks for everyone's quick help. Got it working now! Awesome internet.
Instead of BitArray, you can more easily use the built-in bitwise AND and right-shift operator as follows:
byte authorityByte = ...
int authorityLevel = authorityByte & 7;
int rejectThreshold = authorityByte >> 3;
To get the single byte back, you can use the bitwise OR and left-shift operator:
int authorityLevel = ...
int rejectThreshold = ...
Debug.Assert(authorityLevel >= 0 && authorityLevel <= 7);
Debug.Assert(rejectThreshold >= 0 && rejectThreshold <= 31);
byte authorityByte = (byte)((rejectThreshold << 3) | authorityLevel);
Your use of the BitArray is incorrect. This:
BitArray BA = new BitArray(mybyte);
..will be implicitly converted to an int. When that happens, you're triggering this constructor:
BitArray(int length);
..therefore, its creating it with a specific length.
Looking at MSDN (http://msdn.microsoft.com/en-us/library/x1xda43a.aspx) you want this:
BitArray BA = new BitArray(new byte[] { myByte });
Length will then be 8 (as expected).
To get a value of the five most significant bits in a byte as an integer, shift the byte to the right by 3 (i.e. by 8-5), and set the three upper bits to zero using bitwise AND operation, like this:
byte orig = ...
int rejThreshold = (orig >> 3) & 0x1F;
>> is the "shift right" operator. It moves bits 7..3 into positions 4..0, dropping the three lower bits.
0x1F is the binary number 00011111, which has the upper three bits set to zero, and the lower five bits set to one. AND-ing with this number zeroes out three upper bits.
This technique can be generalized to get other bit patterns and other integral data types. You shift the bits that you want into the least-significant position, and apply a mask that "cuts out" the number of bits that you want. In some cases, shifting would not be necessary (e.g. when you get the least significant group of bits). In other cases, such as above, the masking would not be necessary, because you get the most significant group of bits in an unsigned type (if the type is signed, ANDing would be required).
You're using the wrong constructor (probably).
The one that you're using is probably this one, while you need this one:
var bitArray = new BitArray(new [] { myByte } );

Number of unset bit left of most significant set bit?

Assuming the 64bit integer 0x000000000000FFFF which would be represented as
00000000 00000000 00000000 00000000
00000000 00000000 >11111111 11111111
How do I find the amount of unset bits to the left of the most significant set bit (the one marked with >) ?
In straight C (long long are 64 bit on my setup), taken from similar Java implementations: (updated after a little more reading on Hamming weight)
A little more explanation: The top part just sets all bit to the right of the most significant 1, and then negates it. (i.e. all the 0's to the 'left' of the most significant 1 are now 1's and everything else is 0).
Then I used a Hamming Weight implementation to count the bits.
unsigned long long i = 0x0000000000000000LLU;
i |= i >> 1;
i |= i >> 2;
i |= i >> 4;
i |= i >> 8;
i |= i >> 16;
i |= i >> 32;
// Highest bit in input and all lower bits are now set. Invert to set the bits to count.
i=~i;
i -= (i >> 1) & 0x5555555555555555LLU; // each 2 bits now contains a count
i = (i & 0x3333333333333333LLU) + ((i >> 2) & 0x3333333333333333LLU); // each 4 bits now contains a count
i = (i + (i >> 4)) & 0x0f0f0f0f0f0f0f0fLLU; // each 8 bits now contains a count
i *= 0x0101010101010101LLU; // add each byte to all the bytes above it
i >>= 56; // the number of bits
printf("Leading 0's = %lld\n", i);
I'd be curious to see how this was efficiency wise. Tested it with several values though and it seems to work.
Based on: http://www.hackersdelight.org/HDcode/nlz.c.txt
template<typename T> int clz(T v) {int n=sizeof(T)*8;int c=n;while (n){n>>=1;if (v>>n) c-=n,v>>=n;}return c-v;}
If you'd like a version that allows you to keep your lunch down, here you go:
int clz(uint64_t v) {
int n=64,c=64;
while (n) {
n>>=1;
if (v>>n) c-=n,v>>=n;
}
return c-v;
}
As you'll see, you can save cycles on this by careful analysis of the assembler, but the strategy here is not a terrible one. The while loop will operate Lg[64]=6 times; each time it will convert the problem into one of counting the number of leading bits on an integer of half the size.
The if statement inside the while loop asks the question: "can i represent this integer in half as many bits", or analogously, "if i cut this in half, have i lost it?". After the if() payload completes, our number will always be in the lowest n bits.
At the final stage, v is either 0 or 1, and this completes the calculation correctly.
If you are dealing with unsigned integers, you could do this:
#include <math.h>
int numunset(uint64_t number)
{
int nbits = sizeof(uint64_t)*8;
if(number == 0)
return nbits;
int first_set = floor(log2(number));
return nbits - first_set - 1;
}
I don't know how it will compare in performance to the loop and count methods that have already been offered because log2() could be expensive.
Edit:
This could cause some problems with high-valued integers since the log2() function is casting to double and some numerical issues may arise. You could use the log2l() function that works with long double. A better solution would be to use an integer log2() function as in this question.
// clear all bits except the lowest set bit
x &= -x;
// if x==0, add 0, otherwise add x - 1.
// This sets all bits below the one set above to 1.
x+= (-(x==0))&(x - 1);
return 64 - count_bits_set(x);
Where count_bits_set is the fastest version of counting bits you can find. See https://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel for various bit counting techniques.
I'm not sure I understood the problem correctly. I think you have a 64bit value and want to find the number of leading zeros in it.
One way would be to find the most significant bit and simply subtract its position from 63 (assuming lowest bit is bit 0). You can find out the most significant bit by testing whether a bit is set from within a loop over all 64 bits.
Another way might be to use the (non-standard) __builtin_clz in gcc.
I agree with the binary search idea. However two points are important here:
The range of valid answers to your question is from 0 to 64 inclusive. In other words - there may be 65 different answers to the question. I think (almost sure) all who posted the "binary search" solution missed this point, hence they'll get wrong answer for either zero or a number with the MSB bit on.
If speed is critical - you may want to avoid the loop. There's an elegant way to achieve this using templates.
The following template stuff finds the MSB correctly of any unsigned type variable.
// helper
template <int bits, typename T>
bool IsBitReached(T x)
{
const T cmp = T(1) << (bits ? (bits-1) : 0);
return (x >= cmp);
}
template <int bits, typename T>
int FindMsbInternal(T x)
{
if (!bits)
return 0;
int ret;
if (IsBitReached<bits>(x))
{
ret = bits;
x >>= bits;
} else
ret = 0;
return ret + FindMsbInternal<bits/2, T>(x);
}
// Main routine
template <typename T>
int FindMsb(T x)
{
const int bits = sizeof(T) * 8;
if (IsBitReached<bits>(x))
return bits;
return FindMsbInternal<bits/2>(x);
}
Here you go, pretty trivial to update as you need for other sizes...
int bits_left(unsigned long long value)
{
static unsigned long long mask = 0x8000000000000000;
int c = 64;
// doh
if (value == 0)
return c;
// check byte by byte to see what has been set
if (value & 0xFF00000000000000)
c = 0;
else if (value & 0x00FF000000000000)
c = 8;
else if (value & 0x0000FF0000000000)
c = 16;
else if (value & 0x000000FF00000000)
c = 24;
else if (value & 0x00000000FF000000)
c = 32;
else if (value & 0x0000000000FF0000)
c = 40;
else if (value & 0x000000000000FF00)
c = 48;
else if (value & 0x00000000000000FF)
c = 56;
// skip
value <<= c;
while(!(value & mask))
{
value <<= 1;
c++;
}
return c;
}
Same idea as user470379's, but counting down ...
Assume all 64 bits are unset. While value is larger than 0 keep shifting the value right and decrementing number of unset bits:
/* untested */
int countunsetbits(uint64_t val) {
int x = 64;
while (val) { x--; val >>= 1; }
return x;
}
Try
int countBits(int value)
{
int result = sizeof(value) * CHAR_BITS; // should be 64
while(value != 0)
{
--result;
value = value >> 1; // Remove bottom bits until all 1 are gone.
}
return result;
}
Use log base 2 to get you the most significant digit which is 1.
log(2) = 1, meaning 0b10 -> 1
log(4) = 2, 5-7 => 2.xx, or 0b100 -> 2
log(8) = 3, 9-15 => 3.xx, 0b1000 -> 3
log(16) = 4 you get the idea
and so on...
The numbers in between become fractions of the log result. So typecasting the value to an int gives you the most significant digit.
Once you get this number, say b, the simple 64 - n will be the answer.
function get_pos_msd(int n){
return int(log2(n))
}
last_zero = 64 - get_pos_msd(n)

Need a way to pick a common bit in two bitmasks at random

Imagine two bitmasks, I'll just use 8 bits for simplicity:
01101010
10111011
The 2nd, 4th, and 6th bits are both 1. I want to pick one of those common "on" bits at random. But I want to do this in O(1).
The only way I've found to do this so far is pick a random "on" bit in one, then check the other to see if it's also on, then repeat until I find a match. This is still O(n), and in my case the majority of the bits are off in both masks. I do of course & them together to initially check if there's any common bits at all.
Is there a way to do this? If so, I can increase the speed of my function by around 6%. I'm using C# if that matters. Thanks!
Mike
If you are willing to have an O(lg n) solution, at the cost of a possibly nonuniform probability, recursively half split, i.e. and with the top half of the bits set and the bottom half set. If both are nonzero then chose one randomly, else choose the nonzero one. Then half split what remains, etc. This will take 10 comparisons for a 32 bit number, maybe not as few as you would like, but better than 32.
You can save a few ands by choosing to and with the high half or low half at random, and if there are no hits taking the other half, and if there are hits taking the half tested.
The random number only needs to be generated once, as you are only using one bit at each test, just shift the used bit out when you are done with it.
If you have lots of bits, this will be more efficient. I do not see how you can get this down to O(1) though.
For example, if you have a 32 bit number first and the anded combination with either 0xffff0000 or 0x0000ffff if the result is nonzero (say you anded with 0xffff0000) conitinue on with 0xff000000 of 0x00ff0000, and so on till you get to one bit. This ends up being a lot of tedious code. 32 bits takes 5 layers of code.
Do you want a uniform random distribution? If so, I don't see any good way around counting the bits and then selecting one at random, or selecting random bits until you hit one that is set.
If you don't care about uniform, you can select a set bit out of a word randomly with:
unsigned int pick_random(unsigned int w, int size) {
int bitpos = rng() % size;
unsigned int mask = ~((1U << bitpos) - 1);
if (mask & w)
w &= mask;
return w - (w & (w-1));
}
where rng() is your random number generator, w is the word you want to pick from, and size is the relevant size of the word in bits (which may be the machine wordsize, or may be less as long as you don't set the upper bits of the word. Then, for your example, you use pick_random(0x6a & 0xbb, 8) or whatever values you like.
This function uniformly randomly selects one bit which is high in both masks. If there are
no possible bits to pick, zero is returned instead. The running time is O(n), where n is the number of high bits in the anded masks. So if you have a low number of high bits in your masks, this function could be faster even though the worst case is O(n) which happens when all the bits are high. The implementation in C is as follows:
unsigned int randomMasksBit(unsigned a, unsigned b){
unsigned int i = a & b; // Calculate the bits which are high in both masks.
unsigned int count = 0
unsigned int randomBit = 0;
while (i){ // Loop through all high bits.
count++;
// Randomly pick one bit from the bit stream uniformly, by selecting
// a random floating point number between 0 and 1 and checking if it
// is less then the probability needed for random selection.
if ((rand() / (double)RAND_MAX) < (1 / (double)count)) randomBit = i & -i;
i &= i - 1; // Move on to the next high bit.
}
return randomBit;
}
O(1) with uniform distribution (or as uniform as random generator offers) can be done, depending on whether you count certain mathematical operation as O(1). As a rule we would, though in the case of bit-tweaking one might make a case that they are not.
The trick is that while it's easy enough to get the lowest set bit and to get the highest set bit, in order to have uniform distribution we need to randomly pick a partitioning point, and then randomly pick whether we'll go for the highest bit below it or the lowest bit above (trying the other approach if that returns zero).
I've broken this down a bit more than might be usual to allow the steps to be more easily followed. The only question on constant timing I can see is whether Math.Pow and Math.Log should be considered O(1).
Hence:
public static uint FindRandomSharedBit(uint x, uint y)
{//and two nums together, to find shared bits.
return FindRandomBit(x & y);
}
public static uint FindRandomBit(uint val)
{//if there's none, we can escape out quickly.
if(val == 0)
return 0;
Random rnd = new Random();
//pick a partition point. Note that Random.Next(1, 32) is in range 1 to 31
int maskPoint = rnd.Next(1, 32);
//pick which to try first.
bool tryLowFirst = rnd.Next(0, 2) == 1;
// will turn off all bits above our partition point.
uint lowerMask = Convert.ToUInt32(Math.Pow(2, maskPoint) - 1);
//will turn off all bits below our partition point
uint higherMask = ~lowerMask;
if(tryLowFirst)
{
uint lowRes = FindLowestBit(val & higherMask);
return lowRes != 0 ? lowRes : FindHighestBit(val & lowerMask);
}
uint hiRes = FindHighestBit(val & lowerMask);
return hiRes != 0 ? hiRes : FindLowestBit(val & higherMask);
}
public static uint FindLowestBit(uint masked)
{ //e.g 00100100
uint minusOne = masked - 1; //e.g. 00100011
uint xord = masked ^ minusOne; //e.g. 00000111
uint plusOne = xord + 1; //e.g. 00001000
return plusOne >> 1; //e.g. 00000100
}
public static uint FindHighestBit(uint masked)
{
double db = masked;
return (uint)Math.Pow(2, Math.Floor(Math.Log(masked, 2)));
}
I believe that, if you want uniform, then the answer will have to be Theta(n) in terms of the number of bits, if it has to work for all possible combinations.
The following C++ snippet (stolen) should be able to check if any given num is a power of 2.
if (!var || (var & (var - 1))) {
printf("%u is not power of 2\n", var);
}
else {
printf("%u is power of 2\n", var);
}
If you have few enough bits to worry about, you can get O(1) using a lookup table:
var lookup8bits = new int[256][] = {
new [] {},
new [] {0},
new [] {1},
new [] {0, 1},
...
new [] {0, 1, 2, 3, 4, 5, 6, 7}
};
Failing that, you can find the least significant bit of a number x with (x & -x), assuming 2s complement. For example, if x = 46 = 101110b, then -x = 111...111010010b, hence x & -x = 10.
You can use this technique to enumerate the set bits of x in O(n) time, where n is the number of set bits in x.
Note that computing a pseudo random number is going to take you a lot longer than enumerating the set bits in x!
This can't be done in O(1), and any solution for a fixed number of N bits (unless it's totally really ridiculously stupid) will have a constant upper bound, for that N.

Bitwise operations

A) (Int32)X | ((Int32)Y << 16);
B) (Int32)X + (Int32)Y * (Int32)Int16.MaxValue;
Shouldn't both be equivalent? I know from testing that the first works as expected, but for some reason the second doesn't. Both X and Y are shorts (Int16), and the return type is an integer (Int32).
Shouldn't Y << 16 <=> Y * Int16.MaxValue?
To get the desired behaviour, you need to multiply with 0x10000 (i.e. UInt16.MaxValue+1). Int16.MaxValue is 0x7fff.
5 << 16
327680
5 * 0x10000
327680
Compare to the decimal system: If you want to "shift" the number 5 to 500, you need to multiply with 100, not 99 :-)
There are 2 problems with your second approach:
Int16 is signed, so the max value is actually only 15 bits.
The maximum value that can be represented by 16 bits is 2^16 - 1.
Right-shift 16 bits = * 2^16
But:
Int16.MaxValue = 2^15-1
I think that you want an unsigned 16-bit max value + 1
Overlooking your MaxValue being one less than a power of two, and since you have a bigger problem to cover first:
The OR and SUM operations are not similar. When you are working with 32-bit integers and 16-bit shifts, there will be carries with your + operation and bit-wise OR'ing with the OR operation.
So, the two ways are quite different.
Then, of course, the MaxValue interpretation makes your two 'shift' attempts different. It should be (x * MaxValue + x) or (x * (MaxValue+1)).

Categories

Resources