Thanks for taking a look at this question.
I saw the following piece of code inside a traditional for block, but was not sure what its significance was inside its context.
index <<= 1;
For further context, here is the full block of code.
ulong index = 1;
int distance = 0;
for (int i = 0; i < 64; i++)
{
if ((hash1 & index) != (hash2 & index))
{
distance++;
}
index <<= 1;
}
Is it simply making sure that index is still 1 and if it isn't, return it's value to 1?
Secondly, what is this called so I can read up on it some more.
Finally, Thank you for your time and consideration for this matter.
The code in question is spinning through a pair of 64-bit hashes (probably as ulongs, like the index), and checking how many bits differ between them. I'm going to use 4-bit values for example purposes, but the principle is the same.
if ((hash1 & index) != (hash2 & index))
The & operator is doing a bitwise-AND operation. When the hash is ANDed with the index value, you get either 0 or the index value back, depending on whether that specific bit was 0 or 1. (1010 & 0010 == 0010 and 1010 & 0100 == 0000).
If both ANDs produce a 0, or both produce the index value, then the two bits of the hash match. Otherwise, they don't, and we distance++; to indicate that they are off by one more bit than we knew of before.
index <<= 1;
This line merely bumps the index digit to the next bit. It does this by taking the old index (which starts as 1, equal to 0001), and left shifting by one place (<< 1), then setting that back into the index variable (<<= instead of <<). So after the first loop, index will be 0010, then 0100, and so on.
This has the effect of multiplying by 2, but that's not its intended use here.
So overall, you'd get a distance of 2 by running 0011 and 1111 through this algorithm, because two bits are different.
The code
index <<= 1;
Is a left shift by one bit. It has the same effect in this case as multiplying by two. But see comments for cautions.
Related
For example, is every 4th bit set.
1000.1000 true
1010.1000 true
0010.1000 false
with offset of 1
0100.0100 true
0101.0100 true
0001.0100 false
Currently I am doing this by looping through every 4 bits
int num = 170; //1010.1010
int N = 4;
int offset = 0; //[0, N-1]
bool everyNth = true;
for (int i = 0; i < intervals ; i++){
if(((num >> (N*i)) & ((1 << (N - 1)) >> offset)) == 0){
every4th = false;
break;
}
}
return everyNth;
EXPLANATION OF CODE:
num = 1010.1010
The loop makes it so I look at each 4 bits as a block by right shifting * 4.
num >> 4 = 0000.1010
Then an & for a specific bit that can be offset.
And to only look at a specific bit of the chunk, a mask is created by ((1 << (N - 1)) >> offset)
0000.1010
1000 (mask >> offset0)
OR 0100 (mask >> offset1)
OR 0010 (mask >> offset2)
OR 0001 (mask >> offset3)
Is there a purely computational way to do this? Like how you can XOR your way through to figure out parity. I am working with 64 bit integers for my case, but I am wondering this in a more general case.
Additionally, I am under the assumption that bit operators are one of the fastest methods for calculations or math in general. If this is not true, please feel free to correct me on what the time and place is for bit operators.
If we had a mask M in which every Nth bit is set, then testing whether every Nth bit in a given integer x is set could be calculated as (x & M) == M. Or with offset, you could use ((x << offset) & M) == M. Shifting M right is fine too.
If N is constant, that's all there is to it, just use the right M.
If N is variable, the question becomes, how do we get a mask in which every Nth bit is set.
Here is a simple way to do that:
Start by setting the Nth bit
"Double" the mask until done
For example,
ulong M = 1UL << (N - 1);
do
{
M |= M << N;
N += N;
} while (N < 64);
That is clearly still a loop. But it's not a bit-by-bit loop, it makes only a logarithmic number of iterations.
You could precompute the masks and store them in a small array, the range of N is necessarily small.
There may also be a way based on ulong.MaxValue / ((1UL << N) - 1) but that needs something more to "align" the mask and 64-bit division is not so great anyway. Perhaps there is a smarter way to get the mask.
I am under the assumption that bit operators are one of the fastest methods for calculations or math in general
Bitwise operations are some of the fastest operations, but addition is equally fast, and multiplication is not that far behind (and a multiplication can do a lot more work at once, compared to how much more it costs).
How can i build possibilities tree from integers array with C#? I need to make all possibles variants of array if in the every step delete one element from array.
example if we have array from three integers [1,2,3] then tree should looks like this: tree view
I would approach this as a binary arithmetic problem:
static void Main(string[] args)
{
int[] arr = { 1, 2, 3 };
PickElements(0, arr);
}
static void PickElements<T>(int depth, T[] arr, int mask = -1)
{
int bits = Math.Min(32, arr.Length);
// keep just the bits from mask that are represented in arr
mask &= ~(-1 << bits);
if (mask == 0) return;
// UI: write the options
for (int i = 0; i < depth; i++ )
Console.Write('>'); // indent to depth
for (int i = 0; i < arr.Length; i++)
{
if ((mask & (1 << i)) != 0)
{
Console.Write(' ');
Console.Write(arr[i]);
}
}
Console.WriteLine();
// recurse, taking away one bit (naive and basic bit sweep)
for (int i = 0; i < bits; i++)
{
// try and subtract each bit separately; if it
// is different, recurse
var childMask = mask & ~(1 << i);
if (childMask != mask) PickElements(depth + 1, arr, childMask);
}
}
For a TreeView, simply replace the Console.Write etc with node creation, presumably passing the parent node in (and down) as part of the recursion (in place of depth, perhaps).
To see what this is doing, consider the binary; -1 is:
11111111111111...111111111111111
we then look at bits, which we derive from the array length, and find to be 3 in this example. We only need to look at 3 bits, then; the line:
~(-1 << bits)
computes a mask for this, because:
-1 = 1111111....1111111111111
(-1 << 3) = 1111111....1111111111000 (left-shift back-fills with 0)
~(-1 << 3) = 0000000....0000000000111 (binary inverse)
we then apply this to our input mask, so we're only ever looking at the least significant 3 bits, via mask &= .... If that turns out to be zero, we've run out of things to do, so stop recursing.
The UI update is simple enough; we just scan over the 3 bits that we care about, checking whether the current bit is "on" for our mask; 1 << i creates a mask with just the "i-th set bit"; the & and != 0 checks whether that bit is set. If it is, we include the element in the output.
Finally, we need to start taking away bits, to look at the sub-tree; we could probably be more sophisticated about this, but I chose just to scan all the bits and try them - worst case this is 32 bit tests per level, which is nothing. As before, 1 << i creates a mask of just the "i-th set bit". This time we want to disable that bit, so we "negate" and "and" via mask & ~(...). It is possible that this bit was already disabled, so the childMask != mask check ensures we only actually recurse when we have disabled a bit that was previously enabled.
The end result is that we end up with the masks being successively:
11..1111111111111111 (special case for first call; all set)
110 (first bit disabled)
100 (first and second bits disabled)
010 (first and third bits disabled)
101 (second bit disabled)
100 (second and first bits disabled)
001 (second and third bits disabled)
011 (third bit disabled)
010 (third and first bits disabled)
001 (third and second bits disabled)
Note that for a simpler combination example, it would be possible to just iterate in a single for, using the bits to pick elements; however, I've done it a recursive way because we need to build a tree of successive subtractions, rather than just flat possibilities in no particular order.
Recently I had to identify whether a number is odd or even for a large number of integers. I thought of an idea to identify a number as odd or even by AND-ing it against 1 and comparing the result to 1
x & 1 == 1 // even or odd
I have never seen this implementation in practice. The most common way you always see is :
x % 2 == 0
I decided to do some performance check on both methods and the binary method seems slightly faster on my machine.
int size = 60000000;
List<int> numberList = new List<int>();
Random rnd = new Random();
for (int index = 0; index < size; index++)
{
numberList.Add(rnd.Next(size));
}
DateTime start;
bool even;
// regular mod
start = DateTime.Now;
for (int index = 0; index < size; index++)
{
even = (numberList[index] % 2 == 0);
}
Console.WriteLine("Regualr mod : {0}", DateTime.Now.Subtract(start).Ticks);
// binary
start = DateTime.Now;
for (int index = 0; index < size; index++)
{
even = ((numberList[index] & 1) != 1);
}
Console.WriteLine("Binary operation: {0}", DateTime.Now.Subtract(start).Ticks);
Console.ReadKey();
Has anyone seen the binary method implemented ? Any drawbacks ?
Well, yes, it is a slight optimization. This code snippet:
uint ix = 3; // uint.Parse(Console.ReadLine());
bool even = ix % 2 == 0;
generates this machine code in the release build:
uint ix = 3;
0000003c mov dword ptr [ebp-40h],3
bool even = ix % 2 == 0;
00000043 mov eax,dword ptr [ebp-40h]
00000046 and eax,1
00000049 test eax,eax
0000004b sete al
0000004e movzx eax,al
00000051 mov dword ptr [ebp-44h],eax
Do note that the JIT compiler is smart enough to use the AND processor instruction. It is not doing a division as the % operator would normally perform. Kudos there.
But your custom test generates this code:
uint ix = uint.Parse(Console.ReadLine());
// Bunch of machine code
bool even = (ix & 1) == 0;
00000024 test eax,1
00000029 sete al
0000002c movzx eax,al
0000002f mov esi,eax
I had to alter the assignment statement because the JIT compiler got suddenly smart and evaluated the expression at compile time. The code is very similar but the AND instruction got replaced by a TEST instruction. Saving one instruction in the process. Fairly ironic how it this time chose to not use an AND :)
These are the traps of making assumptions. Your original instinct was right however, it ought to save about half a nanosecond. Very hard to see that back unless this code lives in a very tight loop. It gets drastically different when you change the variable from uint to int, the JIT compiler then generates code that tries to be smart about the sign bit. Unnecessarily.
For such operations you should prefer the more readable approach (in my opinion the modulo-way) over the one that is thought to be faster.
Moreover, the modulo operation above can be optimized by the compiler into the bitwise-and operation. Therefore, you actually don't need to care.
Note to your example: To get more-precise results consider passing the number of items to be added into the list's constructor. This way you avoid discrepancies introduced by multiple reallocation of the backing array. For 60 million integer items (approc. 240 MB of memory) not preallocating the memory can represent a significant bias.
Bitwise and will beat modulo division every day of the week. Division by an arbitrary number takes a lot of clock cycles, whereas bitwise and is an essential primitive op that almost always completes in 1 clock cycle, regardless of your CPU architecture.
What you may be seeing, though, is that the compiler may be replacing x mod 2 with a bit shift or bit mask instruction which will have identical performance to your own bit mask operation.
To confirm that the compiler is playing tricks with your code, compare the performance of x mod 2 with x mod 7 or any other non-base 2 integer. Or obscure the operands from the compiler so that it cannot perform the optimization:
var y = 2;
result = x mod y;
If you see a dramatic difference in execution time with these changes, then that's a pretty strong indicator that the compiler is treating x mod 2 as a special case and not using actual division to find the remainder.
And if you're going to use DateTime to benchmark single-instruction operations, make sure you have a long enough loop that the test runs at least 5 minutes or so to get your true measurement above the noise floor.
This webpage benchmarks at least half a dozen ways to determine whether a number is odd or even.
The fastest was (which I like for easy readability):
if (x % 2 == 0)
//even number
else
//odd number
Here were others tested (code is here). I'm actually surprised the bitwise and bit shifting operations didn't perform the best:
//bitwise
if ((x & 1) == 0)
//even number
else
//odd number
System.Math.DivRem((long)x, (long)2, out outvalue);
if ( outvalue == 0)
//even number
else
//odd number
if (((x / 2) * 2) == x)
//even number
else
//odd number
//bit shifting
if (((x >> 1) << 1) == x)
//even number
else
//odd number
index = NumberOfNumbers;
while (index > 1)
index -= 2;
if (index == 0)
//even number
else
//odd number
tempstr = x.ToString();
index = tempstr.Length - 1;
//this assumes base 10
if (tempstr[index] == '0' || tempstr[index] == '2' || tempstr[index] == '4' || tempstr[index] == '6' || tempstr[index] == '8')
//even number
else
//odd number
Wouldn't the binary method be faster because the compiler is able to optimize this into a bitshift rather than actually forcing the cpu to perform the division calculation?
I agree with the other answers, that you should use the modulo check, because it best conveys intent.
However, for your specific results; try using the even variable. It will make a significant difference, because the compiler might actually optimize away some of the calculations because it knows it won't need to use the value.
Using your program (modified to use Stopwatch), I get 70 ms for regular mod and 88 ms for the binary operation. If I use the even variable, the difference is much smaller (327 vs 316 ms), and the modulos is fastest.
For unsigned numbers, many compilers will optimize the 'mod' operator as an 'and' test. For signed numbers, (x % 2) will be 1 if the number is odd and positive; -1 if it's odd and negative; even though both +1 and -1 are non-zero, they may not get recognized as equivalent.
BTW, when using the "and" operator, I would test for !=0 rather than ==1. Compilers may recognize the equivalence, but they may not.
I have 1 bit in a byte (always in the lowest order position) that I'd like to invert.
ie given 00000001 I'd like to get 00000000 and with 00000000 I'd like 00000001.
I solved it like this:
bit > 0 ? 0 : 1;
I'm curious to see how else it could be done.
How about:
bit ^= 1;
This simply XOR's the first bit with 1, which toggles it.
If you want to flip bit #N, counting from 0 on the right towards 7 on the left (for a byte), you can use this expression:
bit ^= (1 << N);
This won't disturb any other bits, but if the value is only ever going to be 0 or 1 in decimal value (ie. all other bits are 0), then the following can be used as well:
bit = 1 - bit;
Again, if there is only going to be one bit set, you can use the same value for 1 as in the first to flip bit #N:
bit = (1 << N) - bit;
Of course, at that point you're not actually doing bit-manipulation in the same sense.
The expression you have is fine as well, but again will manipulate the entire value.
Also, if you had expressed a single bit as a bool value, you could do this:
bit = !bit;
Which toggles the value.
More of a joke:
Of course, the "enterprisey" way would be to use a lookup table:
byte[] bitTranslations = new byte[256];
bitTranslations[0] = 1;
bitTranslations[1] = 0;
bit = bitTranslations[bit];
Your solution isn't correct because if bit == 2 (10) then your assignment will yield bit == 0 (00).
This is what you want:
bit ^= 1;
What does the CreateMask() function of BitVector32 do?
I did not get what a Mask is.
Would like to understand the following lines of code. Does create mask just sets bit to true?
// Creates and initializes a BitVector32 with all bit flags set to FALSE.
BitVector32 myBV = new BitVector32( 0 );
// Creates masks to isolate each of the first five bit flags.
int myBit1 = BitVector32.CreateMask();
int myBit2 = BitVector32.CreateMask( myBit1 );
int myBit3 = BitVector32.CreateMask( myBit2 );
int myBit4 = BitVector32.CreateMask( myBit3 );
int myBit5 = BitVector32.CreateMask( myBit4 );
// Sets the alternating bits to TRUE.
Console.WriteLine( "Setting alternating bits to TRUE:" );
Console.WriteLine( " Initial: {0}", myBV.ToString() );
myBV[myBit1] = true;
Console.WriteLine( " myBit1 = TRUE: {0}", myBV.ToString() );
myBV[myBit3] = true;
Console.WriteLine( " myBit3 = TRUE: {0}", myBV.ToString() );
myBV[myBit5] = true;
Console.WriteLine( " myBit5 = TRUE: {0}", myBV.ToString() );
What is the practical application of this?
It returns a mask which you can use for easier retrieving of interesting bit.
You might want to check out Wikipedia for what a mask is.
In short: a mask is a pattern in form of array of 1s for the bits that you are interested in and 0s for the others.
If you have something like 01010 and you are interested in getting the last 3 bits, your mask would look like 00111. Then, when you perform a bitwise AND on 01010 and 00111 you will get the last three bits (00010), since AND only is 1 if both bits are set, and none of the bits beside the first three are set in the mask.
An example might be easier to understand:
BitVector32.CreateMask() => 1 (binary 1)
BitVector32.CreateMask(1) => 2 (binary 10)
BitVector32.CreateMask(2) => 4 (binary 100)
BitVector32.CreateMask(4) => 8 (binary 1000)
CreateMask(int) returns the given number multiplied by 2.
NOTE: The first bit is the least significant bit, i.e. the bit farthest to the right.
BitVector32.CreateMask() is a substitution for the left shift operator (<<) which in most cases results in multiplication by 2 (left shift is not circular, so you may start loosing digits, more is explained here)
BitVector32 vector = new BitVector32();
int bit1 = BitVector32.CreateMask();
int bit2 = BitVector32.CreateMask(bit1);
int bit3 = 1 << 2;
int bit5 = 1 << 4;
Console.WriteLine(vector.ToString());
vector[bit1 | bit2 | bit3 | bit5] = true;
Console.WriteLine(vector.ToString());
Output:
BitVector32{00000000000000000000000000000000}
BitVector32{00000000000000000000000000010111}
Check this other post link text.
And also, CreateMask does not return the given number multiplied by 2.
CreateMask creates a bit-mask based on an specific position in the 32-bit word (that's the paramater that you are passing), which is generally x^2 when you are talking about a single bit (flag).
I stumbled upon this question trying to find out what CreateMask does exactly. I did not feel the current answers answered the question for me. After some reading and experimenting, I would like to share my findings:
Basically what Maksymilian says is almast correct: "BitVector32.CreateMask is a substitution for the left shift operator (<<) which in most cases results in multiplication by 2".
Because << is a binary operator and CreateMask only takes one argument, I would like to add that BitVector32.CreateMask(x) is equivalant to x << 1.
Bordercases
However, BitVector32.CreateMask(x) is not equivalant to x << 1 for two border cases:
BitVector32.CreateMask(int.MinValue):
An InvalidOperationException will be thrown. int.MinValue corresponds to 10000000000000000000000000000000. This seems bit odd. Especially considering every other value with a 1 as the leftmost bit (i.e. negative numbers) works fine. In contrast: int.MinValue << 1 would not throw an exception and just return 0.
When you call BitVector32.CreateMask(0) (or BitVector32.CreateMask()). This will return 1
(i.e. 00000000000000000000000000000000 becomes 00000000000000000000000000000001),
whereas 0 << 1 would just return 0.
Multiplication by 2
CreateMask almost always is equivalent to multiplication by 2. Other than the above two special cases, it differs when the second bit from the left is different from the leftmost bit. An int is signed, so the leftmost bit indicates the sign. In that scenario the sign is flipped. E.g. CreateMask(-1) (or 11111111111111111111111111111111) results in -2 (or 11111111111111111111111111111110), but CreateMask(int.MaxValue) (or 01111111111111111111111111111111) also results in -2.
Anyway, you probably shouldn't use it for this purpose. As I understand, when you use a BitVector32, you really should only consider it a sequence of 32 bits. The fact that they use ints in combination with the BitVector32 is probably just because it's convenient.
When is CreateMask useful?
I honestly don't know. It seems from the documentation and the name "previous" of the argument of the function that they intended it to be used in some kind of sequence: "Use CreateMask() to create the first mask in a series and CreateMask(int) for all subsequent masks.".
However, in the code example, they use it to create the masks for the first 5 bits, to subsequently do some operations on those bits. I cannot imagine they expect you to write 32 calls in a row to CreateMask to be able to do some stuff with the bits near the left.