Modify specific bit in byte - c#

I need to modify (!not toggle XOR!) specific bit in byte value. I have:
source byte (e.g. b11010010);
index of bit to modify (e.g. 4);
new value of bit (0 or 1).
Now, what I need. If new value is 0, then bit[4] must be set to 0. If new value is 1, then bit[4] must be set to 1.
General part:
var bitIndex = 4;
var byte = b11010010;
var mask = 1 << bitIndex;
var newValue = 1;
This is the easiest way to do this:
if(newValue == 1)
byte |= mask; // set bit[bitIndex]
else
byte &= ~mask; // drop bit[bitIndex]
Another way allows to do this without if else statement, but look to hard to understand:
byte = byte & ~mask | (newValue << bitIndex) & mask
Here, first AND drops bit[bitIndex], second AND calculates new value for bit[bitIndex], and OR set bit[bitIndex] to calculated value, not matter is it 0 or 1.
Is there any easier way to set specific bit into given value?

(newValue << bitIndex) only has a single bit set, there's no need for & mask.
So you have just 5 operations.
byte = byte & ~(1 << bitIndex) | (newValue << bitIndex); // bitIndex'th bit becomes newValue
It's still complex enough to be worth a comment, but easy to see that the comment is correct because it's two easily recognized operations chained together (unlike the current accepted answer, which requires every reader to sit down and think about it for a minute)

The canonical way to do this is:
byte ^= (-newValue ^ byte) & (1 << n);
Bit number n will be set if newValue == 1, and cleared if newValue == 0

Related

Non looping way to check if every Nth bit is set, with or without an offset?

For example, is every 4th bit set.
1000.1000 true
1010.1000 true
0010.1000 false
with offset of 1
0100.0100 true
0101.0100 true
0001.0100 false
Currently I am doing this by looping through every 4 bits
int num = 170; //1010.1010
int N = 4;
int offset = 0; //[0, N-1]
bool everyNth = true;
for (int i = 0; i < intervals ; i++){
if(((num >> (N*i)) & ((1 << (N - 1)) >> offset)) == 0){
every4th = false;
break;
}
}
return everyNth;
EXPLANATION OF CODE:
num = 1010.1010
The loop makes it so I look at each 4 bits as a block by right shifting * 4.
num >> 4 = 0000.1010
Then an & for a specific bit that can be offset.
And to only look at a specific bit of the chunk, a mask is created by ((1 << (N - 1)) >> offset)
0000.1010
1000 (mask >> offset0)
OR 0100 (mask >> offset1)
OR 0010 (mask >> offset2)
OR 0001 (mask >> offset3)
Is there a purely computational way to do this? Like how you can XOR your way through to figure out parity. I am working with 64 bit integers for my case, but I am wondering this in a more general case.
Additionally, I am under the assumption that bit operators are one of the fastest methods for calculations or math in general. If this is not true, please feel free to correct me on what the time and place is for bit operators.
If we had a mask M in which every Nth bit is set, then testing whether every Nth bit in a given integer x is set could be calculated as (x & M) == M. Or with offset, you could use ((x << offset) & M) == M. Shifting M right is fine too.
If N is constant, that's all there is to it, just use the right M.
If N is variable, the question becomes, how do we get a mask in which every Nth bit is set.
Here is a simple way to do that:
Start by setting the Nth bit
"Double" the mask until done
For example,
ulong M = 1UL << (N - 1);
do
{
M |= M << N;
N += N;
} while (N < 64);
That is clearly still a loop. But it's not a bit-by-bit loop, it makes only a logarithmic number of iterations.
You could precompute the masks and store them in a small array, the range of N is necessarily small.
There may also be a way based on ulong.MaxValue / ((1UL << N) - 1) but that needs something more to "align" the mask and 64-bit division is not so great anyway. Perhaps there is a smarter way to get the mask.
I am under the assumption that bit operators are one of the fastest methods for calculations or math in general
Bitwise operations are some of the fastest operations, but addition is equally fast, and multiplication is not that far behind (and a multiplication can do a lot more work at once, compared to how much more it costs).

Need help writing a binary reader extension method for a specific value format: 6 bit then 7 bits structure

Alright so here goes.
I currently need to write an extension method for the System.IO.BinaryReader class that is capable of reading a specific format.
I have no idea what this format is called but I do know exactly how it works so i will describe it below.
Each byte that makes up the value is flagged to indicate how the reader will need to behave next.
The first byte has 2 flags, and any subsequent bytes for the value have only 1 flag.
First byte:
01000111
^^^^^^^^
|||____|_ 6 bit value
||_______ flag: next byte required
|________ flag: signed value
Next bytes:
00000011
^^^^^^^^
||_____|_ 7 bit value
|________ flag: next byte required
The first byte in the value has 2 flags, the first bit is if the value is positive or negative.
The second bit is if another byte needs to be read.
The 6 remaining bits is the value so far which will need to be kept for later.
If no more bytes need to be read then you just return the 6 bit value with the right sign as dictated by the first bit flag.
If another byte needs to be read then you read the first bit of that byte, and that will indicate if another byte needs to be read.
The remaining 7 bits are the value here.
That value will need to be joined with the 6 bit value from the first byte.
So in the case of the example above:
The first value was this: 01000111.
Which means it is positive, another byte needs to be read, and the value so far is 000111.
Another byte is read and it is this: 00000011
Therefore no new bytes need to be read and value here is this: 0000011
That is joined onto the front of the value so far like so: 0000011000111
That is therefore the final value: 0000011000111 or 199
0100011100000011 turns into this: 0000011000111
Here is another example:
011001111000110100000001
^^^^^^^^^^^^^^^^^^^^^^^^
| || ||______|_____ Third Byte (1 flag)
| ||______|_____________ Second Byte (1 flag)
|______|_____________________ First Byte (2 flags)
First Byte:
0 - Positive
1 - Next Needed
100111 - Value
Second Byte:
1 - Next Needed
0001101 - Value
Third Byte:
0 - Next Not Needed
0000001 - Value
Value:
00000010001101100111 = 9063
Hopefully my explanation was clear :)
Now i need to be able to write a clear, simple and, and most importantly fast extension method for System.IO.BinaryReader to read such a value from a stream.
My attempts so far are kind of bad and unnecessarily complicated involving boolean arrays and bitarrays.
Therefore I could really do with somebody helping me out with this in writing such a method, that would be really appreciated!
Thanks for reading.
Based on the description in the comments I came up with this, unusually reading in signed bytes since it makes the continue flag slightly easier to check: (not tested)
static int ReadVLQInt32(this BinaryReader r)
{
sbyte b0 = r.ReadSByte();
// the first byte has 6 bits of the raw value
int shift = 6;
int raw = b0 & 0x3F;
// first continue flag is the second bit from the top, shift it into the sign
sbyte cont = (sbyte)(b0 << 1);
while (cont < 0)
{
sbyte b = r.ReadSByte();
// these bytes have 7 bits of the raw value
raw |= (b & 0x7F) << shift;
shift += 7;
// continue flag is already in the sign
cont = b;
}
return b0 < 0 ? -raw : raw;
}
It can easily be extended to read a long too, just make sure to use b & 0x7FL otherwise that value is shifted as an int and bits would get dropped.
Version that checks for illegal values (an overlong sequence of 0xFF, 0xFF... for example, plus works with checked math of C# (there is an option in the C# compiler to use cheched math to check for overflows)
public static int ReadVlqInt32(this BinaryReader r)
{
byte b = r.ReadByte();
// the first byte has 6 bits of the raw value
uint raw = (uint)(b & 0x3F);
bool negative = (b & 0x80) != 0;
// first continue flag is the second bit from the top, shift it into the sign
bool cont = (b & 0x40) != 0;
if (cont)
{
int shift = 6;
while (true)
{
b = r.ReadByte();
cont = (b & 0x80) != 0;
b &= 0x7F;
if (shift == 27)
{
if (negative)
{
// minumum value abs(int.MinValue)
if (b > 0x10 || (b == 0x10 && raw != 0))
{
throw new Exception();
}
}
else
{
// maximum value int.MaxValue
if (b > 0xF)
{
throw new Exception();
}
}
}
// these bytes have 7 bits of the raw value
raw |= ((uint)b) << shift;
if (!cont)
{
break;
}
if (shift == 27)
{
throw new Exception();
}
shift += 7;
}
}
// We use unchecked here to handle int.MinValue
return negative ? unchecked(-(int)raw) : (int)raw;
}

How to convert an int to a byte array of size 3?

I need to convert an int to a byte array of size 3. This means dropping the last byte, for example:
var temp = BitConverter.GetBytes(myNum).Take(3).ToArray());
However, is there a better way to do is? Maybe by creating a custom struct?
EDIT
For this requirement I have a predefined max value of 16777215 for this new data type.
Something like this (no Linq, just getting bytes)
int value = 123;
byte[] result = new byte[] {
(byte) (value & 0xFF),
(byte) ((value >> 8) & 0xFF),
(byte) ((value >> 16) & 0xFF),
};
Sounds like you want to create a new struct that represents a 3 byte unsigned integer (based solely on the max value quoted).
Using your original method is very prone to failure, firstly, Take(3) is dependent on whether the system you're running on is big-endian or little-endian, secondly, it doesn't take into account what happens when you get passed a negative int which your new struct can't handle.
You will need to write the conversions yourself, I would take in the int as given, check if it's negative, check if it's bigger than 16777215, if it passes those checks then it's between 0 and 16777215 and you can store it in your new struct, simply execute a Where(b => b != 0) instead of Take(3) to get around the endian-ness problem. Obviously take into account the 0 case where all bytes = 0.

C#: How to concatenate bits to create an UInt64?

I'm trying to create a hashing function for images in order to find similar ones from a database.
The hash is simply a series of bits (101110010) where each bit stands for one pixel. As there are about 60 pixels for each image I assume it would be best to save this as an UInt64.
Now, when looping through each pixel and calculating each bit, how can I concatenate those and save them as a UInt64?
Thanks for you help.
Use some bit twiddling:
long mask = 0;
// For each bit that is set, given its position (0-63):
mask |= 1 << position;
You use bitwise operators like this:
ulong it1 = 0;
ubyte b1 = 0x24;
ubyte b2 = 0x36;
...
it1 = (b1 << 48) | (b2 << 40) | (b3 << 32) .. ;
Alternatively you can use the BitConvert.Uint64() function to quickly convert a byte array to int64. But are you sure the target is of 8bytes long?

Number of unset bit left of most significant set bit?

Assuming the 64bit integer 0x000000000000FFFF which would be represented as
00000000 00000000 00000000 00000000
00000000 00000000 >11111111 11111111
How do I find the amount of unset bits to the left of the most significant set bit (the one marked with >) ?
In straight C (long long are 64 bit on my setup), taken from similar Java implementations: (updated after a little more reading on Hamming weight)
A little more explanation: The top part just sets all bit to the right of the most significant 1, and then negates it. (i.e. all the 0's to the 'left' of the most significant 1 are now 1's and everything else is 0).
Then I used a Hamming Weight implementation to count the bits.
unsigned long long i = 0x0000000000000000LLU;
i |= i >> 1;
i |= i >> 2;
i |= i >> 4;
i |= i >> 8;
i |= i >> 16;
i |= i >> 32;
// Highest bit in input and all lower bits are now set. Invert to set the bits to count.
i=~i;
i -= (i >> 1) & 0x5555555555555555LLU; // each 2 bits now contains a count
i = (i & 0x3333333333333333LLU) + ((i >> 2) & 0x3333333333333333LLU); // each 4 bits now contains a count
i = (i + (i >> 4)) & 0x0f0f0f0f0f0f0f0fLLU; // each 8 bits now contains a count
i *= 0x0101010101010101LLU; // add each byte to all the bytes above it
i >>= 56; // the number of bits
printf("Leading 0's = %lld\n", i);
I'd be curious to see how this was efficiency wise. Tested it with several values though and it seems to work.
Based on: http://www.hackersdelight.org/HDcode/nlz.c.txt
template<typename T> int clz(T v) {int n=sizeof(T)*8;int c=n;while (n){n>>=1;if (v>>n) c-=n,v>>=n;}return c-v;}
If you'd like a version that allows you to keep your lunch down, here you go:
int clz(uint64_t v) {
int n=64,c=64;
while (n) {
n>>=1;
if (v>>n) c-=n,v>>=n;
}
return c-v;
}
As you'll see, you can save cycles on this by careful analysis of the assembler, but the strategy here is not a terrible one. The while loop will operate Lg[64]=6 times; each time it will convert the problem into one of counting the number of leading bits on an integer of half the size.
The if statement inside the while loop asks the question: "can i represent this integer in half as many bits", or analogously, "if i cut this in half, have i lost it?". After the if() payload completes, our number will always be in the lowest n bits.
At the final stage, v is either 0 or 1, and this completes the calculation correctly.
If you are dealing with unsigned integers, you could do this:
#include <math.h>
int numunset(uint64_t number)
{
int nbits = sizeof(uint64_t)*8;
if(number == 0)
return nbits;
int first_set = floor(log2(number));
return nbits - first_set - 1;
}
I don't know how it will compare in performance to the loop and count methods that have already been offered because log2() could be expensive.
Edit:
This could cause some problems with high-valued integers since the log2() function is casting to double and some numerical issues may arise. You could use the log2l() function that works with long double. A better solution would be to use an integer log2() function as in this question.
// clear all bits except the lowest set bit
x &= -x;
// if x==0, add 0, otherwise add x - 1.
// This sets all bits below the one set above to 1.
x+= (-(x==0))&(x - 1);
return 64 - count_bits_set(x);
Where count_bits_set is the fastest version of counting bits you can find. See https://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel for various bit counting techniques.
I'm not sure I understood the problem correctly. I think you have a 64bit value and want to find the number of leading zeros in it.
One way would be to find the most significant bit and simply subtract its position from 63 (assuming lowest bit is bit 0). You can find out the most significant bit by testing whether a bit is set from within a loop over all 64 bits.
Another way might be to use the (non-standard) __builtin_clz in gcc.
I agree with the binary search idea. However two points are important here:
The range of valid answers to your question is from 0 to 64 inclusive. In other words - there may be 65 different answers to the question. I think (almost sure) all who posted the "binary search" solution missed this point, hence they'll get wrong answer for either zero or a number with the MSB bit on.
If speed is critical - you may want to avoid the loop. There's an elegant way to achieve this using templates.
The following template stuff finds the MSB correctly of any unsigned type variable.
// helper
template <int bits, typename T>
bool IsBitReached(T x)
{
const T cmp = T(1) << (bits ? (bits-1) : 0);
return (x >= cmp);
}
template <int bits, typename T>
int FindMsbInternal(T x)
{
if (!bits)
return 0;
int ret;
if (IsBitReached<bits>(x))
{
ret = bits;
x >>= bits;
} else
ret = 0;
return ret + FindMsbInternal<bits/2, T>(x);
}
// Main routine
template <typename T>
int FindMsb(T x)
{
const int bits = sizeof(T) * 8;
if (IsBitReached<bits>(x))
return bits;
return FindMsbInternal<bits/2>(x);
}
Here you go, pretty trivial to update as you need for other sizes...
int bits_left(unsigned long long value)
{
static unsigned long long mask = 0x8000000000000000;
int c = 64;
// doh
if (value == 0)
return c;
// check byte by byte to see what has been set
if (value & 0xFF00000000000000)
c = 0;
else if (value & 0x00FF000000000000)
c = 8;
else if (value & 0x0000FF0000000000)
c = 16;
else if (value & 0x000000FF00000000)
c = 24;
else if (value & 0x00000000FF000000)
c = 32;
else if (value & 0x0000000000FF0000)
c = 40;
else if (value & 0x000000000000FF00)
c = 48;
else if (value & 0x00000000000000FF)
c = 56;
// skip
value <<= c;
while(!(value & mask))
{
value <<= 1;
c++;
}
return c;
}
Same idea as user470379's, but counting down ...
Assume all 64 bits are unset. While value is larger than 0 keep shifting the value right and decrementing number of unset bits:
/* untested */
int countunsetbits(uint64_t val) {
int x = 64;
while (val) { x--; val >>= 1; }
return x;
}
Try
int countBits(int value)
{
int result = sizeof(value) * CHAR_BITS; // should be 64
while(value != 0)
{
--result;
value = value >> 1; // Remove bottom bits until all 1 are gone.
}
return result;
}
Use log base 2 to get you the most significant digit which is 1.
log(2) = 1, meaning 0b10 -> 1
log(4) = 2, 5-7 => 2.xx, or 0b100 -> 2
log(8) = 3, 9-15 => 3.xx, 0b1000 -> 3
log(16) = 4 you get the idea
and so on...
The numbers in between become fractions of the log result. So typecasting the value to an int gives you the most significant digit.
Once you get this number, say b, the simple 64 - n will be the answer.
function get_pos_msd(int n){
return int(log2(n))
}
last_zero = 64 - get_pos_msd(n)

Categories

Resources