How Logical XOR [duplicate] - c#

This question already has answers here:
In c# what does the ^ character do? [duplicate]
(4 answers)
Closed 5 years ago.
I understand that for boolean values Exclusive OR says the output will be on if the inputs are different. link
But how does it work on non boolean values like below. In C# or Javascript, how is its value "10" for below code. Can anyone explain this for me please?
Console.WriteLine(9^3);

I get the impression you are thinking in purely logical terms, where the result must be true or false, 1 or 0. The ^ operator does act this way, but as a bitwise operator is does it per bit, one at a time, instead of on the whole value at once. It's not that 9 and 3 are both "true", and so the result must be false. It's that 9 is 1001 and 3 is 0011, and when you check each of the corresponding bits with xor you get 1010, which is 10:
1001 (9)
^ 0011 (3)
------
1010 (10)

Bitwise operators perform operation on the bits stored in the memory.
Hence take equalent binary value for each decimal number and perform the operation.
9 ---Binary Value ---> 0000 1001
3 ---Binary Value ---> 0000 0011
Perform EXOR (^) ------------
0000 1010 ---- Decimal Value --> 10

Related

Checking if a char is equal to multiple other chars, with as little branching as possible

I'm writing some performance-sensitive C# code that deals with character comparisons. I recently discovered a trick where you can tell if a char is equal to one or more others without branching, if the difference between them is a power of 2.
For example, say you want to check if a char is U+0020 (space) or U+00A0 (non-breaking space). Since the difference between the two is 0x80, you can do this:
public static bool Is20OrA0(char c) => (c | 0x80) == 0xA0;
as opposed to this naive implementation, which would add an additional branch if the character was not a space:
public static bool Is20OrA0(char c) => c == 0x20 || c == 0xA0;
How the first one works is since the difference between the two chars is a power of 2, it has exactly one bit set. So that means when you OR it with the character and it leads to a certain result, there are exactly 2 ^ 1 different characters that could have lead to that result.
Anyway, my question is, can this trick somehow be extended to characters with differences that aren't multiples of 2? For example, if I had the characters # and 0 (which have a difference of 13, by the way), is there any sort of bit-twiddling hack I could use to check if a char was equal to either of them, without branching?
Thanks for your help.
edit: For reference, here is where I first stumbled across this trick in the .NET Framework source code, in char.IsLetter. They take advantage of the fact that a - A == 97 - 65 == 32, and simply OR it with 0x20 to uppercase the char (as opposed to calling ToUpper).
If you can tolerate a multiply instead of a branch, and the values you are testing against only occupy the lower bits of the data type you are using (and therefore won't overflow when multiplied by a smallish constant, consider casting to a larger data type and using a correspondingly larger mask value if this is an issue), then you could multiply the value by a constant to force the two values to be a power of 2 apart.
For example, in the case of # and 0 (decimal values 35 and 48), the values are 13 apart. Rounding down, the nearest power of 2 to 13 is 8, which is 0.615384615 of 13. Multiplying this by 256 and rounding up, to give an 8.8 fixed point value gives 158.
Here are the binary values for 35 and 48, multiplied by 158, and their neighbours:
34 * 158 = 5372 = 0001 0100 1111 1100
35 * 158 = 5530 = 0001 0101 1001 1010
36 * 158 = 5688 = 0001 0110 0011 1000
47 * 158 = 7426 = 0001 1101 0000 0010
48 * 158 = 7548 = 0001 1101 1010 0000
49 * 158 = 7742 = 0001 1110 0011 1110
The lower 7 bits can be ignored because they aren't necessary in order to separate any of the neighbouring values from each other, and apart from that, the values 5530 and 7548 only differ in bit 11, so you can use the mask and compare technique, but using an AND instead of an OR. The mask value in binary is 1111 0111 1000 0000 (63360) and the compare value is 0001 0101 1000 0000 (5504), so you can use this code:
public static bool Is23Or30(char c) => ((c * 158) & 63360) == 5504;
I haven't profiled this, so I can't promise it's faster than a simple compare.
If you do implement something like this, be sure to write some test code that loops through every possible value that can be passed to the function, to verify that it works as expected.
You can use the same trick to compare against a set of 2^N values provided that they have all other bits equal except N bits. E.g if the set of values is 0x01, 0x03, 0x81, 0x83 then N=2 and you can use (c | 0x82) == 0x83. Note that the values in the set differ only in bits 1 and/or 7. All other bits are equal. There are not many cases where this kind of optimization can be applied, but when it can and every little bit of extra speed counts, its a good optimization.
This is the same way boolean expressions are optimized (e.g. when compiling VHDL). You may also want to look up Karnaugh maps.
That being said, it is really bad practice to do this kind of comparisons on character values especially with Unicode, unless you know what you are doing and are doing really low level stuff (such as drivers, kernel code etc). Comparing characters (as opposed to bytes) has to take into account the linguistic features (such as uppercase/lowercase, ligatures, accents, composited characters etc)
On the other hand if all you need is binary comparison (or classification) you can use lookup tables. With single byte character sets these can be reasonably small and really fast.
If not having branches is really your main concern, you can do something like this:
if ( (x-c0|c0-x) & (x-c1|c1-x) & ... & (x-cn|cn-x) & 0x80) {
// x is not equal to any ci
If x is not equal to a specific c, either x-c or c-x will be negative, so x-c|c-x will have bit 7 set. This should work for signed and unsigned chars alike. If you & it for all c's, the result will have bit 7 set only if it's set for every c (i.e. x is not equal to any of them)

strange no error within C# application [duplicate]

This question already has answers here:
No overflow exception for int in C#?
(6 answers)
Closed 8 years ago.
I have a c# application in which i have this code :
public static void Main()
{
int i = 2147483647;
int j = i+1;
Console.WriteLine(j);
Console.ReadKey();
}
The result is : -2147483648
I know that every integer must be < 2147483648. So
Why I don't have a compilation or runtime error? like in this exemple
What is the reason of the negative sign?
thanks
The compiler defaults to unchecked arithmetic; you have simply overflown and looped around, thanks to two's-complement storage.
This fails at runtime:
public static void Main()
{
int i = 2147483647;
int j = checked((int)(i + 1)); // <==== note "checked"
Console.WriteLine(j);
Console.ReadKey();
}
This can also be enabled globally as a compiler-switch.
As Christos says, the negative sign comes from integer overflow. The reason you do net get an error is because the compiler does not evaluate expressions for overflowing values.
0111 1111 1111 1111 1111 1111 1111 1111 2^31-1
+0000 0000 0000 0000 0000 0000 0000 0001 1
=1000 0000 0000 0000 0000 0000 0000 0000 -2^31
The reason for this is that the leftmost bit is the sign bit, it determines whether the int is positive or negative. 0 is positive, 1 is negative. If you add one to the largest possible number, you essentially change the sign bit and get the smallest representable number. The reason for this is that integers use two's complement storage
To check if the value overflows, do:
int j = checked(i + 1);
Why I don't have a compilation or runtime error?
Because compiler can determine that you have assigned a larger than int.MaxValue value to the variable. Since it is hard coded. But for i+1 compiler can't execute the code to determine that the result of this calculation would be greater than int.MaxValue
What is the reason of the negative sign?
It is because of integer overflow.
See: checked (C# Reference)
By default, an expression that contains only constant values causes
a compiler error if the expression produces a value that is outside
the range of the destination type. If the expression contains one or
more non-constant values, the compiler does not detect the overflow.
What is the reason of the negative sign?
You have a negative sign, because you have exceeded the maximum integer value and the next integer is the lowest integer that can be represented.
Why I don't have a compilation or runtime error?
You don't have a complilation error, because this is not an error. Also, this is not a runtime error. You just add one to the i in the runtime. Since the value of i is the maximum integer value that can be stored in a variable of type int and since the circular nature of the integers in programming, you will get the lowest integer that can be stored in a variable of type int.
(A variable of type int can store 32-bit integers).
Furthermore, by default you in C# integer operation don't throw exceptions upon overflow. You could alter this either from project settings or using a checked statement, as it is already have been pointed out here.

Understanding the behavior of a single ampersand operator (&) on integers

I understand that the single ampersand operator is normally used for a 'bitwise AND' operation. However, can anyone help explain the interesting results you get when you use it for comparison between two numbers?
For example;
(6 & 2) = 2
(10 & 5) = 0
(20 & 25) = 16
(123 & 20) = 16
I'm not seeing any logical link between these results and I can only find information on comparing booleans or single bits.
Compare the binary representations of each of those.
110 & 010 = 010
1010 & 0101 = 0000
10100 & 11001 = 10000
1111011 & 0010100 = 0010000
In each case, a digit is 1 in the result only when it is 1 on both the left AND right side of the input.
You need to convert your numbers to binary representation and then you will see the link between results like 6 & 2= 2 is actually 110 & 010 =010 etc
10 & 5 is 1010 & 0101 = 0000
The binary and operation is performed on the integers, represented in binary. For example
110 (6)
010 (2)
--------
010 (2)
The bitwise AND is does exactly that: it does an AND operation on the Bits.
So to anticipate the result you need to look at the bits, not the numbers.
AND gives you 1, only if there's 1 in both number in the same position:
6(110) & 2(010) = 2(010)
10(1010) & 5(0101) = 0(0000)
A bitwise OR will give you 1 if there's 1 in either numbers in the same position:
6(110) | 2(010) = 6(110)
10(1010) | 5(0101) = 15(1111)
6 = 0110
2 = 0010
6 & 2 = 0010
20 = 10100
25 = 11001
20 & 25 = 10000
(looks like you're calculation is wrong for this one)
Etc...
Internally, Integers are stored in binary format. I strongly suggest you read about that. Knowing about the bitwise representation of numbers is very important.
That being said, the bitwise comparison compares the bits of the parameters:
Decimal: 6 & 2 = 2
Binary: 0110 & 0010 = 0010
Bitwize AND matches the bits in binary notation one by one and the result is the bits that are comon between the two numbers.
To convert a number to binary you need to understand the binary system.
For example
6 = 110 binary
The 110 represents 1x4 + 1x2 + 0x1 = 6.
2 then is
0x4 + 1x2 + 0x1 = 2.
Bitwize and only retains the positions where both numbers have the position set, in this case the bit for 2 and the result is then 2.
Every extra bit is double the last so a 4 bit number uses the multipliers 8, 4, 2, 1 and can there fore represent all numbers from 0 to 15 (the sum of the multipliers.)

How to solve this bit operation?

I have one byte in with I need to replace last (least important) bits.
Example below.
Original byte: xxxx0110
Replacement byte: 1111
What I want to get: xxxx1111
Original byte: xxxx1111
Replacement byte: 0000
What I want to get: xxxx0000
Original byte: xxxx0000
Replacement byte: 1111
What I want to get: xxxx1111
Original byte: xxxx1010
Replacement byte: 1111
What I want to get: xxxx1111
Original byte: xxxx0101
Replacement byte: 0111
What I want to get: xxxx0111
value = (byte)( (value & ~15) | newByte);
The ~15 creates a mask of everything except the last 4 bits; value & {that mask} takes the last 4 bits away, then | newByte puts the bits from the new data in their place.
This can be done with a combination of bitwise AND to clear the bits and bitwise OR to set the bits.
To clear the lowest four bits, you can AND with a value that is 1 everywhere except at those bits, where it's zero. One value like this would be ~0xF, which is the complement of 0xF, which is four ones: 0b1111.
To set the bits, you can then use bitwise OR with the bits to set. Since 0 OR x = x, this works as you'd intend it.
The net result would be
(x & ~0xF) | bits
EDIT: As per Eamon Nerbonne's comment, you should then cast back to a byte:
(byte)((x & ~0xF) | bits)
If my understanding is right, you want to OR your byte (after left shift 4 times) with the replacement byte(left shift 4 times, too). Then right shift 4 times and you will get the desired result.
For example: a = 1001 1101
Replacement byte: 0000 1011
Left shift a 4 times: 1101 0000
Left shift replacement 4 times: 1011 0000
OR result: 1111
Right shift 4 times: 1011 (end result).
Maybe this link is helpful: http://www.codeproject.com/KB/cs/leftrightshift.aspx
trim the last 4 bits. and append the new ones.

Why AND two numbers to get a Boolean?

I am working on a little Hardware interface project based on the Velleman k8055 board.
The example code comes in VB.Net and I'm rewriting this into C#, mostly to have a chance to step through the code and make sense of it all.
One thing has me baffled though:
At one stage they read all digital inputs and then set a checkbox based on the answer to the read digital inputs (which come back in an Integer) and then they AND this with a number:
i = ReadAllDigital
cbi(1).Checked = (i And 1)
cbi(2).Checked = (i And 2) \ 2
cbi(3).Checked = (i And 4) \ 4
cbi(4).Checked = (i And 8) \ 8
cbi(5).Checked = (i And 16) \ 16
I have not done Digital systems in a while and I understand what they are trying to do but what effect would it have to AND two numbers? Doesn't everything above 0 equate to true?
How would you translate this to C#?
This is doing a bitwise AND, not a logical AND.
Each of those basically determines whether a single bit in i is set, for instance:
5 AND 4 = 4
5 AND 2 = 0
5 AND 1 = 1
(Because 5 = binary 101, and 4, 2 and 1 are the decimal values of binary 100, 010 and 001 respectively.)
I think you 'll have to translate it to this:
i & 1 == 1
i & 2 == 2
i & 4 == 4
etc...
This is using the bitwise AND operator.
When you use the bitwise AND operator, this operator will compare the binary representation of the two given values, and return a binary value where only those bits are set, that are also set in the two operands.
For instance, when you do this:
2 & 2
It will do this:
0010 & 0010
And this will result in:
0010
0010
&----
0010
Then if you compare this result with 2 (0010), it will ofcourse return true.
Just to add:
It's called bitmasking
http://en.wikipedia.org/wiki/Mask_(computing)
A boolean only require 1 bit. In the implementation most programming language, a boolean takes more than a single bit. In PC this won't be a big waste, but embedded system usually have very limited memory space, so the waste is really significant. To save space, the booleans are packed together, this way a boolean variable only takes up 1 bit.
You can think of it as doing something like an array indexing operation, with a byte (= 8 bits) becoming like an array of 8 boolean variables, so maybe that's your answer: use an array of booleans.
Think of this in binary e.g.
10101010
AND
00000010
yields 00000010
i.e. not zero. Now if the first value was
10101000
you'd get
00000000
i.e. zero.
Note the further division to reduce everything to 1 or 0.
(i and 16) / 16 extracts the value (1 or 0) of the 5th bit.
1xxxx and 16 = 16 / 16 = 1
0xxxx and 16 = 0 / 16 = 0
And operator performs "...bitwise conjunction on two numeric expressions", which maps to '|' in C#. The '` is an integer division, and equivalent in C# is /, provided that both operands are integer types.
The constant numbers are masks (think of them in binary). So what the code does is apply the bitwise AND operator on the byte and the mask and divide by the number, in order to get the bit.
For example:
xxxxxxxx & 00000100 = 00000x000
if x == 1
00000x00 / 00000100 = 000000001
else if x == 0
00000x00 / 00000100 = 000000000
In C# use the BitArray class to directly index individual bits.
To set an individual bit i is straightforward:
b |= 1 << i;
To reset an individual bit i is a little more awkward:
b &= ~(1 << i);
Be aware that both the bitwise operators and the shift operators tend to promote everything to int which may unexpectedly require casting.
As said this is a bitwise AND, not a logical AND. I do see that this has been said quite a few times before me, but IMO the explanations are not so easy to understand.
I like to think of it like this:
Write up the binary numbers under each other (here I'm doing 5 and 1):
101
001
Now we need to turn this into a binary number, where all the 1's from the 1st number, that is also in the second one gets transfered, that is - in this case:
001
In this case we see it gives the same number as the 2nd number, in which this operation (in VB) returns true. Let's look at the other examples (using 5 as i):
(5 and 2)
101
010
----
000
(false)
(5 and 4)
101
100
---
100
(true)
(5 and 8)
0101
1000
----
0000
(false)
(5 and 16)
00101
10000
-----
00000
(false)
EDIT: and obviously I miss the entire point of the question - here's the translation to C#:
cbi[1].Checked = i & 1 == 1;
cbi[2].Checked = i & 2 == 2;
cbi[3].Checked = i & 4 == 4;
cbi[4].Checked = i & 8 == 8;
cbi[5].Checked = i & 16 == 16;
I prefer to use hexadecimal notation when bit twiddling (e.g. 0x10 instead of 16). It makes more sense as you increase your bit depths as 0x20000 is better than 131072.

Categories

Resources