Explain this use of XOR in this function? - c#

I saw this code snippet that was used to solve the ol' "Find the one number in an array that does not have a duplicate." question. I have been looking at this for a while this morning, but can not nail down exactly how it is doing it.
I don't understand how k always ends up holding the value of the non-duplicate. Can anyone explain how this works?
static void Main(string[] args)
{
int[] list = { 3,6,9,12,3,6,9 };
int k = 0;
for (int i = 0; i < list.Length; i++)
{
k = k ^ list[i];
}
Console.WriteLine(k);
}

It only works if only one number is non-duplicated (or occurs any odd number of times) and all the other numbers occur an even number of times.
When you xor a number to another number twice (or any other even number of times) it cancels itself out leaving the original number.

It's slightly similar to the "Nerds, Jocks and Lockers" problem, in terms of the "bit-flipping" going on leaving certain bits set and others not.
The basic behavior is that A XOR B works like "(A OR B) AND NOT (A AND B)". So, 0^0=0, 1^0 = 1, but 1^1 = 0 (unlike OR). Now, you start with zero (no bits set) on K. You then XOR it with the literal 3, which (as a byte) has the bits 00000011, and assign the result to K. You end up with 00000011 for K, because the bits that are set on the literal 3 are all not set on K when it's 0. Now, if you were to XOR K with the literal 3 again, you'd end up with 0, because all the bits match between the two values, so the XOR would return 0 on each bit.
This process is commutative, so ((((0 XOR 3) XOR 6) XOR 3) XOR 6) would give the same result (0) as ((((0 XOR 6) XOR 6) XOR 3) XOR 3), or pretty much any other combination of XORing 0 with 3 twice and 6 twice.
The net result is that, given a list of these numbers, any number that occurs twice (or an even number of times) is "XORed in" to K the first time, and then "XORed out" the second, leaving K with its bits set to the one value that only occurred once; 12.
Here's the binary demonstration of the full problem (using "nibbles" because we don't have any values over 16):
0000 0
^^^^ XOR
0011 3
---- =
0011 3
^^^^ XOR
0110 6
---- =
0101 5
^^^^ XOR
1001 9
---- =
1100 12
^^^^ XOR
1100 12
---- =
0000 0 <-this is coincidence; it'd work the same regardless of the unduped value
^^^^ XOR
0011 3
---- =
0011 3
^^^^ XOR
0110 6
---- =
0101 5
^^^^ XOR
1001 9
---- =
1100 12 <- QED
EDIT FROM COMMENTS: While this answer functions for the specific question asked, even the smallest change to the problem would "break" this implementation, such as:
The algorithm is completely unresponsive to the number zero; the algorithm is thus unable to tell the difference between having zero as a single unpaired value and having no unpaired values at all.
The algorithm only works for pairs, not triplets. If 3 occurred three times and was still a "dupe", and 12 was still the right answer, this algorithm would actually return 15 (1100 ^ 0011 == 1111) which isn't even in the list.
The algorithm only works if there is only one non-duplicated value in the list; if 8 and 12 were both unpaired values expected to be returned, the algorithm would return the XOR of the two (1100 ^ 1000 == 0100 == 4)
An efficient algorithm could be developed that would return the correct answer in all these cases in addition to the original case, but it would likely not involve XOR.

Any number can be expressed as a bit sequence:
3 == 0...00011
6 == 0...00110
9 == 0...01001
If you XOR them twice, the bit switching will cancel each other.
Therefore, if a single number appears once (or an odd number of times) in the list, then it will be the only one with the bits left "uncancelled".

keithS is spot on, but this might be a bit easier to follow.
The simplest explanation I can think of has to do with four properties of XOR:
0 XOR x = x, for any number x
x XOR x = 0, for any number x
XOR is commutative: x XOR y = y XOR x (this allows you to rearrange a series of XOR operations however you see fit)
XOR is associative: (x XOR y) XOR z = x XOR (y XOR z) (this allows you to evaluate XOR's in any order you see fit)
By properties 2 and 3, you can rearrange your input list so that all duplicates are next to each other:
{ 3,6,9,12,3,6,9 } -> { 3, 3, 6, 6, 9, 9, 12 };
By properties 1 and 4, we can XOR these numbers pairwise in any order, and all pairs of identical numbers become 0. After this, only the unduplicated number remains non-zero:
{ 0, 0, 0, 12 };
Also by property 1, because k is 0 to begin with, and all duplicate numbers have XOR'ed to become zero, all that's left is 12

This works because the XOR operator is both commutative and associative. This means you can rearrange the terms in any order:
a ^ b ^ c == a ^ c ^ b == c ^ a ^ b == ... (etc.)
This works for sequences of any length. Therefore:
3 ^ 6 ^ 9 ^ 12 ^ 3 ^ 6 ^ 9 == (3 ^ 3) ^ (6 ^ 6) ^ (9 ^ 9) ^ 12
Since x ^ x == 0 for any integer, which means you can eliminate all the pairs until you get 0 ^ 12, which is just equal to 12.

a ^ a = 0
a ^ 0 = a
0 ^ a = a ^ 0
a = b ^ c
a = c ^ b
int[] list = { 3,6,9,12,3,6,9 };
k ^ 3
k ^ 6
k ^ 9
k ^ 12
k ^ 3
k ^ 6
k ^ 9
=
k ^ 3 // dupe
k ^ 3
k ^ 6
k ^ 6
k ^ 9
k ^ 9
k ^ 12 // non-dupe
=
k ^ 12
=
0 ^ 12
=
12

When using XOR, you have to put in mind that you will be working with bits. In other words, you will only have 0 and 1 to deal with.
The definition of XOR is so simple:
0 XOR 0 -> 0
0 XOR 1 -> 1
1 XOR 1 -> 0
So if you apply this to your code and transform the 3 to 0011, the 6 to 0110, the 9 to 1001 and the 12 to 1100, and invoke the XOR method on them like in the example below, you will end up with 1100 that will represent the value 12.
Example:
0011
XOR
0110
=
0101 -> 5
This is not an algorithm to eliminate duplicate values. This happened by coincidence that you obtained the 12.

The code you reference does NOT actually solve the problem. For instance, put three 12s in the list and you will get 12, even though 12 is duplicated twice.
The result of XOR is essentially whether there is an odd or even number of that bit combination. So if the list contains an odd number of 12s (whether that be one 12 or a hundred-and-one 12s) and an even number of each other number, the result will always be 12.
Even worse, if the list contained an odd number of multiple different numbers, the result value may not be any of the numbers in the list. E.g., one 3 and one 14 would result in 13. This is because the XOR really operates on bits, not integers. The XOR of 14 (1110b) and 3 (0011b) results in 13 (1101b), since XOR sets any bits that are common between the numbers to zero.
Code that actually solves the problem:
using System.Linq;
using System.Collections.Generic;
...
static void Main ( string[] args)
{
int[] list = { 3,6,9,12,3,6,9 };
int[] nonDupes = list
.GroupBy(x => x)
.Where(x => x.Count() == 1)
.Select(x => x.Key)
.ToArray();
string output = string.Join(",", nonDupes);
Console.WriteLine(output);
}

Related

How am I getting a single bit from an int?

I understand that:
int bit = (number >> 3) & 1;
Will give me the bit 3 places from the left, so lets say 8 is 1000 so that would be 0001.
What I don't understand is how "& 1" will remove everything but the last bit to display an output of simply "1". I know that this works, I know how to get a bit from an int but how is it the code is extracting the single bit?
Code...
int number = 8;
int bit = (number >> 3) & 1;
Console.WriteLine(bit);
Unless my boolean algebra from school fails me, what's happening should be equivalent to the following:
*
1100110101101 // last bit is 1
& 0000000000001 // & 1
= 0000000000001 // = 1
*
1100110101100 // last bit is 0
& 0000000000001 // & 1
= 0000000000000 // = 0
So when you do & 1, what you're basically doing is to zero out all other bits except for the last one which will remain whatever it was. Or more technically speaking you do a bitwise AND operation between two numbers, where one of them happens to be a 1 with all leading bits set to 0
8 = 00001000
8 >> 1 = 00000100
8 >> 2 = 00000010
8 >> 3 = 00000001
If you use mask 1 = 000000001 then you have:
8 >> 3 = 000000001
1 = 000000001
(8 >> 3) & 1 = 000000001
Actually this is not hard to understand.
the "& 1" operation is just set all bits of the value to the "0", except the bit, which placed in the same position as the valuable bit in the value "1"
previous operation just shifts the all bits to the right. and places the checked bit to the position which won't be setted to "0" after operation "& 1"
fo example
number is 1011101
number >> 3 makes it 0001011
but (number >> 3) & 1 makes it 0000001
When u right shift 8 you get 0001
0001 & 0001 = 0001 which converted to int gives you 1.
So, when a value 0001 has been assigned to an int, it will print 1 and not 0001 or 0000 0001. All the leading zeroes will be discarded.

Merge first n bits of a byte with last 8-n bits of another byte

How can I merge first n bits of a byte with last 8-n bits of another byte?
I know something like below for picking 3 bits from first and 5 from second (Which I have observed in DES encryption algorithm)
zByte=(xByte & 0xE0) | (yByte & 0x1F); But I don't know maths behind why we need to use 0XE0 and 0X1F in this case. So I am trying to understand the details with regards to each bit.
In C#, that would be something like:
int mask = ~((-1) << n);
var result = (x & ~mask) | (y & mask);
i.e. we build a mask that is (for n = 5) : 000....0011111, then we combine (&) one operand with that mask, the other operand with the inverse (~) of the mask, and compose them (|).
You could also probably do something more quickly just using shift operations (avoiding a mask completely) - but only if the data can be treated as unsigned (so Java might struggle here).
It just sounds like you don't understand how boolean arithmetic works? If this is your question it works like this:
0xEO and 0x1F are hexidecimal representations of numbers. If we convert these numbers to binary they would be:
0xE0 = 11100000
0x1F = 00011111
Additionally & (and) and | (or) are bitwise logical operators. To understand logical operators, first remember the 1 = true and 0 = false.
The truth table for & is:
0 & 0 = 0
0 & 1 = 0
1 & 0 = 0
1 & 1 = 1
The truth table for | is:
0 | 0 = 0
0 | 1 = 1
1 | 0 = 1
1 | 1 = 1
So let's breakdown your equation piece by piece. First we will evaluate the code in parenthesis first. We will walk through each number in binary and for the & operator if each operand has a 1 in the same bit position we will return 1. If either number has a 0 then we will return 0. After we finish the evaluation of the operands in the parenthesis we will then take the 2 resulting numbers and apply the | operator bit by bit. If either number has a 1 in the same bit position we will return 1. If both numbers have a 0 in the same bit position we will return 0.
For the sake of discussion, let's say that
xByte = 255 or (FF in hex and 11111111 in binary)
yByte = 0 or (00 in hex and 00000000 in binary)
When you apply the & and | operators we are going to compare each bit one at a time:
zByte = (xByte & 0xEO) | (yByte & 0x1F)
becomes:
zByte = (11111111 & 11100000) | (00000000 & 00011111)
zByte = 111000000 | 00000000
zByte = 11100000
If you understand this and how boolean logic works then you can use Marc Gravell's answer.
The math behind those numbers (0xE0 and 0x1F) is quite simple. First we are exploiting the fact that 0 & <bit> always equals 0 and 1 & <bit> always equals <bit>.
0x1F is 00011111 binary, which means that the first 3 bits will always be 0 after an & operation with another byte - and the last 5 bits will be the same they were in the other byte. Remember that every 1 in a binary number represents a power of 2, so if you want to find the mask mathematically it would be the sum of 2^x from x = 0 to n-1. Then you can find the opposite mask (the one that is 11100000) to extract the first 3 bit, you simply need to subtract the mask from 11111111, and you will get 11100000 (0xE0).
In java,
By using the following function we can get the first n bits of the first Byte and last 8 n bits of the second byte.
public class BitExample {
public static void main(String[] args) {
Byte a = 15;
Byte b = 16;
String mergedValue=merge(4, a, b);
System.out.println(mergedValue);
}
public static String merge(int n, Byte a, Byte b) {
String mergedString = "";
String sa = Integer.toBinaryString(a);
String sb = Integer.toBinaryString(b);
if(n>sa.length()) {
for(int i=0; i<(n-sa.length()); i++) {
mergedString+="0";
}
mergedString+=sa;
}else{
mergedString+=sa.substring(0, n);
}
if(8*n>sb.length()) {
for(int i=0; i<(8*n-sb.length()); i++) {
mergedString+="0";
}
mergedString+=sb;
}
return mergedString;
}
}

How to remove the leftmost bit and add bit in its rightmost bit

How to remove the leftmost bit?
I have a hexadecimal value BF
Its binary representation is 1011 1111
How can I remove the first bit, which is 1, and then it will become 0111 1110?
How to add "0" also to its last part?
To set bit N of variable x to 0
x &= ~(1 << N);
How it works: The expression 1 << N is one bit shifted N times to the left. For N = 7, this would be
1000 0000
The bitwise NOT operator ~ inverts this to
0111 1111
Then the result is bitwise ANDed with x, giving:
xxxx xxxx
0111 1111
--------- [AND]
0xxx xxxx
Result: bit 7 (zero-based count starting from the LSB) is turned off, all others retain their previous values.
To set bit N of variable x to 1
x |= 1 << N;
How it works: this time we take the shifted bit and bitwise OR it with x, giving:
xxxx xxxx
1000 0000
--------- [OR]
1xxx xxxx
Result: Bit 7 is turned on, all others retain their previous values.
Finding highest order bit set to 1:
If you don't know which is the highest bit set to 1 you can find out on the fly. There are many ways of doing this; a reasonable approach is
int x = 0xbf;
int highestSetBit = -1; // assume that to begin with, x is all zeroes
while (x != 0) {
++highestSetBit;
x >>= 1;
}
At the end of the loop, highestSetBit will be 7 as expected.
See it in action.
int i=0xbf;
int j=(i<<1) & 0xff;
or you could do:
(i*2) && 0xff
if you'd rather not do bit twiddling. >>1 is the equivalent of /2, and <<1 is the equivalent of *2.

How to convert from RGB555 to RGB888 in c#?

I need to convert 16-bit XRGB1555 into 24-bit RGB888. My function for this is below, but it's not perfect, i.e. a value of 0b11111 wil give 248 as the pixel value, not 255. This function is for little-endian, but can easily be modified for big-endian.
public static Color XRGB1555(byte b0, byte b1)
{
return Color.FromArgb(0xFF, (b1 & 0x7C) << 1, ((b1 & 0x03) << 6) | ((b0 & 0xE0) >> 2), (b0 & 0x1F) << 3);
}
Any ideas how to make it work?
You would normally copy the highest bits down to the bottom bits, so if you had five bits as follows:
Bit position: 4 3 2 1 0
Bit variable: A B C D E
You would extend that to eight bits as:
Bit position: 7 6 5 4 3 2 1 0
Bit variable: A B C D E A B C
That way, all zeros remains all zeros, all ones becomes all ones, and values in between scale appropriately.
(Note that A,B,C etc aren't supposed to be hex digits - they are variables representing a single bit).
I'd go with a lookup table. Since there are only 32 different values it even fits in a cache-line.
You can get the 8 bit value from the 5 bit value with:
return (x<<3)||(x>>2);
The rounding might not be perfect though. I.e. the result isn't always closest to the input, but it never is further away that 1/255.

Arithmetic + and Bitwise OR

Is there any difference between Arithmetic + and bitwise OR. In what way this is differing.
uint a = 10;
uint b = 20;
uint arithmeticresult = a + b;
uint bitwiseOR = a | b;
Both the results are 30.
Edit : Small changes to hide my stupidity.
(10 | 20) == 10 + 20 only because the 1-bits do not appear in the same digit.
1010 = 10
or 10100 = 20
————————
11110 = 30
However,
11 = 3 11 = 3
or 110 = 6 + 110 = 6
—————— ——¹——————
111 = 7 1001 = 9
# ^ ^
# (1|1==1) (1+1=2)
Counterexample:
2 + 2 == 42 | 2 == 2
Bitwise OR means, for each bit position in both numbers, if one or two bits are on, then the result bit is on. Example:
0b01101001
|
0b01011001
=
0b01111001
(0b is a prefix for binary literals supported in some programming languages)
At the bit level, addition is similar to bitwise OR, except that it carries:
0b01101001
+
0b01011001
=
0b11000010
In your case, 10+20 and 10|20 happen to be the same because 10 (0b1010) and 20 (0b10100) have no 1s in common, meaning no carry happens in addition.
Try setting a = 230 and b = 120. And you'll observer the difference in results.
The reason is very simple. In the arithmentic addition operation the bit-wise add operation may generate carry bit which is added in the next bit-wise addition on the bit-pair available on the subsequent position. But in case of bit wise OR it just performs ORing which never generates a carry bit.
The fact that you're getting same result in your case is that the
numbers co-incidentally don't generate any
carry-bit during addition.
Bit-wise arithmetic Addition
alt text http://www.is.wayne.edu/drbowen/casw01/AnimAdd.gif
Bitwise OR goes through every bit of two digits and applies the following truth table:
A B | A|B
0 0 | 0
0 1 | 1
1 0 | 1
1 1 | 1
Meanwhile the arithmetic + operator actually goes through every bit applying the following table (where c is the carry-in, a and b are the bits of your number, s is the sum and c' is the carry out):
C A B | S C'
0 0 0 | 0 0
0 0 1 | 1 0
0 1 0 | 1 0
0 1 1 | 0 1
1 0 0 | 1 0
1 0 1 | 0 1
1 1 0 | 0 1
1 1 1 | 1 1
For obvious reasons, the carry-in starts-off being 0.
As you can see, sum is actually a lot more complicated. As a side effect of this, though, there as an easy trick you can do to detect overflow when adding positive signed numbers. More specifically, we expect that a+b >= a|b if that fails then you have an overflow!
The case when the two numbers will be the same is when every time a bit in one of the two numbers is set, the corresponding bit int he second number is NOT set. That is to say that you have three possible states: either both bits aren't set, the bit is set in A but not B, or the bit is set in B but not A. In that case the arithmetic + and the bit-wise or would produce the same result... as would the bitwise xor for that matter.
Using arithmetic operations to manipulate bitmasks can produce unexpected results and even overflow. For instance, turning on the n-th bit of a bitmask if it is already on will turn off the n-th bit and turn on the n+1-th bit. This will cause overflow if there are only n-bits.
Example of turning on bit 2:
Arithmetic ADD Bitwise OR
0101 0101
+ 0100 | 0100
---- ----
1001 0101 //expected result: 0101
Like-wise, using arithmetic subtract to turn off the n-th bit will fail if the n-th bit was not already on.
Example of turning off bit 2:
Arithmetic SUB Bitwise AND~
0001 0001
- 0100 &~ 0100
---- ----
0001 0001
+ 1100 & 1011
---- ----
1101 0001 //expected result: 0001
So bitwise operators are safer than arithmetic operators when you are working with bitmasks.
The following bitwise operations have analogous arithmetic operations:
Bitwise Arithmetic
Check n-th bit x & (1 << n) !(x - (1 << n))
Turn on n-th bit x |= (1 << n) x += (1 << n)
Turn off n-th bit x &= ~(1 << n) x -= (1 << n)
Try a = 1 and b = 1 ;)
+ and | have different when two bits at the same positions are 1
00000010
OR
00000010
Result
00000010
VS
00000010
+
00000010
Result
00000100

Categories

Resources