Finding the length of the common prefix in two bytes

Finding the length of the common prefix in two bytes - c#

Given two bytes, how would I find the length of the common bits at the start of the two bytes.
For example:
9 == 00001001
6 == 00000110
Common prefix is 0000, length 4
I'm working in C#, so please stick to C# operations only.
Addendum: This particular piece of code will run thousands of times and needs to be very fast.

byte x = 9;
byte y = 6;
while ( x != y )
{
x >>= 1;
y >>= 1;
}
Basically, remove a bit from the right of each number until the two are equal. When they become equal, their bits are equal too.
You can keep track of the length of the prefix easily by introducing another variable. I'll leave that to you.
If you want it to be fast, and considering that you're dealing with bytes, why not precompute the values and return the answer in a single operation? Run this algorithm for all possible combinations of two bytes and store the result in a table.
You only have 2^8 * 2^8 = 2^16 possibilities (2^15 actually, because x = 6 and y = 9 is the same as x = 9 and y = 6). If you can afford the initial time and memory, precomputation should be fastest in the end.
Edit:
You got a solution that's at least better for precomputation and probably faster in general: find the leftmost 1 bit in x ^ y. Using this, build a table Pre where Pre[i] = position of leftmost 1 bit in i. You only need 2^8 bytes for this table.

EDIT: Thanks to the comments, I found that I misunderstood the problem. (Below is a fixed version).
With a lookup table:
readonly static int[] bytePrefix = new int[] {
8, 7, 6, 6, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
};
And use it XORing the two bytes:
bytePrefix[9 ^ 6]
I believe this is as fast as it can get, it's just one XOR operation and an array lookup (you can also change it to 2 array lookups, but it would use 256 times more memory and probably be slower, bitwise it really fast).

First get the binary difference between the bytes using the xor operator. Then you just shift bits out to the right until the difference is zero:
byte b1 = 6;
byte b2 = 9;
int length = 8;
for (int diff = b1 ^ b2; diff != 0; length--) diff >>= 1;
This will give you a minimum of calculations in the loop, so it will be rather fast.

If you're in a space-limited environment (which obviously you're not if you're using C#, but just in general) and can't afford a lookup table:
byte test = byte1 ^ byte2;
int length = 0;
if ((test & 0x80) == 0)
{
if ((test & 0x40) == 0)
{
if ((test & 0x20) == 0)
{
if ((test & 0x10) == 0)
{
// I think you get the idea by now.
// Repeat for the lower nibble.
}
else
length = 3;
}
else
length = 2;
}
else
length = 1;
}
This is basically an unraveled loop to find the first 1 bit in the XOR'd number. I don't think it can get any faster than this without the lookup table.

This can be restated as a simpler problem with a known fast solution:
Find the left-most true bit in X ^ Y.
Some code (apparently code can't immediately follow a bulleted list?!?)
int findCommonPrefix(long x, long y, out long common)
{
int prefixPlace = 0;
int testPlace = 32;
long w, mismatch = x ^ y;
do {
w = mismatch >> testPlace;
if (w != 0) { prefixPlace |= testPlace; mismatch = w; }
testPlace >>= 1;
} while (testPlace != 0);
common = x >> prefixPlace;
return 64 - prefixPlace;
}
This needs only 6 iterations to find the common prefix in a 64-bit long, the byte version will need only 3 iterations. Unroll the loop for even more speed.

Another approach using exclusive or (xor):
public int GetCommonPrefixLength(byte a, byte b)
{
int c = a ^ b;
int len = -1;
while ((++len < 8) && ((c & 0x80) == 0))
c = c << 1;
return len;
}

Here's a procedural way:
int r = 8;
while (a != b)
{
a >>= 1;
b >>= 1;
r -= 1;
}
Here's a way that uses a lookup table with just 256 entries:
int[] lookupTable;
void createLookupTable()
{
lookupTable = new int[256];
for (int a = 0; a <= 255; ++a)
{
int n = 8;
byte b = (byte)a;
while (b > 0) {
b >>= 1;
n -= 1;
}
lookupTable[a] = n;
}
}
int commonPrefix(byte a, byte b)
{
return lookupTable[a ^ b];
}
And just for fun here's a way to do it with LINQ:
int r = 8 - Enumerable.Range(0, 9).Where(n => a >> n == b >> n).First();

Here's one without a table or a loop:
len = (a^b) ? (7 - (int)Math.Log( a^b, 2)) : 8;
Explanation:
log2 X is the power to which the number 2 must be raised to obtain the value X. Since each bit in a binary number represents the next power of 2, you can use this fact to find the highest bit set (counting from 0):
2**0 = 1 = 0b0001; log2(1) = 0
2**1 = 2 = 0b0010; log2(2) = 1
2**1.6 =~3 = 0b0011; log2(3) =~1.6; (int)log2(3) = 1
2**2 = 4 = 0b0100; log2(4) = 2
...
2**3 = 8 = 0b1000; log2(8) = 3
So the code works by taking a XOR b, which sets only the bits which are different. If the result is non-zero, we use log2 to find the highest bit set. 7 less the result gives the number of leading zeros = the number of common bits. There is a special case when a XOR b == 0: log2(0) is -Infinity, so that won't work, but we know that all the bits must match, so the answer is 8.

int i;
for (i=0;i<sizeof(byte);i++)
if (a >> sizeof(byte)-i != b >> sizeof(byte)-i) break;

The 256-byte table versions seem quite nice; depending upon caching and branching issues, a 16-byte table version might or might not run faster. Something like:
/* Assumes table[16] is defined similarly to the table[256] in earlier examples */
unsigned int find_mismatch(unsigned char a, unsigned char b)
{
unsigned char mismatch;
mismatch = a^b;
if (mismatch & 0xF0)
return table[mismatch >> 4];
else
return table[mismatch]+4;
}
More instructions, including a branch, but since the table is now only 16 bytes it would take only one or two cache misses to fill entirely. Another approach, using a total of three lookups on a 16-byte table and a five-byte table, but no branching:
unsigned char table2[5] = {0,0,0,0,0xFF};
unsigned int find_mismatch(unsigned char a, unsigned char b)
{
unsigned char mismatch,temp2;
mismatch = a^b;
temp2 = table[mismatch >> 4];
return temp2 + (table2[temp2] & table[mismatch & 15]);
}
One would have to do some profiling in the real application to see whether the reduced cache load of the smaller tables was sufficient to offset the extra instructions.

Related

How do I copy the array elements and also reversing certain elements with Array.Copy() in c#?

Elements of SourceArray are being copied to 2 separate Arrays namely DestArray1 and DestArray2.
output:
DestArray1 will have the first 4 elements of SourceArray but in the reverse form [4 3 2 1]
DestArray2 will have the last 4 elements of SourceArray. [5 6 7 8]
I want to replace the for loop with Array.Copy() method
if not reversed then Array.Copy() works kind of fine except for the last element, but to copy with reverse, it seems the Array.Copy doesn't work or I am not able to implement it.
int i, j;
int bytelength =8;
int halfbytelength = 4;
byte[] SourceArray = new byte[]{ 1, 2, 3, 4, 5, 6, 7, 8 };
byte[] DestArray1 = new byte[4];
byte[] DestArray2 = new byte[4];
for (i = halfbytelength - 1, j = 0; i >= 0; i -= 1, j++)
{
DestArray1[j] = SourceArray[i];
}
for (i = halfbytelength; i < bytelength; i += 1)
{
DestArray2[i - halfbytelength] = SourceArray[i];
}
I tried following the code but the results are not as expected as seen in(Results:), is there a way to do it?
Array.Copy(SourceArray, 0, DestArray1, 3, 0);
Array.Copy(SourceArray, 4, DestArray2, 0, 3);
Result:
DestArray1: [0 0 0 0]
DestArray2: [5 6 7 0]

First array.
To reverse array you can just call Array.Reverse() after copying:
Array.Copy(SourceArray, 0, DestArray1, 0, 4);
Array.Reverse(DestArray1);
Second array.
if not reversed then Array.Copy() works kind of fine except for the
last element
Because you pass invalid count of elements to copy (last parameter):
Array.Copy(SourceArray, 4, DestArray2, 0, 3); // 3 - elements count, not an index
Simply replace 3 with 4:
Array.Copy(SourceArray, 4, DestArray2, 0, 4); // 4 and it will copy including the last element

C# locate value in a array and move it to right

My task is to find a element in the array that is the same as given value, and then take it to right while maintaining the order of other elements. An example (Test Case):
{1, 2, 0, 1, 0, 1, 0, 3, 0, 1} for value = 0 => {1, 2, 1, 1, 3, 1, 0, 0, 0, 0}
While my code could do that above example, what it could not do is a very specific case: if the element in array equals value and the next element also equals value it will not shift the element. Again a example:
{ 1, int.MinValue, int.MinValue, int.MaxValue, int.MinValue, -1, -3, -9, 1 }, value = int.MinValue
Expected result: { 1, int.MaxValue, -1, -3, -9, 1, int.MinValue, int.MinValue, int.MinValue }
Result with my code: { 1, int.MinValue ,int.MaxValue, -1, -3, -9, 1, int.MinValue, int.MinValue }
I thought shifting is the one solution, is it? I am having a lot of problems with it, I also tried Array.Copy but there were problems the result was always out of range.
How can I make it so that it will shifts/rotates correctly in all cases?
Code:
static void Main(string[] args)
{
int[] source = new int[] { 1, int.MinValue, int.MinValue, int.MaxValue, int.MinValue, -1, -3, -9, 1 };
int value = int.MinValue;
for (int i = 0; i < source.Length; i++)
{
if (source[i] == value)
{
LeftShiftArray(source, i);
}
}
for (int i = 0; i < source.Length; i++)
{
Console.WriteLine(source[i]);
}
}
public static void LeftShiftArray(int[] source, int i)
{
var temp1 = source[i];
for (var j = i; j < source.Length - 1; j++)
{
source[j] = source[j + 1];
}
source[source.Length - 1] = temp1;
}
Now this

I have a simple approach to solve this problem. Run a loop, You keep on counting the numbers which are not equal to your number. And keep assigning to arr[count]. Then increment the count. And then finally you will be left to assign all the remaining numbers with the given number.
static void MoveToEnd(int []arr, int n)
{
int count = 0;
    for (int i = 0; i < arr.Length; i++)
        if (arr[i] != n)
         arr[count++] = arr[i];
    while (count < arr.Length)
        arr[count++] = n;
}
Please note I have typed this answer from phone, so please avoid typing mistakes.

This is a classic off by one error. Think about it like this:
Let's say you are moving all 0s to the back (value = 0). When you find a zero at some position, let's say source[2], you turn your array from this:
1 1 0 0 1 1
source[2] ^
To this:
1 1 0 1 1 0
source[2] ^
Now that you've shifted the array, the next step in your code is to increase i by 1. That means the next comparison you make will be with source[3]. In the above array, that looks like this:
1 1 0 1 1 0
source[3] ^
Do you see the problem? Let me know if not and I can explain further.
PS. There are a couple of issues with the other code that was posted that will probably stop you from getting full points if you were to turn it in for an assignment :)

IndexError in array PYTHON

I am tring to write this C# function in Python, unfortunatly I have this ERROR: IndexError: list assignment index out of range in Population.BinaryX.insert([i][j], dec) line. Can anyone tell me how can I fix this problem?! What`s wrong I did?
C# CODE:
public class Population
{
public float[] CodedX = new float[20];
public float[,] BinaryX = new float[10, 8];
}
private void BinaryTranslating()
{
int dec;
int j = 0;
for (var i = 0; i < 10; i++)
{
while (Population.CodedX[i] > 1 & j < 8)
{
dec = (int)Population.CodedX[i] % 2;
Population.BinaryX[i, j] = dec;
Population.CodedX[i] /= 2;
j++;
}
j = 0;
}
}
private void DecimalTranslating()
{
for (var i = 0; i < 10; i++)
{
Population.CodedX[i] = Population.BinaryX[i, 7] * 128 + Population.BinaryX[i, 6] * 64 +
Population.BinaryX[i, 5] * 32 + Population.BinaryX[i, 4] * 16 +
Population.BinaryX[i, 3] * 8 + Population.BinaryX[i, 2] * 4 +
Population.BinaryX[i, 1] * 2 + Population.BinaryX[i, 0];
}
}
Python CODE:
class Population:
CodedX = []
BinaryX = [[], []]
class Application:
#staticmethod
def binary_translating():
j = 0
for i in range(10):
while Population.CodedX[i] > 1 & j < 8:
dec = int(Population.CodedX[i]) % 2
Population.BinaryX.insert([i][j], dec)
Population.CodedX[i] /= 2
j += 1
j = 0
#staticmethod
def decimal_translating():
for i in range(10):
new_item = Population.BinaryX[i][7] * 128 + Population.BinaryX[i][6] * 64 + Population.BinaryX[i][5] * 32 +\
Population.BinaryX[i][4] * 16 + Population.BinaryX[i][3] * 8 + Population.BinaryX[i][2] * 4 +\
Population.BinaryX[i][1] * 2 + Population.BinaryX[i][0]
Population.CodedX.insert(i, new_item)

Consider the [i][j] expression in Population.BinaryX.insert([i][j], dec). That expression creates a 1 item list containing the value of i and then tries to take the jth item from that list.
>>> i=1
>>> j=2
>>> [i][j]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
Python lists are one dimensional and if you want a multidimensional list, you need to create a list of lists or use some other structure such as a numpy or pandas array.
One option is to preallocate the array with a known value so that you can simply index it later
#staticmethod
def binary_translating():
Population.BinaryX = [[None] * 8 for _ in range(10)]
j = 0
for i in range(10):
while Population.CodedX[i] > 1 & j < 8:
dec = int(Population.CodedX[i]) % 2
Population.BinaryX[i][j] = dec
Population.CodedX[i] /= 2
j += 1
j = 0
Another option is to insert into the sublist:
#staticmethod
def binary_translating():
Population.BinaryX = []
j = 0
for i in range(10):
Population.BinaryX.insert([])
while Population.CodedX[i] > 1 & j < 8:
dec = int(Population.CodedX[i]) % 2
Population.BinaryX[i].insert(j, dec)
Population.CodedX[i] /= 2
j += 1
j = 0

please make sure that Population.BinaryX is a valid entity and has at least 10 element in it because you are running the loop 10 times. Same goes for CodedX.
If either of them do not have at least 10 elements, you will get
IndexError: list assignment index out of range
Try..
class Population:
CodedX = [0 for i in range(10)]
BinaryX = [[0 for i in range(10)] for j in range(10)]
This is preallocating as mentioned by tdelaney.
If you look at each functions in the Application class, they try to use either BinaryX or CodedX arrays but if these arrays do not have any elements in them, then how will python index into them?
What I mean is before calling the decimal_translating() function, the Population.BinrayX must have elements in it. It cannot be empty array.
Similarly, before calling the binary_translating() function, Population.CodedX must have elements in it.
[edit #1]
After your comments and trying to understand your code.Here is what I have:-
class Population(object):
def __init__(self):
self.CodedX = [0 for i in range(10)] # as per your C# code
self.BinaryX = []
# debug code to fill CodedX array - remove it
for i in range(10):
self.CodedX[i] = int(761)
def binary_translating(self):
for i in range(10):
j = 0
self.BinaryX.append([0 for k in range(10)])
while (self.CodedX[i] > 0) and (j < 10):
dec = int(self.CodedX[i] % 2)
self.BinaryX[i][j] = dec # this will append at j
self.CodedX[i] = int(self.CodedX[i] / 2)
j += 1
# debug code to print - remove it
print(self.BinaryX)
# debug code to clear CodedX - remove it
for i in range(10):
self.CodedX[i] = int(0)
def decimal_translating(self):
for i in range(10):
value = self.BinaryX[i][7] * 128 + self.BinaryX[i][6] * 64 + self.BinaryX[i][5] * 32 + \
self.BinaryX[i][4] * 16 + self.BinaryX[i][3] * 8 + self.BinaryX[i][2] * 4 + \
self.BinaryX[i][1] * 2 + self.BinaryX[i][0]
self.CodedX[i] = value
print(self.CodedX)
pop = Population()
pop.binary_translating()
pop.decimal_translating()
I have added some debug code to have some starting values in CodedX and print statements for you to see the output.
generates the output:
[[1, 0, 0, 1, 1, 1, 1, 1, 0, 1], [1, 0, 0, 1, 1, 1, 1, 1, 0, 1], [1, 0, 0, 1, 1, 1, 1, 1, 0, 1], [1, 0, 0, 1, 1, 1, 1, 1, 0, 1], [1, 0, 0, 1, 1, 1, 1, 1, 0, 1], [1, 0, 0, 1, 1, 1, 1, 1, 0, 1], [1, 0, 0, 1, 1, 1, 1, 1, 0, 1], [1, 0, 0, 1, 1, 1, 1, 1, 0, 1], [1, 0, 0, 1, 1, 1, 1, 1, 0, 1], [1, 0, 0, 1, 1, 1, 1, 1, 0, 1]]
[249, 249, 249, 249, 249, 249, 249, 249, 249, 249]
[edit #2]

How to incrementally iterate through all possible values of a byte array of size n?

For my question n=16, but a generic answer would be appreciated too.
So I have a byte array:
byte[] key;
My problem is that I want to iterate through all possible values of each element in this array, combined. I know this will take ages, and I'm not looking to actually complete this loop, just to make a loop which will at least attempt this.
So e.g.:
First iteration:
//Math.Pow(2,128) is the max no. of iterations right?
byte[] key;
for(int i = 0; i < Math.Pow(2,128); i++)
{
key = new byte[16] {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
}
Second iteration:
//Math.Pow(2,128) is the max no. of iterations right?
byte[] key;
for(int i = 0; i < Math.Pow(2,128); i++)
{
key = new byte[16] {1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
}
Third iteration:
//Math.Pow(2,128) is the max no. of iterations right?
byte[] key;
for(int i = 0; i < Math.Pow(2,128); i++)
{
key = new byte[16] {2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
}
Final iteration:
//Math.Pow(2,128) is the max no. of iterations right?
byte[] key;
for(int i = 0; i < Math.Pow(2,128); i++)
{
key = new byte[16] {255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255};
}
Obviously I have just hardcoded the array above. I need a way of doing this in a proper way. Again, I know there are many different combinations. All I need is a way to start iterating through all possible values. How can I do this in my loop?
I.e. what should I replace the body of my loop with, in order to iterate through all possible values of a byte array of size 16.
What I have tried:
In the body of the loop I have tried the following:
key = new byte[16] { (byte)i, (byte)i, (byte)i, (byte)i, (byte)i, (byte)i, (byte)i, (byte)i, (byte)i, (byte)i, (byte)i, (byte)i, (byte)i, (byte)i, (byte)i, (byte)i };
Obviously wrong, will only test a small subset of possible values. Will just try i= 0,...,255 and then start over for when i=256 --> (byte)i = 0.
I suspect I need some more nesting. Possibly up to 16 nested loops, which sounds insane and probably wrong? I can't get my head around this problem, any help would be much appreciated!
Purpose:
The purpose of this question is to demonstrate how inefficient brute force cryptanalysis is in practice. The rest of my program works, I'm just stuck in this loop.

In case you don't realize: 16 bytes is the size of a Guid or the size of a standard cryptographic key size. There are so many combinations that you cannot enumerate even a fraction. Maybe you can enumerate the last 8 bytes if you parallelize across 1000 machines and wait a year.
You could do that easily by running a for loop from 0 to ulong.MaxValue. I'm submitting this as an answer because this very simple idea allows you to start enumerating and essentially never come to a point where you finish.
for (ulong i = 0; i < ulong.MaxValue; i++) {
var bytes = new [] {
0, 0, 0, 0, 0, 0, 0, 0
, (byte)(i >> (7 * 8))
, (byte)(i >> (6 * 8))
, (byte)(i >> (5 * 8))
//...
, (byte)(i >> (0 * 8)) };
}
Or, just use 16 nested for loops. I don't think that's insane at all because it is so simple that it's clearly correct.

this is a sample code without any exception handling and kind of inefficient to simulate a counter like the one you mentioned
public static void NextIteration(byte[] input)
{
if (input.All(x => x == 255))
throw new InvalidOperationException("there is no iteration left");
var converted = input.Select(x => (int) x).ToArray();
converted[0]++;
for (var i = 0; i < converted.Length; i++)
{
if (converted[i] == 256)
{
converted[i] = 0;
converted[i + 1]++;
}
}
for (var i = 0; i < input.Length; i++)
{
input[i] = (byte) converted[i];
}
}

How to read a range of bytes from a bytearray in c#

I have to read a range of bytes from a byte array. I have the starting position and the ending position to read.
-(NSData *) getSubDataFrom:(int)stPos To:(int)endPos withData:(NSData *) data{
NSRange range = NSMakeRange(stPos, endPos);
return [data subDataWithRage:range];
}
The above code in ObjectiveC reads the range of data(bytes) from a NSData(byteArray). Is there any equivelent method in c# to do the same. or how else we can do this. Please advise!

What do you mean by read? Copy a range of bytes into another byte array?
var mainArray = new byte[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };
var startPos = 5;
var endPos = 10;
var subset = new byte[endPos - startPos + 1];
Array.Copy(mainArray, startPos, subset, 0, endPos - startPos + 1);
From MSDN

Try the Array.Copy() or Array.CopyTo() method.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Finding the length of the common prefix in two bytes - c#

Another approach using exclusive or (xor): public int GetCommonPrefixLength(byte a, byte b) { int c = a ^ b; int len = -1; while ((++len < 8) && ((c & 0x80) == 0)) c = c << 1; return len; }

int i; for (i=0;i<sizeof(byte);i++) if (a >> sizeof(byte)-i != b >> sizeof(byte)-i) break;

Related

How do I copy the array elements and also reversing certain elements with Array.Copy() in c#?

C# locate value in a array and move it to right

IndexError in array PYTHON

How to incrementally iterate through all possible values of a byte array of size n?

How to read a range of bytes from a bytearray in c#

Categories

Resources