Are these two functions the same?

Are these two functions the same? - c#

There is a function in the AES algorithm, to multiply a byte by 2 in Galois Field.
This is the function given in a website
private static byte gfmultby02(byte b)
{
if (b < 0x80)
return (byte)(int)(b <<1);
else
return (byte)( (int)(b << 1) ^ (int)(0x1b) );
}
This is the function i wrote.
private static byte MulGF2(byte x)
{
if (x < 0x80)
return (byte)(x << 1);
else
{
return (byte)((x << 1) ^ 0x1b);
}
}
What i need to know is, given any byte whether this will perform in the same manner. Actually I am worried about the extra cast to int and then again to byte. So far I have tested and it looks fine. Does the extra cast to int and then to byte make a difference in rare cases?

I think in this case the cast to int does nothing, cause the cast is done after the left shift. So let's take a little example:
byte b = 0x1000;
//temp1 == 0x00000000;
int temp1 = (int)(b << 1);
//temp2 == 0x00010000;
int temp2 = ((int)b) << 1);
So as you can see the parentheses have a big impact on the result, but if took the formula from the website right, your code should behave the same.

I think it's correct, but:
The best way to make sure is to simply test it; there are only 256 cases and it shouldn't take many minutes to write the test case.

Related

Python to C# explanation

I have been converting a Python script to C#, I am 99% there but I am having trouble understanding the following piece of code
# The lower 8-bits from the Xorshift PNR are subtracted from byte
# values during extraction, and added to byte values on insertion.
# When calling deobfuscate_string() the whole string is processed.
def deobfuscate_string(pnr, obfuscated, operation=int.__sub__):
return ''.join([chr((ord(c) operation pnr.next()) & 0xff) for c in obfuscated])
Could you please explain the code above? what does operation pnr.next() do? If you could help maybe convert this method to C# that would be even better but an explanation of the above would be great.
Full source can be found at
https://raw.githubusercontent.com/sladen/pat/master/gar.py

The snippet you provided is not a valid Python code. One cannot write a function name in the place of an infix operator. I think it is meant to be like this:
# The lower 8-bits from the Xorshift PNR are subtracted from byte
# values during extraction, and added to byte values on insertion.
# When calling deobfuscate_string() the whole string is processed.
def deobfuscate_string(pnr, obfuscated, operation=int.__sub__):
return ''.join([chr(operation(ord(c), pnr.next()) & 0xff) for c in obfuscated])
You see, this way it will execute the the operation on ord(c) and pnr.next(). This way the translation to C# is straightforward, operation should be of type Func<int, int, int>.
This might give you an idea:
public static T Next<T>(IEnumerator<T> en) {
en.MoveNext();
return en.Current;
}
public static string deobfuscate_string(IEnumerator<int> pnr, string obfuscated, Func<int, int, int> operation = null) {
if (operation == null) operation = (a, b) => a - b;
return string.Join("", from c in obfuscated select (char)operation((int)c, Next(pnr)));
}
EDIT: added default parameter to deobfuscate_string

The function deobfuscate_string takes an iterable pnr, a string obfuscated and an operation that is by default substract.
For each character c in the string obfuscated
It apply the
operator (substract by default) to the value of the character with
the next element in pnr.
Then it uses & 0xff to ensure result is in range 255
Every thing is then combine in a string.
So, it just encrypt the input by rotating every characters from a known set of rotations.
Notice: The code is not valid as operation cannot by used this way, I just explain the goal here.

Thank you everyone for posting responses, I ended up grabbing a Python debugger and working it through.
private static byte[] deobfuscate_string(XORShift128 pnr, byte[] obfuscated)
{
byte[] deobfuscated = new byte[obfuscated.Length];
for (int i = 0; i < obfuscated.Length; i++)
{
byte b = Convert.ToByte((obfuscated[i] - pnr.next()) & 0xff);
deobfuscated[i] = b;
}
Array.Reverse(deobfuscated);
return deobfuscated;
}
private class XORShift128
{
private UInt32 x = 123456789;
private UInt32 y = 362436069;
private UInt32 z = 521288629;
private UInt32 w = 88675123;
public XORShift128(UInt32 x, UInt32 y)
{
this.x = x;
this.y = y;
}
public UInt32 next()
{
UInt32 t = (x ^ (x << 11)) & 0xffffffff;
x = y;
y = z;
z = w;
w = (w ^ (w >> 19) ^ (t ^ (t >> 8)));
return w;
}
}
Above is that I ended up with

Get the sign of a number in C# without conditional statement

Just out of curiosity, is there a way to get the sign of a number, any kind (but obviously a signed type), not just integer using some bitwise/masking, or other kind of, operation?
That is without using any conditional statement or calling the Math.Sign() function.
Thanks in advance!
EDIT: I recognize it was a misleading question. What I had in mind more likely something like: "get the same output of the Math.Sign() or, simplifying get 0 if x <= 0, 1 otherwise".
EDIT #2: to all those asking for code, I didn't have any in mind when I posted the question, but here's an example I came up with, just to give a context of a possible application:
x = (x < 0) ? 0 : x;
Having the sign into a variable could lead to:
x = sign * x; //where sign = 0 for x <= 0, otherwise sign = 1;
The aim would be to achieve the same result as the above :)
EDIT #3: FOUND IT! :D
// THIS IS NOT MEANT TO BE PLAIN C#!
// Returns 0 if x <= 0, 1 otherwise.
int SignOf(x)
{
return (1+x-(x+1)%x)/x;
}
Thanks to everyone!

is there a way to get the sign of a number (any kind, not just integer)
Not for any number type, no. For an integer, you can test the most significant bit: if it's 1, the number is negative. You could theoretically do the same with a floating point number, but bitwise operators don't work on float or double.

Here's a "zero safe" solution that works for all value types (int, float, double, decimal...etc):
(value.GetHashCode() >> 31) + 1;
Output: 1 = 1, -1 = 0, 0.5 = 1, -0.5 = 0, 0 = 1
It's also roughly 10% cheaper than (1+x-(x+1)%x)/x; in C#. Additionally if "value" is an integer, you can drop the GetHashCode() function call in which case (1+x-(x+1)%x)/x; is 127% more expensive ((value >> 31) + 1; is 56% cheaper).
Since 0 is positive it is illogical for a result of 1 for positive numbers & a result of 0 for 0. If you could parametrise -0 you would get an output of 0.
I understand that GetHashCode() is a function call, but the inner workings of the function in the C# language implementation is entirely "arithmetic". Basically the GetHashCode() function reads the memory section, that stores your float type, as an integer type:
*((int*)&singleValue);
How the GetHashCode function works (best source I could find quickly) - https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c3fde60-1b4a-449f-afdc-fe5bba8fece3/hash-code-of-floats?forum=netfxbcl
If you want the output value to be 1 with the same sign as the input, use:
((float.GetHashCode() >> 31) * 2) + 1;
The above floating-point method is roughly 39% cheaper than System.Math.Sign(float) (System.Math.Sign(float) is roughly 65% more expensive). Where System.Math.Sign(float) throws an exception for float.NaN, ((float.NaN.GetHashCode() >> 31) * 2) + 1; does not and will return -1 instead of crashing.
or for integers:
((int >> 31) * 2) + 1;
The above integer method is roughly 56% cheaper than System.Math.Sign(int) (System.Math.Sign(int) is roughly 125% more expensive).

It depends on the type of number value type you are targeting.
For signed Integers C# and most computer systems use the so called Ones' complement representation.
That means the sign is stored in the first bit of the value.
So you can extract the sign like this:
Int16 number = -2;
Int16 sign = (number & Int16.MinValue) >> 16;
Boolean isNegative = Convert.ToBoolean(sign);
Note that up until now we have not used any conditional operator (explicitly anyways)
But: You still don't know whether the number has a sign or not.
The logical equivalent of your question: "How do I know, if my number is negative?" explicitly requires the usage of a conditional operator as the question is, after all conditional.
So you won't be able to dodge:
if(isNegative)
doThis();
else
doThat();

to just get the sign, you can avoid conditional operators as you will see below in Sign extension of int32 struct. however to get the name I dont think you can avoid conditional operator
class Program
{
static void Main(string[] args)
{
Console.WriteLine(0.Sign());
Console.WriteLine(0.SignName());
Console.WriteLine(12.Sign());
Console.WriteLine(12.SignName());
Console.WriteLine((-15).Sign());
Console.WriteLine((-15).SignName());
Console.ReadLine();
}
}
public static class extensions
{
public static string Sign(this int signedNumber)
{
return (signedNumber.ToString("+00;-00").Substring(0, 1));
}
public static string SignName(this int signedNumber)
{
return (signedNumber.ToString("+00;-00").Substring(0, 1)=="+"?"positive":"negative");
}
}

if x==0 you will have a divby0 exception with this code you posted:
int SignOf(x) {
return (1+x-(x+1)%x)/x; }

Setting all low order bits to 0 until two 1s remain (for a number stored as a byte array)

I need to set all the low order bits of a given BigInteger to 0 until only two 1 bits are left. In other words leave the highest and second-highest bits set while unsetting all others.
The number could be any combination of bits. It may even be all 1s or all 0s. Example:
MSB 0000 0000
1101 1010
0010 0111
...
...
...
LSB 0100 1010
We can easily take out corner cases such as 0, 1, PowerOf2, etc. Not sure how to apply popular bit manipulation algorithms on a an array of bytes representing one number.
I have already looked at bithacks but have the following constraints. The BigInteger structure only exposes underlying data through the ToByteArray method which itself is expensive and unnecessary. Since there is no way around this, I don't want to slow things down further by implementing a bit counting algorithm optimized for 32/64 bit integers (which most are).
In short, I have a byte [] representing an arbitrarily large number. Speed is the key factor here.
NOTE: In case it helps, the numbers I am dealing with have around 5,000,000 bits. They keep on decreasing with each iteration of the algorithm so I could probably switch techniques as the magnitude of the number decreases.
Why I need to do this: I am working with a 2D graph and am particularly interested in coordinates whose x and y values are powers of 2. So (x+y) will always have two bits set and (x-y) will always have consecutive bits set. Given an arbitrary coordinate (x, y), I need to transform an intersection by getting values with all bits unset except the first two MSB.

Try the following (not sure if it's actually valid C#, but it should be close enough):
// find the next non-zero byte (I'm assuming little endian) or return -1
int find_next_byte(byte[] data, int i) {
while (data[i] == 0) --i;
return i;
}
// find a bit mask of the next non-zero bit or return 0
int find_next_bit(int value, int b) {
while (b > 0 && ((value & b) == 0)) b >>= 1;
return b;
}
byte[] data;
int i = find_next_byte(data, data.Length - 1);
// find the first 1 bit
int b = find_next_bit(data[i], 1 << 7);
// try to find the second 1 bit
b = find_next_bit(data[i], b >> 1);
if (b > 0) {
// found 2 bits, removing the rest
if (b > 1) data[i] &= ~(b - 1);
} else {
// we only found 1 bit, find the next non-zero byte
i = find_next_byte(data, i - 1);
b = find_next_bit(data[i], 1 << 7);
if (b > 1) data[i] &= ~(b - 1);
}
// remove the rest (a memcpy would be even better here,
// but that would probably require unmanaged code)
for (--i; i >= 0; --i) data[i] = 0;
Untested.
Probably this would be a bit more performant if compiled as unmanaged code or even with a C or C++ compiler.
As harold noted correctly, if you have no a priori knowledge about your number, this O(n) method is the best you can do. If you can, you should keep the position of the highest two non-zero bytes, which would drastically reduce the time needed to perform your transformation.

I'm not sure if this is getting optimised out or not but this code appears to be 16x faster than ToByteArray. It also avoids the memory copy and it means you get to the results as uint instead of byte so you should have further improvements there.
//create delegate to get private _bit field
var par = Expression.Parameter(typeof(BigInteger));
var bits = Expression.Field(par, "_bits");
var lambda = Expression.Lambda(bits, par);
var func = (Func<BigInteger, uint[]>)lambda.Compile();
//test call our delegate
var bigint = BigInteger.Parse("3498574578238348969856895698745697868975687978");
int time = Environment.TickCount;
for (int y = 0; y < 10000000; y++)
{
var x = func(bigint);
}
Console.WriteLine(Environment.TickCount - time);
//compare time to ToByteArray
time = Environment.TickCount;
for (int y = 0; y < 10000000; y++)
{
var x = bigint.ToByteArray();
}
Console.WriteLine(Environment.TickCount - time);
From there finding the top 2 bits should be pretty easy. The first bit will be in the first int I presume, then it is just a matter of searching for the second top most bit. If it is in the same integer then just set the first bit to zero and find the topmost bit, otherwise search for the next no zero int and find the topmost bit.
EDIT: to make things simple just copy/paste this class into your project. This creates extension methods that means you can just call mybigint.GetUnderlyingBitsArray(). I added a method to get the Sign also and, to make it more generic, have created a function that will allow accessing any private field of any object. I found this to be slower than my original code in debug mode but the same speed in release mode. I would advise performance testing this yourself.
static class BigIntegerEx
{
private static Func<BigInteger, uint[]> getUnderlyingBitsArray;
private static Func<BigInteger, int> getUnderlyingSign;
static BigIntegerEx()
{
getUnderlyingBitsArray = CompileFuncToGetPrivateField<BigInteger, uint[]>("_bits");
getUnderlyingSign = CompileFuncToGetPrivateField<BigInteger, int>("_sign");
}
private static Func<TObject, TField> CompileFuncToGetPrivateField<TObject, TField>(string fieldName)
{
var par = Expression.Parameter(typeof(TObject));
var field = Expression.Field(par, fieldName);
var lambda = Expression.Lambda(field, par);
return (Func<TObject, TField>)lambda.Compile();
}
public static uint[] GetUnderlyingBitsArray(this BigInteger source)
{
return getUnderlyingBitsArray(source);
}
public static int GetUnderlyingSign(this BigInteger source)
{
return getUnderlyingSign(source);
}
}

Reversing a hash function

I have the following hash function, and I'm trying to get my way to reverse it, so that I can find the key from a hashed value.
uint Hash(string s)
{
uint result = 0;
for (int i = 0; i < s.Length; i++)
{
result = ((result << 5) + result) + s[i];
}
return result;
}
The code is in C# but I assume it is clear.
I am aware that for one hashed value, there can be more than one key, but my intent is not to find them all, just one that satisfies the hash function suffices.
EDIT :
The string that the function accepts is formed only from digits 0 to 9 and the chars '*' and '#' hence the Unhash function must respect this criteria too.
Any ideas? Thank you.

This should reverse the operations:
string Unhash(uint hash)
{
List<char> s = new List<char>();
while (hash != 0)
{
s.Add((char)(hash % 33));
hash /= 33;
}
s.Reverse();
return new string(s.ToArray());
}
This should return a string that gives the same hash as the original string, but it is very unlikely to be the exact same string.

Characters 0-9,*,# have ASCII values 48-57,42,35, or binary: 00110000 ... 00111001, 00101010, 00100011
First 5 bits of those values are different, and 6th bit is always 1. This means that you can deduce your last character in a loop by taking current hash:
uint lastChar = hash & 0x1F - ((hash >> 5) - 1) & 0x1F + 0x20;
(if this doesn't work, I don't know who wrote it)
Now roll back hash,
hash = (hash - lastChar) / 33;
and repeat the loop until hash becomes zero. I don't have C# on me, but I'm 70% confident that this should work with only minor changes.

Brute force should work if uint is 32 bits. Try at least 2^32 strings and one of them is likely to hash to the same value. Should only take a few minutes on a modern pc.
You have 12 possible characters, and 12^9 is about 2^32, so if you try 9 character strings you're likely to find your target hash. I'll do 10 character strings just to be safe.
(simple recursive implementation in C++, don't know C# that well)
#define NUM_VALID_CHARS 12
#define STRING_LENGTH 10
const char valid_chars[NUM_VALID_CHARS] = {'0', ..., '#' ,'*'};
void unhash(uint hash_value, char *string, int nchars) {
if (nchars == STRING_LENGTH) {
string[STRING_LENGTH] = 0;
if (Hash(string) == hash_value) { printf("%s\n", string); }
} else {
for (int i = 0; i < NUM_VALID_CHARS; i++) {
string[nchars] = valid_chars[i];
unhash(hash_value, string, nchars + 1);
}
}
}
Then call it with:
char string[STRING_LENGTH + 1];
unhash(hash_value, string, 0);

Hash functions are designed to be difficult or impossible to reverse, hence the name (visualize meat + potatoes being ground up)

I would start out by writing each step that result = ((result << 5) + result) + s[i]; does on a separate line. This will make solving a lot easier. Then all you have to do is the opposite of each line (in the opposite order too).

Number of unset bit left of most significant set bit?

Assuming the 64bit integer 0x000000000000FFFF which would be represented as
00000000 00000000 00000000 00000000
00000000 00000000 >11111111 11111111
How do I find the amount of unset bits to the left of the most significant set bit (the one marked with >) ?

In straight C (long long are 64 bit on my setup), taken from similar Java implementations: (updated after a little more reading on Hamming weight)
A little more explanation: The top part just sets all bit to the right of the most significant 1, and then negates it. (i.e. all the 0's to the 'left' of the most significant 1 are now 1's and everything else is 0).
Then I used a Hamming Weight implementation to count the bits.
unsigned long long i = 0x0000000000000000LLU;
i |= i >> 1;
i |= i >> 2;
i |= i >> 4;
i |= i >> 8;
i |= i >> 16;
i |= i >> 32;
// Highest bit in input and all lower bits are now set. Invert to set the bits to count.
i=~i;
i -= (i >> 1) & 0x5555555555555555LLU; // each 2 bits now contains a count
i = (i & 0x3333333333333333LLU) + ((i >> 2) & 0x3333333333333333LLU); // each 4 bits now contains a count
i = (i + (i >> 4)) & 0x0f0f0f0f0f0f0f0fLLU; // each 8 bits now contains a count
i *= 0x0101010101010101LLU; // add each byte to all the bytes above it
i >>= 56; // the number of bits
printf("Leading 0's = %lld\n", i);
I'd be curious to see how this was efficiency wise. Tested it with several values though and it seems to work.

Based on: http://www.hackersdelight.org/HDcode/nlz.c.txt
template<typename T> int clz(T v) {int n=sizeof(T)*8;int c=n;while (n){n>>=1;if (v>>n) c-=n,v>>=n;}return c-v;}
If you'd like a version that allows you to keep your lunch down, here you go:
int clz(uint64_t v) {
int n=64,c=64;
while (n) {
n>>=1;
if (v>>n) c-=n,v>>=n;
}
return c-v;
}
As you'll see, you can save cycles on this by careful analysis of the assembler, but the strategy here is not a terrible one. The while loop will operate Lg[64]=6 times; each time it will convert the problem into one of counting the number of leading bits on an integer of half the size.
The if statement inside the while loop asks the question: "can i represent this integer in half as many bits", or analogously, "if i cut this in half, have i lost it?". After the if() payload completes, our number will always be in the lowest n bits.
At the final stage, v is either 0 or 1, and this completes the calculation correctly.

If you are dealing with unsigned integers, you could do this:
#include <math.h>
int numunset(uint64_t number)
{
int nbits = sizeof(uint64_t)*8;
if(number == 0)
return nbits;
int first_set = floor(log2(number));
return nbits - first_set - 1;
}
I don't know how it will compare in performance to the loop and count methods that have already been offered because log2() could be expensive.
Edit:
This could cause some problems with high-valued integers since the log2() function is casting to double and some numerical issues may arise. You could use the log2l() function that works with long double. A better solution would be to use an integer log2() function as in this question.

// clear all bits except the lowest set bit
x &= -x;
// if x==0, add 0, otherwise add x - 1.
// This sets all bits below the one set above to 1.
x+= (-(x==0))&(x - 1);
return 64 - count_bits_set(x);
Where count_bits_set is the fastest version of counting bits you can find. See https://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel for various bit counting techniques.

I'm not sure I understood the problem correctly. I think you have a 64bit value and want to find the number of leading zeros in it.
One way would be to find the most significant bit and simply subtract its position from 63 (assuming lowest bit is bit 0). You can find out the most significant bit by testing whether a bit is set from within a loop over all 64 bits.
Another way might be to use the (non-standard) __builtin_clz in gcc.

I agree with the binary search idea. However two points are important here:
The range of valid answers to your question is from 0 to 64 inclusive. In other words - there may be 65 different answers to the question. I think (almost sure) all who posted the "binary search" solution missed this point, hence they'll get wrong answer for either zero or a number with the MSB bit on.
If speed is critical - you may want to avoid the loop. There's an elegant way to achieve this using templates.
The following template stuff finds the MSB correctly of any unsigned type variable.
// helper
template <int bits, typename T>
bool IsBitReached(T x)
{
const T cmp = T(1) << (bits ? (bits-1) : 0);
return (x >= cmp);
}
template <int bits, typename T>
int FindMsbInternal(T x)
{
if (!bits)
return 0;
int ret;
if (IsBitReached<bits>(x))
{
ret = bits;
x >>= bits;
} else
ret = 0;
return ret + FindMsbInternal<bits/2, T>(x);
}
// Main routine
template <typename T>
int FindMsb(T x)
{
const int bits = sizeof(T) * 8;
if (IsBitReached<bits>(x))
return bits;
return FindMsbInternal<bits/2>(x);
}

Here you go, pretty trivial to update as you need for other sizes...
int bits_left(unsigned long long value)
{
static unsigned long long mask = 0x8000000000000000;
int c = 64;
// doh
if (value == 0)
return c;
// check byte by byte to see what has been set
if (value & 0xFF00000000000000)
c = 0;
else if (value & 0x00FF000000000000)
c = 8;
else if (value & 0x0000FF0000000000)
c = 16;
else if (value & 0x000000FF00000000)
c = 24;
else if (value & 0x00000000FF000000)
c = 32;
else if (value & 0x0000000000FF0000)
c = 40;
else if (value & 0x000000000000FF00)
c = 48;
else if (value & 0x00000000000000FF)
c = 56;
// skip
value <<= c;
while(!(value & mask))
{
value <<= 1;
c++;
}
return c;
}

Same idea as user470379's, but counting down ...
Assume all 64 bits are unset. While value is larger than 0 keep shifting the value right and decrementing number of unset bits:
/* untested */
int countunsetbits(uint64_t val) {
int x = 64;
while (val) { x--; val >>= 1; }
return x;
}

Try
int countBits(int value)
{
int result = sizeof(value) * CHAR_BITS; // should be 64
while(value != 0)
{
--result;
value = value >> 1; // Remove bottom bits until all 1 are gone.
}
return result;
}

Use log base 2 to get you the most significant digit which is 1.
log(2) = 1, meaning 0b10 -> 1
log(4) = 2, 5-7 => 2.xx, or 0b100 -> 2
log(8) = 3, 9-15 => 3.xx, 0b1000 -> 3
log(16) = 4 you get the idea
and so on...
The numbers in between become fractions of the log result. So typecasting the value to an int gives you the most significant digit.
Once you get this number, say b, the simple 64 - n will be the answer.
function get_pos_msd(int n){
return int(log2(n))
}
last_zero = 64 - get_pos_msd(n)

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.