I am having problem with this method I wrote to convert UInt64 to a binary array. For some numbers I am getting incorrect binary representation.
Results
Correct
999 = 1111100111
Correct
18446744073709551615 = 1111111111111111111111111111111111111111111111111111111111111111
Incorrect?
18446744073709551614 =
0111111111111111111111111111111111111111111111111111111111111110
According to an online converter the binary value of 18446744073709551614 should be
1111111111111111111111111111111111111111111111111111111111111110
public static int[] GetBinaryArray(UInt64 n)
{
if (n == 0)
{
return new int[2] { 0, 0 };
}
var val = (int)(Math.Log(n) / Math.Log(2));
if (val == 0)
val++;
var arr = new int[val + 1];
for (int i = val, j = 0; i >= 0 && j <= val; i--, j++)
{
if ((n & ((UInt64)1 << i)) != 0)
arr[j] = 1;
else
arr[j] = 0;
}
return arr;
}
FYI: This is not a homework assignment, I require to convert an integer to binary array for encryption purposes, hence the need for an array of bits. Many solutions I have found on this site convert an integer to string representation of binary number which was useless so I came up with this mashup of various other methods.
An explanation as to why the method works for some numbers and not others would be helpful. Yes I used Math.Log and it is slow, but performance can be fixed later.
EDIT: And yes I do need the line where I use Math.Log because my array will not always be 64 bits long, for example if my number was 4 then in binary it is 100 which is array length 3. It is a requirement of my application to do it this way.
It's not the returned array for the input UInt64.MaxValue - 1 which is wrong, it seems like UInt64.MaxValue is wrong.
The array is 65 elements long. This is intuitively wrong because UInt64.MaxValue must fit in 64 bits.
Firstly, instead of doing a natural log and dividing by a log to base 2, you can just do a log to base 2.
Secondly, you also need to do a Math.Ceiling on the returned value because you need the value to fit fully inside the number of bits. Discarding the remainder with a cast to int means that you need to arbitrarily do a val + 1 when declaring the result array. This is only correct for certain scenarios - one of which it is not correct for is... UInt64.MaxValue. Adding one to the number of bits necessary gives a 65-element array.
Thirdly, and finally, you cannot left-shift 64 bits, hence i = val - 1 in the for loop initialization.
Haven't tested this exhaustively...
public static int[] GetBinaryArray(UInt64 n)
{
if (n == 0)
{
return new int[2] { 0, 0 };
}
var val = (int)Math.Ceiling(Math.Log(n,2));
if (val == 0)
val++;
var arr = new int[val];
for (int i = val-1, j = 0; i >= 0 && j <= val; i--, j++)
{
if ((n & ((UInt64)1 << i)) != 0)
arr[j] = 1;
else
arr[j] = 0;
}
return arr;
}
Related
Problem statement:
Given an array of non-negative integers, count the number of unordered pairs of array elements, such that their bitwise AND is a power of 2.
Example:
arr = [10, 7, 2, 8, 3]
Answer: 6 (10&7, 10&2, 10&8, 10&3, 7&2, 2&3)
Constraints:
1 <= arr.Count <= 2*10^5
0 <= arr[i] <= 2^12
Here's my brute-force solution that I've come up with:
private static Dictionary<int, bool> _dictionary = new Dictionary<int, bool>();
public static long CountPairs(List<int> arr)
{
long result = 0;
for (var i = 0; i < arr.Count - 1; ++i)
{
for (var j = i + 1; j < arr.Count; ++j)
{
if (IsPowerOfTwo(arr[i] & arr[j])) ++result;
}
}
return result;
}
public static bool IsPowerOfTwo(int number)
{
if (_dictionary.TryGetValue(number, out bool value)) return value;
var result = (number != 0) && ((number & (number - 1)) == 0);
_dictionary[number] = result;
return result;
}
For small inputs this works fine, but for big inputs this works slow.
My question is: what is the optimal (or at least more optimal) solution for the problem? Please provide a graceful solution in C#. 😊
One way to accelerate your approach is to compute the histogram of your data values before counting.
This will reduce the number of computations for long arrays because there are fewer options for value (4096) than the length of your array (200000).
Be careful when counting bins that are powers of 2 to make sure you do not overcount the number of pairs by including cases when you are comparing a number with itself.
We can adapt the bit-subset dynamic programming idea to have a solution with O(2^N * N^2 + n * N) complexity, where N is the number of bits in the range, and n is the number of elements in the list. (So if the integers were restricted to [1, 4096] or 2^12, with n at 100,000, we would have on the order of 2^12 * 12^2 + 100000*12 = 1,789,824 iterations.)
The idea is that we want to count instances for which we have overlapping bit subsets, with the twist of adding a fixed set bit. Given Ai -- for simplicity, take 6 = b110 -- if we were to find all partners that AND to zero, we'd take Ai's negation,
110 -> ~110 -> 001
Now we can build a dynamic program that takes a diminishing mask, starting with the full number and diminishing the mask towards the left
001
^^^
001
^^
001
^
Each set bit on the negation of Ai represents a zero, which can be ANDed with either 1 or 0 to the same effect. Each unset bit on the negation of Ai represents a set bit in Ai, which we'd like to pair only with zeros, except for a single set bit.
We construct this set bit by examining each possibility separately. So where to count pairs that would AND with Ai to zero, we'd do something like
001 ->
001
000
we now want to enumerate
011 ->
011
010
101 ->
101
100
fixing a single bit each time.
We can achieve this by adding a dimension to the inner iteration. When the mask does have a set bit at the end, we "fix" the relevant bit by counting only the result for the previous DP cell that would have the bit set, and not the usual union of subsets that could either have that bit set or not.
Here is some JavaScript code (sorry, I do not know C#) to demonstrate with testing at the end comparing to the brute-force solution.
var debug = 0;
function bruteForce(a){
let answer = 0;
for (let i = 0; i < a.length; i++) {
for (let j = i + 1; j < a.length; j++) {
let and = a[i] & a[j];
if ((and & (and - 1)) == 0 && and != 0){
answer++;
if (debug)
console.log(a[i], a[j], a[i].toString(2), a[j].toString(2))
}
}
}
return answer;
}
function f(A, N){
const n = A.length;
const hash = {};
const dp = new Array(1 << N);
for (let i=0; i<1<<N; i++){
dp[i] = new Array(N + 1);
for (let j=0; j<N+1; j++)
dp[i][j] = new Array(N + 1).fill(0);
}
for (let i=0; i<n; i++){
if (hash.hasOwnProperty(A[i]))
hash[A[i]] = hash[A[i]] + 1;
else
hash[A[i]] = 1;
}
for (let mask=0; mask<1<<N; mask++){
// j is an index where we fix a 1
for (let j=0; j<=N; j++){
if (mask & 1){
if (j == 0)
dp[mask][j][0] = hash[mask] || 0;
else
dp[mask][j][0] = (hash[mask] || 0) + (hash[mask ^ 1] || 0);
} else {
dp[mask][j][0] = hash[mask] || 0;
}
for (let i=1; i<=N; i++){
if (mask & (1 << i)){
if (j == i)
dp[mask][j][i] = dp[mask][j][i-1];
else
dp[mask][j][i] = dp[mask][j][i-1] + dp[mask ^ (1 << i)][j][i - 1];
} else {
dp[mask][j][i] = dp[mask][j][i-1];
}
}
}
}
let answer = 0;
for (let i=0; i<n; i++){
for (let j=0; j<N; j++)
if (A[i] & (1 << j))
answer += dp[((1 << N) - 1) ^ A[i] | (1 << j)][j][N];
}
for (let i=0; i<N + 1; i++)
if (hash[1 << i])
answer = answer - hash[1 << i];
return answer / 2;
}
var As = [
[10, 7, 2, 8, 3] // 6
];
for (let A of As){
console.log(JSON.stringify(A));
console.log(`DP, brute force: ${ f(A, 4) }, ${ bruteForce(A) }`);
console.log('');
}
var numTests = 1000;
for (let i=0; i<numTests; i++){
const N = 6;
const A = [];
const n = 10;
for (let j=0; j<n; j++){
const num = Math.floor(Math.random() * (1 << N));
A.push(num);
}
const fA = f(A, N);
const brute = bruteForce(A);
if (fA != brute){
console.log('Mismatch:');
console.log(A);
console.log(fA, brute);
console.log('');
}
}
console.log("Done testing.");
int[] numbers = new[] { 10, 7, 2, 8, 3 };
static bool IsPowerOfTwo(int n) => (n != 0) && ((n & (n - 1)) == 0);
long result = numbers.AsParallel()
.Select((a, i) => numbers
.Skip(i + 1)
.Select(b => a & b)
.Count(IsPowerOfTwo))
.Sum();
If I understand the problem correctly, this should work and should be faster.
First, for each number in the array we grab all elements in the array after it to get a collection of numbers to pair with.
Then we transform each pair number with a bitwise AND, then counting the number that satisfy our 'IsPowerOfTwo;' predicate (implementation here).
Finally we simply get the sum of all the counts - our output from this case is 6.
I think this should be more performant than your dictionary based solution - it avoids having to perform a lookup each time you wish to check power of 2.
I think also given the numerical constraints of your inputs it is fine to use int data types.
What is effective(fast) way to get last set bit in BitArray. (LINQ or simple backward for loop isn't very fast for large bitmaps. And I need fast) BitArray
I see next algorithm: go back through BitArray internal int array data and use some compiler Intrinsic Like C++ _BitScanReverse( don't know analog in C#).
The "normal" solution:
static long FindLastSetBit(BitArray array)
{
for (int i = array.Length - 1; i >= 0; i--)
{
if (array[i])
{
return i;
}
}
return -1;
}
The reflection solution (note - relies on implementation of BitArray):
static long FindLastSetBitReflection(BitArray array)
{
int[] intArray = (int[])array.GetType().GetField("m_array", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic).GetValue(array);
for (var i = intArray.Length - 1; i >= 0; i--)
{
var b = intArray[i];
if (b != 0)
{
var pos = (i << 5) + 31;
for (int bit = 31; bit >= 0; bit--)
{
if ((b & (1 << bit)) != 0)
return pos;
pos--;
}
return pos;
}
}
return -1;
}
The reflection solution is 50-100x faster for me on large BitArrays (on very small ones the overhead of reflection will start to appear). It takes about 0.2 ms per megabyte on my machine.
The main thing is that if (b != 0) checks 32 bits at once. The inner loop which checks specific bits only runs once, when the correct word is found.
Edited: unsafe code removed because I realized almost nothing is gained by it, it only avoids the array boundary check and as the code is so fast already it doesn't matter that much. For the record, unsafe solution (~30% faster for me):
static unsafe long FindLastSetBitUnsafe(BitArray array)
{
int[] intArray = (int[])array.GetType().GetField("m_array", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic).GetValue(array);
fixed (int* buffer = intArray)
{
for (var i = intArray.Length - 1; i >= 0; i--)
{
var b = buffer[i];
if (b != 0)
{
var pos = (i << 5) + 31;
for (int bit = 31; bit >= 0; bit--)
{
if ((b & (1 << bit)) != 0)
return pos;
pos--;
}
return pos;
}
}
}
return -1;
}
If you want the index of that last set bit you can do this in C# 6.
int? index = array.Select((b,i)=>{Index = i, Value = b})
.LastOrDefault(x => x.Value)
?.Index;
Otherwise you have to do something like this
var last = array.Select((b,i)=>{Index = i, Value = b})
.LastOrDefault(x => x.Value);
int? index = last == null ? (int?)null : last.Index;
Either way the index will be null if all the bits are zero.
I don't believe there is anything it can be done, other than iterate from last to first bit, and ask for each one if it is set. It could be done with something like:
BitArray bits = ...;
int lastSet = Enumerable.Range(1, bits.Length)
.Select(i => bits.Length - i)
.Where(i => bits[i])
.DefaultIfEmpty(-1)
.First();
That should return the last bit set, or -1 if none is. Haven't tested it myself, so it may need some adjustment.
Hope it helps.
I have a program which reads bytes from the network. Sometimes, those bytes are string representations of integer in decimal or hexadecimal form.
Normally, I parse this with something like
var s=Encoding.ASCII.GetString(p.GetBuffer(),0,(int)p.Length);
int.TryParse(s, out number);
I feel that this is wasteful, as it has to allocate memory to the string without any need for it.
Is there a better way I can do it in c#?
UPDATE
I've seen several suggestions to use BitConverter class. This is not what I need. BitConverter will transform binary representation of int (4 bytes) into int type, but since the int is in ascii form, this doesn't apply here.
I doubt it will have a substantial impact on performance or memory consumption, but you can do this relatively easily. One implementation for converting decimal numbers is shown below:
private static int IntFromDecimalAscii(byte[] bytes)
{
int result = 0;
// For each digit, add the digit's value times 10^n, where n is the
// column number counting from right to left starting at 0.
for(int i = 0; i < bytes.Length; ++i)
{
// ASCII digits are in the range 48 <= n <= 57. This code only
// makes sense if we are dealing exclusively with digits, so
// throw if we encounter a non-digit character
if(bytes[i] < 48 || bytes[i] > 57)
{
throw new ArgumentException("Non-digit character present", "bytes");
}
// The bytes are in order from most to least significant, so
// we need to reverse the index to get the right column number
int exp = bytes.Length - i - 1;
// Digits in ASCII start with 0 at 48, and move sequentially
// to 9 at 57, so we can simply subtract 48 from a valid digit
// to get its numeric value
int digitValue = bytes[i] - 48;
// Finally, add the digit value times the column value to the
// result accumulator
result += digitValue * (int)Math.Pow(10, exp);
}
return result;
}
This can easily be adapted to convert hex values as well:
private static int IntFromHexAscii(byte[] bytes)
{
int result = 0;
for(int i = 0; i < bytes.Length; ++i)
{
// ASCII hex digits are a bit more complex than decimal.
if(bytes[i] < 48 || bytes[i] > 71 || (bytes[i] > 57 && bytes[i] < 65))
{
throw new ArgumentException("Non-digit character present", "bytes");
}
int exp = bytes.Length - i - 1;
// Assume decimal first, then fix it if it's actually hex.
int digitValue = bytes[i] - 48;
// This is safe because we already excluded all non-digit
// characters above
if(bytes[i] > 57) // A-F
{
digitValue = bytes[i] - 55;
}
// For hex, we use 16^n instead of 10^n
result += digitValue * (int)Math.Pow(16, exp);
}
return result;
}
Well, you could be a little less wasteful (at least in the number of source code characters sense) by avoiding the s declaration like:
int.TryParse(Encoding.ASCII.GetString(p.GetBuffer(),0,(int)p.Length), out number);
But, I think the only other real way to get a speed-up would be to do as the commenter suggests and hard code a mapping into a Dictionary or something. This could save some time if you have to do this a lot, but it may not be worth the effort...
I am getting a number such as 513. I need to convert this number to a bitmask32 then I need to count where each 1 bit is in the array
For Example
513 = 0 and 9
How would I go about converting the number to a bit32 then reading the values?
Right now I am just converting the number to a string binary value:
string bit = Convert.ToString(513, 2);
Would there be a more effective way to do this? How would I convert the value to a bit array?
Thanks
var val = 513;
for(var pos=0;;pos++)
{
var x = 1 << pos;
if(x > val) break;
if((val & x) == x)
{
Console.WriteLine(pos);
}
}
The BitVector32 class is an utility class that can help you out for this, if you really want to keep a bit map.
using System.Collections;
int originalInt = 7;
byte[] bytes = BitConverter.GetBytes(originalInt);
BitArray bits = new BitArray(bytes);
int ndx = 9; //or whatever ndx you actually care about
if (bits[ndx] == true)
{
Console.WriteLine("Bit at index {0} is on!", ndx);
}
To test bit #i in number n:
if ((n & (1 << i)) != 0)
I want to convert an int to a byte[2] array using BCD.
The int in question will come from DateTime representing the Year and must be converted to two bytes.
Is there any pre-made function that does this or can you give me a simple way of doing this?
example:
int year = 2010
would output:
byte[2]{0x20, 0x10};
static byte[] Year2Bcd(int year) {
if (year < 0 || year > 9999) throw new ArgumentException();
int bcd = 0;
for (int digit = 0; digit < 4; ++digit) {
int nibble = year % 10;
bcd |= nibble << (digit * 4);
year /= 10;
}
return new byte[] { (byte)((bcd >> 8) & 0xff), (byte)(bcd & 0xff) };
}
Beware that you asked for a big-endian result, that's a bit unusual.
Use this method.
public static byte[] ToBcd(int value){
if(value<0 || value>99999999)
throw new ArgumentOutOfRangeException("value");
byte[] ret=new byte[4];
for(int i=0;i<4;i++){
ret[i]=(byte)(value%10);
value/=10;
ret[i]|=(byte)((value%10)<<4);
value/=10;
}
return ret;
}
This is essentially how it works.
If the value is less than 0 or greater than 99999999, the value won't fit in four bytes. More formally, if the value is less than 0 or is 10^(n*2) or greater, where n is the number of bytes, the value won't fit in n bytes.
For each byte:
Set that byte to the remainder of the value-divided-by-10 to the byte. (This will place the last digit in the low nibble [half-byte] of the current byte.)
Divide the value by 10.
Add 16 times the remainder of the value-divided-by-10 to the byte. (This will place the now-last digit in the high nibble of the current byte.)
Divide the value by 10.
(One optimization is to set every byte to 0 beforehand -- which is implicitly done by .NET when it allocates a new array -- and to stop iterating when the value reaches 0. This latter optimization is not done in the code above, for simplicity. Also, if available, some compilers or assemblers offer a divide/remainder routine that allows retrieving the quotient and remainder in one division step, an optimization which is not usually necessary though.)
Here's a terrible brute-force version. I'm sure there's a better way than this, but it ought to work anyway.
int digitOne = year / 1000;
int digitTwo = (year - digitOne * 1000) / 100;
int digitThree = (year - digitOne * 1000 - digitTwo * 100) / 10;
int digitFour = year - digitOne * 1000 - digitTwo * 100 - digitThree * 10;
byte[] bcdYear = new byte[] { digitOne << 4 | digitTwo, digitThree << 4 | digitFour };
The sad part about it is that fast binary to BCD conversions are built into the x86 microprocessor architecture, if you could get at them!
Here is a slightly cleaner version then Jeffrey's
static byte[] IntToBCD(int input)
{
if (input > 9999 || input < 0)
throw new ArgumentOutOfRangeException("input");
int thousands = input / 1000;
int hundreds = (input -= thousands * 1000) / 100;
int tens = (input -= hundreds * 100) / 10;
int ones = (input -= tens * 10);
byte[] bcd = new byte[] {
(byte)(thousands << 4 | hundreds),
(byte)(tens << 4 | ones)
};
return bcd;
}
maybe a simple parse function containing this loop
i=0;
while (id>0)
{
twodigits=id%100; //need 2 digits per byte
arr[i]=twodigits%10 + twodigits/10*16; //first digit on first 4 bits second digit shifted with 4 bits
id/=100;
i++;
}
More common solution
private IEnumerable<Byte> GetBytes(Decimal value)
{
Byte currentByte = 0;
Boolean odd = true;
while (value > 0)
{
if (odd)
currentByte = 0;
Decimal rest = value % 10;
value = (value-rest)/10;
currentByte |= (Byte)(odd ? (Byte)rest : (Byte)((Byte)rest << 4));
if(!odd)
yield return currentByte;
odd = !odd;
}
if(!odd)
yield return currentByte;
}
Same version as Peter O. but in VB.NET
Public Shared Function ToBcd(ByVal pValue As Integer) As Byte()
If pValue < 0 OrElse pValue > 99999999 Then Throw New ArgumentOutOfRangeException("value")
Dim ret As Byte() = New Byte(3) {} 'All bytes are init with 0's
For i As Integer = 0 To 3
ret(i) = CByte(pValue Mod 10)
pValue = Math.Floor(pValue / 10.0)
ret(i) = ret(i) Or CByte((pValue Mod 10) << 4)
pValue = Math.Floor(pValue / 10.0)
If pValue = 0 Then Exit For
Next
Return ret
End Function
The trick here is to be aware that simply using pValue /= 10 will round the value so if for instance the argument is "16", the first part of the byte will be correct, but the result of the division will be 2 (as 1.6 will be rounded up). Therefore I use the Math.Floor method.
I made a generic routine posted at IntToByteArray that you could use like:
var yearInBytes = ConvertBigIntToBcd(2010, 2);
static byte[] IntToBCD(int input) {
byte[] bcd = new byte[] {
(byte)(input>> 8),
(byte)(input& 0x00FF)
};
return bcd;
}