How can I save space when storing boolean data in c# - c#

I need to store boolean data in Windows Azure. I want to make the space these take up as small as possible. What I have is about fifteen fields with values that are true of false.
field_1 = true;
field_2 = true;
field_a = false;
field_xx = true;
I had an idea that I could take these, convert the true and false to 1s and 0s and then store as a string something like 1101. Is there a simple way that I could do this coding and then uncode when getting the data out? Note that the field names are all different and so I can't use a fancy for loop to go through field names.

int bits = (field_1 ? 1 : 0) | (field_2 ? 2 : 0) | (field_3 ? 4 : 0) | (field_4 ? 8 : 0) | ...
field_1 = (bits & 1) != 0;
field_2 = (bits & 2) != 0;
field_3 = (bits & 4) != 0;
field_4 = (bits & 8) != 0;
...

I don't think you can even imagine how skeptical I am that this will help in any way, shape or form. 15 booleans is literally nothing.
Now, if you insist on going down this path, the best way would be to store them as a single int and use & to read them out and | to write them back in.

You can use a BitArray to pack the booleans into an int:
BitArray b = new BitArray(new bool[] { field_1, field_2, ..., field_xy });
int[] buffer = new int[1];
b.CopyTo(buffer, 0);
int data = buffer[0];
You can use a byte or int array. A byte can hold up to 8 booleans, an int up to 32. To hold up to 16 booleans you could use a byte array with two bytes, or a single int, depending on whether the overhead of an array or the unused bits in the int take up more space. You could also use the BitConverter class to convert a two byte array into a short.
To get the booleans back you create a BitArray from an array of byte or int:
BitArray b = new BitArray(new int[] { data });
field_1 = b[0];
field_2 = b[1];
...
field_xy = b[14];

Consider an enumeration with the [Flags] attribute
[Flags]
public enum Values
{
Field1 = 1,
Field2 = 2,
Field3 = 4,
Field4 = 8
}
Values obj = Field1 | Field2;
obj.HasValue(Field1); //true
obj.HasValue(Field3); //false
int storage = (int)obj;// 3

Don't bother. You're using boolean values, which are already about as small (for an individual value) as you can get (1 byte I believe). And the small amount of space that you might be able to save would not be worth the added complexity of your code, plus the time it would take you to develop it.
A few more thoughts: think how you'd use such a construct. Currently, if you look at field_1 and see a value of true, you don't have to look further into the implementation to figure out the actual value. However, let's say you had the following string: "100101011011010" (or an integer value of 19162, which would be more efficient). Is field_1 true, or is it false? It's not inherently obvious -- you need to go find the implementation. And what happens when you need to support more fields somewhere down the line? You'll save yourself a lot of heartache by sticking with what you've got.

Storing these as characters will take either 8 or 16 bits per value. I'd pack them into the an array of the longest unsigned integer available, using bit-shifting operations.

There is a great post on this at:
http://blog.millermedeiros.com/2010/12/using-integers-to-store-multiple-boolean-values/

You could do this with an int and using xor http://www.dotnetperls.com/xor
I saw a project that did this about 15 years ago. But it ended up with a limitation of 32 roles in the system (it used a 32 bit number). That product does not exist today :)
So do not do it, store values in an array or seperate fields.

you could use Enum with the flags attribute. You say you have 15 fields. So you could try using something like
[Flags]
enum Fieldvals
{
Field1, Field2, ....
}
take a look at http://msdn.microsoft.com/en-us/library/system.flagsattribute.aspx for the guidelines

Check out the BitArray class, it should do exactly what you need.
Example:
BitArray bits = new BitArray
(
new bool[]
{
false, true, false, false, true,
false, true, false, true, false,
false, true, false, true, false
}
);
short values = 0;
for( int index = 0; index < bits.Length; index++ )
{
if( bits[ index ] )
values |= ( short )( values | ( 1 << index ) );
}
Console.WriteLine( Convert.ToString( values, 2 ) );
You now have 15 bool variables stored in a single 16 bit field.

You can store your flags in an integer value. Here are some helper methods to accomplish that:
// Sets the given bit position in the UInt32 to the specified boolean value
public static UInt32 Set(UInt32 u, int position, bool newBitValue)
{
UInt32 mask = (UInt32)(1 << position);
if (newBitValue)
return (u | mask)
else
return (u & ~mask);
}
// Gets a bit value from the supplied UInt32
public static bool Get(UInt32 u, int position)
{
return ((u & (UInt32)(1 << position)) != 0);
}

Related

Attempt at setting a bit on with a bitmask [duplicate]

I need to mask certain string values read from a database by setting a specific bit in an int value for each possible database value. For example, if the database returns the string "value1" then the bit in position 0 will need to be set to 1, but if the database returns "value2" then the bit in position 1 will need to be set to 1 instead.
How can I ensure each bit of an int is set to 0 originally and then turn on just the specified bit?
If you have an int value "intValue" and you want to set a specific bit at position "bitPosition", do something like:
intValue = intValue | (1 << bitPosition);
or shorter:
intValue |= 1 << bitPosition;
If you want to reset a bit (i.e, set it to zero), you can do this:
intValue &= ~(1 << bitPosition);
(The operator ~ reverses each bit in a value, thus ~(1 << bitPosition) will result in an int where every bit is 1 except the bit at the given bitPosition.)
To set everything to 0, AND the value with 0x00000000:
int startValue = initialValue & 0x00000000;
//Or much easier :)
int startValue = 0;
To then set the bit, you have to determine what number has just that bit set and OR it. For example, to set the last bit:
int finalValue = startValue | 0x00000001;
As #Magus points out, to unset a bit you do the exact opposite:
int finalValue = startValue & 0xFFFFFFFE;
//Or
int finalValue = startValue & ~(0x00000001);
The ~ operatior is bitwise not which flips every bit.
so, this?
enum ConditionEnum
{
Aaa = 0,
Bbb = 1,
Ccc = 2,
}
static void SetBit(ref int bitFlag, ConditionEnum condition, bool bitValue)
{
int mask = 1 << (int)condition;
if (bitValue)
bitFlag |= mask;
else
bitFlag &= ~mask;
}
Just provide a value, bit value and position. Note that you might be able to modify this to work for other types.
public static void SetBit(ref int value, bool bitval, int bitpos)
{
if (!bitval) value&=~(1<<bitpos); else value|=1<<bitpos;
}

Encode bool array as ushort

Let's assume we have an array of Boolean values, some are true some are false.
I like to generate a ushort and set the bits according to the array.
A ushort consists of 2 bytes, - that makes up 16 bits.
So the first bool in the array need to set the first bit of the ushort if it's true, otherwise the bit would be 0.
This needs to be repeated for each bit in the ushort.
How would I setup a method stub which takes an array of bools as input and returns the encoded ushort? (C#)
You can make use of the BitConverter class (https://msdn.microsoft.com/en-us/library/bb384066.aspx) in order to convert from bytes to an int, and binary operations (like in this StackOverflow question: How can I convert bits to bytes?) to convert from bits to bytes
For Example:
//Bools to Bytes...
bool[] bools = ...
BitArray a = new BitArray(bools);
byte[] bytes = new byte[a.Length / 8];
a.CopyTo(bytes, 0);
//Bytes to ints
int newInt = BitConverter.ToInt32(bytes); //Change the "32" to however many bits are in your number, like 16 for a short
This will only work for one int, so if you have multiple int's in a single bit array, you'll need to split up the array for this approach to work.
A BitArray might be more suitable for your use case: https://msdn.microsoft.com/en-us/library/system.collections.bitarray(v=vs.110).aspx
bool[] myBools = new bool[5] { true, false, true, true, false };
BitArray myBA = new BitArray(myBools);
foreach (var value in myBA)
{
if((bool)value == true)
{
}
else
{
}
}

Enums with positive and negative values

I need to create a large enum which will be used as bit flags. Using the standard doubling i.e. 1, 2, 4 to ensure uniqueness of any combination is fine except that I run out of numbers if I use the int (2 billion upperlimit). I also cannot use a big int as Sql server has a limitation on bitwise operations and will truncate to 10 characters.
What I wanted to know is how to throw negative numbers in there as well and still ensure that all combinations remain unique. (for example some the enum values used in the ADO.NET library seem to have negative integers).
You can create an enum based on a ulong :
[Flags]
enum Foo : ulong
{
A = 1 ,
B = 2 ,
C = 4 ,
. . .
}
Store that in your database as two integers, something like this:
Save( Foo value )
{
ulong bitfield = (ulong) value ;
int hiNibble = (int)( (bitfield>>32) & 0x00000000FFFFFFFF ) ;
int loNibble = (int)( (bitfield>>0) & 0x00000000FFFFFFFF ) ;
// store the hi and lo nibbles as two integer columns in your database
}
In your database, create the table as something like
create table dbo.some_table
(
hiNibble int ,
loNibble int ,
bitField as convert(bigint, convert(varbinary,hiNibble) + convert(varbinary,loNibble) )
)
Now you have two 32-bit integers you can bit twiddle in SQL and you've got a 64-bit integer you can pass back to your C# code and rehydrate as the ulong-based enum it represents.

Setting all low order bits to 0 until two 1s remain (for a number stored as a byte array)

I need to set all the low order bits of a given BigInteger to 0 until only two 1 bits are left. In other words leave the highest and second-highest bits set while unsetting all others.
The number could be any combination of bits. It may even be all 1s or all 0s. Example:
MSB 0000 0000
1101 1010
0010 0111
...
...
...
LSB 0100 1010
We can easily take out corner cases such as 0, 1, PowerOf2, etc. Not sure how to apply popular bit manipulation algorithms on a an array of bytes representing one number.
I have already looked at bithacks but have the following constraints. The BigInteger structure only exposes underlying data through the ToByteArray method which itself is expensive and unnecessary. Since there is no way around this, I don't want to slow things down further by implementing a bit counting algorithm optimized for 32/64 bit integers (which most are).
In short, I have a byte [] representing an arbitrarily large number. Speed is the key factor here.
NOTE: In case it helps, the numbers I am dealing with have around 5,000,000 bits. They keep on decreasing with each iteration of the algorithm so I could probably switch techniques as the magnitude of the number decreases.
Why I need to do this: I am working with a 2D graph and am particularly interested in coordinates whose x and y values are powers of 2. So (x+y) will always have two bits set and (x-y) will always have consecutive bits set. Given an arbitrary coordinate (x, y), I need to transform an intersection by getting values with all bits unset except the first two MSB.
Try the following (not sure if it's actually valid C#, but it should be close enough):
// find the next non-zero byte (I'm assuming little endian) or return -1
int find_next_byte(byte[] data, int i) {
while (data[i] == 0) --i;
return i;
}
// find a bit mask of the next non-zero bit or return 0
int find_next_bit(int value, int b) {
while (b > 0 && ((value & b) == 0)) b >>= 1;
return b;
}
byte[] data;
int i = find_next_byte(data, data.Length - 1);
// find the first 1 bit
int b = find_next_bit(data[i], 1 << 7);
// try to find the second 1 bit
b = find_next_bit(data[i], b >> 1);
if (b > 0) {
// found 2 bits, removing the rest
if (b > 1) data[i] &= ~(b - 1);
} else {
// we only found 1 bit, find the next non-zero byte
i = find_next_byte(data, i - 1);
b = find_next_bit(data[i], 1 << 7);
if (b > 1) data[i] &= ~(b - 1);
}
// remove the rest (a memcpy would be even better here,
// but that would probably require unmanaged code)
for (--i; i >= 0; --i) data[i] = 0;
Untested.
Probably this would be a bit more performant if compiled as unmanaged code or even with a C or C++ compiler.
As harold noted correctly, if you have no a priori knowledge about your number, this O(n) method is the best you can do. If you can, you should keep the position of the highest two non-zero bytes, which would drastically reduce the time needed to perform your transformation.
I'm not sure if this is getting optimised out or not but this code appears to be 16x faster than ToByteArray. It also avoids the memory copy and it means you get to the results as uint instead of byte so you should have further improvements there.
//create delegate to get private _bit field
var par = Expression.Parameter(typeof(BigInteger));
var bits = Expression.Field(par, "_bits");
var lambda = Expression.Lambda(bits, par);
var func = (Func<BigInteger, uint[]>)lambda.Compile();
//test call our delegate
var bigint = BigInteger.Parse("3498574578238348969856895698745697868975687978");
int time = Environment.TickCount;
for (int y = 0; y < 10000000; y++)
{
var x = func(bigint);
}
Console.WriteLine(Environment.TickCount - time);
//compare time to ToByteArray
time = Environment.TickCount;
for (int y = 0; y < 10000000; y++)
{
var x = bigint.ToByteArray();
}
Console.WriteLine(Environment.TickCount - time);
From there finding the top 2 bits should be pretty easy. The first bit will be in the first int I presume, then it is just a matter of searching for the second top most bit. If it is in the same integer then just set the first bit to zero and find the topmost bit, otherwise search for the next no zero int and find the topmost bit.
EDIT: to make things simple just copy/paste this class into your project. This creates extension methods that means you can just call mybigint.GetUnderlyingBitsArray(). I added a method to get the Sign also and, to make it more generic, have created a function that will allow accessing any private field of any object. I found this to be slower than my original code in debug mode but the same speed in release mode. I would advise performance testing this yourself.
static class BigIntegerEx
{
private static Func<BigInteger, uint[]> getUnderlyingBitsArray;
private static Func<BigInteger, int> getUnderlyingSign;
static BigIntegerEx()
{
getUnderlyingBitsArray = CompileFuncToGetPrivateField<BigInteger, uint[]>("_bits");
getUnderlyingSign = CompileFuncToGetPrivateField<BigInteger, int>("_sign");
}
private static Func<TObject, TField> CompileFuncToGetPrivateField<TObject, TField>(string fieldName)
{
var par = Expression.Parameter(typeof(TObject));
var field = Expression.Field(par, fieldName);
var lambda = Expression.Lambda(field, par);
return (Func<TObject, TField>)lambda.Compile();
}
public static uint[] GetUnderlyingBitsArray(this BigInteger source)
{
return getUnderlyingBitsArray(source);
}
public static int GetUnderlyingSign(this BigInteger source)
{
return getUnderlyingSign(source);
}
}

Why does the BitConverter return Bytes and how can I get the bits then?

As input I get an int (well, actually a string I should convert to an int).
This int should be converted to bits.
For each bit position that has a 1, I should get the position.
In my database, I want all records that have an int value field that has this position as value.
I currently have the following naive code that should ask my entity(holding the databaseValue) if it matches the position, but obviously doesn't work correctly:
Byte[] bits = BitConverter.GetBytes(theDatabaseValue);
return bits[position].equals(1);
Firstly, I have an array of byte because there apparantly is no bit type. Should I use Boolean[] ?
Then, how can I fill this array?
Lastly, if previous statements are solved, I should just return bits[position]
I feel like this should somehow be solved with bitmasks, but I don't know where to start..
Any help would be appreciated
Your feeling is correct. This should be solved with bitmasks. BitConverter does not return bits (and how could it? "bits" isn't an actual data type), it converts raw bytes to CLR data types. Whenever you want to extract the bits out of something, you should think bitmasks.
If you want to check if a bit at a certain position is set, use the & operator. Bitwise & is only true if both bits are set. For example if you had two bytes 109 and 33, the result of & would be
0110 1101
& 0010 0001
-----------
0010 0001
If you just want to see if a bit is set in an int, you & it with a number that has only the bit you're checking set (ie 1, 2, 4, 8, 16, 32 and so forth) and check if the result is not zero.
List<int> BitPositions(uint input) {
List<int> result = new List<int>();
uint mask = 1;
int position = 0;
do {
if (input & mask != 0) {
result.Add(position);
}
mask <<= 1;
position++;
} while (mask != 0);
return result;
}
I suspect BitArray is what you're after. Alternatively, using bitmasks yourself isn't hard:
for (int i=0; i < 32; i++)
{
if ((value & (1 << i)) != 0)
{
Console.WriteLine("Bit {0} was set!", i);
}
}
Do not use Boolean. Although boolean has only two values, it is actually stored using 32 bits like an int.
EDIT: Actually, in array form Booleans will be packed into bytes, not 4 bytes.

Categories

Resources