Most confused - we are trying to process an octet-stream binary file. We have various possible destination structs. The incoming file is a string of x bytes - a blob - which we understand we first need to convert to a byte array. We use a FOR loop to move a byte at a time to the byte array. Then, when we know the specific struct of the data - as defined by a fixed-position text field within the data - we use a deserialize routine specific to that struct. Character arrays use one deserialize function to populate string variables, integer fields populate other variables (generally UINT16s), and so on through the received data. When we know we have an int16 (2-byte integer) processing fails if the low order integer's byte 8 is set to negative. We don't know if the 8-bits in the FOR loop is integer, char, or what until after the blob has been moved to the byte-array using the FOR loop (standard
for (i=1, I <= blob_length, i++)
{dest(i) = source(i); }
)and we have identified which struct is in play.
By the time we exit deserialize, we see the data is corrupted as follows:
so decimal 511 binary 01 11111111 converts to decimal 256 binary 01 00000000
but decimal 383 binary 01 01111111 converts correctly
We cannot tell if the FOR loop processing is somehow unable to handle an 8-bit field if the high-order bit is on, or if the actual deserialize process for the UINT16 is failing. We have struggled through other ascii-related issues where that 8th bit corrupts processing. Not sure this is yet another, or if it's something else.
Any insight or guidance would be gratefully appreciated.
Usually the indexes are 0-based and the for-loop should look like this:
for(int i = 0; i < blob_length; i++) {
dest[i] = source[i];
}
Probably you were one byte off.
Related
I am getting a string of zeros and ones from a client API request. They are of a set length (28, in this case) and I want to convert them to a byte[] or something similar, with the goal of storing these in SQL via EF Core and later using bitwise operators to compare them.
I can't seem to wrap my head around this one. I'm seeing a lot of posts/questions about converting characters to byte arrays, or byte arrays to strings, neither of which is what I need.
I need a "00111000010101010" to become a literal binary 00111000010101010 I can use a ^ on.
Leading zeros would be fine if necessary, I think the length might be forced to be a multiple of 8?
You can binary string convert to an integer easily with this:
string source = "00111000010101010";
int number = Convert.ToInt32(source, 2); // The `2` is "base 2"
That gives: 28842.
Then you can go one step further an convert to a byte array, if needed.
byte[] bytes = BitConverter.GetBytes(number);
I've downloaded this library https://github.com/robertvazan/crc32c.net for my project I'm working on. I need to use CRC in a part of my project so I downloaded the library as it is obviously going to be much faster than anything I'm going to write in the near future.
I have some understanding of how crc works, I once made a software implementation of it (as a part of learning) that worked, but I have got to be doing something incredibly stupid while trying to get this library to work and not realize it. No matter what I do, I can't seem to be able to get crc = 0 even though the arrays were not changed.
Basically, my question is, how do I actually use this library to check for integrity of a byte array?
The way I understand it, I should call Crc32CAlgorithm.Compute(array) once to compute the crc the first time and then call it again on an array that has the previously returned value appended (I've tried to append it as well as set last 4 bytes of the array to zeroes before putting the returned value there) and if the second call returns 0 the array was unchanged.
Please help me, I don't know what I'm doing wrong.
EDIT: It doesn't work right when I do this: (yes, I realize linq is very slow, this is just an example)
using(var hash = new Crc32CAlgorithm())
{
var array = new byte[] { 1, 2, 3, 4, 5, 6, 7, 8 };
var crc = hash.ComputeHash(array);
var arrayWithCrc = array.Concat(crc).ToArray();
Console.WriteLine(string.Join(" ", hash.ComputeHash(arrayWithCrc)));
}
Console outputs: 199 75 103 72
You do not need to append a CRC to a message and compute the CRC of that in order to check a CRC. Just compute the CRC on the message on one end, send that CRC along with the message, compute CRC on just the message on the other end (not including the sent CRC), and then compare the CRC you computed to the one that was sent with the message.
They should be equal to each other. That's all there is to it. That works for any hash you might use, not just CRCs.
If you feel deeply compelled to make use of the lovely mathematical property of CRCs where computing the CRC on the message with its CRC appended gives a specific result, you can. You have to append the CRC bits in the correct order, and you need to look for the "residue" of the CRC, which may not be zero.
In your case, you are in fact appending the bits in the correct order (by appending the bytes in little-endian order), and the result you are getting is the correct residue for the CRC-32C. That residue is 0x48674bc7, which separated into bytes, in little-endian order, and then converted into decimal is your 199 75 103 72.
You will find that if you take any sequence of bytes, compute the CRC-32C of that, append that CRC to the sequence in little-endian order, and compute the CRC-32C of the sequence plus CRC, you will always get 0x48674bc7.
However that's smidge slower than just comparing the two CRC's, since now you have to compute a CRC on four more bytes than before. So, really, there's no need to do it this way.
I have a byte array with 9 elements. I need to convert the entire byte array into a single hex string.
Above is fairly straightforward to do however, I need to drop leading 0 if an element contains a decimal value less than 16 i.e. F or less. See example below.
byte[] myByteArr = {10,11,12,13,14,15,16,17,18,19};
"Regular" method to convert the above to hex should give me 0x0A0B0C0D0E0F10111213
What I actually need is 0xABCDEF10111213
Is there a quick method to just drop the upper nibble and take the lower nibble if none of the bits in the upper nibble are set ?
Thanks in advance!
This question already has answers here:
Best way to store long binary (up to 512 bit) in C#
(5 answers)
Closed 9 years ago.
I am modifying an existing C# solution, wherein data is validated and status is stored as below:
a) A given record is validated against certain no. of conditions (say 5). Failed / passed status is represented by a bit value (0 - passed; 1 - failed)
b) So, if a record failed for all 5 validations, value will be 11111. This will be converted to a decimal and stored in a DB.
Once again, this decimal value will be converted back to binary (using bitwise & operator) which will be used to show the passed / failed records.
The issue is, long datatype is used in C# to handle the decimal and 'decimal' datatype in SQL Server 2008 to store this decimal value. The max. value of long converted to binary can hold only upto 64 bits, so validation count is currently restricted to 64.
My requirement is to remove this limit to allow any no. of validations.
How do I store a large no. of bits and also retrieve them? Also, please keep in mind, this being an existing (.NET 2.0) solution, can't afford to upgrade / use any 3rd party libraries and changes must be minimum
Latest update
Yes, this solution seems to be OK from an application perspective, i.e. if only I (a.k.a the present solution) were to use only C#. However, the designers of the existing solution made things complicated by storing the binary value (11111 represents all 5 records failed, 10111 - all but 4th record failed, and so on...) converted into decimal in SQL Server DB. An SP takes this value to arrive at no. of records failed for e each validation.
OPEN sValidateCUR
FETCH NEXT FROM sValidateCUR INTO #ValidationOID,#ValidationBit, #ValidationType
WHILE ##FETCH_STATUS = 0
BEGIN
-- Fetch the Error Record Count
SET #nBitVal = ABS(RPT.fGetPowerValue(#ValidationBit)) -- Validation bit is no. of a type of validation, say, e.g. 60. So 1st time when loop run, ValidationBit will be 0
select #ErrorRecordCount = COUNT(1) FROM
<<Error_Table_Where_Flags_are availble in decimal values>>
WITH(NOLOCK) WHERE ExpressionValidationFlags & CAST(CAST(#nBitVal AS VARCHAR(20)) AS Bigint) = CAST(#nBitVal AS VARCHAR(20)) -- For #ValidationBit = 3, #nBitVal = 2^3 = 8
Now, in application, using BitArray, I managed to stored the passed / failed records in BitArray, converted this to byte[] and stored in SQL Server as VARBINARY(100)... (the same column ExpressionValidationFlags, which was earlier BIGINT, is now VARBINARY and holds the byte array). However, to complete my changes, I need to modify the SP above.
Again, looking forward for help!!
Thanks
Why not use a specially designed class BitArray?
http://msdn.microsoft.com/query/dev11.query?appId=Dev11IDEF1&l=EN-US&k=k(System.Collections.BitArray);k(TargetFrameworkMoniker-.NETFramework,Version%3Dv4.5);k(DevLang-csharp)&rd=true
e.g.
BitArray array = new BitArray(150); // <- up to 150 bits
...
array[140] = true; // <- set 140th bit
array[130] = false; // <- reset 130th bit
...
if (array[120]) { // <- if 120th bit is set
...
There are several ways to go about this, based on the limitations of the database you are using.
If you are able to store byte arrays within the database, you can use the BitArray class. You can pass the constructor a byte array, use it to easily check and set each bit by index, and then use it's built in CopyTo method to copy it back out into a byte array.
Example:
byte[] statusBytes = yourDatabase.Get("passed_failed_bits");
BitArray statusBits = new BitArray(statusBytes);
...
statusBits[65] = false;
statusBits[66] = true;
...
statusBits.CopyTo(statusBytes, 0);
yourDatabase.Set("passed_failed_bits", statusBytes);
If the database is unable to deal with raw byte arrays, you can always encode the byte array as a hex string:
string hex = BitConverter.ToString(statusBytes);
hex.Replace("-","");
and then get it back into a byte array again:
int numberChars = hex.Length;
byte[] statusBytes= new byte[numberChars / 2];
for (int i = 0; i < numberChars; i += 2) {
statusBytes[i / 2] = Convert.ToByte(hex.Substring(i, 2), 16);
}
And if you can't even store strings, there are more creative ways to turn the byte array into multiple longs or doubles.
Also, if space efficiency is an issue, there are other, more efficient (but more complicated) ways to encode bytes as ascii text by using more of the character range without using control characters. You may also want to look into Run Length Encoding the byte array, if you're finding the data remains the same value for long stretches.
Hope this helps!
Why not use a string instead? You could put a very large number of characters in the database (use VARCHAR and not NVARCHAR since you control the input).
Using your example, if you had "11111", you could skip bitwise operations and just do things like this:
string myBits = "11111";
bool failedPosition0 = myBits[0] == '1';
bool failedPosition1 = myBits[1] == '1';
bool failedPosition2 = myBits[2] == '1';
bool failedPosition3 = myBits[3] == '1';
bool failedPosition4 = myBits[4] == '1';
I am working on a piece of software that analyzes E01 bitstream images. Basically these are forensic data files that allow a user to compress all the data on a disk into a single file. The E01 format embeds data about the original data, including MD5 hash of the source and resulting data, etc. If you are interested in some light reading, the EWF/E01 specification is here. Onto my problem:
The e01 file contains a "table" section which is a series of 32 bit numbers that are offsets to other locations within the e01 file where the actual data chunks are located. I have successfully parsed this data out into a list doing the following:
this.ChunkLocations = new List<int>();
//hack:Will this overflow? We are adding to integers to a long?
long currentReadLocation = TableSectionDescriptorRef.OffsetFromFileStart + c_SECTION_DESCRIPTOR_LENGTH + c_TABLE_HEADER_LENGTH;
byte[] currReadBytes;
using (var fs = new FileStream(E01File.FullName, FileMode.Open))
{
fs.Seek(currentReadLocation, 0);
for (int i = 0; i < NumberOfEntries; i++)
{
currReadBytes = new byte[c_CHUNK_DATA_OFFSET_LENGTH];
fs.Read(currReadBytes,0, c_CHUNK_DATA_OFFSET_LENGTH);
this.ChunkLocations.Add(BitConverter.ToUInt32(currReadBytes, 0));
}
}
The c_CHUNK_DATA_OFFSET_LENGTH is 4 bytes/ "32 bit" number.
According to the ewf/e01 specification, "The most significant bit in the chunk data offset indicates if the chunk is compressed (1) or uncompressed (0)". This appears to be evidenced by the fact that, if I convert the offsets to ints, there are large negative numbers in the results (for chunks without compression,no doubt), but most of the other offsets appear to be correctly incremented, but every once in a while there is crazy data. The data in the ChunkLocations looks something like this:
346256
379028
-2147071848
444556
477328
510100
Where with -2147071848 it appears the MSB was flipped to indicate compression/lack of compression.
QUESTIONS: So, if the MSB is used to flag for the presence of compression, then really I'm dealing with at 31 bit number, right?
1. How do I ignore the MSB/ compute a 31 bit number in figuring the offset value?
2. This seems to be a strange standard since it would seem like it would significantly limit the size of the offsets you could have, so I'm questioning if I'm missing something? These offsets to seem correct when I navigate to these locations within the e01 file.
Thanks for any help!
This sort of thing is typical when dealing with binary formats. As dtb pointed out, 31 bits is probably plenty large for this application, because it can address offsets up to 2 GiB. So they use that extra bit as a flag to save space.
You can just mask off the bit with a bitwise AND:
const UInt32 COMPRESSED = 0x80000000; // Only bit 31 on
UInt32 raw_value = 0x80004000; // test value
bool compressed = (raw_value & COMPRESSED) > 0;
UInt32 offset = raw_value & ~COMPRESSED;
Console.WriteLine("Compressed={0} Offset=0x{1:X}", compressed, offset);
Output:
Compressed=True Offset=0x4000
If you just want to strip off the leading bit, perform a bitwise and (&) of the value with 0x7FFFFFFF