I have a project I'm working on where the file format stores the locations of various parts of the file in offsets. So, for example, the file will hold information about 8 different layers. There will be an offset in bytes to the data for each layer.
I'm having trouble calculating what that offset is as the way it is stored is confusing to me. I do have enough documentation to do it by hand but I don't know how to do it in code.
The docs say:
A packed offset is 32bits. The unpacked offset is also a 32 bit number to be used as a byte count. An offset is packed in memory as two words, or 4 bytes.
So, for example,
byte0 = aaaaaaaa
byte1 = bbbbbbbb
byte3 = cccccccc
byte4 = ddddeeee
The hi nibble of the low byte is appended to byte 0 and byte 2 as follows:
dddd aaaaaaaa cccccccc
Four 0 are added to the lo part (enforcing 16 byte chunkiness)
dddd aaaaaaaa cccccccc 0000
For completeness we specify that the high 8 bits of a 32 bit offset are 0.
The final unpacked offset looks like this:
00000000 ddddaaaa aaaacccc cccc0000
I can follow those instructions manually and come up with the correct numnber, but I don't know how to code that. I was copying another person's code who was working with the same filetype and they used:
offset = (val1 << 12) + (val2 << 4) + (val3 <<4) + (val4 >> 4)
val1, val2, val3, and val4 are just the 4 individual bytes. This worked fine for smaller numbers, but as soon as they got over a certain value, it no longer worked.
Can anyone help in getting this to work in C#?
Judging by your description, it looks like you need the following
offset = val1 << 12 | val3 << 4 | (val4 & 0xF0) << 16;
In this case, val1 means aaaaaaaa, val3 means cccccccc and val4 means dddddddd. val2 appears to be ignored.
Related
I have this block of C code that I can not for the life of me understand. I need to calculate the CRC-16 for a certain byte array I send to the method and it should give me the msb(most significant byte) and the lsb(least significant byte). I was also given a C written app to test some functionality and that app also gives me a log of what is sent and what is received via COM port.
What is weird is that I entered the hex string that I found in the log into this online calculator, but it gives me a different result.
I took a stab at translating the method to C#, but I don't understand certain aspects:
What is pucPTR doing there (it's not beeing used anywhere else)?
What do the 2 lines of code mean, under the first for?
Why in the second for the short "i" is <=7, shouldn't it be <=8?
Last line in if statement means that usCRC is in fact ushort 8005?
Here is the block of code:
unsigned short CalculateCRC(unsigned char* a_szBufuer, short a_sBufferLen)
{
unsigned short usCRC = 0;
for (short j = 0; j < a_sBufferLen; j++)
{
unsigned char* pucPtr = (unsigned char*)&usCRC;
*(pucPtr + 1) = *(pucPtr + 1) ^ *a_szBufuer++;
for (short i = 0; i <= 7; i++)
{
if (usCRC & ((short)0x8000))
{
usCRC = usCRC << 1;
usCRC = usCRC ^ ((ushort)0x8005);
}
else
usCRC = usCRC << 1;
}
}
return (usCRC);
}
This is the hex string that I convert to byte array and send to the method:
02 00 04 a0 00 01 01 03
This is the result that should be given from the CRC calculus:
06 35
The document I have been given says that this is a CRC16 IBM (msb, lsb) of the entire data.
Can anyone please help? I've been stuck on it for a while now.
Any code guru out there capable of translating that C method to C#? Apparently I'm not capable of such sourcery.
First of all, please note than in C, the ^ operator means bitwise XOR.
What is pucPTR doing there (it's not beeing used anywhere else)?
What do the 2 lines of code mean, under the first for?
Causing bugs, by the looks of it. It is only used to grab one of the two bytes of the FCS, but the code is written in an endianess-dependent way.
Endianess is very important when dealing with checksum algorithms, since they were originally designed for hardware shift registers, that require MSB first, aka big endian. In addition, CRC often means data communication, and data communication means possibly different endianess between the sender, the protocol and the receiver.
I would guess that this code was written for little endian machines only and the intent is to XOR with the ms byte. The code points to the first byte then uses +1 pointer arithmetic to get to the second byte. Corrected code should be something like:
uint8_t puc = (unsigned int)usCRC >> 8;
puc ^= *a_szBufuer;
usCRC = (usCRC & 0xFF) | ((unsigned int)puc << 8);
a_szBufuer++;
The casts to unsigned int are there to portably prevent mishaps with implicit integer promotion.
Why in the second for the short "i" is <=7, shouldn't it be <=8?
I think it is correct, but more readably it could have been written as i < 8.
Last line in if statement means that usCRC is in fact ushort 8005?
No, it means to XOR your FCS with the polynomial 0x8005. See this.
The document I have been given says that this is a CRC16 IBM
Yeah it is sometimes called that. Though from what I recall, "CRC16 IBM" also involves some bit inversion of the final result(?). I'd double check that.
Overall, be careful with this code. Whoever wrote it didn't have much of a clue about endianess, integer signedness and implicit type promotions. It is amateur-level code. You should be able to find safer, portable professional versions of the same CRC algorithm on the net.
Very good reading about the topic is A Painless Guide To CRC.
What is pucPTR doing there (it's not beeing used anywhere else)?
pucPtr is used to transform uglily an unsigned short to an array of 2 unsigned char. According endianess of platform, pucPtr will point on first byte of unsigned short and pucPtr+1 will point on second byte of unsigned short (or vice versa). You have to know if this algorithm is designed for little or big endian.
Code equivalent (and portable, if code have been developed for big endian):
unsigned char rawCrc[2];
rawCrc[0] = (unsigned char)(usCRC & 0x00FF);
rawCrc[1] = (unsigned char)((usCRC >> 8) & 0x00FF);
rawCrc[1] = rawCrc[1] ^ *a_szBufuer++;
usCRC = (unsigned short)rawCrc[0]
| (unsigned short)((unsigned int)rawCrc[1] << 8);
For little endian, you have to inverse raw[0] and raw[1]
What do the 2 lines of code mean, under the first for?
First line does the ugly transformation described in 1.
Second line retrieves value pointed by a_szBufuer and increment it. And does a "xor" with second (or first, according endianess) byte of crc (note *(pucPtr +1) is equivalent of pucPtr[1]) and stores results inside second (or first, according endianess) byte of crc.
*(pucPtr + 1) = *(pucPtr + 1) ^ *a_szBufuer++;
is equivalent to
pucPtr[1] = pucPtr[1] ^ *a_szBufuer++;
Why in the second for the short "i" is <=7, shouldn't it be <=8?
You have to do 8 iterations, from 0 to 7. You can change condition to i = 0; i<8 or i=1; i<=8
Last line in if statement means that usCRC is in fact ushort 8005?
No, it doesn't. It means that usCRC is now equal to usCRC XOR 0x8005. ^ is XOR bitwise operation (also called or-exclusive). Example:
0b1100110
^0b1001011
----------
0b0101101
I've had a good search, spent a few hours of wasted time and I can't do a simple bit shift in reverse :(
Dim result = VALUE >> 8 And &HFF
I have existing code that reads VALUE (an UInt16) from a file, does the bit shift to it. What I am trying to do is the reverse of it so it can be saved and read using the existing code above.
I've read up in bit shifting and read this great Code Project article but it may as well be in Latin.
UInt16 tt = 12123; //10111101011011
int aa = tt >> 8 & 0xFF; //101111 = 47
8 bits are disappeared. you can never get it back.
If you have the value 54, in binary 110110
If you shift 54 >> 2, it moves the bit to the right
00110110
00011011 (shift once)
00001101 (shift twice)
You end up with 13. If you shift 13 to the left. 13 << 2
00001101
00011010 (shift once)
00110100 (shift twice)
You will end up with 52
I am given a 11 bit signed hex value that must be stored in an int32 data type. When I cast the hex value to an int 32, the 11 bit hex value is obviously smaller then the int32 so it 0 fills the higher order bits.
Basically i need to be able to store 11 bit signed values in an int32 or 16 from a given 11 bit hex value.
For example.
String hex = 0x7FF;
If i cast this to int32 using Int.parse(hex, System.Globalization.Numbers.Hexvalue);
I get 2047 when it should be -1 (according to the 11 bit binary 111 1111)
How can I accomplish this in c#?
It's actually very simple, just two shifts. Shifting right keeps the sign, so that's useful. In order to use it, the sign of the 11 bit thing has to be aligned with the sign of the int:
x <<= -11;
Then do the right shift:
x >>= -11;
That's all.
The -11, which may seem odd, is just a shorter way to write 32 - 11. That's not in general the same thing, but shift counts are masked by 31 (ints) or 63 (longs), so in this case you can use that shortcut.
string hex = "0x7FF";
var i = (Convert.ToInt32(hex, 16) << 21) >> 21;
Preferably for this to be done in C#.
Supposedly, I have an integer of 1024.
I will be able to generate these equations:
4096 >> 2 = 1024
65536 >> 6 = 1024
64 << 4 = 1024
and so on...
Any clues or tips or guides or ideas?
Edit: Ok, in simple terms, what I want is, for example...Hey, I'm giving you an integer of 1024, now give me a list of possible bit-shift equations that will always return the value of 1024.
Ok, scratch that. It seems my question wasn't very concise and clear. I'll try again.
What I want, is to generate a list of possible bit-shift equations based on a numerical value. For example, if I have a value of 1024, how would I generate a list of possible equations that would always return the value of 1024?
Sample Equations:
4096 >> 2 = 1024
65536 >> 6 = 1024
64 << 4 = 1024
In a similar way, if I asked you to give me some additional equations that would give me 5, you would response:
3 + 2 = 5
10 - 5 = 5
4 + 1 = 5
Am I still too vague? I apologize for that.
You may reverse each equation and thus "generate" possible equations:
1024 >> 4 == 64
and therefore
64 << 4 == 1024
Thus generate all right/left shifts of 1024 without loosing bits due to overflow or underflow of your variable and then invert the corresponding equation.
Just add an extra '>' or '<':
uint value1= 4096 >> 2;
uint value2 = 65536 >> 6;
uint value3 = 64 << 4;
http://www.blackwasp.co.uk/CSharpShiftOperators.aspx
Are you asking why these relationships exist? Shifting bits left by 1 bit is equivalent to multiplying by 2. So 512 << 1 = 512 * 2 = 1024. Shifting right by 1 is dividing by 2. Shifting by 2 is multiplying/dividing by 4, by n is 2^n. So 1 << 10 = 1 * 2^10 = 1024. To see why, write the number out in binary: let's take 7 as an example:
7 -> 0000 0111b
7 << 1 -> 0000 1110b = 14
7 << 3 -> 0011 1000b = 56
If you already knew all this, I apologize, but you might want to make the question less vague.
First an explanation of why:
I have a list of links to a variety of MP3 links and I'm trying to read the ID3 information for these files quickly. I'm only downloading the first 1500 or so bytes and trying to ana yze the data within this chunk. I came across ID3Lib, but I could only get it to work on completely downloaded files and didn't notice any support for Streams. (If I'm wrong in this, feel free to point that out)
So basically, I'm left trying to parse the ID3 tag by myself. The size of the tag can be determined from four bytes near the start of the file. From the ID3 site:
The ID3v2 tag size is encoded with four bytes where the most
significant bit (bit 7) is set to zero in every byte, making a total
of 28 bits. The zeroed bits are ignored, so a 257 bytes long tag is
represented as $00 00 02 01.
So basically:
00000000 00000000 00000010 00000001
becomes
0000 00000000 00000001 00000001
I'm not too familiar with bit level operations and was wondering if someone could shed some insight on an elegant solution to ignore the leftmost bit of each of these four bytes? I'm trying to pull a base 10 integer from it, so that works as well.
If you've got the four individual bytes, you'd want:
int value = ((byte1 & 0x7f) << 21) |
((byte2 & 0x7f) << 14) |
((byte3 & 0x7f) << 7) |
((byte4 & 0x7f) << 0);
If you've got it in a single int already:
int value = ((value & 0x7f000000) >> 3) |
((value & 0x7f0000) >> 2) |
((value & 0x7f00) >> 1) |
(value & 0x7f);
To clear the most significant bit, AND with 127 (0x7F), this takes all bits apart from the MSB.
int tag1 = tag1Byte & 0x7F; // this is the first one read from the file
int tag2 = tag2Byte & 0x7F;
int tag3 = tag3Byte & 0x7F;
int tag4 = tag4Byte & 0x7F; // this is the last one
To convert this into a single number, realize that each tag value is a base 128 digit. So, the least signifiant is multipled by 128^0 (1), the next 128^1 (128), the third significant (128^2) and so on.
int tagLength = tag4+(tag3<<7)+(tag2<<14)+(tag1<<21)
You mention you want to conver this to base 10. You can then convert this to base 10, say for printing, using int to string conversion:
String base10 = String.valueOf(tagLength);