BinaryReader.ReadInt32 result unexpected compared to input file, why?

BinaryReader.ReadInt32 result unexpected compared to input file, why? - c#

I am puzzled with a particular BinaryReader operation.
When viewing a binary file with a hex editor (UltraEdit), the first four bytes are: 52 62 38 11.
When iterating over the same file with a BinaryReader, if I call ReadInt32() first, I would expect the int value to be 1,382,168,593.
.ReadInt32(): Reads a 4-byte signed integer from the current stream and advances the current position of the stream by four bytes.
Instead, I get 288,907,858.
Clearly I am missing something obvious... can anyone explain what is going on?

BinaryReader reads bytes in little-endian order.
Observe:
csharp> 0x52623811; // What you expected it to read.
1382168593
csharp> 0x11386252; // What it actually read.
288907858
If you need to specify the byte ordering of the data you are reading, I would suggest using Mono.DataConvert. I've used it in several projects and it is incredibly useful, as well as MIT-licensed. (It does use unsafe code for performance reasons, so you can't use it in untrusted contexts.)
See the Wikipedia article on endianness for more info on the concept.
See Microsoft's Reference Source for details on the implementation of BinaryReader

Intel architecture is little endian. The last byte in the sequence has the highest value. So 52 62 38 11 is equivalent to 0x11386252.

Related

Sending Int32 equal to 4, received as equal to 67108864

What's going on, I do this on the server:
var msg = Server.Api.CreateMessage();
msg.Write(2);
msg.Write(FreshChunks.Count());
Server.Api.SendMessage(msg, peer.Connection, NetDeliveryMethod.ReliableUnordered);
then on the client it succesfuly reads the byte = 2 and the switch then routes to function which reads Int32 (FreshChunks.Count) which was equal 4 but when received it equals 67108864. I've tried with Int16-64 and UInt16-64, none of them work out the correct value.

Given that:
In your usage of msg.Write(2), the compiler reads the 2 as an int (Int32)
You mentioned that you "successfully read the byte = 2".
It seems that one of these options is happening:
msg.Write is writing only bytes that have at least one-bit set (=1) in them. (to save space)
msg.Write is always casting the given argument to a byte.
When asking for 4 bytes (Int32),
You got:
0x04 00 00 00. The first byte is exactly the 4 you passed.
It seems that when asking from msg.Read more bytes than it has (you requested 4bytes and it has only 1 due to msg.Write logic)
It does one of these:
Appends the remaining bytes with zeros
Keeps on reading, and in your case, there were 3 0's bytes in the message's metadata that was returned to you.
For solving your problem, you should read the documentation of the Write and Read methods and understand how they behave.

How do I use this CRC-32C C# library?

I've downloaded this library https://github.com/robertvazan/crc32c.net for my project I'm working on. I need to use CRC in a part of my project so I downloaded the library as it is obviously going to be much faster than anything I'm going to write in the near future.
I have some understanding of how crc works, I once made a software implementation of it (as a part of learning) that worked, but I have got to be doing something incredibly stupid while trying to get this library to work and not realize it. No matter what I do, I can't seem to be able to get crc = 0 even though the arrays were not changed.
Basically, my question is, how do I actually use this library to check for integrity of a byte array?
The way I understand it, I should call Crc32CAlgorithm.Compute(array) once to compute the crc the first time and then call it again on an array that has the previously returned value appended (I've tried to append it as well as set last 4 bytes of the array to zeroes before putting the returned value there) and if the second call returns 0 the array was unchanged.
Please help me, I don't know what I'm doing wrong.
EDIT: It doesn't work right when I do this: (yes, I realize linq is very slow, this is just an example)
using(var hash = new Crc32CAlgorithm())
{
var array = new byte[] { 1, 2, 3, 4, 5, 6, 7, 8 };
var crc = hash.ComputeHash(array);
var arrayWithCrc = array.Concat(crc).ToArray();
Console.WriteLine(string.Join(" ", hash.ComputeHash(arrayWithCrc)));
}
Console outputs: 199 75 103 72

You do not need to append a CRC to a message and compute the CRC of that in order to check a CRC. Just compute the CRC on the message on one end, send that CRC along with the message, compute CRC on just the message on the other end (not including the sent CRC), and then compare the CRC you computed to the one that was sent with the message.
They should be equal to each other. That's all there is to it. That works for any hash you might use, not just CRCs.
If you feel deeply compelled to make use of the lovely mathematical property of CRCs where computing the CRC on the message with its CRC appended gives a specific result, you can. You have to append the CRC bits in the correct order, and you need to look for the "residue" of the CRC, which may not be zero.
In your case, you are in fact appending the bits in the correct order (by appending the bytes in little-endian order), and the result you are getting is the correct residue for the CRC-32C. That residue is 0x48674bc7, which separated into bytes, in little-endian order, and then converted into decimal is your 199 75 103 72.
You will find that if you take any sequence of bytes, compute the CRC-32C of that, append that CRC to the sequence in little-endian order, and compute the CRC-32C of the sequence plus CRC, you will always get 0x48674bc7.
However that's smidge slower than just comparing the two CRC's, since now you have to compute a CRC on four more bytes than before. So, really, there's no need to do it this way.

Cutting random bytes off of file byte array in C#

So I've been working on this project for a while now, involving LSB steganography. Really fun stuff. Anyways, I just finished writing the code for embedding and extracting files from an image(instead of just plaintext), and I'm running into this problem. I can recognize the MIME and extension of the bytes, but because the embedded file doesn't usually take up all of the LSBs of the image, there's a lot of garbage data. So I have the extracted file + some garbage in the byte array right after it. I need to figure out how to cut these, so that the file that is being exported is the correct, smaller size.
TLDR: I have a byte array with a recognized file in it, with some additional random bytes. How do I find out where the file ends and the random bytes begin?
Remember this is all in C#.
Any advice is appreciated.
Link to my project for reference: https://github.com/nicosogangstar/Steg

Generally you have two options.
End of stream marker
This is the more direct approach of the two, but it may lack some versatily depending on what data you want to hide. After you embed your data, continue with embedding a unique sequence of bits/bytes such that you know it cannot be prematurely encountered in the data before. As you extract the bits, you can stop reading once you encounter this sequence. If you expect to hide only readable text, i.e. bytes with ascii codes between 32 and 127, your marker can be as short as eight 0s, or eight 1s. However, if you intend to hide any sort of binary data, where each byte has a chance of appearing, you may accidentally encounter the marker while extracting legitimate data and thus halt the process prematurely.
Header information
You can add a header preceding data, e.g, another 16-24 bits (or any other amount) which can be translated to a number that tells you how many bits/bytes/pixels to read before stopping. For example, if you want to hide a byte array of size 1000, first embed 2 bytes related to the length of the secret and then follow it with the actual data. More specifically, split the length in 2 bytes, where the first byte has the 8th to 15th bits and the second byte has the 0th to 7th bits of the number 1000 in binary.
00000011 11101000 1000 in binary
3 -24 byte values
You can embed all sorts of information in a header, such as whether the data is encrypted or compressed with some algorithm, the original filename of the date, how many LSBs to read for extracting the information, etc.

Compression of small string

I have data 0f 340 bytes in string mostly consists of signs and numbers like "føàA¹º#ƒUë5§Ž§"
I want to compress into 250 or less bytes to save it on my RFID card.
As this data is related to finger print temp. I want lossless compression.
So is there any algorithm which i can implement in C# to compress it?

If the data is strictly numbers and signs, I highly recommend changing the numbers into int based values. eg:
+12939272-23923+927392
can be compress into 3 piece of 32-bit integers, which is 22 bytes => 16 bytes. Picking the right integer size (whether 32-bit, 24-bit, 16-bit) should help.
If the integer size varies greatly, you could possibly use 8-bit to begin and use the value 255 to specify that the next 8-bit becomes the 8 more significant bits of the integer, making it 15-bit.
alternatively, you could identify the most significant character and assign 0 for it. the second most significant character gets 10, and the third 110. This is a very crude compression, but if you data is very limited, this might just do the job for you.

Is there any other information you know about your string? For instance does it contain certain characters more often than others? Does it contain all 255 characters or just a subset of them?
If so, huffman encoding may help you, see this or this other link for implementations in C#.
To be honest it just depends on how your input string looks like. What I'd do is try the using rar, zip, 7zip (LZMA) with very small dictionary sizes (otherwise they'll just use up too much space for preprocessed information) and see how big the raw compressed file they produce is (will probably have to use their libraries in order to make them strip headers to conserve space). If any of them produce a file under 250b, then find the c# library for it and there you go.

How to properly read 16 byte unsigned integer with BinaryReader

I need to parse a binary stream in .NET to convert a 16 byte unsigned integer. I would like to use the BinaryReader.ReadUIntXX() functions but there isn't a BinaryReader.ReadUInt128() function available. I assume I will have to roll my own function using the ReadByte function and build an array but I don't know if this is the most efficient method?
Thanks!

I would love to take credit for this, but one quick search of the net, and viola:
http://msdn.microsoft.com/en-us/library/bb384066.aspx
Here is the code sample (which is on the same page)
byte[] bytes = { 0, 0, 0, 25 };
// If the system architecture is little-endian (that is, little end first),
// reverse the byte array.
if (BitConverter.IsLittleEndian)
Array.Reverse(bytes);
int i = BitConverter.ToInt32(bytes, 0);
Console.WriteLine("int: {0}", i);
// Output: int: 25
The only thing that most developers do not know is the difference between big-endian, and little-endian. Well like most things in life the human race simply can't agree on very simple things (left and right hand cars is a good example as well). When the bits (remember 1 and 0's and binary math), are laid out the order of the bits will determine the value of the field. One byte is eigth bits.. then there is signed and unsigned.. but lets stick to the order. The number 1 (one) can be represented in one of two ways , 10000000 or 00000001 (see clarification in comments for detailed explanation) - as the comment in the code suggests, the big-endian is the one with the one in front, litte-endian is the one with zero. (see http: // en.wikipedia.org/wiki/Endianness -sorry new user and they wont' let me hyperlink more than once....) Why can't we all just agree???
I learned this lesson many years ago when dealing with embedded systems....remember linking? :) Am I showing my age??

I think the comments from 0xA3, SLaks, and Darin Dimitrov answered the question but to put it all together.
BinaryReader.ReadUInt128() is not supported in the binary reader class in .NET and the only solution I could find was to create my own function. As 0xA3 mentioned, there is a BigInt data type in .NET 4.0. I am in the process of creating my own function based upon everyone's comments.
Thanks!

a guid is exactly 16 bytes in size.
Guid guid = new Guid(byteArray);
But you cannot do maths with a Guid. If you need to, you can search for some implementations of a BigInteger for .net on the internet. You can then convert your bytearray into a BigInteger.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.