Write one single bit to binary file using BinaryWriter

Write one single bit to binary file using BinaryWriter - c#

I want to write one single bit to a binary file.
using (FileStream fileStream = new FileStream(#"myfile.bin", FileMode.Create))
using (BinaryWriter binaryWriter = new BinaryWriter(fileStream))
{
binaryWriter.Write((bool)10);
}
Something like binaryWriter.Write((bit)1);
When I use binaryWriter.Write((bool)1) the file has one byte, but I want to write one single bit. Is this possible?

No it's not possible to write a single bit. You have to write at least a full byte. If you are in a situation that you want to write single bits, you can wait until you have 8 bits ready for writing (i.e. queue them up in memory) and then write out a full byte (i.e. using bit shifts etc. to combine those bits to a byte).
Also from Wikipedia:
Historically, a byte was the number of bits used to encode a single
character of text in a computer and for this reason it is the
basic addressable element in many computer architectures.

You cannot store only 1 bit in a file. Almost all modern filesystems and hardware store data in segments of 8 bits, aka bytes or octets.
If you want store a bit value in a file, store either 1 or 0 as a byte (00000001 or 00000000).

If you are writing only bits, you can mix 8 bits into a single byte.
But it is not possible to write a single bit.

You could probably do it by read/modify/write. Why do you want to do such a thing? Whatever it is, find another way, pack bits into bytes, read/write boolean bytes and convert them back into bits, use ASCII '0' and '1' - amost anything except reading and writing one bit at a time.

Related

Cutting random bytes off of file byte array in C#

So I've been working on this project for a while now, involving LSB steganography. Really fun stuff. Anyways, I just finished writing the code for embedding and extracting files from an image(instead of just plaintext), and I'm running into this problem. I can recognize the MIME and extension of the bytes, but because the embedded file doesn't usually take up all of the LSBs of the image, there's a lot of garbage data. So I have the extracted file + some garbage in the byte array right after it. I need to figure out how to cut these, so that the file that is being exported is the correct, smaller size.
TLDR: I have a byte array with a recognized file in it, with some additional random bytes. How do I find out where the file ends and the random bytes begin?
Remember this is all in C#.
Any advice is appreciated.
Link to my project for reference: https://github.com/nicosogangstar/Steg

Generally you have two options.
End of stream marker
This is the more direct approach of the two, but it may lack some versatily depending on what data you want to hide. After you embed your data, continue with embedding a unique sequence of bits/bytes such that you know it cannot be prematurely encountered in the data before. As you extract the bits, you can stop reading once you encounter this sequence. If you expect to hide only readable text, i.e. bytes with ascii codes between 32 and 127, your marker can be as short as eight 0s, or eight 1s. However, if you intend to hide any sort of binary data, where each byte has a chance of appearing, you may accidentally encounter the marker while extracting legitimate data and thus halt the process prematurely.
Header information
You can add a header preceding data, e.g, another 16-24 bits (or any other amount) which can be translated to a number that tells you how many bits/bytes/pixels to read before stopping. For example, if you want to hide a byte array of size 1000, first embed 2 bytes related to the length of the secret and then follow it with the actual data. More specifically, split the length in 2 bytes, where the first byte has the 8th to 15th bits and the second byte has the 0th to 7th bits of the number 1000 in binary.
00000011 11101000 1000 in binary
3 -24 byte values
You can embed all sorts of information in a header, such as whether the data is encrypted or compressed with some algorithm, the original filename of the date, how many LSBs to read for extracting the information, etc.

reading binary from any type of file in c#

I need to read data from different types of files (wav, dll etc.) for a compression algorithm. Now the algorithm is kind of sorted out, however I'm having a problem when reading from non text files.
What I need to do is read the ascii representation of each character in the file and then apply my algorithm to what I've read.
I've used this for reading (path is the string that represents the path of the file, byte[] abc):
if (path != "") {
abc = File.ReadAllBytes(path);
}
It works just fine for text files (doc, txt, .m etc) but if I try to do this for a dll file I get the following error: Value was either too large or too small for an unsigned byte.
I've also tried setting abc as a string and using File.ReadAllText and then converting each character in the string to a byte value but I get the same error.
I know that a wav file, for example, is composed of special characters if you open it in a text editor and so far I think that the ascii value for some of those characters is beyond 255 which may lead to the error. However I don't know if that is in fact the case and I'm a bit stuck on what I might do to sort out my issue.
If anyone has any idea I would most appreciate it. It would also be nice if you could stick to the language used (C#).
Thanks!

A byte is a value between 0 and 255. Every file on your computer consists of a number of bytes, regardless of whether they are wave files, dll files, text files or even files without extensions. You can ReadAllBytes from any file and all bytes returned contain values between 0 and 255.
ASCII is a character set that contains values between 0 and 127 - there are ASCII extensions or code pages that contain 256 possible values. Not all values can be represented (or displayed) though - a portion of ASCII and these extensions are control characters which have no default representation.
There are no ASCII characters beyond 255 - the characters you see is the text editor trying to make the best of it.
The error you get is from converting something (a byte?) to a ubyte which allows a value between -128 and 127, while most wave files will certainly contain values above 127.
In short: you can't use ASCII to represent every possible value for a byte. You could use an ASCII extension to hold the byte's value but it would not make sense for a non-textual file (the 'A's you see when you open a .wav file in a text editor are not meant to be 'A's).
If you do want to continue down the path you have chosen, you'll have to post the code where you convert the bytes to an unsigned byte or ASCII value. But you probably should try to "convert" your algorithm into a binary one.

Use the following code:
using (FileStream fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read))
{
byte[] buffer = new byte[fileStream.Length];
fileStream.Read(buffer, 0, (int) fileStream.Length);
return buffer;
}
Tried to read kernel32.dll and user32.dll with this code, and it worked just fine

Converting Uint64 to 5 bytes and vice versa in c#

I have an application that expects 5 bytes that is derived of a number. Right now I am using Uint32 which is 4 bytes. I been told to make a class that uses a byte and a Uint32. However, im not sure how to combine them since they are numbers.
I figured the best way may be to use a Uint64 and convert it down to 5 bytes. However, I am not sure how that can be done. I need to be able to convert it to 5 bytes and a separate function in the class to convert it back to a Uint64.
Does anyone have any ideas on the best way to do this?
Thank you

Use BitConverter.GetBytes and then just remove the three bytes you don't need.
To convert it back use BitConverter.ToUInt64 remembering that you'll have to first pad your byte array with three extra (zero) bytes.
Just watch out for endianness. You may need to reverse your array if the application you are sending these bytes to expects the opposite endianess. Also the endianess will dictate whether you need to add/remove bytes from the start or end of the array. (check BitConverter.IsLittleEndian)

Compression of small string

I have data 0f 340 bytes in string mostly consists of signs and numbers like "føàA¹º#ƒUë5§Ž§"
I want to compress into 250 or less bytes to save it on my RFID card.
As this data is related to finger print temp. I want lossless compression.
So is there any algorithm which i can implement in C# to compress it?

If the data is strictly numbers and signs, I highly recommend changing the numbers into int based values. eg:
+12939272-23923+927392
can be compress into 3 piece of 32-bit integers, which is 22 bytes => 16 bytes. Picking the right integer size (whether 32-bit, 24-bit, 16-bit) should help.
If the integer size varies greatly, you could possibly use 8-bit to begin and use the value 255 to specify that the next 8-bit becomes the 8 more significant bits of the integer, making it 15-bit.
alternatively, you could identify the most significant character and assign 0 for it. the second most significant character gets 10, and the third 110. This is a very crude compression, but if you data is very limited, this might just do the job for you.

Is there any other information you know about your string? For instance does it contain certain characters more often than others? Does it contain all 255 characters or just a subset of them?
If so, huffman encoding may help you, see this or this other link for implementations in C#.
To be honest it just depends on how your input string looks like. What I'd do is try the using rar, zip, 7zip (LZMA) with very small dictionary sizes (otherwise they'll just use up too much space for preprocessed information) and see how big the raw compressed file they produce is (will probably have to use their libraries in order to make them strip headers to conserve space). If any of them produce a file under 250b, then find the c# library for it and there you go.

Getting a string, int, etc in binary representation?

Is it possible to get strings, ints, etc in binary format? What I mean is that assume I have the string:
"Hello" and I want to store it in binary format, so assume "Hello" is
11110000110011001111111100000000 in binary (I know it not, I just typed something quickly).
Can I store the above binary not as a string, but in the actual format with the bits.
In addition to this, is it actually possible to store less than 8 bits. What I am getting at is if the letter A is the most frequent letter used in a text, can I use 1 bit to store it with regards to compression instead of building a binary tree.

Is it possible to get strings, ints,
etc in binary format?
Yes. There are several different methods for doing so. One common method is to make a MemoryStream out of an array of bytes, and then make a BinaryWriter on top of that memory stream, and then write ints, bools, chars, strings, whatever, to the BinaryWriter. That will fill the array with the bytes that represent the data you wrote. There are other ways to do this too.
Can I store the above binary not as a string, but in the actual format with the bits.
Sure, you can store an array of bytes.
is it actually possible to store less than 8 bits.
No. The smallest unit of storage in C# is a byte. However, there are classes that will let you treat an array of bytes as an array of bits. You should read about the BitArray class.

What encoding would you be assuming?

What you are looking for is something like Huffman coding, it's used to represent more common values with a shorter bit pattern.
How you store the bit codes is still limited to whole bytes. There is no data type that uses less than a byte. The way that you store variable width bit values is to pack them end to end in a byte array. That way you have a stream of bit values, but that also means that you can only read the stream from start to end, there is no random access to the values like you have with the byte values in a byte array.

What I am getting at is if the letter
A is the most frequent letter used in
a text, can I use 1 bit to store it
with regards to compression instead of
building a binary tree.
The algorithm you're describing is known as Huffman coding. To relate to your example, if 'A' appears frequently in the data, then the algorithm will represent 'A' as simply 1. If 'B' also appears frequently (but less frequently than A), the algorithm usually would represent 'B' as 01. Then, the rest of the characters would be 00xxxxx... etc.
In essence, the algorithm performs statistical analysis on the data and generates a code that will give you the most compression.

You can use things like:
Convert.ToBytes(1);
ASCII.GetBytes("text");
Unicode.GetBytes("text");
Once you have the bytes, you can do all the bit twiddling you want. You would need an algorithm of some sort before we can give you much more useful information.

The string is actually stored in binary format, as are all strings.
The difference between a string and another data type is that when your program displays the string, it retrieves the binary and shows the corresponding (ASCII) characters.
If you were to store data in a compressed format, you would need to assign more than 1 bit per character. How else would you identify which character is the mose frequent?
If 1 represents an 'A', what does 0 mean? all the other characters?

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.