Read and write more than 8 bit symbols

Read and write more than 8 bit symbols - c#

I am trying to write an Encoded file.The file has 9 to 12 bit symbols. While writing a file I guess that it is not written correctly the 9 bit symbols because I am unable to decode that file. Although when file has only 8 bit symbols in it. Everything works fine. This is the way I am writing a file
File.AppendAllText(outputFileName, WriteBackContent, ASCIIEncoding.Default);
Same goes for reading with ReadAllText function call.
What is the way to go here?
I am using ZXing library to encode my file using RS encoder.
ReedSolomonEncoder enc = new ReedSolomonEncoder(GenericGF.AZTEC_DATA_12);//if i use AZTEC_DATA_8 it works fine beacuse symbol size is 8 bit
int[] bytesAsInts = Array.ConvertAll(toBytes.ToArray(), c => (int)c);
enc.encode(bytesAsInts, parity);
byte[] bytes = bytesAsInts.Select(x => (byte)x).ToArray();
string contentWithParity = (ASCIIEncoding.Default.GetString(bytes.ToArray()));
WriteBackContent += contentWithParity;
File.AppendAllText(outputFileName, WriteBackContent, ASCIIEncoding.Default);
Like in the code I am initializing my Encoder with AZTEC_DATA_12 which means 12 bit symbol. Because RS Encoder requires int array so I am converting it to int array. And writing to file like here.But it works well with AZTEC_DATA_8 beacue of 8 bit symbol but not with AZTEC_DATA_12.

Main problem is here:
byte[] bytes = bytesAsInts.Select(x => (byte)x).ToArray();
You are basically throwing away part of the result when converting the single integers to single bytes.
If you look at the array after the call to encode(), you can see that some of the array elements have a value higher than 255, so they cannot be represented as bytes. However, in your code quoted above, you cast every single element in the integer array to byte, changing the element when it has a value greater than 255.
So to store the result of encode(), you have to convert the integer array to a byte array in a way that the values are not lost or modified.
In order to make this kind of conversion between byte arrays and integer arrays, you can use the function Buffer.BlockCopy(). An example on how to use this function is in this answer.
Use the samples from the answer and the one from the comment to the answer for both conversions: Turning a byte array to an integer array to pass to the encode() function and to turn the integer array returned from the encode() function back into a byte array.
Here are the sample codes from the linked answer:
// Convert byte array to integer array
byte[] result = new byte[intArray.Length * sizeof(int)];
Buffer.BlockCopy(intArray, 0, result, 0, result.Length);
// Convert integer array to byte array (with bugs fixed)
int bytesCount = byteArray.Length;
int intsCount = bytesCount / sizeof(int);
if (bytesCount % sizeof(int) != 0) intsCount++;
int[] result = new int[intsCount];
Buffer.BlockCopy(byteArray, 0, result, 0, byteArray.Length);
Now about storing the data into files: Do not turn the data into a string directly via Encoding.GetString(). Not all bit sequences are valid representations of characters in any given character set. So, converting a random sequence of random bytes into a string will sometimes fail.
Instead, either store/read the byte array directly into a file via File.WriteAllBytes() / File.ReadAllBytes() or use Convert.ToBase64() and Convert.FromBase64() to work with a base64 encoded string representation of the byte array.
Combined here is some sample code:
ReedSolomonEncoder enc = new ReedSolomonEncoder(GenericGF.AZTEC_DATA_12);//if i use AZTEC_DATA_8 it works fine beacuse symbol size is 8 bit
int[] bytesAsInts = Array.ConvertAll(toBytes.ToArray(), c => (int)c);
enc.encode(bytesAsInts, parity);
// Turn int array to byte array without loosing value
byte[] bytes = new byte[bytesAsInts.Length * sizeof(int)];
Buffer.BlockCopy(bytesAsInts, 0, bytes, 0, bytes.Length);
// Write to file
File.WriteAllBytes(outputFileName, bytes);
// Read from file
bytes = File.ReadAllBytes(outputFileName);
// Turn byte array to int array
int bytesCount = bytes.Length * 40;
int intsCount = bytesCount / sizeof(int);
if (bytesCount % sizeof(int) != 0) intsCount++;
int[] dataAsInts = new int[intsCount];
Buffer.BlockCopy(bytes, 0, dataAsInts, 0, bytes.Length);
// Decoding
ReedSolomonDecoder dec = new ReedSolomonDecoder(GenericGF.AZTEC_DATA_12);
dec.decode(dataAsInts, parity);

Related

C# Converting base64 string to 16-bit words stored in little-endian byte order

I am trying to upload a base64 of a signature but I need it to be a base64 encoding of an array of 16-bit words stored in little-endian byte order. Can anyone help me convert the base64 to 16-bit array in little-endian byte and then convert it again to base64?

To do this you can create arrays of the correct type (byte[] and short[]) and use Buffer.BlockCopy() to copy the bytes between them, thus converting the data.
This does not account for little-endian/big-endian differences, but since you state that this only needs to run on little-endian systems, we don't need to worry about it.
Here's a sample console app that demonstrates how to do the conversion. It does the following:
Create an array of shorts 0..99 inclusive.
Convert array of shorts to array of bytes (preserving endianness).
Convert array of bytes to base 64 string.
Convert base 64 string back into array of bytes.
Convert array of bytes back into array of shorts (preserving endianness).
Compare converted array of shorts with original array to prove correctness.
Here's the code:
using System;
using System.Linq;
namespace ConsoleApp1
{
class Program
{
static void Main(string[] args)
{
// Create demo array of shorts 0..99 inclusive.
short[] sourceShorts = Enumerable.Range(0, 100).Select(i => (short)i).ToArray();
// Convert array of shorts to array of bytes. (Will be little-endian on Intel.)
int byteCount = sizeof(short) * sourceShorts.Length;
byte[] dataAsByteArray = new byte[byteCount];
Buffer.BlockCopy(sourceShorts, 0, dataAsByteArray, 0, byteCount);
// Convert array of bytes to base 64 string.
var asBase64 = Convert.ToBase64String(dataAsByteArray);
Console.WriteLine(asBase64);
// Convert base 64 string back to array of bytes.
byte[] fromBase64 = Convert.FromBase64String(asBase64);
// Convert array of bytes back to array of shorts.
if (fromBase64.Length % sizeof(short) != 0)
throw new InvalidOperationException("Byte array size must be multiple of sizeof(short) to be convertable to shorts");
short[] destShorts = new short[fromBase64.Length/sizeof(short)];
Buffer.BlockCopy(fromBase64, 0, destShorts, 0, fromBase64.Length);
// Prove that the unconverted shorts match the source shorts.
if (destShorts.SequenceEqual(sourceShorts))
Console.WriteLine("Converted and unconverted successfully");
else
Console.WriteLine("Error: conversion was unsuccessful");
}
}
}

Large byte array - any benefit of storing length within the byte array?

Q: Is there any benefit of storing the length of a large array within the array itself?
Explanation:
Let's say we compress some large binary serialized object by using the GZipStream class of the System.IO.Compression namespace.
The output will be a Base64 string of some compressed byte array.
At some later point the Base64 string gets converted back to a byte array and the data needs to be decompressed.
While compressing the data we create a new byte array with the size of the compressed byte array + 4.
In the first 4 bytes we store the length/size of the compressed byte array and we then BlockCopy the length and the data to the new array. This new array gets converted into a Base64 string.
While decompressing we convert the Base64 string into a byte array.
Now we can extract the length of the actual compressed data by using the BitConverter class which will extract a Int32 from the first 4 bytes.
We then allocate a byte array with the length that we got from the first 4 bytes and let the Stream write the decompressed bytes to the byte array.
I can't image that something like this actually has any benefit at all.
It adds more complexity to the code and more operations need to be executed.
Readability is reduced too.
The BlockCopy operations alone should consume so much resources that this just cannot have a benefit, right?
Compression example code:
byte[] buffer = new byte[0xffff] // Some large binary serialized object
// Compress in-memory.
using (var mem = new MemoryStream())
{
// The actual compression takes place here.
using (var zipStream = new GZipStream(mem, CompressionMode.Compress, true)) {
zipStream.Write(buffer, 0, buffer.Length);
}
// Store compressed byte data here.
var compressedData = new byte[mem.Length];
mem.Position = 0;
mem.Read(compressedData, 0, compressedData.Length);
/* Increase the size by 4 to accommadate for an Int32 that
** will store the total length of the compressed data. */
var zipBuffer = new byte[compressedData.Length + 4];
// Store length of compressedData array in the first 4 bytes.
Buffer.BlockCopy(compressedData, 0, zipBuffer, 4, compressedData.Length);
// Store the compressedData array after the first 4 bytes which store the length.
Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, zipBuffer, 0, 4);
return Convert.ToBase64String(zipBuffer);
}
Decompression example code:
byte[] zipBuffer = Convert.FromBase64String("some base64 string");
using (var inStream = new MemoryStream())
{
// The length of the array that was stored in the first 4 bytes.
int dataLength = BitConverter.ToInt32(zipBuffer, 0);
// Allocate array with specific size.
byte[] buffer = new byte[dataLength];
// Writer data to buffer array.
inStream.Write(zipBuffer, 4, zipBuffer.Length - 4);
inStream.Position = 0;
// Decompress data.
using (var zipStream = new GZipStream(inStream, CompressionMode.Decompress)) {
zipStream.Read(buffer, 0, buffer.Length);
}
... code
... code
... code
}

You tagged the question as C#, wich means .NET, so the question is irrelevant:
The Framework already store the length with the Array. It is how the array classes do the sanity checks on Indexers. It how it prevents overflow attacks in managed code. That help alone is worth any minor inefficiency (note that the JiT is actually able to prune most of the checks. With a loop for example, it will simply look at the running variable once per loop).
You would have to go all the way into unmanaged code and handling naked pointers to have a hope to get rid of it. But why would you? The difference is so small, it falls under the speed rant. If it maters, you propably got a realtime programming case. And starting those with .NET was a bad idea.

Byte[] to BitArray and back to Byte[]

As the title states, i'm trying to convert a byte array to bit array back to byte array again.
I am aware that Array.CopyTo() takes care of that but the byte array received is not the same as the original one due to how BitArray stores values in LSB.
How do you go about it in C#?

This should do it
static byte[] ConvertToByte(BitArray bits) {
// Make sure we have enough space allocated even when number of bits is not a multiple of 8
var bytes = new byte[(bits.Length - 1) / 8 + 1];
bits.CopyTo(bytes, 0);
return bytes;
}
You can verify it using a simple driver program like below
// test to make sure it works
static void Main(string[] args) {
var bytes = new byte[] { 10, 12, 200, 255, 0 };
var bits = new BitArray(bytes);
var newBytes = ConvertToByte(bits);
if (bytes.SequenceEqual(newBytes))
Console.WriteLine("Successfully converted byte[] to bits and then back to byte[]");
else
Console.WriteLine("Conversion Problem");
}
I know that the OP is aware of the Array.CopyTo solution (which is similar to what I have here), but I don't see why it's causing any Bit order issues. FYI, I am using .NET 4.5.2 to verify it. And hence I have provided the test case to confirm the results

To get a BitArray of byte[] you can simply use the constructor of BitArray:
BitArray bits = new BitArray(bytes);
To get the byte[] of the BitArray there are many possible solutions. I think a very elegant solution is to use the BitArray.CopyTo method. Just create a new array and copy the bits into:
byte[]resultBytes = new byte[(bits.Length - 1) / 8 + 1];
bits.CopyTo(resultBytes, 0);

Most efficient way to save binary code to file

I have a string that only contains 1 and 0 and I need to save this to a .txt-File.
I also want it to be as small as possible. Since I have binary code, I can turn it into pretty much everything. Saving it as binary is not an option, since apparently every character will be a whole byte, even if it's a 1 or a 0.
I thought about turning my string into an Array of Byte but trying to convert "11111111" to Byte gave me a System.OverflowException.
My next thought was using an ASCII Codepage or something. But I don't know how reliable that is. Alternatively I could turn all of the 8-Bit pieces of my string into the corresponding numbers. 8 characters would turn into a maximum of 3 (255), which seems pretty nice to me. And since I know the highest individual number will be 255 I don't even need any delimiter for decoding.
But I'm sure there's a better way.
So:
What exactly is the best/most efficient way to store a string that only contains 1 and 0?

You could represent all your data as 64 bit integers and then write them to a binary file:
// The string we are working with.
string str = #"1010101010010100010101101";
// The number of bits in a 64 bit integer!
int size = 64;
// Pad the end of the string with zeros so the length of the string is divisible by 64.
str += new string('0', str.Length % size);
// Convert each 64 character segment into a 64 bit integer.
long[] binary = new long[str.Length / size]
.Select((x, idx) => Convert.ToInt64(str.Substring(idx * size, size), 2)).ToArray();
// Copy the result to a byte array.
byte[] bytes = new byte[binary.Length * sizeof(long)];
Buffer.BlockCopy(binary, 0, bytes, 0, bytes.Length);
// Write the result to file.
File.WriteAllBytes("MyFile.bin", bytes);
EDIT:
If you're only writing 64 bits then it's a one-liner:
File.WriteAllBytes("MyFile.bin", BitConverter.GetBytes(Convert.ToUInt64(str, 2)));

I would suggest using BinaryWriter. Like this:
BinaryWriter writer = new BinaryWriter(File.Open(fileName, FileMode.Create));

BitConverter.ToInt32 to convert 2 bytes

I am using BitConverter.ToInt32 to convert a Byte array into int.
I have only two bytes [0][26], but the function needs 4 bytes, so I have to add two 0 bytes to the front of the existing bytes.
What is the quickest method.
Thank you.

You should probably do (int)BitConverter.ToInt16(..) instead. ToInt16 is made to read two bytes into a short. Then you simply convert that to an int with the cast.

You should call `BitConverter.ToInt16, which only reads two bytes.
short is implicitly convertible to int.

Array.Copy. Here is some code:
byte[] arr = new byte[] { 0x12, 0x34 };
byte[] done = new byte[4];
Array.Copy(arr, 0, done, 2, 2); // http://msdn.microsoft.com/en-us/library/z50k9bft.aspx
int myInt = BitConverter.ToInt32(done); // 0x00000026
However, a call to `BitConverter.ToInt16(byte[]) seems like a better idea, then just save it to an int:
int myInt = BitConverter.ToInt16(...);
Keep in mind endianess however. On little endian machines, { 0x00 0x02 } is actually 512, not 2 (0x0002 is still 2, regardless of endianness).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Read and write more than 8 bit symbols - c#

Related

C# Converting base64 string to 16-bit words stored in little-endian byte order

Large byte array - any benefit of storing length within the byte array?

Byte[] to BitArray and back to Byte[]

Most efficient way to save binary code to file

BitConverter.ToInt32 to convert 2 bytes

Categories

Resources