Converting from byte[] to string

Converting from byte[] to string - c#

I have the following code:
using (BinaryReader br = new BinaryReader(
File.Open(FILE_PATH, FileMode.Open, FileAccess.ReadWrite)))
{
int pos = 0;
int length = (int) br.BaseStream.Length;
while (pos < length)
{
b[pos] = br.ReadByte();
pos++;
}
pos = 0;
while (pos < length)
{
Console.WriteLine(Convert.ToString(b[pos]));
pos++;
}
}
The FILE_PATH is a const string that contains the path to the binary file being read.
The binary file is a mixture of integers and characters.
The integers are 1 bytes each and each character is written to the file as 2 bytes.
For example, the file has the following data :
1HELLO HOW ARE YOU45YOU ARE LOOKING GREAT //and so on
Please note: Each integer is associated with the string of characters following it. So 1 is associated with "HELLO HOW ARE YOU" and 45 with "YOU ARE LOOKING GREAT" and so on.
Now the binary is written (I do not know why but I have to live with this) such that '1' will take only 1 byte while 'H' (and other characters) take 2 bytes each.
So here is what the file actually contains:
0100480045..and so on
Heres the breakdown:
01 is the first byte for the integer 1
0048 are the 2 bytes for 'H' (H is 48 in Hex)
0045 are the 2 bytes for 'E' (E = 0x45)
and so on..
I want my Console to print human readable format out of this file: That I want it to print "1 HELLO HOW ARE YOU" and then "45 YOU ARE LOOKING GREAT" and so on...
Is what I am doing correct? Is there an easier/efficient way?
My line Console.WriteLine(Convert.ToString(b[pos])); does nothing but prints the integer value and not the actual character I want. It is OK for integers in the file but then how do I read out characters?
Any help would be much appreciated.
Thanks

I think what you are looking for is Encoding.GetString.
Since your string data is composed of 2 byte characters, how you can get your string out is:
for (int i = 0; i < b.Length; i++)
{
byte curByte = b[i];
// Assuming that the first byte of a 2-byte character sequence will be 0
if (curByte != 0)
{
// This is a 1 byte number
Console.WriteLine(Convert.ToString(curByte));
}
else
{
// This is a 2 byte character. Print it out.
Console.WriteLine(Encoding.Unicode.GetString(b, i, 2));
// We consumed the next character as well, no need to deal with it
// in the next round of the loop.
i++;
}
}

You can use String System.Text.UnicodeEncoding.GetString() which takes a byte[] array and produces a string.
I found this link very useful
Note that this is not the same as just blindly copying the bytes from the byte[] array into a hunk of memory and calling it a string. The GetString() method must validate the bytes and forbid invalid surrogates, for example.

using (BinaryReader br = new BinaryReader(File.Open(FILE_PATH, FileMode.Open, FileAccess.ReadWrite)))
{
int length = (int)br.BaseStream.Length;
byte[] buffer = new byte[length * 2];
int bufferPosition = 0;
while (pos < length)
{
byte b = br.ReadByte();
if(b < 10)
{
buffer[bufferPosition] = 0;
buffer[bufferPosition + 1] = b + 0x30;
pos++;
}
else
{
buffer[bufferPosition] = b;
buffer[bufferPosition + 1] = br.ReadByte();
pos += 2;
}
bufferPosition += 2;
}
Console.WriteLine(System.Text.Encoding.Unicode.GetString(buffer, 0, bufferPosition));
}

Related

C# How convert unHex to String

I'm trying to convert an unHex value to a string but it's not working.
I have the following value 0x01BB92E7F716F55B144768FCB2EA40187AE6CF6B2E52A64F7331D0539507441F7D770112510D679F0B310116B0D709E049A19467672FFA532A7C30DFB72
Result I hope would be this
but executing the function below displays this result
»’ Ç ÷ õ [Ghü²ê # zæÏk.R¦Os1ÐS • D} w Q gŸ 1 ° × àI¡ ”gg / úS * | 0ß ·) = ¤
Any idea how I can extract the information as expected
public static string Hex2String (string input)
{
var builder = new StringBuilder ();
for (int i = 0; i < socketLength; i + = 2)
{
// throws an exception if not properly formatted
string hexdec = input.Substring (i, 2);
int number = Int32.Parse (hexdec, NumberStyles.HexNumber);
char charToAdd = (char) number;
builder.Append (charToAdd);
}
return builder.ToString ();
}

Your result is base64-encoded. Base64 is a way of taking a byte array and turning it into human-readable characters.
Your code tries to take these raw bytes and cast them to chars, but not all byte values are valid printable characters: some are control characters, some can't be printed, etc.
Instead, let's turn the hex string into a byte array, and then turn that byte array into a base64 string.
string input = "01BB92E7F716F55B144768FCB2EA40187AE6CF6B2E52A64F7331D0539507441F7D770112510D679F0B310116B0D709E049A19467672FFA532A7C30DFB72";
byte[] bytes = new byte[input.Length / 2];
for (int i = 0; i < bytes.Length; i++)
{
bytes[i] = byte.Parse(input.Substring(i * 2, 2), NumberStyles.HexNumber);
}
string result = Convert.ToBase64String(bytes);
This results in:
AbuS5/cW9VsUR2j8supAGHrmz2suUqZPczHQU5UHRB99dwESUQ1nnwsxARaw1wngSaGUZ2cv+lMqfDDftw==
See it running here.

checksum for UTF-8 string

Below is the checksum description.
The checksum is four ASCII character digits representing the binary sum of the characters including the
first character of the transmission and up to and including the checksum field identifier characters.
To calculate the checksum add each character as an unsigned binary number, take the lower 16 bits of the
total and perform a 2's complement. The checksum field is the result represented by four hex digits.
To verify the correct checksum on received data, simply add all the hex values including the checksum. It
should equal zero.
this is the implementation for ASCII string, but my input string is UTF-8 now.
anyone give some idea to revise the implementation for UTF-8 encoding. Thanks very much.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace SIP2
{
// Adapted from VB.NET from the Library Tech Guy blog
// http://librarytechguy.blogspot.com/2009/11/sip2-checksum_13.html
public class CheckSum
{
public static string ApplyChecksum(string strMsg)
{
int intCtr;
char[] chrArray;
int intAscSum;
bool blnCarryBit;
string strBinVal = String.Empty;
string strInvBinVal;
string strNewBinVal = String.Empty;
// Transfer SIP message to a a character array.Loop through each character of the array,
// converting the character to an ASCII value and adding the value to a running total.
intAscSum = 0;
chrArray = strMsg.ToCharArray();
for (intCtr = 0; intCtr <= chrArray.Length - 1; intCtr++)
{
intAscSum = intAscSum + (chrArray[intCtr]);
}
// Next, convert ASCII sum to a binary digit by:
// 1) taking the remainder of the ASCII sum divided by 2
// 2) Repeat until sum reaches 0
// 3) Pad to 16 digits with leading zeroes
do
{
strBinVal = (intAscSum % 2).ToString() + strBinVal;
intAscSum = intAscSum / 2;
} while (intAscSum > 0);
strBinVal = strBinVal.PadLeft(16, '0');
// Next, invert all bits in binary number.
chrArray = strBinVal.ToCharArray();
strInvBinVal = "";
for (intCtr = 0; intCtr <= chrArray.Length - 1; intCtr++)
{
if (chrArray[intCtr] == '0') { strInvBinVal = strInvBinVal + '1'; }
else { strInvBinVal = strInvBinVal + '0'; }
}
// Next, add 1 to the inverted binary digit. Loop from least significant digit (rightmost) to most (leftmost);
// if digit is 1, flip to 0 and retain carry bit to next significant digit.
blnCarryBit = true;
chrArray = strInvBinVal.ToCharArray();
for (intCtr = chrArray.Length - 1; intCtr >= 0; intCtr--)
{
if (blnCarryBit == true)
{
if (chrArray[intCtr] == '0')
{
chrArray[intCtr] = '1';
blnCarryBit = false;
}
else
{
chrArray[intCtr] = '0';
blnCarryBit = true;
}
}
strNewBinVal = chrArray[intCtr] + strNewBinVal;
}
// Finally, convert binary digit to hex value, append to original SIP message.
return strMsg + (Convert.ToInt16(strNewBinVal, 2)).ToString("X");
}
}
}

Replace the code
for (intCtr = 0; intCtr <= chrArray.Length - 1; intCtr++)
{
intAscSum = intAscSum + (chrArray[intCtr]);
}
chrArray[intCtr] is input ASCII String to ouput the ASCII code in decimal, for example "A" is 65. ASCII encoding only uses 1 byte. UTF-8 uses one byte or more than one byte to represent the UTF-8 char. I think chrArray[intCtr] is designed for ASCII - thus the input of UTF-8 (more than one byte) is not reasonable.
With
int i = 0;
for (i = 0; i < bytes.Length; i++)
{
intAscSum = intAscSum + bytes[i];
}
byte[] bytes = Encoding.UTF8.GetBytes(strMsg);
Add up all the bytes, because one UTF8 char can be more than one byte.

Replacing bits in a Byte Array C#

I want to open a Bitmap File in C# as an array of bytes, and replace certain bytes within that array, and rewrite the Byte array back to disk as a bitmap again.
My current approach is to read into a byte[] array, then convert that array to a list to begin editing individual bytes.
originalBytes = File.ReadAllBytes(path);
List<byte> listBytes = new List<Byte>(originalBytes);
How does one go about replacing every nth byte in the array with a user configured/different byte each time and rewriting back to file?

no need in List<byte>
replaces every n-th byte with customByte
var n = 5;
byte customByte = 0xFF;
var bytes = File.ReadAllBytes(path);
for (var i = 0; i < bytes.Length; i++)
{
if (i%n == 0)
{
bytes[i] = customByte;
}
}
File.WriteAllBytes(path, bytes);

Assuming that you want to replace every nth byte with the same new byte, you could do something like this (shown for every 3rd byte):
int n = 3;
byte newValue = 0xFF;
for (int i = n; i < listBytes.Count; i += n)
{
listBytes[i] = newValue;
}
File.WriteAllBytes(path, listBytes.ToArray());
Of course, you could also do this with a fancy LINQ expression which would be harder to read i guess.

Technically, you can implement something like this:
// ReadAllBytes returns byte[] array, we have no need in List<byte>
byte[] data = File.ReadAllBytes(path);
// starting from 0 - int i = 0 - will ruin BMP header which we must spare
// if n is small, you may want to start from 2 * n, 3 * n etc.
// or from some fixed offset
for (int i = n; i < data.Length; i += n)
data[i] = yourValue;
File.WriteAllBytes(path, data);
Please notice, that Bitmap file has a header
https://en.wikipedia.org/wiki/BMP_file_format
that's why I've started loop from n, not from 0

What is the correct way to get a byte array from a FileStream?

The Microsoft website has the code snippet:
using (FileStream fsSource = new FileStream(pathSource,
FileMode.Open, FileAccess.Read))
{
// Read the source file into a byte array.
byte[] bytes = new byte[fsSource.Length];
int numBytesToRead = (int)fsSource.Length;
int numBytesRead = 0;
while (numBytesToRead > 0)
{
// Read may return anything from 0 to numBytesToRead.
int n = fsSource.Read(bytes, numBytesRead, numBytesToRead);
// Break when the end of the file is reached.
if (n == 0)
break;
numBytesRead += n;
numBytesToRead -= n;
}
}
What concerns me is that fsSource.Length is a long, whereas numBytesRead is an int so at most only 2 * int.MaxValue can be read into bytes (the head and the tail of the stream). So my questions are:
Is there some reason that this is OK?
If not, how should you read a FileStream into a byte[].

In this situation I wouldn't even bother processing the FileStream manually; use File.ReadAllBytes instead:
byte[] bytes = File.ReadAllBytes(pathSource);

To answer your question:
The sample code is good for most of applications where we are not reaching extremes.
If you have really long stream like say a video, use BufferedStream. Sample code is available at MSDN site

Example using ReadAllBytes:
private byte[] m_cfgBuffer;
m_cfgBuffer = File.ReadAllBytes(m_FileName);
StringBuilder PartNbr = new StringBuilder();
StringBuilder Version = new StringBuilder();
int i, j;
byte b;
i = 356; // We know that the cfg file header ends at position 356 (1st hex(80))
b = m_cfgBuffer[i];
while (b != 0x80) // Scan for 2nd hex(80)
{
i++;
b = m_cfgBuffer[i];
}
// Now extract the part number - 6 bytes after hex(80)
m_PartNbrPos = i + 5;
for (j = m_PartNbrPos; j < m_PartNbrPos + 6; j++)
{
char cP = (char)m_cfgBuffer[j];
PartNbr.Append(cP);
}
m_PartNbr = PartNbr.ToString();
// Now, extract version number - 6 bytes after part number
m_VersionPos = (m_PartNbrPos + 6) + 6;
for (j = m_VersionPos; j < m_VersionPos + 2; j++)
{
char cP = (char)m_cfgBuffer[j];
Version.Append(cP);
}
m_Version = Version.ToString();

Why do I get the following output when inverting bits in a byte?

Assumption:
Converting a
byte[] from Little Endian to Big
Endian means inverting the order of the bits in
each byte of the byte[].
Assuming this is correct, I tried the following to understand this:
byte[] data = new byte[] { 1, 2, 3, 4, 5, 15, 24 };
byte[] inverted = ToBig(data);
var little = new BitArray(data);
var big = new BitArray(inverted);
int i = 1;
foreach (bool b in little)
{
Console.Write(b ? "1" : "0");
if (i == 8)
{
i = 0;
Console.Write(" ");
}
i++;
}
Console.WriteLine();
i = 1;
foreach (bool b in big)
{
Console.Write(b ? "1" : "0");
if (i == 8)
{
i = 0;
Console.Write(" ");
}
i++;
}
Console.WriteLine();
Console.WriteLine(BitConverter.ToString(data));
Console.WriteLine(BitConverter.ToString(ToBig(data)));
foreach (byte b in data)
{
Console.Write("{0} ", b);
}
Console.WriteLine();
foreach (byte b in inverted)
{
Console.Write("{0} ", b);
}
The convert method:
private static byte[] ToBig(byte[] data)
{
byte[] inverted = new byte[data.Length];
for (int i = 0; i < data.Length; i++)
{
var bits = new BitArray(new byte[] { data[i] });
var invertedBits = new BitArray(bits.Count);
int x = 0;
for (int p = bits.Count - 1; p >= 0; p--)
{
invertedBits[x] = bits[p];
x++;
}
invertedBits.CopyTo(inverted, i);
}
return inverted;
}
The output of this little application is different from what I expected:
00000001 00000010 00000011 00000100 00000101 00001111 00011000
00000001 00000010 00000011 00000100 00000101 00001111 00011000
80-40-C0-20-A0-F0-18
01-02-03-04-05-0F-18
1 2 3 4 5 15 24
1 2 3 4 5 15 24
For some reason the data remains the same, unless printed using BitConverter.
What am I not understanding?
Update
New code produces the following output:
10000000 01000000 11000000 00100000 10100000 11110000 00011000
00000001 00000010 00000011 00000100 00000101 00001111 00011000
01-02-03-04-05-0F-18
80-40-C0-20-A0-F0-18
1 2 3 4 5 15 24
128 64 192 32 160 240 24
But as I have been told now, my method is incorrect anyway because I should invert the bytes
and not the bits?
This hardware developer I'm working with told me to invert the bits because he cannot read the data.
Context where I'm using this
The application that will use this does not really work with numbers.
I'm supposed to save a stream of bits to file where
1 = white and 0 = black.
They represent pixels of a bitmap 256x64.
byte 0 to byte 31 represents the first row of pixels
byte 32 to byte 63 the second row of pixels.
I have code that outputs these bits... but the developer is telling
me they are in the wrong order... He says the bytes are fine but the bits are not.
So I'm left confused :p

No. Endianness refers to the order of bytes, not bits. Big endian systems store the most-significant byte first and little-endian systems store the least-significant first. The bits within a byte remain in the same order.
Your ToBig() function is returning the original data rather than the bit-swapped data, it seems.

Your method may be correct at this point. There are different meanings of endianness, and it depends on the hardware.
Typically, it's used for converting between computing platforms. Most CPU vendors (now) use the same bit ordering, but different byte ordering, for different chipsets. This means, that, if you are passing a 2-byte int from one system to another, you leave the bits alone, but swap bytes 1 and 2, ie:
int somenumber -> byte[2]: somenumber[high],somenumber[low] ->
byte[2]: somenumber[low],somenumber[high] -> int newNumber
However, this isn't always true. Some hardware still uses inverted BIT ordering, so what you have may be correct. You'll need to either trust your hardware dev. or look into it further.
I recommend reading up on this on Wikipedia - always a great source of info:
http://en.wikipedia.org/wiki/Endianness
Your ToBig method has a bug.
At the end:
invertedBits.CopyTo(data, i);
}
return data;
You need to change that to:
byte[] newData = new byte[data.Length];
invertedBits.CopyTo(newData, i);
}
return newData;
You're resetting your input data, so you're receiving both arrays inverted. The problem is that arrays are reference types, so you can modify the original data.

As greyfade already said, endianness is not about bit ordering.
The reason that your code doesn't do what you expect, is that the ToBig method changes the array that you send to it. That means that after calling the method the array is inverted, and data and inverted are just two references pointing to the same array.
Here's a corrected version of the method.
private static byte[] ToBig(byte[] data) {
byte[] result = new byte[data.length];
for (int i = 0; i < data.Length; i++) {
var bits = new BitArray(new byte[] { data[i] });
var invertedBits = new BitArray(bits.Count);
int x = 0;
for (int p = bits.Count - 1; p >= 0; p--) {
invertedBits[x] = bits[p];
x++;
}
invertedBits.CopyTo(result, i);
}
return result;
}
Edit:
Here's a method that changes endianness for a byte array:
static byte[] ConvertEndianness(byte[] data, int wordSize) {
if (data.Length % wordSize != 0) throw new ArgumentException("The data length does not divide into an even number of words.");
byte[] result = new byte[data.Length];
int offset = wordSize - 1;
for (int i = 0; i < data.Length; i++) {
result[i + offset] = data[i];
offset -= 2;
if (offset < -wordSize) {
offset += wordSize * 2;
}
}
return result;
}
Example:
byte[] data = { 1,2,3,4,5,6 };
byte[] inverted = ConvertEndianness(data, 2);
Console.WriteLine(BitConverter.ToString(inverted));
Output:
02-01-04-03-06-05
The second parameter is the word size. As endianness is the ordering of bytes in a word, you have to specify how large the words are.
Edit 2:
Here is a more efficient method for reversing the bits:
static byte[] ReverseBits(byte[] data) {
byte[] result = new byte[data.Length];
for (int i = 0; i < data.Length; i++) {
int b = data[i];
int r = 0;
for (int j = 0; j < 8; j++) {
r <<= 1;
r |= b & 1;
b >>= 1;
}
result[i] = (byte)r;
}
return result;
}

One big problem I see is ToBig changes the contents of the data[] array that is passed to it.
You're calling ToBig on an array named data, then assigning the result to inverted, but since you didn't create a new array inside ToBig, you modified both arrays, then you proceed to treat the arrays data and inverted as different when in reality they are not.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Converting from byte[] to string - c#

Related

C# How convert unHex to String

checksum for UTF-8 string

Replacing bits in a Byte Array C#

What is the correct way to get a byte array from a FileStream?

Why do I get the following output when inverting bits in a byte?

Categories

Resources