recently I have discovered C#, which is really what I want. Before C# I was coding with AS3. I've recoded all my old program using C# but I am blocked with this :
public function Envoie_Serveur(param1:String) : void
{
var _loc_2:* = String(this.CMDTEC % 9000 + 1000).split("");
this.Serveur.send(this.MDT[_loc_2[0]] + this.MDT[_loc_2[1]] + this.MDT[_loc_2[2]] + this.MDT[_loc_2[3]] + param1);
var _loc_3:* = this;
var _loc_4:* = this.CMDTEC + 1;
_loc_3.CMDTEC = _loc_4;
return;
}
CMDTEC and MDT are 2 byteArray (byte[] in C# I guess)
That is what I have tried but which is not working ;c
byte[] _loc_1 = Encode((Int64.Parse(this.CMDTEC[0].ToString("X", System.Globalization.NumberStyles.HexNumber)) % 9000 + 1000) + "");
var fingerprint = new byte[4];
fingerprint[0] = byte.Parse(this.MDT[_loc_1[0]].ToString("X"), System.Globalization.NumberStyles.HexNumber);
fingerprint[1] = byte.Parse(this.MDT[_loc_1[1]].ToString("X"), System.Globalization.NumberStyles.HexNumber);
fingerprint[2] = byte.Parse(this.MDT[_loc_1[2]].ToString("X"), System.Globalization.NumberStyles.HexNumber);
fingerprint[3] = byte.Parse(this.MDT[_loc_1[3]].ToString("X"), System.Globalization.NumberStyles.HexNumber);
this.CMDTEC++;
And for exemple, that is what CMDTEC and MDT contains :
this.MDT = "1400175151406"; (just for exemple, I get this by socket)
this.CMDTEC = "8306"; (idem as ^)
How can I convert properly that to C# please ? Thanks in advance for answers.
Here's an attempt; but you need to add more details to your question regarding inputs and outputs, datatypes etc. Although you are dealing with strings, it appears you are mainly handling numeric values.
The code below is verbose for clarity, it can be condensed a lot more. Please note I haven't actually compiled and tried the code (because I don't have a Serveur object, it won't compile for me).
byte[] MDT = System.Text.Encoding.ASCII.GetBytes ("1400175151406");
byte[] CMDTEC = System.Text.Encoding.ASCII.GetBytes ("8306");
void Envoie_Serveur(string param1)
{
// firstly, get CMDTEC as a string, assuming ascii encoded bytes
string sCMDTEC = System.Text.Encoding.ASCII.GetString(CMDTEC);
// now convert CMDTEC string to an int
int iCMDTEC = int.Parse(sCMDTEC);
// now do modulation etc on the int value
iCMDTEC = iCMDTEC % 9000 + 1000;
// now convert modulated int back into a string
sCMDTEC = iCMDTEC.ToString();
// now convert modulated string back to byte array, assuming ascii encoded bytes
byte[] bCMDTEC = System.Text.Encoding.ASCII.GetBytes(sCMDTEC);
// send the data
this.Serveur.send(((int)this.MDT[bCMDTEC[0]]) + ((int)this.MDT[bCMDTEC[1]]) + ((int)this.MDT[bCMDTEC[2]]) + ((int)this.MDT[bCMDTEC[3]]) + int.Parse(param1));
// convert CMDTEC bytes to string again
sCMDTEC = System.Text.Encoding.ASCII.GetString(CMDTEC);
// convert CMDTEC string to int again
iCMDTEC = int.Parse(sCMDTEC);
// increament CMDTEC
iCMDTEC += 1;
// convert back to string
sCMDTEC = iCMDTEC.ToString();
// convert back to bytes
this.CMDTEC = System.Text.Encoding.ASCII.GetBytes(sCMDTEC);
}
Related
I have Java code example, of how verification code should be computed. And I have to convert Java code to C#.
First of all, code is computed as:
integer(SHA256(hash)[-2: -1]) mod 10000
Where we take SHA256 result, extract 2 rightmost bytes from it, interpret them as big-endian unsigned integer and take the last 4 digits in decimal for display.
Java code:
public static String calculate(byte[] documentHash) {
byte[] digest = DigestCalculator.calculateDigest(documentHash, HashType.SHA256);
ByteBuffer byteBuffer = ByteBuffer.wrap(digest);
int shortBytes = Short.SIZE / Byte.SIZE; // Short.BYTES in java 8
int rightMostBytesIndex = byteBuffer.limit() - shortBytes;
short twoRightmostBytes = byteBuffer.getShort(rightMostBytesIndex);
int positiveInteger = ((int) twoRightmostBytes) & 0xffff;
String code = String.valueOf(positiveInteger);
String paddedCode = "0000" + code;
return paddedCode.substring(code.length());
}
public static byte[] calculateDigest(byte[] dataToDigest, HashType hashType) {
String algorithmName = hashType.getAlgorithmName();
return DigestUtils.getDigest(algorithmName).digest(dataToDigest);
}
So int C# from Base64 string:
2afAxT+nH5qNYrfM+D7F6cKAaCKLLA23pj8ro3SksqwsdwmC3xTndKJotewzu7HlDy/DiqgkR+HXBiA0sW1x0Q==
should compute code equal to: 3676
Any ideas how to implement this?
class Program
{
static void Main(string[] args)
{
Console.WriteLine(GetCode("2afAxT+nH5qNYrfM+D7F6cKAaCKLLA23pj8ro3SksqwsdwmC3xTndKJotewzu7HlDy/DiqgkR+HXBiA0sW1x0Q=="));
}
public static string GetCode(string str)
{
var sha = System.Security.Cryptography.SHA256.Create();
var hash = sha.ComputeHash(Convert.FromBase64String(str));
var last2 = hash[^2..];
var intVal = ((int) last2[0]) * 0x0100 + ((int) last2[1]);
var digits = intVal % 10000;
return $"{digits:0000}";
}
}
I have a byte[] array that is loaded from a file that I happen to known contains UTF-8.
In some debugging code, I need to convert it to a string. Is there a one-liner that will do this?
Under the covers it should be just an allocation and a memcopy, so even if it is not implemented, it should be possible.
string result = System.Text.Encoding.UTF8.GetString(byteArray);
There're at least four different ways doing this conversion.
Encoding's GetString, but you won't be able to get the original bytes back if those bytes have non-ASCII characters.
BitConverter.ToString The output is a "-" delimited string, but there's no .NET built-in method to convert the string back to byte array.
Convert.ToBase64String You can easily convert the output string back to byte array by using Convert.FromBase64String. Note: The output string could contain '+', '/' and '='. If you want to use the string in a URL, you need to explicitly encode it.
HttpServerUtility.UrlTokenEncodeYou can easily convert the output string back to byte array by using HttpServerUtility.UrlTokenDecode. The output string is already URL friendly! The downside is it needs System.Web assembly if your project is not a web project.
A full example:
byte[] bytes = { 130, 200, 234, 23 }; // A byte array contains non-ASCII (or non-readable) characters
string s1 = Encoding.UTF8.GetString(bytes); // ���
byte[] decBytes1 = Encoding.UTF8.GetBytes(s1); // decBytes1.Length == 10 !!
// decBytes1 not same as bytes
// Using UTF-8 or other Encoding object will get similar results
string s2 = BitConverter.ToString(bytes); // 82-C8-EA-17
String[] tempAry = s2.Split('-');
byte[] decBytes2 = new byte[tempAry.Length];
for (int i = 0; i < tempAry.Length; i++)
decBytes2[i] = Convert.ToByte(tempAry[i], 16);
// decBytes2 same as bytes
string s3 = Convert.ToBase64String(bytes); // gsjqFw==
byte[] decByte3 = Convert.FromBase64String(s3);
// decByte3 same as bytes
string s4 = HttpServerUtility.UrlTokenEncode(bytes); // gsjqFw2
byte[] decBytes4 = HttpServerUtility.UrlTokenDecode(s4);
// decBytes4 same as bytes
A general solution to convert from byte array to string when you don't know the encoding:
static string BytesToStringConverted(byte[] bytes)
{
using (var stream = new MemoryStream(bytes))
{
using (var streamReader = new StreamReader(stream))
{
return streamReader.ReadToEnd();
}
}
}
Definition:
public static string ConvertByteToString(this byte[] source)
{
return source != null ? System.Text.Encoding.UTF8.GetString(source) : null;
}
Using:
string result = input.ConvertByteToString();
Converting a byte[] to a string seems simple, but any kind of encoding is likely to mess up the output string. This little function just works without any unexpected results:
private string ToString(byte[] bytes)
{
string response = string.Empty;
foreach (byte b in bytes)
response += (Char)b;
return response;
}
I saw some answers at this post and it's possible to be considered completed base knowledge, because I have a several approaches in C# Programming to resolve the same problem. The only thing that is necessary to be considered is about a difference between pure UTF-8 and UTF-8 with a BOM.
Last week, at my job, I needed to develop one functionality that outputs CSV files with a BOM and other CSV files with pure UTF-8 (without a BOM). Each CSV file encoding type will be consumed by different non-standardized APIs. One API reads UTF-8 with a BOM and the other API reads without a BOM. I needed to research the references about this concept, reading the "What's the difference between UTF-8 and UTF-8 without BOM?" Stack Overflow question, and the Wikipedia article "Byte order mark" to build my approach.
Finally, my C# Programming for both UTF-8 encoding types (with BOM and pure) needed to be similar to this example below:
// For UTF-8 with BOM, equals shared by Zanoni (at top)
string result = System.Text.Encoding.UTF8.GetString(byteArray);
//for Pure UTF-8 (without B.O.M.)
string result = (new UTF8Encoding(false)).GetString(byteArray);
Using (byte)b.ToString("x2"), Outputs b4b5dfe475e58b67
public static class Ext {
public static string ToHexString(this byte[] hex)
{
if (hex == null) return null;
if (hex.Length == 0) return string.Empty;
var s = new StringBuilder();
foreach (byte b in hex) {
s.Append(b.ToString("x2"));
}
return s.ToString();
}
public static byte[] ToHexBytes(this string hex)
{
if (hex == null) return null;
if (hex.Length == 0) return new byte[0];
int l = hex.Length / 2;
var b = new byte[l];
for (int i = 0; i < l; ++i) {
b[i] = Convert.ToByte(hex.Substring(i * 2, 2), 16);
}
return b;
}
public static bool EqualsTo(this byte[] bytes, byte[] bytesToCompare)
{
if (bytes == null && bytesToCompare == null) return true; // ?
if (bytes == null || bytesToCompare == null) return false;
if (object.ReferenceEquals(bytes, bytesToCompare)) return true;
if (bytes.Length != bytesToCompare.Length) return false;
for (int i = 0; i < bytes.Length; ++i) {
if (bytes[i] != bytesToCompare[i]) return false;
}
return true;
}
}
There is also class UnicodeEncoding, quite simple in usage:
ByteConverter = new UnicodeEncoding();
string stringDataForEncoding = "My Secret Data!";
byte[] dataEncoded = ByteConverter.GetBytes(stringDataForEncoding);
Console.WriteLine("Data after decoding: {0}", ByteConverter.GetString(dataEncoded));
In addition to the selected answer, if you're using .NET 3.5 or .NET 3.5 CE, you have to specify the index of the first byte to decode, and the number of bytes to decode:
string result = System.Text.Encoding.UTF8.GetString(byteArray, 0, byteArray.Length);
Alternatively:
var byteStr = Convert.ToBase64String(bytes);
The BitConverter class can be used to convert a byte[] to string.
var convertedString = BitConverter.ToString(byteAttay);
Documentation of BitConverter class can be fount on MSDN.
To my knowledge none of the given answers guarantee correct behavior with null termination. Until someone shows me differently I wrote my own static class for handling this with the following methods:
// Mimics the functionality of strlen() in c/c++
// Needed because niether StringBuilder or Encoding.*.GetString() handle \0 well
static int StringLength(byte[] buffer, int startIndex = 0)
{
int strlen = 0;
while
(
(startIndex + strlen + 1) < buffer.Length // Make sure incrementing won't break any bounds
&& buffer[startIndex + strlen] != 0 // The typical null terimation check
)
{
++strlen;
}
return strlen;
}
// This is messy, but I haven't found a built-in way in c# that guarentees null termination
public static string ParseBytes(byte[] buffer, out int strlen, int startIndex = 0)
{
strlen = StringLength(buffer, startIndex);
byte[] c_str = new byte[strlen];
Array.Copy(buffer, startIndex, c_str, 0, strlen);
return Encoding.UTF8.GetString(c_str);
}
The reason for the startIndex was in the example I was working on specifically I needed to parse a byte[] as an array of null terminated strings. It can be safely ignored in the simple case
A LINQ one-liner for converting a byte array byteArrFilename read from a file to a pure ASCII C-style zero-terminated string would be this: Handy for reading things like file index tables in old archive formats.
String filename = new String(byteArrFilename.TakeWhile(x => x != 0)
.Select(x => x < 128 ? (Char)x : '?').ToArray());
I use '?' as the default character for anything not pure ASCII here, but that can be changed, of course. If you want to be sure you can detect it, just use '\0' instead, since the TakeWhile at the start ensures that a string built this way cannot possibly contain '\0' values from the input source.
Try this console application:
static void Main(string[] args)
{
//Encoding _UTF8 = Encoding.UTF8;
string[] _mainString = { "Hello, World!" };
Console.WriteLine("Main String: " + _mainString);
// Convert a string to UTF-8 bytes.
byte[] _utf8Bytes = Encoding.UTF8.GetBytes(_mainString[0]);
// Convert UTF-8 bytes to a string.
string _stringuUnicode = Encoding.UTF8.GetString(_utf8Bytes);
Console.WriteLine("String Unicode: " + _stringuUnicode);
}
Here is a result where you didn’t have to bother with encoding. I used it in my network class and send binary objects as string with it.
public static byte[] String2ByteArray(string str)
{
char[] chars = str.ToArray();
byte[] bytes = new byte[chars.Length * 2];
for (int i = 0; i < chars.Length; i++)
Array.Copy(BitConverter.GetBytes(chars[i]), 0, bytes, i * 2, 2);
return bytes;
}
public static string ByteArray2String(byte[] bytes)
{
char[] chars = new char[bytes.Length / 2];
for (int i = 0; i < chars.Length; i++)
chars[i] = BitConverter.ToChar(bytes, i * 2);
return new string(chars);
}
string result = ASCIIEncoding.UTF8.GetString(byteArray);
I want to hash password using mx.utils.SHA256 or SHA256 algo based password in ActionScript for my SQLite local database hashed password. So that I can match the inserted password with the database stored HashedPassword. For this I am using Salt too.
I want the same things with ActionScript which I have done in VB code.
How can I change the following in ActionScript from VB.NET?
Encoding.UTF8.GetBytes("String")
String Salt - type parameter.
System.Text.Encoding.Default.GetBytes(Salt.ToString.ToCharArray))
byte HashOut - type parameter.
Convert.ToBase64String(HashOut)
Array.Copy() method Copies one Byte Array to another according to specified length:
Array.Copy(Data, DataAndSalt, Data.Length) // concatenation of Arrays in context of `ActionScript`
Fairly simple process, but the documentation of Actionscript's SHA256 class is pretty lackluster, What you need to do is:
Write your salted string to a ByteArray
Call SHA256.computeDigest()
EG:
public function hashMyString(mySaltedInput:String):String
{
var bytes:ByteArray = new ByteArray;
bytes.writeUTFBytes(mySaltedInput):
return SHA256.computeDigest(bytes);
}
I have Created the whole code according to my requirements Own My Own , Which was done in the VB and now both are producing the same results .
Encoding.UTF8.GetBytes("String") VB code in ActionScript is
yourByteArray.writeMultiByte("String", "iso-8859-1");
System.Text.Encoding.Default.GetBytes(Salt.ToString.ToCharArray))
VB code in ActionScript is
byterrSalt.writeMultiByte(Salt,Salt);
Array.Copy(Data, DataAndSalt, Data.Length)
it was for concatenation of byte array which has been done in
actions script is done by
var DataAndSalt:ByteArray = new ByteArray();
DataAndSalt.writeBytes(Data);
DataAndSalt.writeBytes(Salt);
DataAndSalt ByteArray Will have both byteArray now Data + Salt
Data is ByteArray and you can Concatenate Many Byte Arrays by .writeBytes(YourByteArray)
. Convert.ToBase64String(HashOut) is done By the following fucntion
private static const BASE64_CHARS:String = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=";
public static function encodeByteArray(data:ByteArray):String {
// Initialise output
var output:String = "";
// Create data and output buffers
var dataBuffer:Array;
var outputBuffer:Array = new Array(4);
// Rewind ByteArray
data.position = 0;
// while there are still bytes to be processed
while (data.bytesAvailable > 0) {
// Create new data buffer and populate next 3 bytes from data
dataBuffer = new Array();
for (var i:uint = 0; i < 3 && data.bytesAvailable > 0; i++) {
dataBuffer[i] = data.readUnsignedByte();
}
// Convert to data buffer Base64 character positions and
// store in output buffer
outputBuffer[0] = (dataBuffer[0] & 0xfc) >> 2;
outputBuffer[1] = ((dataBuffer[0] & 0x03) << 4) | ((dataBuffer[1]) >> 4);
outputBuffer[2] = ((dataBuffer[1] & 0x0f) << 2) | ((dataBuffer[2]) >> 6);
outputBuffer[3] = dataBuffer[2] & 0x3f;
// If data buffer was short (i.e not 3 characters) then set
// end character indexes in data buffer to index of '=' symbol.
// This is necessary because Base64 data is always a multiple of
// 4 bytes and is basses with '=' symbols.
for (var j:uint = dataBuffer.length; j < 3; j++) {
outputBuffer[j + 1] = 64;
}
// Loop through output buffer and add Base64 characters to
// encoded data string for each character.
for (var k:uint = 0; k < outputBuffer.length; k++) {
output += BASE64_CHARS.charAt(outputBuffer[k]);
}
}
// Return encoded data
return output;
}
Thank You
Udit Bhardwaj
I'm trying to convert this code (from David Clunie) to C#, for the purposes of creating OIDs from UUID (or GUID) in my program
public static String createOIDFromUUIDCanonicalHexString(String hexString) throws IllegalArgumentException {
UUID uuid = UUID.fromString(hexString);
long leastSignificantBits = uuid.getLeastSignificantBits();
long mostSignificantBits = uuid.getMostSignificantBits();
BigInteger decimalValue = makeBigIntegerFromUnsignedLong(mostSignificantBits);
decimalValue = decimalValue.shiftLeft(64);
BigInteger bigValueOfLeastSignificantBits = makeBigIntegerFromUnsignedLong(leastSignificantBits);
decimalValue = decimalValue.or(bigValueOfLeastSignificantBits); // not add() ... do not want to introduce question of signedness of long
return OID_PREFIX+"."+decimalValue.toString();
I don't understand why make the longs (leastSignificantBits, mostSignificantBits) from the parts of the UUID, and then make the bigint from them - why not just directly make a BigInt? (since he's shifting the most significant digits left anyway).
Can anyone give me any insight into why this is written the way it is? (disclaimer: I have not tried to run the java code, I'm just trying to implement this in C#)
[EDIT]
Turns out there were several problems - a big one, as Kevin Coulombe points out, is the byte order Microsoft stores GUIDs in. The java code seems to get the bytes in the obvious (left to right) order, but in two separate chunks, apparently (also thanks to Kevin) because there's no easy way to get the whole byte array in Java.
Here is working C# code, forwards and (partially) backwards:
class Program
{
const string OidPrefix = "2.25.";
static void Main(string[] args)
{
Guid guid = new Guid("000000FF-0000-0000-0000-000000000000");
//Guid guid = new Guid("f81d4fae-7dec-11d0-a765-00a0c91e6bf6");
Console.WriteLine("Original guid: " + guid.ToString());
byte[] guidBytes = StringToByteArray(guid.ToString().Replace("-", ""));
BigInteger decimalValue = new BigInteger(guidBytes);
Console.WriteLine("The OID is " + OidPrefix + decimalValue.ToString());
string hexGuid = decimalValue.ToHexString().PadLeft(32, '0');//padded for later use
Console.WriteLine("The hex value of the big int is " + hexGuid);
Guid testGuid = new Guid(hexGuid);
Console.WriteLine("This guid should match the orginal one: " + testGuid);
Console.ReadKey();
}
public static byte[] StringToByteArray(String hex)
{
int NumberChars = hex.Length;
byte[] bytes = new byte[NumberChars / 2];
for (int i = 0; i < NumberChars; i += 2)
bytes[i / 2] = Convert.ToByte(hex.Substring(i, 2), 16);
return bytes;
}
}
I think they do it this way because there is no easy way of converting the UUID to a byte array to pass to the BigInteger in Java.
See this : GUID to ByteArray
In C#, this should be what you are looking for :
String oid = "prefix" + "." + new System.Numerics.BigInteger(
System.Guid.NewGuid().ToByteArray()).ToString();
String oid_prefix = "2.25"
String hexString = "f81d4fae-7dec-11d0-a765-00a0c91e6bf6"
UUID uuid = UUID.fromString(hexString);
long leastSignificantBits = uuid.getLeastSignificantBits();
long mostSignificantBits = uuid.getMostSignificantBits();
mostSignificantBits = mostSignificantBits & Long.MAX_VALUE;
BigInteger decimalValue = BigInteger.valueOf(mostSignificantBits);
decimalValue = decimalValue.setBit(63);
decimalValue = decimalValue.shiftLeft(64);
leastSignificantBits = leastSignificantBits & Long.MAX_VALUE;
BigInteger bigValueLeastSignificantBit = BigInteger.valueOf(leastSignificantBits);
bigValueLeastSignificantBit = bigValueLeastSignificantBit.setBit(63);
decimalValue = decimalValue.or(bigValueLeastSignificantBit);
println "oid is = "+oid_prefix+"."+decimalValue
I have a byte[] array that is loaded from a file that I happen to known contains UTF-8.
In some debugging code, I need to convert it to a string. Is there a one-liner that will do this?
Under the covers it should be just an allocation and a memcopy, so even if it is not implemented, it should be possible.
string result = System.Text.Encoding.UTF8.GetString(byteArray);
There're at least four different ways doing this conversion.
Encoding's GetString, but you won't be able to get the original bytes back if those bytes have non-ASCII characters.
BitConverter.ToString The output is a "-" delimited string, but there's no .NET built-in method to convert the string back to byte array.
Convert.ToBase64String You can easily convert the output string back to byte array by using Convert.FromBase64String. Note: The output string could contain '+', '/' and '='. If you want to use the string in a URL, you need to explicitly encode it.
HttpServerUtility.UrlTokenEncodeYou can easily convert the output string back to byte array by using HttpServerUtility.UrlTokenDecode. The output string is already URL friendly! The downside is it needs System.Web assembly if your project is not a web project.
A full example:
byte[] bytes = { 130, 200, 234, 23 }; // A byte array contains non-ASCII (or non-readable) characters
string s1 = Encoding.UTF8.GetString(bytes); // ���
byte[] decBytes1 = Encoding.UTF8.GetBytes(s1); // decBytes1.Length == 10 !!
// decBytes1 not same as bytes
// Using UTF-8 or other Encoding object will get similar results
string s2 = BitConverter.ToString(bytes); // 82-C8-EA-17
String[] tempAry = s2.Split('-');
byte[] decBytes2 = new byte[tempAry.Length];
for (int i = 0; i < tempAry.Length; i++)
decBytes2[i] = Convert.ToByte(tempAry[i], 16);
// decBytes2 same as bytes
string s3 = Convert.ToBase64String(bytes); // gsjqFw==
byte[] decByte3 = Convert.FromBase64String(s3);
// decByte3 same as bytes
string s4 = HttpServerUtility.UrlTokenEncode(bytes); // gsjqFw2
byte[] decBytes4 = HttpServerUtility.UrlTokenDecode(s4);
// decBytes4 same as bytes
A general solution to convert from byte array to string when you don't know the encoding:
static string BytesToStringConverted(byte[] bytes)
{
using (var stream = new MemoryStream(bytes))
{
using (var streamReader = new StreamReader(stream))
{
return streamReader.ReadToEnd();
}
}
}
Definition:
public static string ConvertByteToString(this byte[] source)
{
return source != null ? System.Text.Encoding.UTF8.GetString(source) : null;
}
Using:
string result = input.ConvertByteToString();
Converting a byte[] to a string seems simple, but any kind of encoding is likely to mess up the output string. This little function just works without any unexpected results:
private string ToString(byte[] bytes)
{
string response = string.Empty;
foreach (byte b in bytes)
response += (Char)b;
return response;
}
I saw some answers at this post and it's possible to be considered completed base knowledge, because I have a several approaches in C# Programming to resolve the same problem. The only thing that is necessary to be considered is about a difference between pure UTF-8 and UTF-8 with a BOM.
Last week, at my job, I needed to develop one functionality that outputs CSV files with a BOM and other CSV files with pure UTF-8 (without a BOM). Each CSV file encoding type will be consumed by different non-standardized APIs. One API reads UTF-8 with a BOM and the other API reads without a BOM. I needed to research the references about this concept, reading the "What's the difference between UTF-8 and UTF-8 without BOM?" Stack Overflow question, and the Wikipedia article "Byte order mark" to build my approach.
Finally, my C# Programming for both UTF-8 encoding types (with BOM and pure) needed to be similar to this example below:
// For UTF-8 with BOM, equals shared by Zanoni (at top)
string result = System.Text.Encoding.UTF8.GetString(byteArray);
//for Pure UTF-8 (without B.O.M.)
string result = (new UTF8Encoding(false)).GetString(byteArray);
Using (byte)b.ToString("x2"), Outputs b4b5dfe475e58b67
public static class Ext {
public static string ToHexString(this byte[] hex)
{
if (hex == null) return null;
if (hex.Length == 0) return string.Empty;
var s = new StringBuilder();
foreach (byte b in hex) {
s.Append(b.ToString("x2"));
}
return s.ToString();
}
public static byte[] ToHexBytes(this string hex)
{
if (hex == null) return null;
if (hex.Length == 0) return new byte[0];
int l = hex.Length / 2;
var b = new byte[l];
for (int i = 0; i < l; ++i) {
b[i] = Convert.ToByte(hex.Substring(i * 2, 2), 16);
}
return b;
}
public static bool EqualsTo(this byte[] bytes, byte[] bytesToCompare)
{
if (bytes == null && bytesToCompare == null) return true; // ?
if (bytes == null || bytesToCompare == null) return false;
if (object.ReferenceEquals(bytes, bytesToCompare)) return true;
if (bytes.Length != bytesToCompare.Length) return false;
for (int i = 0; i < bytes.Length; ++i) {
if (bytes[i] != bytesToCompare[i]) return false;
}
return true;
}
}
There is also class UnicodeEncoding, quite simple in usage:
ByteConverter = new UnicodeEncoding();
string stringDataForEncoding = "My Secret Data!";
byte[] dataEncoded = ByteConverter.GetBytes(stringDataForEncoding);
Console.WriteLine("Data after decoding: {0}", ByteConverter.GetString(dataEncoded));
In addition to the selected answer, if you're using .NET 3.5 or .NET 3.5 CE, you have to specify the index of the first byte to decode, and the number of bytes to decode:
string result = System.Text.Encoding.UTF8.GetString(byteArray, 0, byteArray.Length);
Alternatively:
var byteStr = Convert.ToBase64String(bytes);
The BitConverter class can be used to convert a byte[] to string.
var convertedString = BitConverter.ToString(byteAttay);
Documentation of BitConverter class can be fount on MSDN.
To my knowledge none of the given answers guarantee correct behavior with null termination. Until someone shows me differently I wrote my own static class for handling this with the following methods:
// Mimics the functionality of strlen() in c/c++
// Needed because niether StringBuilder or Encoding.*.GetString() handle \0 well
static int StringLength(byte[] buffer, int startIndex = 0)
{
int strlen = 0;
while
(
(startIndex + strlen + 1) < buffer.Length // Make sure incrementing won't break any bounds
&& buffer[startIndex + strlen] != 0 // The typical null terimation check
)
{
++strlen;
}
return strlen;
}
// This is messy, but I haven't found a built-in way in c# that guarentees null termination
public static string ParseBytes(byte[] buffer, out int strlen, int startIndex = 0)
{
strlen = StringLength(buffer, startIndex);
byte[] c_str = new byte[strlen];
Array.Copy(buffer, startIndex, c_str, 0, strlen);
return Encoding.UTF8.GetString(c_str);
}
The reason for the startIndex was in the example I was working on specifically I needed to parse a byte[] as an array of null terminated strings. It can be safely ignored in the simple case
A LINQ one-liner for converting a byte array byteArrFilename read from a file to a pure ASCII C-style zero-terminated string would be this: Handy for reading things like file index tables in old archive formats.
String filename = new String(byteArrFilename.TakeWhile(x => x != 0)
.Select(x => x < 128 ? (Char)x : '?').ToArray());
I use '?' as the default character for anything not pure ASCII here, but that can be changed, of course. If you want to be sure you can detect it, just use '\0' instead, since the TakeWhile at the start ensures that a string built this way cannot possibly contain '\0' values from the input source.
Try this console application:
static void Main(string[] args)
{
//Encoding _UTF8 = Encoding.UTF8;
string[] _mainString = { "Hello, World!" };
Console.WriteLine("Main String: " + _mainString);
// Convert a string to UTF-8 bytes.
byte[] _utf8Bytes = Encoding.UTF8.GetBytes(_mainString[0]);
// Convert UTF-8 bytes to a string.
string _stringuUnicode = Encoding.UTF8.GetString(_utf8Bytes);
Console.WriteLine("String Unicode: " + _stringuUnicode);
}
Here is a result where you didn’t have to bother with encoding. I used it in my network class and send binary objects as string with it.
public static byte[] String2ByteArray(string str)
{
char[] chars = str.ToArray();
byte[] bytes = new byte[chars.Length * 2];
for (int i = 0; i < chars.Length; i++)
Array.Copy(BitConverter.GetBytes(chars[i]), 0, bytes, i * 2, 2);
return bytes;
}
public static string ByteArray2String(byte[] bytes)
{
char[] chars = new char[bytes.Length / 2];
for (int i = 0; i < chars.Length; i++)
chars[i] = BitConverter.ToChar(bytes, i * 2);
return new string(chars);
}
string result = ASCIIEncoding.UTF8.GetString(byteArray);