checksum for UTF-8 string

checksum for UTF-8 string - c#

Below is the checksum description.
The checksum is four ASCII character digits representing the binary sum of the characters including the
first character of the transmission and up to and including the checksum field identifier characters.
To calculate the checksum add each character as an unsigned binary number, take the lower 16 bits of the
total and perform a 2's complement. The checksum field is the result represented by four hex digits.
To verify the correct checksum on received data, simply add all the hex values including the checksum. It
should equal zero.
this is the implementation for ASCII string, but my input string is UTF-8 now.
anyone give some idea to revise the implementation for UTF-8 encoding. Thanks very much.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace SIP2
{
// Adapted from VB.NET from the Library Tech Guy blog
// http://librarytechguy.blogspot.com/2009/11/sip2-checksum_13.html
public class CheckSum
{
public static string ApplyChecksum(string strMsg)
{
int intCtr;
char[] chrArray;
int intAscSum;
bool blnCarryBit;
string strBinVal = String.Empty;
string strInvBinVal;
string strNewBinVal = String.Empty;
// Transfer SIP message to a a character array.Loop through each character of the array,
// converting the character to an ASCII value and adding the value to a running total.
intAscSum = 0;
chrArray = strMsg.ToCharArray();
for (intCtr = 0; intCtr <= chrArray.Length - 1; intCtr++)
{
intAscSum = intAscSum + (chrArray[intCtr]);
}
// Next, convert ASCII sum to a binary digit by:
// 1) taking the remainder of the ASCII sum divided by 2
// 2) Repeat until sum reaches 0
// 3) Pad to 16 digits with leading zeroes
do
{
strBinVal = (intAscSum % 2).ToString() + strBinVal;
intAscSum = intAscSum / 2;
} while (intAscSum > 0);
strBinVal = strBinVal.PadLeft(16, '0');
// Next, invert all bits in binary number.
chrArray = strBinVal.ToCharArray();
strInvBinVal = "";
for (intCtr = 0; intCtr <= chrArray.Length - 1; intCtr++)
{
if (chrArray[intCtr] == '0') { strInvBinVal = strInvBinVal + '1'; }
else { strInvBinVal = strInvBinVal + '0'; }
}
// Next, add 1 to the inverted binary digit. Loop from least significant digit (rightmost) to most (leftmost);
// if digit is 1, flip to 0 and retain carry bit to next significant digit.
blnCarryBit = true;
chrArray = strInvBinVal.ToCharArray();
for (intCtr = chrArray.Length - 1; intCtr >= 0; intCtr--)
{
if (blnCarryBit == true)
{
if (chrArray[intCtr] == '0')
{
chrArray[intCtr] = '1';
blnCarryBit = false;
}
else
{
chrArray[intCtr] = '0';
blnCarryBit = true;
}
}
strNewBinVal = chrArray[intCtr] + strNewBinVal;
}
// Finally, convert binary digit to hex value, append to original SIP message.
return strMsg + (Convert.ToInt16(strNewBinVal, 2)).ToString("X");
}
}
}

Replace the code
for (intCtr = 0; intCtr <= chrArray.Length - 1; intCtr++)
{
intAscSum = intAscSum + (chrArray[intCtr]);
}
chrArray[intCtr] is input ASCII String to ouput the ASCII code in decimal, for example "A" is 65. ASCII encoding only uses 1 byte. UTF-8 uses one byte or more than one byte to represent the UTF-8 char. I think chrArray[intCtr] is designed for ASCII - thus the input of UTF-8 (more than one byte) is not reasonable.
With
int i = 0;
for (i = 0; i < bytes.Length; i++)
{
intAscSum = intAscSum + bytes[i];
}
byte[] bytes = Encoding.UTF8.GetBytes(strMsg);
Add up all the bytes, because one UTF8 char can be more than one byte.

Related

C# How convert unHex to String

I'm trying to convert an unHex value to a string but it's not working.
I have the following value 0x01BB92E7F716F55B144768FCB2EA40187AE6CF6B2E52A64F7331D0539507441F7D770112510D679F0B310116B0D709E049A19467672FFA532A7C30DFB72
Result I hope would be this
but executing the function below displays this result
»’ Ç ÷ õ [Ghü²ê # zæÏk.R¦Os1ÐS • D} w Q gŸ 1 ° × àI¡ ”gg / úS * | 0ß ·) = ¤
Any idea how I can extract the information as expected
public static string Hex2String (string input)
{
var builder = new StringBuilder ();
for (int i = 0; i < socketLength; i + = 2)
{
// throws an exception if not properly formatted
string hexdec = input.Substring (i, 2);
int number = Int32.Parse (hexdec, NumberStyles.HexNumber);
char charToAdd = (char) number;
builder.Append (charToAdd);
}
return builder.ToString ();
}

Your result is base64-encoded. Base64 is a way of taking a byte array and turning it into human-readable characters.
Your code tries to take these raw bytes and cast them to chars, but not all byte values are valid printable characters: some are control characters, some can't be printed, etc.
Instead, let's turn the hex string into a byte array, and then turn that byte array into a base64 string.
string input = "01BB92E7F716F55B144768FCB2EA40187AE6CF6B2E52A64F7331D0539507441F7D770112510D679F0B310116B0D709E049A19467672FFA532A7C30DFB72";
byte[] bytes = new byte[input.Length / 2];
for (int i = 0; i < bytes.Length; i++)
{
bytes[i] = byte.Parse(input.Substring(i * 2, 2), NumberStyles.HexNumber);
}
string result = Convert.ToBase64String(bytes);
This results in:
AbuS5/cW9VsUR2j8supAGHrmz2suUqZPczHQU5UHRB99dwESUQ1nnwsxARaw1wngSaGUZ2cv+lMqfDDftw==
See it running here.

Calculating a checksum in C#

I am writing a checksum for a manifest file for a courrier based system written in C# in the .NET environment.
I need to have an 8 digit field representing the checksum which is calculated as per the following:
Record Check Sum Algorithm
Form the 32-bit arithmetic sum of the products of
• the 7 low order bits of each ASCII character in the record
• the position of each character in the record numbered from 1 for the first character.
for the length of the record up to but excluding the check sum field itself :
Sum = Σi ASCII( ith character in the record ).( i )
where i runs over the length of the record excluding the check sum field.
After performing this calculation, convert the resultant sum to binary and split the 32 low order
bits of the Sum into eight blocks of 4 bits (octets). Note that each of the octets has a decimal
number value ranging from 0 to 15.
Add an offset of ASCII 0 ( zero ) to each octet to form an ASCII code number.
Convert the ASCII code number to its equivalent ASCII character thus forming printable
characters in the range 0123456789:;<=>?.
Concatenate each of these characters to form a single string of eight (8) characters in overall
length.
I am not the greatest at mathematics so I am struggling to write the code correctly as per the documentation.
I have written the following so far:
byte[] sumOfAscii = null;
for(int i = 1; i< recordCheckSum.Length; i++)
{
string indexChar = recordCheckSum.ElementAt(i).ToString();
byte[] asciiChar = Encoding.ASCII.GetBytes(indexChar);
for(int x = 0; x<asciiChar[6]; x++)
{
sumOfAscii += asciiChar[x];
}
}
//Turn into octets
byte firstOctet = 0;
for(int i = 0;i< sumOfAscii[6]; i++)
{
firstOctet += recordCheckSum;
}
Where recordCheckSum is a string made up of deliveryAddresses, product names etc and excludes the 8-digit checksum.
Any help with calculating this would be greatly appreciated as I am struggling.

There are notes in line as I go along. Some more notes on the calculation at the end.
uint sum = 0;
uint zeroOffset = 0x30; // ASCII '0'
byte[] inputData = Encoding.ASCII.GetBytes(recordCheckSum);
for (int i = 0; i < inputData.Length; i++)
{
int product = inputData[i] & 0x7F; // Take the low 7 bits from the record.
product *= i + 1; // Multiply by the 1 based position.
sum += (uint)product; // Add the product to the running sum.
}
byte[] result = new byte[8];
for (int i = 0; i < 8; i++) // if the checksum is reversed, make this:
// for (int i = 7; i >=0; i--)
{
uint current = (uint)(sum & 0x0f); // take the lowest 4 bits.
current += zeroOffset; // Add '0'
result[i] = (byte)current;
sum = sum >> 4; // Right shift the bottom 4 bits off.
}
string checksum = Encoding.ASCII.GetString(result);
One note, I use the & and >> operators, which you may or may not be familiar with. The & operator is the bitwise and operator. The >> operator is logical shift right.

Direct convertation between ascii byte[] and int

I have a program which reads bytes from the network. Sometimes, those bytes are string representations of integer in decimal or hexadecimal form.
Normally, I parse this with something like
var s=Encoding.ASCII.GetString(p.GetBuffer(),0,(int)p.Length);
int.TryParse(s, out number);
I feel that this is wasteful, as it has to allocate memory to the string without any need for it.
Is there a better way I can do it in c#?
UPDATE
I've seen several suggestions to use BitConverter class. This is not what I need. BitConverter will transform binary representation of int (4 bytes) into int type, but since the int is in ascii form, this doesn't apply here.

I doubt it will have a substantial impact on performance or memory consumption, but you can do this relatively easily. One implementation for converting decimal numbers is shown below:
private static int IntFromDecimalAscii(byte[] bytes)
{
int result = 0;
// For each digit, add the digit's value times 10^n, where n is the
// column number counting from right to left starting at 0.
for(int i = 0; i < bytes.Length; ++i)
{
// ASCII digits are in the range 48 <= n <= 57. This code only
// makes sense if we are dealing exclusively with digits, so
// throw if we encounter a non-digit character
if(bytes[i] < 48 || bytes[i] > 57)
{
throw new ArgumentException("Non-digit character present", "bytes");
}
// The bytes are in order from most to least significant, so
// we need to reverse the index to get the right column number
int exp = bytes.Length - i - 1;
// Digits in ASCII start with 0 at 48, and move sequentially
// to 9 at 57, so we can simply subtract 48 from a valid digit
// to get its numeric value
int digitValue = bytes[i] - 48;
// Finally, add the digit value times the column value to the
// result accumulator
result += digitValue * (int)Math.Pow(10, exp);
}
return result;
}
This can easily be adapted to convert hex values as well:
private static int IntFromHexAscii(byte[] bytes)
{
int result = 0;
for(int i = 0; i < bytes.Length; ++i)
{
// ASCII hex digits are a bit more complex than decimal.
if(bytes[i] < 48 || bytes[i] > 71 || (bytes[i] > 57 && bytes[i] < 65))
{
throw new ArgumentException("Non-digit character present", "bytes");
}
int exp = bytes.Length - i - 1;
// Assume decimal first, then fix it if it's actually hex.
int digitValue = bytes[i] - 48;
// This is safe because we already excluded all non-digit
// characters above
if(bytes[i] > 57) // A-F
{
digitValue = bytes[i] - 55;
}
// For hex, we use 16^n instead of 10^n
result += digitValue * (int)Math.Pow(16, exp);
}
return result;
}

Well, you could be a little less wasteful (at least in the number of source code characters sense) by avoiding the s declaration like:
int.TryParse(Encoding.ASCII.GetString(p.GetBuffer(),0,(int)p.Length), out number);
But, I think the only other real way to get a speed-up would be to do as the commenter suggests and hard code a mapping into a Dictionary or something. This could save some time if you have to do this a lot, but it may not be worth the effort...

Having trouble converting in binary to decimal for rather large numbers

I'm converting Binary numbers to decimal numbers fine until my number seems to surpass $2^64$ . It seems because the data type can't hold numbers larger than $2^64$ any insight? What happens is the number that is stored in my base_2 variable seems to not be able to surpass $2^64$ as it should when handling huge binary numbers but it overflows because the data type is too small and resets to 0...Any ideas how I can bypass this or fix this?
//Vector to store the Binary # that user has input
List<ulong> binaryVector = new List<ulong>();
//Vector to store the Decimal Vector I will output
List<string> decimalVector = new List<string>();
//Variable to store the input
string input = "";
//Variables to do conversions
ulong base2 = 1;
ulong decimalOutput = 0;
Console.WriteLine("2^64=" + Math.Pow(2.00,64));
//Prompt User
Console.WriteLine("Enter the Binary Number you would like to convert to decimal: ");
input = Console.ReadLine();
//Store the user input in a vector
for(int i = 0; i < input.Length; i++)
{
//If we find a 0, store it in the appropriate vector, otherwise we found a 1..
if (input[i].Equals('0'))
{
binaryVector.Add(0);
}
else
{
binaryVector.Add(1);
}
}
//Reverse the vector
binaryVector.Reverse();
//Convert the Binary # to Decimal
for(int i = 0; i < binaryVector.Count; i++)
{
//0101 For Example: 0 + (0*1) = 0 Thus: 0 is out current Decimal
//While our base2 variable is now a multiple of 2 (1 * 2 = 2)..
decimalOutput = decimalOutput + (binaryVector[i] * base2);
base2 = base2 * 2;
Console.WriteLine("\nTest base2 Output Position[" + i + "]::" + base2);
}
//Convert Decimal Output to String
string tempString = decimalOutput.ToString();

An ulong can only hold values between 0 and 2**64; see UInt64.MaxValue.
Use a BigInteger when you want to deal with bigger values.

Converting from byte[] to string

I have the following code:
using (BinaryReader br = new BinaryReader(
File.Open(FILE_PATH, FileMode.Open, FileAccess.ReadWrite)))
{
int pos = 0;
int length = (int) br.BaseStream.Length;
while (pos < length)
{
b[pos] = br.ReadByte();
pos++;
}
pos = 0;
while (pos < length)
{
Console.WriteLine(Convert.ToString(b[pos]));
pos++;
}
}
The FILE_PATH is a const string that contains the path to the binary file being read.
The binary file is a mixture of integers and characters.
The integers are 1 bytes each and each character is written to the file as 2 bytes.
For example, the file has the following data :
1HELLO HOW ARE YOU45YOU ARE LOOKING GREAT //and so on
Please note: Each integer is associated with the string of characters following it. So 1 is associated with "HELLO HOW ARE YOU" and 45 with "YOU ARE LOOKING GREAT" and so on.
Now the binary is written (I do not know why but I have to live with this) such that '1' will take only 1 byte while 'H' (and other characters) take 2 bytes each.
So here is what the file actually contains:
0100480045..and so on
Heres the breakdown:
01 is the first byte for the integer 1
0048 are the 2 bytes for 'H' (H is 48 in Hex)
0045 are the 2 bytes for 'E' (E = 0x45)
and so on..
I want my Console to print human readable format out of this file: That I want it to print "1 HELLO HOW ARE YOU" and then "45 YOU ARE LOOKING GREAT" and so on...
Is what I am doing correct? Is there an easier/efficient way?
My line Console.WriteLine(Convert.ToString(b[pos])); does nothing but prints the integer value and not the actual character I want. It is OK for integers in the file but then how do I read out characters?
Any help would be much appreciated.
Thanks

I think what you are looking for is Encoding.GetString.
Since your string data is composed of 2 byte characters, how you can get your string out is:
for (int i = 0; i < b.Length; i++)
{
byte curByte = b[i];
// Assuming that the first byte of a 2-byte character sequence will be 0
if (curByte != 0)
{
// This is a 1 byte number
Console.WriteLine(Convert.ToString(curByte));
}
else
{
// This is a 2 byte character. Print it out.
Console.WriteLine(Encoding.Unicode.GetString(b, i, 2));
// We consumed the next character as well, no need to deal with it
// in the next round of the loop.
i++;
}
}

You can use String System.Text.UnicodeEncoding.GetString() which takes a byte[] array and produces a string.
I found this link very useful
Note that this is not the same as just blindly copying the bytes from the byte[] array into a hunk of memory and calling it a string. The GetString() method must validate the bytes and forbid invalid surrogates, for example.

using (BinaryReader br = new BinaryReader(File.Open(FILE_PATH, FileMode.Open, FileAccess.ReadWrite)))
{
int length = (int)br.BaseStream.Length;
byte[] buffer = new byte[length * 2];
int bufferPosition = 0;
while (pos < length)
{
byte b = br.ReadByte();
if(b < 10)
{
buffer[bufferPosition] = 0;
buffer[bufferPosition + 1] = b + 0x30;
pos++;
}
else
{
buffer[bufferPosition] = b;
buffer[bufferPosition + 1] = br.ReadByte();
pos += 2;
}
bufferPosition += 2;
}
Console.WriteLine(System.Text.Encoding.Unicode.GetString(buffer, 0, bufferPosition));
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

checksum for UTF-8 string - c#

Related

C# How convert unHex to String

Calculating a checksum in C#

Direct convertation between ascii byte[] and int

Having trouble converting in binary to decimal for rather large numbers

Converting from byte[] to string

Categories

Resources