Different MD5 file hash in C# and PHP - c#

I have a small problem in checking MD5 checksum of files in C# and PHP. The hash calculated by PHP script vary from hash calculated by C#.
libcurl.dll C# = c3506360ce8f42f10dc844e3ff6ed999
libcurl.dll PHP = f02b47e41e9fa77909031bdef07532af
In PHP I use md5_file function, and my C# code is:
protected string GetFileMD5(string fileName)
{
FileStream file = new FileStream(fileName, FileMode.Open);
MD5 md5 = new MD5CryptoServiceProvider();
byte[] retVal = md5.ComputeHash(file);
file.Close();
StringBuilder sb = new StringBuilder();
for (int i = 0; i < retVal.Length; i++)
{
sb.Append(retVal[i].ToString("x2"));
}
return sb.ToString();
}
Any ideas how to calculate the same hash? I think that it may be something about encoding.
Thanks in advance!

My C# is rusty, but will:
byte[] retVal = md5.ComputeHash(file);
actually read in the entire file? I think it is just hashing the stream object. I believe you need to read the stream, then hash on the entire file contents?
int length = (int)file.Length; // get file length
buffer = new byte[length]; // create buffer
int count; // actual number of bytes read
int sum = 0; // total number of bytes read
// read until Read method returns 0 (end of the stream has been reached)
while ((count = file.Read(buffer, sum, length - sum)) > 0)
sum += count; // sum is a buffer offset for next reading
byte[] retVal = md5.ComputeHash(buffer);
I'm not sure if that actually runs as is, but I think something along those lines will be needed.

I use this:
I havent had yet any issues with comparison of php md5 with c# md5
System.Text.UTF8Encoding text = new System.Text.UTF8Encoding();
System.Security.Cryptography.MD5CryptoServiceProvider md5 = new System.Security.Cryptography.MD5CryptoServiceProvider();
Convert2.ToBase16(md5.ComputeHash(text.GetBytes(encPassString + sess)));
class Convert2
{
public static string ToBase16(byte[] input)
{
return string.Concat((from x in input select x.ToString("x2")).ToArray());
}
}

Related

Convert an byte array to MD5 Hash in react native

I have a system in C# which receives a password and this password is encrypted into a MD5 Hash using this function. I had read a lot of posts and suggestion, but I couldn't create the MD5 byte array as in C#.
public static string GetMD5HashData(string data)
{
//create new instance of md5
MD5 md5 = MD5.Create();
//convert the input text to array of bytes
byte[] hashData = md5.ComputeHash(Encoding.Default.GetBytes(data));
//create new instance of StringBuilder to save hashed data
StringBuilder returnValue = new StringBuilder();
//loop for each byte and add it to StringBuilder
for (int i = 0; i < hashData.Length; i++)
{
returnValue.Append(hashData[i].ToString());
}
// return hexadecimal string
return returnValue.ToString();
}
The return of this function is this string 207154234292557519022585191701391052252168 . I need to generate the same string in React Native.
This part Encoding.Default.GetBytes(data) in the C# function I've reproduced in React native, so both C# and React native return the same array of bytes from the input string.
Input string: 'system123' byte array: '[115, 121, 115, 116, 101, 109,
49, 50, 51]'
The React native function to generate the array of bytes.
convertStringToByteArray = (str) =>{
var bufferedVal = Buffer.from(str, 'utf8').toString('hex');
String.prototype.encodeHex = function () {
var bytes = [];
for (var i = 0; i < this.length; ++i) {
bytes.push(this.charCodeAt(i));
}
return bytes;
};
var byteArray = str.encodeHex();
return byteArray;
};
I've tried some libs like crypto-js for react-native to create the MD5 hash, but could not generate the same value as C# '207154234292557519022585191701391052252168'. Could someone help me?
Applying CryptoJS and assuming UTF8 encoding, the C# logic can be implemented as follows:
var result = '';
var hashBytes = CryptoJS.MD5('system123').toString(CryptoJS.enc.Latin1);
for (var i = 0; i < hashBytes.length; i++)
result += hashBytes.codePointAt(i).toString();
console.log(result); // 207154234292557519022585191701391052252168
<script src="https://cdnjs.cloudflare.com/ajax/libs/crypto-js/4.1.1/crypto-js.min.js"></script>
Explanation:
CryptoJS.MD5() implicitly performs a UTF-8 encoding since the data is passed as string (here). The Latin1 encoder converts the WordArray into a bytes string. In the loop, the Unicode code point value for each byte is determined as non-negative integer, converted to a string, and concatenated.
The issue is that you use a different encoding in your C# code compared to your js code.
Try to use Encoding.UTF8 instead of Encoding.Default in your code.
public static string GetMD5HashData(string data)
{
//create new instance of md5
MD5 md5 = MD5.Create();
//convert the input text to array of bytes
byte[] hashData = md5.ComputeHash(Encoding.UTF8.GetBytes(data));
//create new instance of StringBuilder to save hashed data
StringBuilder returnValue = new StringBuilder();
//loop for each byte and add it to StringBuilder
for (int i = 0; i < hashData.Length; i++)
{
returnValue.Append(hashData[i].ToString());
}
// return hexadecimal string
return returnValue.ToString();
}

How to read blob data from MySQL in c# [duplicate]

I have a byte[] array that is loaded from a file that I happen to known contains UTF-8.
In some debugging code, I need to convert it to a string. Is there a one-liner that will do this?
Under the covers it should be just an allocation and a memcopy, so even if it is not implemented, it should be possible.
string result = System.Text.Encoding.UTF8.GetString(byteArray);
There're at least four different ways doing this conversion.
Encoding's GetString, but you won't be able to get the original bytes back if those bytes have non-ASCII characters.
BitConverter.ToString The output is a "-" delimited string, but there's no .NET built-in method to convert the string back to byte array.
Convert.ToBase64String You can easily convert the output string back to byte array by using Convert.FromBase64String. Note: The output string could contain '+', '/' and '='. If you want to use the string in a URL, you need to explicitly encode it.
HttpServerUtility.UrlTokenEncodeYou can easily convert the output string back to byte array by using HttpServerUtility.UrlTokenDecode. The output string is already URL friendly! The downside is it needs System.Web assembly if your project is not a web project.
A full example:
byte[] bytes = { 130, 200, 234, 23 }; // A byte array contains non-ASCII (or non-readable) characters
string s1 = Encoding.UTF8.GetString(bytes); // ���
byte[] decBytes1 = Encoding.UTF8.GetBytes(s1); // decBytes1.Length == 10 !!
// decBytes1 not same as bytes
// Using UTF-8 or other Encoding object will get similar results
string s2 = BitConverter.ToString(bytes); // 82-C8-EA-17
String[] tempAry = s2.Split('-');
byte[] decBytes2 = new byte[tempAry.Length];
for (int i = 0; i < tempAry.Length; i++)
decBytes2[i] = Convert.ToByte(tempAry[i], 16);
// decBytes2 same as bytes
string s3 = Convert.ToBase64String(bytes); // gsjqFw==
byte[] decByte3 = Convert.FromBase64String(s3);
// decByte3 same as bytes
string s4 = HttpServerUtility.UrlTokenEncode(bytes); // gsjqFw2
byte[] decBytes4 = HttpServerUtility.UrlTokenDecode(s4);
// decBytes4 same as bytes
A general solution to convert from byte array to string when you don't know the encoding:
static string BytesToStringConverted(byte[] bytes)
{
using (var stream = new MemoryStream(bytes))
{
using (var streamReader = new StreamReader(stream))
{
return streamReader.ReadToEnd();
}
}
}
Definition:
public static string ConvertByteToString(this byte[] source)
{
return source != null ? System.Text.Encoding.UTF8.GetString(source) : null;
}
Using:
string result = input.ConvertByteToString();
Converting a byte[] to a string seems simple, but any kind of encoding is likely to mess up the output string. This little function just works without any unexpected results:
private string ToString(byte[] bytes)
{
string response = string.Empty;
foreach (byte b in bytes)
response += (Char)b;
return response;
}
I saw some answers at this post and it's possible to be considered completed base knowledge, because I have a several approaches in C# Programming to resolve the same problem. The only thing that is necessary to be considered is about a difference between pure UTF-8 and UTF-8 with a BOM.
Last week, at my job, I needed to develop one functionality that outputs CSV files with a BOM and other CSV files with pure UTF-8 (without a BOM). Each CSV file encoding type will be consumed by different non-standardized APIs. One API reads UTF-8 with a BOM and the other API reads without a BOM. I needed to research the references about this concept, reading the "What's the difference between UTF-8 and UTF-8 without BOM?" Stack Overflow question, and the Wikipedia article "Byte order mark" to build my approach.
Finally, my C# Programming for both UTF-8 encoding types (with BOM and pure) needed to be similar to this example below:
// For UTF-8 with BOM, equals shared by Zanoni (at top)
string result = System.Text.Encoding.UTF8.GetString(byteArray);
//for Pure UTF-8 (without B.O.M.)
string result = (new UTF8Encoding(false)).GetString(byteArray);
Using (byte)b.ToString("x2"), Outputs b4b5dfe475e58b67
public static class Ext {
public static string ToHexString(this byte[] hex)
{
if (hex == null) return null;
if (hex.Length == 0) return string.Empty;
var s = new StringBuilder();
foreach (byte b in hex) {
s.Append(b.ToString("x2"));
}
return s.ToString();
}
public static byte[] ToHexBytes(this string hex)
{
if (hex == null) return null;
if (hex.Length == 0) return new byte[0];
int l = hex.Length / 2;
var b = new byte[l];
for (int i = 0; i < l; ++i) {
b[i] = Convert.ToByte(hex.Substring(i * 2, 2), 16);
}
return b;
}
public static bool EqualsTo(this byte[] bytes, byte[] bytesToCompare)
{
if (bytes == null && bytesToCompare == null) return true; // ?
if (bytes == null || bytesToCompare == null) return false;
if (object.ReferenceEquals(bytes, bytesToCompare)) return true;
if (bytes.Length != bytesToCompare.Length) return false;
for (int i = 0; i < bytes.Length; ++i) {
if (bytes[i] != bytesToCompare[i]) return false;
}
return true;
}
}
There is also class UnicodeEncoding, quite simple in usage:
ByteConverter = new UnicodeEncoding();
string stringDataForEncoding = "My Secret Data!";
byte[] dataEncoded = ByteConverter.GetBytes(stringDataForEncoding);
Console.WriteLine("Data after decoding: {0}", ByteConverter.GetString(dataEncoded));
In addition to the selected answer, if you're using .NET 3.5 or .NET 3.5 CE, you have to specify the index of the first byte to decode, and the number of bytes to decode:
string result = System.Text.Encoding.UTF8.GetString(byteArray, 0, byteArray.Length);
Alternatively:
var byteStr = Convert.ToBase64String(bytes);
The BitConverter class can be used to convert a byte[] to string.
var convertedString = BitConverter.ToString(byteAttay);
Documentation of BitConverter class can be fount on MSDN.
To my knowledge none of the given answers guarantee correct behavior with null termination. Until someone shows me differently I wrote my own static class for handling this with the following methods:
// Mimics the functionality of strlen() in c/c++
// Needed because niether StringBuilder or Encoding.*.GetString() handle \0 well
static int StringLength(byte[] buffer, int startIndex = 0)
{
int strlen = 0;
while
(
(startIndex + strlen + 1) < buffer.Length // Make sure incrementing won't break any bounds
&& buffer[startIndex + strlen] != 0 // The typical null terimation check
)
{
++strlen;
}
return strlen;
}
// This is messy, but I haven't found a built-in way in c# that guarentees null termination
public static string ParseBytes(byte[] buffer, out int strlen, int startIndex = 0)
{
strlen = StringLength(buffer, startIndex);
byte[] c_str = new byte[strlen];
Array.Copy(buffer, startIndex, c_str, 0, strlen);
return Encoding.UTF8.GetString(c_str);
}
The reason for the startIndex was in the example I was working on specifically I needed to parse a byte[] as an array of null terminated strings. It can be safely ignored in the simple case
A LINQ one-liner for converting a byte array byteArrFilename read from a file to a pure ASCII C-style zero-terminated string would be this: Handy for reading things like file index tables in old archive formats.
String filename = new String(byteArrFilename.TakeWhile(x => x != 0)
.Select(x => x < 128 ? (Char)x : '?').ToArray());
I use '?' as the default character for anything not pure ASCII here, but that can be changed, of course. If you want to be sure you can detect it, just use '\0' instead, since the TakeWhile at the start ensures that a string built this way cannot possibly contain '\0' values from the input source.
Try this console application:
static void Main(string[] args)
{
//Encoding _UTF8 = Encoding.UTF8;
string[] _mainString = { "Hello, World!" };
Console.WriteLine("Main String: " + _mainString);
// Convert a string to UTF-8 bytes.
byte[] _utf8Bytes = Encoding.UTF8.GetBytes(_mainString[0]);
// Convert UTF-8 bytes to a string.
string _stringuUnicode = Encoding.UTF8.GetString(_utf8Bytes);
Console.WriteLine("String Unicode: " + _stringuUnicode);
}
Here is a result where you didn’t have to bother with encoding. I used it in my network class and send binary objects as string with it.
public static byte[] String2ByteArray(string str)
{
char[] chars = str.ToArray();
byte[] bytes = new byte[chars.Length * 2];
for (int i = 0; i < chars.Length; i++)
Array.Copy(BitConverter.GetBytes(chars[i]), 0, bytes, i * 2, 2);
return bytes;
}
public static string ByteArray2String(byte[] bytes)
{
char[] chars = new char[bytes.Length / 2];
for (int i = 0; i < chars.Length; i++)
chars[i] = BitConverter.ToChar(bytes, i * 2);
return new string(chars);
}
string result = ASCIIEncoding.UTF8.GetString(byteArray);

Does WCF pad all byte arrays in SOAP messages?

I am doing some data chunking and I'm seeing an interesting issue when sending binary data in my response. I can confirm that the length of the byte array is below my data limit of 4 megabytes, but when I receive the message, it's total size is over 4 megabytes.
For the example below, I used the largest chunk size I could so I could illustrate the issue while still receiving a usable chunk.
The size of the binary data is 3,040,870 on the service side and the client (once the message is deserialized). However, I can also confirm that the byte array is actually just under 4 megabytes (this was done by actually copying the binary data from the message and pasting it into a text file).
So, is WCF causing these issues and, if so, is there anything I can do to prevent it? If not, what might be causing this inflation on my side?
Thanks!
The usual way of sending byte[]s in SOAP messages is to base64-encode the data. This encoding takes 33% more space than binary encoding, which accounts for the size difference almost precisely.
You could adjust the max size or chunk size slightly so that the end result is within the right range, or use another encoding, e.g. MTOM, to eliminate this 33% overhead.
If you're stuck with soap, you can offset the buffer overhead Tim S. talked about using the System.IO.Compression library in .Net - You'd use the compress function first, before building and sending the soap message.
You'd compress with this:
public static byte[] Compress(byte[] data)
{
MemoryStream ms = new MemoryStream();
DeflateStream ds = new DeflateStream(ms, CompressionMode.Compress);
ds.Write(data, 0, data.Length);
ds.Flush();
ds.Close();
return ms.ToArray();
}
On the receiving end, you'd use this to decompress:
public static byte[] Decompress(byte[] data)
{
const int BUFFER_SIZE = 256;
byte[] tempArray = new byte[BUFFER_SIZE];
List<byte[]> tempList = new List<byte[]>();
int count = 0;
int length = 0;
MemoryStream ms = new MemoryStream(data);
DeflateStream ds = new DeflateStream(ms, CompressionMode.Decompress);
while ((InlineAssignHelper(count, ds.Read(tempArray, 0, BUFFER_SIZE))) > 0) {
if (count == BUFFER_SIZE) {
tempList.Add(tempArray);
tempArray = new byte[BUFFER_SIZE];
} else {
byte[] temp = new byte[count];
Array.Copy(tempArray, 0, temp, 0, count);
tempList.Add(temp);
}
length += count;
}
byte[] retVal = new byte[length];
count = 0;
foreach (byte[] temp in tempList) {
Array.Copy(temp, 0, retVal, count, temp.Length);
count += temp.Length;
}
return retVal;
}

Need help Creating a big array from small byte arrays

i got the following code:
byte[] myBytes = new byte[10 * 10000];
for (long i = 0; i < 10000; i++)
{
byte[] a1 = BitConverter.GetBytes(i);
byte[] a2 = BitConverter.GetBytes(true);
byte[] a3 = BitConverter.GetBytes(false);
byte[] rv = new byte[10];
System.Buffer.BlockCopy(a1, 0, rv, 0, a1.Length);
System.Buffer.BlockCopy(a2, 0, rv, a1.Length, a2.Length);
System.Buffer.BlockCopy(a3, 0, rv, a1.Length + a2.Length, a3.Length);
}
everything works as it should. i was trying to convert this code so everything will be written into myBytes but then i realised, that i use a long and if its value will be higher then int.MaxValue casting will fail.
how could one solve this?
another question would be, since i dont want to create a very large bytearray in memory, how could i send it directry to my .WriteBytes(path, myBytes); function ?
If the final destination for this is, as suggested, a file: then write to a file more directly, rather than buffering in memory:
using (var file = File.Create(path)) // or append file FileStream etc
using (var writer = new BinaryWriter(file))
{
for (long i = 0; i < 10000; i++)
{
writer.Write(i);
writer.Write(true);
writer.Write(false);
}
}
Perhaps the ideal way of doing this in your case would be to pass a single BinaryWriter instance to each object in turn as you serialize them (don't open and close the file per-object).
Why don't you just Write() the bytes out as you process them rather than converting to a massive buffer, or use a smaller buffer at least?

C# Split byte[] array

I am doing RSA encryption and I have to split my long string into small byte[] and encrypt them. I then combine the arrays and convert to string and write to a secure file.
Then encryption creates byte[128]
I use this the following to combine:
public static byte[] Combine(params byte[][] arrays)
{
byte[] ret = new byte[arrays.Sum(x => x.Length)];
int offset = 0;
foreach (byte[] data in arrays)
{
Buffer.BlockCopy(data, 0, ret, offset, data.Length);
offset += data.Length;
}
return ret;
}
When I decrypt I take the string, convert it to a byte[] array and now need to split it to decode the chunks and then convert to string.
Any ideas?
Thanks
EDIT:
I think I have the split working now however the decryption fails. Is this because of RSA keys etc? At TimePointA it encrypts it, then at TimePointB it tries to decrypt and it fails. The public keys are different so not sure if that is the issue.
When you decrypt, you can create one array for your decrypt buffer and reuse it:
Also, normally RSA gets used to encrypt a symmetric key for something like AES, and the symmetric algorithm is used to encrypt the actual data. This is enormously faster for anything longer than 1 cipher block. To decrypt the data, you decrypt the symmetric key with RSA, followed by decrypting the data with that key.
byte[] buffer = new byte[BlockLength];
// ASSUMES SOURCE IS padded to BlockLength
for (int i = 0; i < source.Length; i += BlockLength)
{
Buffer.BlockCopy(source, i, buffer, 0, BlockLength);
// ... decode buffer and copy the result somewhere else
}
Edit 2: If you are storing the data as strings and not as raw bytes, use Convert.ToBase64String() and Convert.FromBase64String() as the safest conversion solution.
Edit 3: From his edit:
private static List<byte[]> splitByteArray(string longString)
{
byte[] source = Convert.FromBase64String(longString);
List<byte[]> result = new List<byte[]>();
for (int i = 0; i < source.Length; i += 128)
{
byte[] buffer = new byte[128];
Buffer.BlockCopy(source, i, buffer, 0, 128);
result.Add(buffer);
}
return result;
}
I'd say something like this would do it:
byte[] text = Encoding.UTF8.GetBytes(longString);
int len = 128;
for (int i = 0; i < text.Length; )
{
int j = 0;
byte[] chunk = new byte[len];
while (++j < chunk.Length && i < text.Length)
{
chunk[j] = text[i++];
}
Convert(chunk); //do something with the chunk
}
Why do you need to break the string into variable length chunks? Fixed-length chunks, or no chunks at all, would simplify this a lot.
why not use a framework instead of doing the byte-stuff yourself?
http://www.codinghorror.com/blog/archives/001275.html
"the public keys are different"?
You encrypt with a private key, and decrypt with the public key that corresponds to the private key.
Anything else will give you gibberish.

Categories

Resources