I have to calculate CRC32 checksum for a string in C# and send it to an external application.
On the other end they will calculate it using Java.
But my checksum does not match on the their end.
e.g. CRC32 checksum of the following string
43HLV109520DAP10072la19z6
is 1269993351 on their end.
And 2947932745 at my end using C#
Please tell me what's going wrong in my code.
I am using this 0xffffffff default seed and following crc table
readonly static uint[] CRCTable = new uint[] {
0x00000000, 0x77073096, 0xEE0E612C, 0x990951BA, 0x076DC419,
0x706AF48F, 0xE963A535, 0x9E6495A3, 0x0EDB8832, 0x79DCB8A4,
0xE0D5E91E, 0x97D2D988, 0x09B64C2B, 0x7EB17CBD, 0xE7B82D07,
0x90BF1D91, 0x1DB71064, 0x6AB020F2, 0xF3B97148, 0x84BE41DE,
0x1ADAD47D, 0x6DDDE4EB, 0xF4D4B551, 0x83D385C7, 0x136C9856,
0x646BA8C0, 0xFD62F97A, 0x8A65C9EC, 0x14015C4F, 0x63066CD9,
0xFA0F3D63, 0x8D080DF5, 0x3B6E20C8, 0x4C69105E, 0xD56041E4,
0xA2677172, 0x3C03E4D1, 0x4B04D447, 0xD20D85FD, 0xA50AB56B,
0x35B5A8FA, 0x42B2986C, 0xDBBBC9D6, 0xACBCF940, 0x32D86CE3,
0x45DF5C75, 0xDCD60DCF, 0xABD13D59, 0x26D930AC, 0x51DE003A,
0xC8D75180, 0xBFD06116, 0x21B4F4B5, 0x56B3C423, 0xCFBA9599,
0xB8BDA50F, 0x2802B89E, 0x5F058808, 0xC60CD9B2, 0xB10BE924,
0x2F6F7C87, 0x58684C11, 0xC1611DAB, 0xB6662D3D, 0x76DC4190,
0x01DB7106, 0x98D220BC, 0xEFD5102A, 0x71B18589, 0x06B6B51F,
0x9FBFE4A5, 0xE8B8D433, 0x7807C9A2, 0x0F00F934, 0x9609A88E,
0xE10E9818, 0x7F6A0DBB, 0x086D3D2D, 0x91646C97, 0xE6635C01,
0x6B6B51F4, 0x1C6C6162, 0x856530D8, 0xF262004E, 0x6C0695ED,
0x1B01A57B, 0x8208F4C1, 0xF50FC457, 0x65B0D9C6, 0x12B7E950,
0x8BBEB8EA, 0xFCB9887C, 0x62DD1DDF, 0x15DA2D49, 0x8CD37CF3,
0xFBD44C65, 0x4DB26158, 0x3AB551CE, 0xA3BC0074, 0xD4BB30E2,
0x4ADFA541, 0x3DD895D7, 0xA4D1C46D, 0xD3D6F4FB, 0x4369E96A,
0x346ED9FC, 0xAD678846, 0xDA60B8D0, 0x44042D73, 0x33031DE5,
0xAA0A4C5F, 0xDD0D7CC9, 0x5005713C, 0x270241AA, 0xBE0B1010,
0xC90C2086, 0x5768B525, 0x206F85B3, 0xB966D409, 0xCE61E49F,
0x5EDEF90E, 0x29D9C998, 0xB0D09822, 0xC7D7A8B4, 0x59B33D17,
0x2EB40D81, 0xB7BD5C3B, 0xC0BA6CAD, 0xEDB88320, 0x9ABFB3B6,
0x03B6E20C, 0x74B1D29A, 0xEAD54739, 0x9DD277AF, 0x04DB2615,
0x73DC1683, 0xE3630B12, 0x94643B84, 0x0D6D6A3E, 0x7A6A5AA8,
0xE40ECF0B, 0x9309FF9D, 0x0A00AE27, 0x7D079EB1, 0xF00F9344,
0x8708A3D2, 0x1E01F268, 0x6906C2FE, 0xF762575D, 0x806567CB,
0x196C3671, 0x6E6B06E7, 0xFED41B76, 0x89D32BE0, 0x10DA7A5A,
0x67DD4ACC, 0xF9B9DF6F, 0x8EBEEFF9, 0x17B7BE43, 0x60B08ED5,
0xD6D6A3E8, 0xA1D1937E, 0x38D8C2C4, 0x4FDFF252, 0xD1BB67F1,
0xA6BC5767, 0x3FB506DD, 0x48B2364B, 0xD80D2BDA, 0xAF0A1B4C,
0x36034AF6, 0x41047A60, 0xDF60EFC3, 0xA867DF55, 0x316E8EEF,
0x4669BE79, 0xCB61B38C, 0xBC66831A, 0x256FD2A0, 0x5268E236,
0xCC0C7795, 0xBB0B4703, 0x220216B9, 0x5505262F, 0xC5BA3BBE,
0xB2BD0B28, 0x2BB45A92, 0x5CB36A04, 0xC2D7FFA7, 0xB5D0CF31,
0x2CD99E8B, 0x5BDEAE1D, 0x9B64C2B0, 0xEC63F226, 0x756AA39C,
0x026D930A, 0x9C0906A9, 0xEB0E363F, 0x72076785, 0x05005713,
0x95BF4A82, 0xE2B87A14, 0x7BB12BAE, 0x0CB61B38, 0x92D28E9B,
0xE5D5BE0D, 0x7CDCEFB7, 0x0BDBDF21, 0x86D3D2D4, 0xF1D4E242,
0x68DDB3F8, 0x1FDA836E, 0x81BE16CD, 0xF6B9265B, 0x6FB077E1,
0x18B74777, 0x88085AE6, 0xFF0F6A70, 0x66063BCA, 0x11010B5C,
0x8F659EFF, 0xF862AE69, 0x616BFFD3, 0x166CCF45, 0xA00AE278,
0xD70DD2EE, 0x4E048354, 0x3903B3C2, 0xA7672661, 0xD06016F7,
0x4969474D, 0x3E6E77DB, 0xAED16A4A, 0xD9D65ADC, 0x40DF0B66,
0x37D83BF0, 0xA9BCAE53, 0xDEBB9EC5, 0x47B2CF7F, 0x30B5FFE9,
0xBDBDF21C, 0xCABAC28A, 0x53B39330, 0x24B4A3A6, 0xBAD03605,
0xCDD70693, 0x54DE5729, 0x23D967BF, 0xB3667A2E, 0xC4614AB8,
0x5D681B02, 0x2A6F2B94, 0xB40BBE37, 0xC30C8EA1, 0x5A05DF1B,
0x2D02EF8D
};
CRC32 is calculated over a sequence of bytes and not over a string. So to calculate CRC32 you need to transform the string into bytes first. If you use a different encoding to transform a string to a sequence of bytes the result will be different.
Thus you need to use the same encoding on both sides. I recommend using UTF-8 without BOM.
I have calculated CRC32 with Java and got the same you got in C#. I.e. CRC32(43HLV109520DAP10072la19z6)=2947932745. This means that either they have a bug in java, or you have a bug during transmission.
Code follows.
I suggest you try to send simple data to java application, like zeros or ones, and try to deduce how do they compute CRC.
public static void main(String[] args) {
CRC32 crc32 = new CRC32();
String data = "43HLV109520DAP10072la19z6";
String[] cs = new String[] {"utf8" /*, "cp1252", "cp866" */};
byte[] array;
byte b;
for(int i=0; i<cs.length; ++i) {
array = data.getBytes(Charset.forName(cs[i]));
crc32.reset();
crc32.update(array);
System.out.println(String.format("%s: %d", cs[i], crc32.getValue()));
/*
for(int j=0; j<array.length/2; j++) {
b = array[i];
array[i] = array[array.length-1-i];
array[array.length-1-i] = b;
}
*/
for(int j=0; j<array.length; j+=2) {
b = array[i];
array[i] = array[i+2];
array[i+1] = b;
}
crc32.reset();
crc32.update(array);
System.out.println(String.format("of modified: %d", crc32.getValue()));
}
}
UPDATE
Endiannes reverse also not help
for(int j=0; j<array.length; j+=4) {
b = array[i];
array[i] = array[i+3];
array[i+3] = b;
b = array[i+1];
array[i+1] = array[i+2];
array[i+2] = b;
}
Without delving into any detail, the problem can be related to Java's lack of unsigned integer types. The problem could happen at the int level, but also at the byte level. This is one avenue of investigation.
CRC is calculated over a sequence of bytes and not over a string.
Whichever CRC in java looks different due unavailability of Unsigned int in java.
Convert calculated Int CRC into Hex String and take last 2 Bytes (length 4)
That is your actual CRC unsigned Int.
String hexCrc = Integer.toHexString(crcCalculated);
hexCrc = hexCrc.substring(hexCrc.length()-4);
compare hex CRC of c# and Java both should be same.
This question already has answers here:
What is the equivalent of memset in C#?
(17 answers)
Closed 8 years ago.
I'm busy rewriting an old project that was done in C++, to C#.
My task is to rewrite the program so that it functions as close to the original as possible.
During a bunch of file-handling the previous developer who wrote this program creates a structure containing a ton of fields that correspond to the set format that a file has to be written in, so all that work is already done for me.
These fields are all byte arrays. What the C++ code then does is use memset to set this entire structure to all spaces characters (0x20). One line of code. Easy.
This is very important as the utility that this file eventually goes to is expecting the file in this format. What I've had to do is change this struct to a class in C#, but I cannot find a way to easily initialize each of these byte arrays to all space characters.
What I've ended up having to do is this in the class constructor:
//Initialize all of the variables to spaces.
int index = 0;
foreach (byte b in UserCode)
{
UserCode[index] = 0x20;
index++;
}
This works fine, but I'm sure there must be a simpler way to do this. When the array is set to UserCode = new byte[6] in the constructor the byte array gets automatically initialized to the default null values. Is there no way that I can make it become all spaces upon declaration, so that when I call my class' constructor that it is initialized straight away like this? Or some memset-like function?
For small arrays use array initialisation syntax:
var sevenItems = new byte[] { 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20 };
For larger arrays use a standard for loop. This is the most readable and efficient way to do it:
var sevenThousandItems = new byte[7000];
for (int i = 0; i < sevenThousandItems.Length; i++)
{
sevenThousandItems[i] = 0x20;
}
Of course, if you need to do this a lot then you could create a helper method to help keep your code concise:
byte[] sevenItems = CreateSpecialByteArray(7);
byte[] sevenThousandItems = CreateSpecialByteArray(7000);
// ...
public static byte[] CreateSpecialByteArray(int length)
{
var arr = new byte[length];
for (int i = 0; i < arr.Length; i++)
{
arr[i] = 0x20;
}
return arr;
}
Use this to create the array in the first place:
byte[] array = Enumerable.Repeat((byte)0x20, <number of elements>).ToArray();
Replace <number of elements> with the desired array size.
You can use Enumerable.Repeat()
Enumerable.Repeat generates a sequence that contains one repeated value.
Array of 100 items initialized to 0x20:
byte[] arr1 = Enumerable.Repeat((byte)0x20,100).ToArray();
var array = Encoding.ASCII.GetBytes(new string(' ', 100));
If you need to initialise a small array you can use:
byte[] smallArray = new byte[] { 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20 };
If you have a larger array, then you could use:
byte[] bitBiggerArray Enumerable.Repeat(0x20, 7000).ToArray();
Which is simple, and easy for the next guy/girl to read. And will be fast enough 99.9% of the time.
(Normally will be the BestOption™)
However if you really really need super speed, calling out to the optimized memset method, using P/invoke, is for you:
(Here wrapped up in a nice to use class)
public static class Superfast
{
[DllImport("msvcrt.dll",
EntryPoint = "memset",
CallingConvention = CallingConvention.Cdecl,
SetLastError = false)]
private static extern IntPtr MemSet(IntPtr dest, int c, int count);
//If you need super speed, calling out to M$ memset optimized method using P/invoke
public static byte[] InitByteArray(byte fillWith, int size)
{
byte[] arrayBytes = new byte[size];
GCHandle gch = GCHandle.Alloc(arrayBytes, GCHandleType.Pinned);
MemSet(gch.AddrOfPinnedObject(), fillWith, arrayBytes.Length);
gch.Free();
return arrayBytes;
}
}
Usage:
byte[] oneofManyBigArrays = Superfast.InitByteArray(0x20,700000);
Maybe these could be helpful?
What is the equivalent of memset in C#?
http://techmikael.blogspot.com/2009/12/filling-array-with-default-value.html
Guys before me gave you your answer. I just want to point out your misuse of foreach loop. See, since you have to increment index standard "for loop" would be not only more compact, but also more efficient ("foreach" does many things under the hood):
for (int index = 0; index < UserCode.Length; ++index)
{
UserCode[index] = 0x20;
}
This is a faster version of the code from the post marked as the answer.
All of the benchmarks that I have performed show that a simple for loop that only contains something like an array fill is typically twice as fast if it is decrementing versus if it is incrementing.
Also, the array Length property is already passed as the parameter so it doesn't need to be retrieved from the array properties. It should also be pre-calculated and assigned to a local variable.
Loop bounds calculations that involve a property accessor will re-compute the value of the bounds before each iteration of the loop.
public static byte[] CreateSpecialByteArray(int length)
{
byte[] array = new byte[length];
int len = length - 1;
for (int i = len; i >= 0; i--)
{
array[i] = 0x20;
}
return array;
}
Just to expand on my answer a neater way of doing this multiple times would probably be:
PopulateByteArray(UserCode, 0x20);
which calls:
public static void PopulateByteArray(byte[] byteArray, byte value)
{
for (int i = 0; i < byteArray.Length; i++)
{
byteArray[i] = value;
}
}
This has the advantage of a nice efficient for loop (mention to gwiazdorrr's answer) as well as a nice neat looking call if it is being used a lot. And a lot mroe at a glance readable than the enumeration one I personally think. :)
The fastest way to do this is to use the api:
bR = 0xFF;
RtlFillMemory(pBuffer, nFileLen, bR);
using a pointer to a buffer, the length to write, and the encoded byte. I think the fastest way to do it in managed code (much slower), is to create a small block of initialized bytes, then use Buffer.Blockcopy to write them to the byte array in a loop. I threw this together but haven't tested it, but you get the idea:
long size = GetFileSize(FileName);
// zero byte
const int blocksize = 1024;
// 1's array
byte[] ntemp = new byte[blocksize];
byte[] nbyte = new byte[size];
// init 1's array
for (int i = 0; i < blocksize; i++)
ntemp[i] = 0xff;
// get dimensions
int blocks = (int)(size / blocksize);
int remainder = (int)(size - (blocks * blocksize));
int count = 0;
// copy to the buffer
do
{
Buffer.BlockCopy(ntemp, 0, nbyte, blocksize * count, blocksize);
count++;
} while (count < blocks);
// copy remaining bytes
Buffer.BlockCopy(ntemp, 0, nbyte, blocksize * count, remainder);
This function is way faster than a for loop for filling an array.
The Array.Copy command is a very fast memory copy function. This function takes advantage of that by repeatedly calling the Array.Copy command and doubling the size of what we copy until the array is full.
I discuss this on my blog at https://grax32.com/2013/06/fast-array-fill-function-revisited.html (Link updated 12/16/2019). Also see Nuget package that provides this extension method. http://sites.grax32.com/ArrayExtensions/
Note that this would be easy to make into an extension method by just adding the word "this" to the method declarations i.e. public static void ArrayFill<T>(this T[] arrayToFill ...
public static void ArrayFill<T>(T[] arrayToFill, T fillValue)
{
// if called with a single value, wrap the value in an array and call the main function
ArrayFill(arrayToFill, new T[] { fillValue });
}
public static void ArrayFill<T>(T[] arrayToFill, T[] fillValue)
{
if (fillValue.Length >= arrayToFill.Length)
{
throw new ArgumentException("fillValue array length must be smaller than length of arrayToFill");
}
// set the initial array value
Array.Copy(fillValue, arrayToFill, fillValue.Length);
int arrayToFillHalfLength = arrayToFill.Length / 2;
for (int i = fillValue.Length; i < arrayToFill.Length; i *= 2)
{
int copyLength = i;
if (i > arrayToFillHalfLength)
{
copyLength = arrayToFill.Length - i;
}
Array.Copy(arrayToFill, 0, arrayToFill, i, copyLength);
}
}
You can use a collection initializer:
UserCode = new byte[]{0x20,0x20,0x20,0x20,0x20,0x20};
This will work better than Repeat if the values are not identical.
You could speed up the initialization and simplify the code by using the the Parallel class (.NET 4 and newer):
public static void PopulateByteArray(byte[] byteArray, byte value)
{
Parallel.For(0, byteArray.Length, i => byteArray[i] = value);
}
Of course you can create the array at the same time:
public static byte[] CreateSpecialByteArray(int length, byte value)
{
var byteArray = new byte[length];
Parallel.For(0, length, i => byteArray[i] = value);
return byteArray;
}