CRC-16 and CRC-32 Checks - c#

I need help trying to verify CRC-16 values (also need help with CRC-32 values). I tried to sit down and understand how CRC works but I am drawing a blank.
My first problem is when trying to use an online calculator for calculating the message "BD001325E032091B94C412AC" into CRC16 = 12AC. The documentation states that the last two octets are the CRC16 value, so I am inputting "BD001325E032091B94C4" into the site http://www.lammertbies.nl/comm/info/crc-calculation.html and receive 5A90 as the result instead of 12AC.
Does anybody know why these values are different and where I can find code for how to calculate CRC16 and CRC32 values (I plan to later learn how to do this but times doesn't allow right now)?
Some more messages are as following:
16000040FFFFFFFF00015FCB
3C00003144010405E57022C7
BA00001144010101B970F0ED
3900010101390401B3049FF1
09900C800000000000008CF3
8590000000000000000035F7
00900259025902590259EBC9
0200002B00080191014BF5A2
BB0000BEE0014401B970E51E
3D000322D0320A2510A263A0
2C0001440000D60000D65E54
--Edit--
I have included more information. The documentation I was referencing is TIA-102.BAAA-A (from the TIA standard). The following is what the documentation states (trying to avoid copyright infringement as much as possible):
The Last Block in a packet comprises several octets of user information and / or
pad octets, followed by a 4-octet CRC parity check. This is referred to as the
packet CRC.
The packet CRC is a 4-octet cyclic redundancy check coded over all of the data
octets included in the Intermediate Blocks and the octets of user information of
the Last Block. The specific calculation is as follows.
Let k be the total number of user information and pad bits over which the packet
CRC is to be calculated. Consider the k message bits as the coefficients of a
polynomial M(x) of degree k–1, associating the MSB of the zero-th message
octet with x^k–1 and the LSB of the last message octet with x^0. Define the
generator polynomial, GM(x), and the inversion polynomial, IM(x).
GM(x) = x^32 + x^26 + x^23 + x^22 + x^16 + x^12 + x^11 + x^10 + x^8 + x^7 + x^5 +
x^4 + x^2 + x + 1
IM(x) = x^31 + x^30 + x^29 + ... + x^2 + x +1
The packet CRC polynomial, FM(x), is then computed from the following formula.
FM(x) = ( x^32 M(x) mod GM(x) ) + IM(x) modulo 2, i.e., in GF(2)
The coefficients of FM(x) are placed in the CRC field with the MSB of the zero-th
octet of the CRC corresponding to x^31 and the LSB of the third octet of the CRC
corresponding to x^0.
In the above quote, I have put ^ to show powers as the formatting didn't stay the same when quoted. I'm not sure what goes to what but does this help?

I have a class I converted from a C++ I found in internet, it uses a long to calculate a CRC32. It adhere to the standard and is the one use by PKZIP, WinZip and Ethernet. To test it, use Winzip and compress a file then calculate the same file with this class, it should return the same CRC. It does for me.
public class CRC32
{
private int[] iTable;
public CRC32() {
this.iTable = new int[256];
Init();
}
/**
* Initialize the iTable aplying the polynomial used by PKZIP, WINZIP and Ethernet.
*/
private void Init()
{
// 0x04C11DB7 is the official polynomial used by PKZip, WinZip and Ethernet.
int iPolynomial = 0x04C11DB7;
// 256 values representing ASCII character codes.
for (int iAscii = 0; iAscii <= 0xFF; iAscii++)
{
this.iTable[iAscii] = this.Reflect(iAscii, (byte) 8) << 24;
for (int i = 0; i <= 7; i++)
{
if ((this.iTable[iAscii] & 0x80000000L) == 0) this.iTable[iAscii] = (this.iTable[iAscii] << 1) ^ 0;
else this.iTable[iAscii] = (this.iTable[iAscii] << 1) ^ iPolynomial;
}
this.iTable[iAscii] = this.Reflect(this.iTable[iAscii], (byte) 32);
}
}
/**
* Reflection is a requirement for the official CRC-32 standard. Note that you can create CRC without it,
* but it won't conform to the standard.
*
* #param iReflect
* value to apply the reflection
* #param iValue
* #return the calculated value
*/
private int Reflect(int iReflect, int iValue)
{
int iReturned = 0;
// Swap bit 0 for bit 7, bit 1 For bit 6, etc....
for (int i = 1; i < (iValue + 1); i++)
{
if ((iReflect & 1) != 0)
{
iReturned |= (1 << (iValue - i));
}
iReflect >>= 1;
}
return iReturned;
}
/**
* PartialCRC caculates the CRC32 by looping through each byte in sData
*
* #param lCRC
* the variable to hold the CRC. It must have been initialize.
* <p>
* See fullCRC for an example
* </p>
* #param sData
* array of byte to calculate the CRC
* #param iDataLength
* the length of the data
* #return the new caculated CRC
*/
public long CalculateCRC(long lCRC, byte[] sData, int iDataLength)
{
for (int i = 0; i < iDataLength; i++)
{
lCRC = (lCRC >> 8) ^ (long) (this.iTable[(int) (lCRC & 0xFF) ^ (int) (sData[i] & 0xff)] & 0xffffffffL);
}
return lCRC;
}
/**
* Caculates the CRC32 for the given Data
*
* #param sData
* the data to calculate the CRC
* #param iDataLength
* then length of the data
* #return the calculated CRC32
*/
public long FullCRC(byte[] sData, int iDataLength)
{
long lCRC = 0xffffffffL;
lCRC = this.CalculateCRC(lCRC, sData, iDataLength);
return (lCRC /*& 0xffffffffL)*/^ 0xffffffffL);
}
/**
* Calculates the CRC32 of a file
*
* #param sFileName
* The complete file path
* #param context
* The context to open the files.
* #return the calculated CRC32 or -1 if an error occurs (file not found).
*/
long FileCRC(String sFileName, Context context)
{
long iOutCRC = 0xffffffffL; // Initilaize the CRC.
int iBytesRead = 0;
int buffSize = 32 * 1024;
FileInputStream isFile = null;
try
{
byte[] data = new byte[buffSize]; // buffer de 32Kb
isFile = context.openFileInput(sFileName);
try
{
while ((iBytesRead = isFile.read(data, 0, buffSize)) > 0)
{
iOutCRC = this.CalculateCRC(iOutCRC, data, iBytesRead);
}
return (iOutCRC ^ 0xffffffffL); // Finalize the CRC.
}
catch (Exception e)
{
// Error reading file
}
finally
{
isFile.close();
}
}
catch (Exception e)
{
// file not found
}
return -1l;
}
}

Read Ross Williams tutorial on CRCs to get a better understanding of CRC's, what defines a particular CRC, and their implementations.
The reveng website has an excellent catalog of known CRCs, and for each the CRC of a test string (nine bytes: "123456789" in ASCII/UTF-8). Note that there are 22 different 16-bit CRCs defined there.
The reveng software on that same site can be used to reverse engineer the polynomial, initialization, post-processing, and bit reversal given several examples as you have for the 16-bit CRC. (Hence the name "reveng".) I ran your data through and got:
./reveng -w 16 -s 16000040FFFFFFFF00015FCB 3C00003144010405E57022C7 BA00001144010101B970F0ED 3900010101390401B3049FF1 09900C800000000000008CF3 8590000000000000000035F7 00900259025902590259EBC9 0200002B00080191014BF5A2 BB0000BEE0014401B970E51E 3D000322D0320A2510A263A0 2C0001440000D60000D65E54
width=16 poly=0x1021 init=0xc921 refin=false refout=false xorout=0x0000 check=0x2fcf name=(none)
As indicated by the "(none)", that 16-bit CRC is not any of the 22 listed on reveng, though it is similar to several of them, differing only in the initialization.
The additional information you provided is for a 32-bit CRC, either CRC-32 or CRC-32/BZIP in the reveng catalog, depending on whether the bits are reversed or not.

There are quite a few parameters to CRC calculations: Polynomial, initial value, final XOR... see Wikipedia for details. Your CRC does not seem to fit the ones on the site you used, but you can try to find the right parameters from your documentation and use a different calculator, e.g. this one (though I'm afraid it doesn't support HEX input).
One thing to keep in mind is that CRC-16 is usually calculated over the data that is supposed to be checksummed plus two zero-bytes, e.g. you are probably looking for a CRC16 function where CRC16(BD001325E032091B94C40000) == 12AC. With checksums calculated in this way, the CRC of the data with checksum appended will work out to 0, which makes checking easier, e.g. CRC16(BD001325E032091B94C412AC) == 0000

Related

16bit CRC-ITU calculation for Concox tracker

I am creating C# code for a server program that receives data from a Concox TR06 GPS tracker via TCP:
http://www.iconcox.com/uploads/soft/140920/1-140920023130.pdf
When first starting up, the tracker sends a login message, which needs to be acknowledged before it will send any position data. My first problem is that, according to the documentation, the acknowledge message is 18 bytes long, yet the example they provide is only 10 bytes long:
P.s. in the table above, the "bits" column I'm pretty sure should be labelled "bytes" instead...
Now, my main problem is in calculating the Error Check. According to the documentation:
The check code is generated by the CRC-ITU checking method. The check codes of data in the structure of the protocol, from the Packet Length to the Information Serial Number (including "Packet Length" and "Information Serial Number"), are values of CRC-ITU.
Ok, so in the above example, I need to calculate CRC on 0x05 0x01 0x00 0x01
Now, I'm guessing it's 16 bit CRC, as according to the diagram above, the CRC is 2 bytes long. I've implemented two different CRC implementations I found online at http://www.sanity-free.org/134/standard_crc_16_in_csharp.html and http://www.sanity-free.org/133/crc_16_ccitt_in_csharp.html but neither give me the answer that, according to the diagram above I am supposed to be getting - 0xD9 0xDC. I've even used this site - https://www.lammertbies.nl/comm/info/crc-calculation.html - to manually enter the 4 bytes, but nothing gives me the result I'm supposed to be getting according to the diagram above...
Any ideas where I might be going wrong? Any pointers/hints would be greatly appreciated. Thank you
i have implemented the same logic in nodejs (javascript). I hope this helps someone.
const crc16itu = hexString => {
if (!hexString) return 0x00;
const table = [
0x0000, 0x1189, 0x2312, 0x329B, 0x4624, 0x57AD, 0x6536, 0x74BF,
0x8C48, 0x9DC1, 0xAF5A, 0xBED3, 0xCA6C, 0xDBE5, 0xE97E, 0xF8F7,
0x1081, 0x0108, 0x3393, 0x221A, 0x56A5, 0x472C, 0x75B7, 0x643E,
0x9CC9, 0x8D40, 0xBFDB, 0xAE52, 0xDAED, 0xCB64, 0xF9FF, 0xE876,
0x2102, 0x308B, 0x0210, 0x1399, 0x6726, 0x76AF, 0x4434, 0x55BD,
0xAD4A, 0xBCC3, 0x8E58, 0x9FD1, 0xEB6E, 0xFAE7, 0xC87C, 0xD9F5,
0x3183, 0x200A, 0x1291, 0x0318, 0x77A7, 0x662E, 0x54B5, 0x453C,
0xBDCB, 0xAC42, 0x9ED9, 0x8F50, 0xFBEF, 0xEA66, 0xD8FD, 0xC974,
0x4204, 0x538D, 0x6116, 0x709F, 0x0420, 0x15A9, 0x2732, 0x36BB,
0xCE4C, 0xDFC5, 0xED5E, 0xFCD7, 0x8868, 0x99E1, 0xAB7A, 0xBAF3,
0x5285, 0x430C, 0x7197, 0x601E, 0x14A1, 0x0528, 0x37B3, 0x263A,
0xDECD, 0xCF44, 0xFDDF, 0xEC56, 0x98E9, 0x8960, 0xBBFB, 0xAA72,
0x6306, 0x728F, 0x4014, 0x519D, 0x2522, 0x34AB, 0x0630, 0x17B9,
0xEF4E, 0xFEC7, 0xCC5C, 0xDDD5, 0xA96A, 0xB8E3, 0x8A78, 0x9BF1,
0x7387, 0x620E, 0x5095, 0x411C, 0x35A3, 0x242A, 0x16B1, 0x0738,
0xFFCF, 0xEE46, 0xDCDD, 0xCD54, 0xB9EB, 0xA862, 0x9AF9, 0x8B70,
0x8408, 0x9581, 0xA71A, 0xB693, 0xC22C, 0xD3A5, 0xE13E, 0xF0B7,
0x0840, 0x19C9, 0x2B52, 0x3ADB, 0x4E64, 0x5FED, 0x6D76, 0x7CFF,
0x9489, 0x8500, 0xB79B, 0xA612, 0xD2AD, 0xC324, 0xF1BF, 0xE036,
0x18C1, 0x0948, 0x3BD3, 0x2A5A, 0x5EE5, 0x4F6C, 0x7DF7, 0x6C7E,
0xA50A, 0xB483, 0x8618, 0x9791, 0xE32E, 0xF2A7, 0xC03C, 0xD1B5,
0x2942, 0x38CB, 0x0A50, 0x1BD9, 0x6F66, 0x7EEF, 0x4C74, 0x5DFD,
0xB58B, 0xA402, 0x9699, 0x8710, 0xF3AF, 0xE226, 0xD0BD, 0xC134,
0x39C3, 0x284A, 0x1AD1, 0x0B58, 0x7FE7, 0x6E6E, 0x5CF5, 0x4D7C,
0xC60C, 0xD785, 0xE51E, 0xF497, 0x8028, 0x91A1, 0xA33A, 0xB2B3,
0x4A44, 0x5BCD, 0x6956, 0x78DF, 0x0C60, 0x1DE9, 0x2F72, 0x3EFB,
0xD68D, 0xC704, 0xF59F, 0xE416, 0x90A9, 0x8120, 0xB3BB, 0xA232,
0x5AC5, 0x4B4C, 0x79D7, 0x685E, 0x1CE1, 0x0D68, 0x3FF3, 0x2E7A,
0xE70E, 0xF687, 0xC41C, 0xD595, 0xA12A, 0xB0A3, 0x8238, 0x93B1,
0x6B46, 0x7ACF, 0x4854, 0x59DD, 0x2D62, 0x3CEB, 0x0E70, 0x1FF9,
0xF78F, 0xE606, 0xD49D, 0xC514, 0xB1AB, 0xA022, 0x92B9, 0x8330,
0x7BC7, 0x6A4E, 0x58D5, 0x495C, 0x3DE3, 0x2C6A, 0x1EF1, 0x0F78
];
let fcs = parseInt("FFFF", 16);
let i = 0;
while (i < hexString.length) {
let strHexNumber = hexString.substring(i, i + 2);
let intNumber = parseInt(strHexNumber, 16);
let crc16tabIndex = (fcs ^ intNumber) & parseInt("FF", 16);
fcs = (fcs >> 8) ^ table[crc16tabIndex];
i = i + 2;
}
return fcs ^ 0xFFFF;
};
module.exports = crc16itu;
The ITU CRC-16 is also called the X-25 CRC. You can find its specification here, which is:
width=16 poly=0x1021 init=0xffff refin=true refout=true xorout=0xffff check=0x906e name="X-25"
My crcany code will take that specification and generate C code to compute the CRC.
Here is the bit-wise (slow) code thusly generated:
#include <stddef.h>
unsigned crc16x_25_bit(unsigned crc, void const *data, size_t len) {
if (data == NULL)
return 0;
crc = ~crc;
crc &= 0xffff;
while (len--) {
crc ^= *(unsigned char const *)data++;
for (unsigned k = 0; k < 8; k++)
crc = crc & 1 ? (crc >> 1) ^ 0x8408 : crc >> 1;
}
crc ^= 0xffff;
return crc;
}

EBCDIC to ASCII conversion, handling numeric values

I’m attempting to convert files from ECDIC to ASCII format and have run into an interesting issue. The files contain fixed length records with some fields being signed binary integers (described as B4 in the record layout), and long-precision numeric values (described as L8 in the record layout). I’ve been able to convert character data with no problem, but I’m not sure how to go about converting these numeric values. From a reference manual for the original system (an IBM 5110), the fields are described below.
B indicates the length (2, 4, or 8 bytes) of numeric data items in
fixed-point signed binary integer format that are to be converted to
BASIC internal data format. For record I/O file input, the next 2,
4, or 8 bytes in the record contain a signed binary value to be
converted by the system into internal data format and assigned to the
variable(s) specified in the READ FILE or REREAD FILE statement using
a FORM statement.
and
L indicates long-precision (8 characters) for numeric values. For
input, this entry indicates that an eight-position, long-precision
value in the record is to be assigned without conversion to a
corresponding numeric variable specified in the READ FILE or REREAD
FILE statement.
EDIT: Here's the code I'm using for the conversion
private void ConvertFile(EbcdicFile file)
{
if (file == null) return;
var filePath = Path.Combine(file.Path, file.FileName);
if (!File.Exists(filePath))
{
this.Logger.Info(string.Format("Cannot convert file {0}. It does not exist.", filePath));
return;
}
var ebcdic = Encoding.GetEncoding(37);
string convertedFilepath = Path.Combine(file.Path, file.ConvertedFileName);
byte[] fileData = File.ReadAllBytes(filePath);
if (!file.HasNumericFields)
File.WriteAllBytes(convertedFilepath, Encoding.Convert(ebcdic, Encoding.ASCII, fileData));
else
{
var convertedFileData = new List<byte>();
for (int position = 0; position < fileData.Length; position += file.RecordLength)
{
var segment = new ArraySegment<byte>(fileData, position, file.RecordLength);
file.Fields.ForEach(field =>
{
var fieldSegment = segment.Array.Skip(segment.Offset + field.Start - 1).Take(field.Length);
if (field.Type.Equals("string", StringComparison.OrdinalIgnoreCase))
{
convertedFileData.AddRange(
Encoding.Convert(ebcdic, Encoding.ASCII, fieldSegment.ToArray())
);
}
else if (field.Type.Equals("B4", StringComparison.OrdinalIgnoreCase))
{
// Not sure how to convert this field
}
else if (field.Type.Equals("L8", StringComparison.OrdinalIgnoreCase))
{
// Not sure how to convert this field
}
});
}
File.WriteAllBytes(convertedFilepath, convertedFileData.ToArray());
}
}
You must first know the fixed record size. Use FileStream.Read() to read one record worth of bytes. Then Encoding.GetString() to convert it to a string.
Then fish the fields out of the record using String.SubString(). A B4 is simply a SubString call with a length of 4, L8 with a length of 8. Further convert such a field to a number with Decimal.Parse(). You may have to divide the result, it wasn't clear what fixed-point multiplier is used. Good odds for 100.
Okay, so I've figured out how to convert both fields. B4 fields are very straightforward. They are essentially a 4-byte array which can be converted to an integer.
//The IBM 5110 were big endian machines, so reverse the array
if (BitConverter.IsLittleEndian)
Array.Reverse(by);
int value = BitConverter.ToInt32(by, 0);
The L8 fields are 8-bytes arrays that represented an IBM Double Precision Float. There are many ways this can be converted to an IEEE 754 Float. A few examples can be found at:
How To Read IBM 370 Data from a Binary File
Transform between IEEE, IBM or VAX floating point number formats and bytes expressions
Here's the version I used based on guidance from the articles.
private double IbmFloatToDouble(byte[] value)
{
if (ReferenceEquals(null, value))
throw new ArgumentNullException("value");
if (BitConverter.ToInt64(value, 0) == 0)
return 0;
int exponentBias = 64;
int ibmBase = 16;
double sign = 0.0D;
int signValue = (value[0] & 0x80) >> 7;
int exponentValue = (value[0] & 0x7f);
double fraction1 = (value[1] << 16) + (value[2] << 8) + value[3];
double fraction2 = (value[4] << 24) + (value[5] << 16) + (value[6] << 8) + value[7];
double exponent24 = 16777216.0; // 2^24
double exponent56 = 72057594037927936.0; // 2^56
double mantissa1 = fraction1 / exponent24;
double mantissa2 = fraction2 / exponent56;
double mantissa = mantissa1 + mantissa2;
double exponent = Math.Pow(ibmBase, exponentValue - exponentBias);
if (signValue == 0)
sign = 1.0;
else
sign = -1.0;
return (sign * mantissa * exponent);
}

CRC programming help needed, CRC32 conversion from the .NET class to C

Code(written in C):
unsigned long chksum_crc32 (unsigned char *block, unsigned int length)
{
register unsigned long crc;
unsigned long i;
crc = 0xFFFFFFFF;
for (i = 0; i < length; i++)
{
crc = ((crc >> 8) & 0x00FFFFFF) ^ crc_tab[(crc ^ *block++) & 0xFF];
}
return (crc ^ 0xFFFFFFFF);
}
/* chksum_crc32gentab() -- to a global crc_tab[256], this one will
* calculate the crcTable for crc32-checksums.
* it is generated to the polynom [..]
*/
void chksum_crc32gentab ()
{
unsigned long crc, poly;
int i, j;
poly = 0xEDB88320L;
for (i = 0; i < 256; i++)
{
crc = i;
for (j = 8; j > 0; j--)
{
if (crc & 1)
{
crc = (crc >> 1) ^ poly;
}
else
{
crc >>= 1;
}
}
crc_tab[i] = crc;
}
}
For starters; I know how CRC works, first the divisor is calculated with a specified polynomial, then this FCS(frame check sequence) is appended to the data set and sent to the end users system. Once the transfer is finished, the FCS is checked with the same polynomial used to calculate the FCS, and if the remainder of the data with that divisor is zero, then you know the data is correct.
I do not understand the implementation of these two functions. From what I have learned, the function chksum_crc32gentab() generates all the possible hex values the checksum could take with the 32 bit CRC polynomial. One thing I dont get is how poly = 0xEDB88320L; is equivelent to a polynomial. I don't understand the logic in the bottom of this function either. For example, the conditional if (crc & 1), does this mean that for every bit in crc that is 1, compute, otherwise shift right one bit?
I also do not understand chksum_crc32(unsigned char *block, unsigned int length);. Does this function just take in a string of bytes and convert them to the proper crc value computed with the table?. I guess I am confused about the logic it uses within the for loop.
If anyone understands this code an explanation would be great; this does work for the crc32 conversion from the .net class, an example of how data is converted then used by these functions would be something like:
(C# source)
MemoryStream ms = new MemoryStream(System.Text.Encoding.Default.GetBytes(input));
foreach (byte b in crc32.ComputeHash(ms))
hash += b.ToString("x2").ToLower();
Here is the original site and project the C code was taken from. http://www.codeproject.com/Articles/35134/How-to-calculate-CRC-in-C
Any explanation would help
Or just google it... Second hit is: http://www.opensource.apple.com/source/xnu/xnu-1456.1.26/bsd/libkern/crc32.c
Backporting it from C#'s the hard way to do it, most of these algorithms are already in C.
In CRC calculations, binary polynomials, which are sums of x^n with either a 0 or 1 coefficient, are represented simply as binary words where the position of the 0 or 1 indicates which power of x it is a coefficient of.
0xEDB88320L represents the coefficients of the CRC32 polynomial as 1's where there is an x^n term (except for the x^32 term, which is left out). The CRC32 polynomial (why oh why doesn't stackoverflow have TeX equations like math.stackexchange -- I can't write decent equations here! sigh, sorry for the rant ...) is:
x^32 + x^26 + x^23 + x^22 + x^16 + x^12 + x^11 + x^10 + x^8 + x^7 + x^5 + x^4 + x^2 + x + 1
Because of how this CRC is defined with respect to bit-ordering, the lowest coefficients are in the highest bits. So the first E in the hex constant above is 1110 representing (in order from left to right in the bits), 1 + x + x^2.
You can find the construction in the crc32.c source file of zlib, from which a snippet is shown here:
static const unsigned char p[] = {0,1,2,4,5,7,8,10,11,12,16,22,23,26};
/* make exclusive-or pattern from polynomial (0xedb88320UL) */
poly = 0;
for (n = 0; n < (int)(sizeof(p)/sizeof(unsigned char)); n++)
poly |= (z_crc_t)1 << (31 - p[n]);
/* generate a crc for every 8-bit value */
for (n = 0; n < 256; n++) {
c = (z_crc_t)n;
for (k = 0; k < 8; k++)
c = c & 1 ? poly ^ (c >> 1) : c >> 1;
crc_table[0][n] = c;
}
The if (crc & 1) or c & 1 ? above looks at the low bit of the CRC at each step before it is shifted away. That is effectively a carry bit for the polynomial subtraction operation, so if it is a one, the polynomial is subtracted (exclusive-ored) from the shifted down polynomial in the CRC (multiplied by x). The CRC is shifted down whether the low bit is 1 or not.
The chksum_crc32() function that you show indeed computes the CRC on the provided block of data. It is the standard table-based approach for CRC calculations on strings of bytes, which indexes the table by the exclusive-or of the data byte and the low byte of the CRC. This does the same thing as shifting in a bit at a time and applying the polynomial for 1 bits, but does it in one step instead of eight. The CRC is effectively multiplied by x^8 (the >> 8), and is exclusive-ored with the effect of exclusive-oring with the polynomial 0 to 8 times at various shifted locations depending on the index value. It is simply a speed trick using a pre-computed table.
You can find even more extreme speed tricks used in zlib's crc32.c that uses larger tables and processes more data a time.

How to generate a LONG guid?

I would like to generate a long UUID - something like the session key used by gmail. It should be at least 256 chars and no more than 512. It can contain all alpha-numeric chars and a few special chars (the ones below the function keys on the keyboard). Has this been done already or is there a sample out there?
C++ or C#
Update: A GUID is not enough. We already have been seeing collisions and need to remedy this. 512 is the max as of now because it will prevent us from changing stuff that was already shipped.
Update 2: For the guys who are insisting about how unique the GUID is, if someone wants to guess your next session ID, they don't have to compute the combinations for the next 1 trillion years. All they have to do is use constrain the time factor and they will be done in hours.
If your GUIDs are colliding, may I ask how you're generating them?
It is astronomically improbable that GUIDs would collide as they are based on:
60 bits - timestamp during generation
48 bits - computer identifier
14 bits - unique ID
6 bits are fixed
You would have to run the GUID generation on the same machine about 50 times in the exact same instant in time in order to have a 50% chance of collision. Note that instant is measured down to nanoseconds.
Update:
As per your comment "putting GUIDs into a hashtable"... the GetHashCode() method is what is causing the collision, not the GUIDs:
public override int GetHashCode()
{
return ((this._a ^ ((this._b << 0x10) | ((ushort) this._c))) ^ ((this._f << 0x18) | this._k));
}
You can see it returns an int, so if you have more than 2^32 "GUIDs" in the hashtable, you are 100% going to have a collision.
As per your update2 you are correct on Guids are predicable even the msdn references that. here is a method that uses a crptographicly strong random number generator to create the ID.
static long counter; //store and load the counter from persistent storage every time the program loads or closes.
public static string CreateRandomString(int length)
{
long count = System.Threading.Interlocked.Increment(ref counter);
int PasswordLength = length;
String _allowedChars = "abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNOPQRSTUVWXYZ23456789";
Byte[] randomBytes = new Byte[PasswordLength];
RNGCryptoServiceProvider rng = new RNGCryptoServiceProvider();
rng.GetBytes(randomBytes);
char[] chars = new char[PasswordLength];
int allowedCharCount = _allowedChars.Length;
for (int i = 0; i < PasswordLength; i++)
{
while(randomBytes[i] > byte.MaxValue - (byte.MaxValue % allowedCharCount))
{
byte[] tmp = new byte[1];
rng.GetBytes(tmp);
randomBytes[i] = tmp[0];
}
chars[i] = _allowedChars[(int)randomBytes[i] % allowedCharCount];
}
byte[] buf = new byte[8];
buf[0] = (byte) count;
buf[1] = (byte) (count >> 8);
buf[2] = (byte) (count >> 16);
buf[3] = (byte) (count >> 24);
buf[4] = (byte) (count >> 32);
buf[5] = (byte) (count >> 40);
buf[6] = (byte) (count >> 48);
buf[7] = (byte) (count >> 56);
return Convert.ToBase64String(buf) + new string(chars);
}
EDIT I know there is some biasing because allowedCharCount is not evenly divisible by 255, you can get rid of the bias throwing away and getting a new random number if it lands in the no-mans-land of the remainder.
EDIT2 - This is not guaranteed to be unique, you could hold a static 64 bit(or higher if necessary) monotonic counter encode it to base46 and have that be the first 4-5 characters of the id.
UPDATE - Now guaranteed to be unique
UPDATE 2: Algorithm is now slower but removed biasing.
EDIT: I just ran a test, I wanted to let you know that ToBase64String can return non alphnumeric charaters (like 1 encodes to "AQAAAAAAAAA=") just so you are aware.
New Version:
Taking from Matt Dotson's answer on this page, if you are no so worried about the keyspace you can do it this way and it will run a LOT faster.
public static string CreateRandomString(int length)
{
length -= 12; //12 digits are the counter
if (length <= 0)
throw new ArgumentOutOfRangeException("length");
long count = System.Threading.Interlocked.Increment(ref counter);
Byte[] randomBytes = new Byte[length * 3 / 4];
RNGCryptoServiceProvider rng = new RNGCryptoServiceProvider();
rng.GetBytes(randomBytes);
byte[] buf = new byte[8];
buf[0] = (byte)count;
buf[1] = (byte)(count >> 8);
buf[2] = (byte)(count >> 16);
buf[3] = (byte)(count >> 24);
buf[4] = (byte)(count >> 32);
buf[5] = (byte)(count >> 40);
buf[6] = (byte)(count >> 48);
buf[7] = (byte)(count >> 56);
return Convert.ToBase64String(buf) + Convert.ToBase64String(randomBytes);
}
StringBuilder sb = new StringBuilder();
for (int i = 0; i < HOW_MUCH_YOU_WANT / 32; i++)
sb.Append(Guid.NewGuid().ToString("N"));
return sb.ToString();
but what for?
The problem here is why, not how. A session ID bigger than a GUID is useless, because it's already big enough to thwart brute force attacks.
If you're concerned about predicting GUID's, don't be. Unlike the earlier, sequential GUID's, V4 GUID's are cryptographically secure, based on RC4. The only exploit I know about depends on having full access to the internal state of the process that's generating the values, so it can't get you anywhere if all you have is a partial sequence of GUID's.
If you're paranoid, generate a GUID, hash it with something like SHA-1, and use that value. However, this is a waste of time. If you're concerned about session hijacking, you should be looking at SSL, not this.
byte[] random = new Byte[384];
//RNGCryptoServiceProvider is an implementation of a random number generator.
RNGCryptoServiceProvider rng = new RNGCryptoServiceProvider();
rng.GetBytes(random);
var sessionId = Convert.ToBase64String(random);
You can replace the "/" and "=" from the base64 encoding to be whatever special characters are acceptable to you.
Base64 encoding creates a string that is 4/3 larger than the byte array (hence the 384 bytes should give you 512 characters).
This should give you orders of magnatude more values than a base16 (hex) encoded guid. 512^16 vs 512^64
Also if you are putting these in sql server, make sure to turn OFF case insensitivity.
There are two really easy ways (C#):
1) Generate a bunch of Guids using Guid.NewGuid().ToString("N"). each GUID will be 32 characters long, so just generate 8 of them and concatenate them to get 256 chars.
2) Create a constant string (const string sChars = "abcdef") of acceptable characters you'd like in your UID. Then in a loop, randomly pick characters from that string by randomly generating a number from 0 to the length of the string of acceptable characters (sChars), and concatenate them in a new string (use stringbuilder to make it more performant, but string will work too).
You may want to check out boost's Uuid Library. It supports a variety of generators, including a random generator that might suit your needs.
I would use some kind of hash of std::time() probably sha512.
ex (using crypto++ for the sha hash + base64 encoding).
#include <iostream>
#include <sstream>
#include <ctime>
#include <crypto++/sha.h>
#include <crypto++/base64.h>
int main() {
std::string digest;
std::stringstream ss("");
ss << std::time(NULL);
// borrowed from http://www.cryptopp.com/fom-serve/cache/50.html
CryptoPP::SHA512 hash;
CryptoPP::StringSource foo(ss.str(), true,
new CryptoPP::HashFilter(hash,
new CryptoPP::Base64Encoder(
new CryptoPP::StringSink(digest))));
std::cout << digest << std::endl;
return 0;
}
https://github.com/bigfatsea/SUID Simple Unique Identifier
Though it's in Java, but can be easily ported to any other language. You may expect duplicated ids on same instance 136 years later, good enough for medium-small projects.
Example:
long id = SUID.id().get();

Unpacking EBCDIC Packed Decimals (COMP-3) in an ASCII Conversion

I am using Jon Skeet's EBCDIC implementation in .NET to read a VSAM file downloaded in binary mode with FTP from a mainframe system. It works very well for reading/writing in this encoding, but it does not have anything to read packed-decimal values. My file contains these, and I need to unpack them (at the cost of more bytes, obviously).
How can I do this?
My fields are defined as PIC S9(7)V99 COMP-3.
Ahh, BCD. Honk if you used it in 6502 assembly.
Of course, the best bet is to let the COBOL MOVE do the job for you! One of these possibilities may help.
(Possibility #1) Assuming you do have access to the mainframe and the source code, and the output file is ONLY for your use, modify the program so it just MOVEs the value to a plain unpacked PIC S9(7)V99.
(Possibility #2) Assuming it's not that easy, (e.g., file is input for other pgms, or can't change the code), you can write another COBOL program on the system that reads that file and writes another. Cut and paste the file record layout with the BCD into the new program for input and output files. Modify the output version to be non-packed. Read a record, do a 'move corresponding' to transfer the data, and write, until eof. Then transfer that file.
(Possibility #3) If you can't touch the mainframe, note the description in the article you linked in your comment. BCD is relatively simple. It could be as easy as this (vb.net):
Private Function FromBCD(ByVal BCD As String, ByVal intsz As Integer, ByVal decsz As Integer) As Decimal
Dim PicLen As Integer = intsz + decsz
Dim result As Decimal = 0
Dim val As Integer = Asc(Mid(BCD, 1, 1))
Do While PicLen > 0
result *= 10D
result += val \ 16
PicLen -= 1
If PicLen > 0 Then
result *= 10D
result += val Mod 16
PicLen -= 1
BCD = Mid(BCD, 2)
End If
val = Asc(Mid(BCD, 1, 1))
Loop
If val Mod 16 = &HD& Then
result = -result
End If
Return result / CDec(10 ^ decsz)
End Function
I tested it with a few variations of this call:
MsgBox(FromBCD("#" & Chr(13 + 16), 2, 1))
E.g., is -40.1. But just a few. So it might still be wrong.
So then if your comp-3 starts, say, at byte 10 of the input record layout, this would solve it:
dim valu as Decimal = FromBCD(Mid(InputLine,10,5), 7,2))
Noting the formulas from the data-conversion article for the # of bytes to send in, and the # of 9's before and after the V.
Store the result in a Decimal to avoid rounding errors. Esp if it's $$$. Float & Double WILL cause you grief! If you're not processing it, even a string is better.
of course it could be harder. Where I work, the mainframe is 9 bits per byte. Serious. That's what makes the first 2 possibilities so salient. Of course what really makes them better is the fact the you may be a PC only programmer and this is a great excuse to get a mainframe programmer to do the work for you! If you are so lucky to have that option...
Peace,
-Al
I use this extension method for packed decimal (BCD) conversion:
/// <summary>
/// computes the actual decimal value from an IBM "Packed Decimal" 9(x)v99 (COBOL COMP-3) format
/// </summary>
/// <param name="value">byte[]</param>
/// <param name="precision">byte; decimal places, default 2</param>
/// <returns>decimal</returns>
public static decimal FromPackedDecimal(this byte[] value, byte precision = 2)
{
if (value.Length < 1)
{
throw new System.InvalidOperationException("Cannot unpack empty bytes.");
}
double power = System.Math.Pow(10, precision);
if (power > long.MaxValue)
{
throw new System.InvalidOperationException(
$"Precision too large for valid calculation: {precision}");
}
string hex = System.BitConverter.ToString(value).Replace("-", "");
var bytes = Enumerable.Range(0, hex.Length)
.Select(x => System.Convert.ToByte($"0{hex.Substring(x, 1)}", 16))
.ToList();
long place = 1;
decimal ret = 0;
for (int i = bytes.Count - 2; i > -1; i--)
{
ret += (bytes[i] * place);
place *= 10;
}
ret /= (long)power;
return (bytes.Last() & (1 << 7)) != 0 ? ret * -1 : ret;
}

Categories

Resources