Reading binary file into struct

Reading binary file into struct - c#

I know this has been answered but after reading the other questions I'm still with no solution. I have a file which was written with the following C++ struct:
typedef struct myStruct{
char Name[127];
char s1[2];
char MailBox[149];
char s2[2];
char RouteID[10];
} MY_STRUCT;
My approach was to be able to parse one field at a time in the struct, but my issue is that I cannot get s1 and MailBox to parse correctly. In the file, the s1 field contains "\r\n" (binary 0D0A), and this causes my parsing code to not parse the MailBox field correctly. Here's my parsing code:
[StructLayout(LayoutKind.Explicit, Size = 0x80 + 0x2 + 0x96)]
unsafe struct MY_STRUCT
{
[FieldOffset(0)]
[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 0x80)]
public string Name;
[FieldOffset(0x80)]
public fixed char s1[2];
/* Does not work, "Could not load type 'MY_STRUCT' ... because it contains an object field at offset 130 that is incorrectly aligned or overlapped by a non-object field." */
[FieldOffset(0x80 + 0x2)]
[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 0x96)]
public string MailBox;
}
If I comment out the last field and reduce the struct's size to 0x80+0x2 it will work correctly for the first two variables.
One thing to note is that the Name and Mailbox strings contain the null terminating character, but since s1 doesn't have the null-terminating character it seems to be messing up the parser, but I don't know why because to me it looks like the code is explicitly telling the Marshaler that the s1 field in the struct is only a fixed 2-char buffer, not a null-terminated string.
Here is a pic of my test data (in code I seek past the first row in the BinaryReader, so "Name" begins at 0x0, not 0x10).

Here's one way, it doesn't use unsafe (nor is it particularly elegant/efficient)
using System.Text;
using System.IO;
namespace ReadCppStruct
{
/*
typedef struct myStruct{
char Name[127];
char s1[2];
char MailBox[149];
char s2[2];
char RouteID[10];
} MY_STRUCT;
*/
class MyStruct
{
public string Name { get; set; }
public string MailBox { get; set; }
public string RouteID { get; set; }
}
class Program
{
static string GetString(Encoding encoding, byte[] bytes, int index, int count)
{
string retval = encoding.GetString(bytes, index, count);
int nullIndex = retval.IndexOf('\0');
if (nullIndex != -1)
retval = retval.Substring(0, nullIndex);
return retval;
}
static MyStruct ReadStruct(string path)
{
byte[] bytes = File.ReadAllBytes(path);
var utf8 = new UTF8Encoding();
var retval = new MyStruct();
int index = 0; int cb = 127;
retval.Name = GetString(utf8, bytes, index, cb);
index += cb + 2;
cb = 149;
retval.MailBox = GetString(utf8, bytes, index, cb);
index += cb + 2;
cb = 10;
retval.RouteID = GetString(utf8, bytes, index, cb);
return retval;
} // http://stackoverflow.com/questions/30742019/reading-binary-file-into-struct
static void Main(string[] args)
{
MyStruct ms = ReadStruct("MY_STRUCT.data");
}
}
}

Here's how I got it to work for me:
public static unsafe string BytesToString(byte* bytes, int len)
{
return new string((sbyte*)bytes, 0, len).Trim(new char[] { ' ' }); // trim trailing spaces (but keep newline characters)
}
[StructLayout(LayoutKind.Explicit, Size = 127 + 2 + 149 + 2 + 10)]
unsafe struct USRRECORD_ANSI
{
[FieldOffset(0)]
public fixed byte Name[127];
[FieldOffset(127)]
public fixed byte s1[2];
[FieldOffset(127 + 2)]
public fixed byte MailBox[149];
[FieldOffset(127 + 2 + 149)]
public fixed byte s2[2];
[FieldOffset(127 + 2 + 149 + 2)]
public fixed byte RouteID[10];
}
After the struct has been parsed, I can access the strings by calling the BytesToString method, e.g. string name = BytesToString(record.Name, 127);
I did notice that I don't need the Size attribute in the StructLayout, I'm not sure if it's the best practice to keep it or remove it, ideas?

Your struct sizes are not adding up correctly. The MailBox size is 0x95 as listed in MY_STRUCT, not 0x96 as you're calling it in the C# code.

Related

Bytes sequence problem during Marshaling in C#

I want to use my laptop to communicate with MES(Manufacturing Execution System).
And when I serialized the data (struct type), something happen.
The code below is what I have done:
[StructLayout(LayoutKind.Sequential, Pack = 4)]
struct DataPackage
{
public int a;
public ushort b;
public byte c;
[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 5)] public string d;
}
class Program
{
static void Main(string[] args)
{
DataPackage pack1 = new DataPackage();
pack1.a = 0x33333301;
pack1.b = 200;
pack1.c = 21;
pack1.d = "hello";
byte[] pack1_serialized = getBytes(pack1);
Console.WriteLine(BitConverter.ToString(pack1_serialized));
byte[] getBytes(DataPackage str)
{
int size = Marshal.SizeOf(str);
byte[] arr = new byte[size];
IntPtr ptr = Marshal.AllocHGlobal(size);
Marshal.StructureToPtr(str, ptr, true);
Marshal.Copy(ptr, arr, 0, size);
Marshal.FreeHGlobal(ptr);
return arr;
}
}
}
And here is the outcome:
I want the outcome to be like this:
33-33-33-01-00-C8-15-68-65-6C-6C-6F
So the questions are:
Why is the uint / ushort type data reverse after Marshalling?
Is there any other way that I can send the data in the sequence that I want ?
Why is the last word "o" in string "hello" disappear in the byte array ?
Thanks.

1 - Because your expected outcome is big endian, and your system appears to use little endian, so basically reversed order of bytes compared to what you expect.
2- Easiest way is to "convert" your numbers to big endian before marshalling (that is change them in a way which will produce desired result while converting them using little endian), for example like this:
static int ToBigEndianInt(int x) {
if (!BitConverter.IsLittleEndian)
return x; // already fine
var ar = BitConverter.GetBytes(x);
Array.Reverse(ar);
return BitConverter.ToInt32(ar, 0);
}
static ushort ToBigEndianShort(ushort x) {
if (!BitConverter.IsLittleEndian)
return x; // already fine
var ar = BitConverter.GetBytes(x);
Array.Reverse(ar);
return BitConverter.ToUInt16(ar, 0);
}
And then:
pack1.a = ToBigEndianInt(0x33333301);
pack1.b = ToBigEndianShort(200);
Note that this way of conversion is not very efficient and if you need more perfomance you can do this with some bit manipulations.
3 - Because string is null terminated, and this null terminator counts in SizeConst. Since you have it 5, there will be 4 characters of your string + 1 null terminator. Just increase SizeConst = 6 (that might add additional zeroes at the end because of Pack = 4).

Creating a managed structure for data

Last night I was working on cleaning up some code I found that was a port of an old game I used to play. One of the tricks I used to clean up data was to get rid of the home-brew DataOffsetAttribute that was made for a struct and just make it a plain jane struct and then I can (later) convert it to something a little more useful. Kind of like a person would do with a DataLayer. This worked really good for fixed sized data from a file. Using this method I was about to convert between the two "data types".
public static byte[] ToByteArray<T>(this T dataStructure) where T : struct
{
int size = Marshal.SizeOf(dataStructure);
byte[] arr = new byte[size];
IntPtr ptr = Marshal.AllocHGlobal(size);
Marshal.StructureToPtr(dataStructure, ptr, true);
Marshal.Copy(ptr, arr, 0, size);
Marshal.FreeHGlobal(ptr);
return arr;
}
public static T MarshalAs<T>(this byte[] rawDataStructure) where T : struct
{
var type = typeof(T);
int size = Marshal.SizeOf(type);
IntPtr ptr = Marshal.AllocHGlobal(size);
Marshal.Copy(rawDataStructure, 0, ptr, size);
T structure = (T)Marshal.PtrToStructure(ptr, type);
Marshal.FreeHGlobal(ptr);
return structure;
}
So then I started to wonder if this would work on another project I worked on a long time ago where the data was variable. Here is what I was hoping my data structure would look like
[StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct RfidReaderResponse
{
public byte PreambleA;
public byte PreambleB;
public byte Length;
public byte LengthLRC;
public ushort ReaderId;
public byte CommandId;
public byte ErrorCode;
public byte[] Data;
public byte LRC;
}
I would probably combine the preamble bytes into a ushort and check if it is a specific value... but that is a different discussion. I have 3 known good responses that I used for testing back in the day. I put those into linqpad as well as my hopeful datastructure. So here is what I am currently using for testing
void Main()
{
var readerResponses = new string[]
{
//data is null because of error EC
"AA-BB-05-05-02-39-0C-EC-21",
//data is 44-00
"AA-BB-07-07-02-39-0C-00-44-00-8B",
//data is 44-00-20-04-13-5E-1A-A4-33-80
"AA-BB-0F-0F-02-39-10-00-44-00-20-04-13-5E-1A-A4-33-80-FB",
};
readerResponses
.Select(x=> x.ToByteArray().MarshalAs<RfidReaderResponse>())
.Dump();
}
now if I comment out the last two fields of the struct I get back what I expect for the first part of the response, but I am just lost on how to get back the data portion. I would prefer to have it as I have it with some magical attribute that the Marshaler understands but so far I can't figure it out. I have tried
[StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct RfidReaderResponse
{
public byte PreambleA;
public byte PreambleB;
public byte Length;
public byte LengthLRC;
[MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]
public IntPtr Data;
}
and
[StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct RfidReaderResponse
{
public byte PreambleA;
public byte PreambleB;
public byte Length;
public byte LengthLRC;
public ushort ReaderId;
public byte CommandId;
public byte ErrorCode;
[MarshalAs(UnmanagedType.SafeArray, SafeArraySubType=VarEnum.VT_I1)]
public byte[] Data;
public byte LRC;
}
after a few hours of research. Both of which didn't work stating that my method couldn't make a size for my structure. So I dunno. What am I doing wrong?
EDIT
forgot the method for converting hex strings to byte arrays
public static byte[] ToByteArray(this string hexString)
{
hexString = System.Text.RegularExpressions.Regex.Replace(hexString, "[^0-9A-F.]", "").ToUpper();
if (hexString.Length % 2 == 1)
throw new Exception("The binary key cannot have an odd number of digits");
byte[] arr = new byte[hexString.Length >> 1];
for (int i = 0; i < hexString.Length >> 1; ++i)
{
arr[i] = (byte)((GetHexVal(hexString[i << 1]) << 4) + (GetHexVal(hexString[(i << 1) + 1])));
}
return arr;
}
private static int GetHexVal(char hex)
{
int val = (int)hex;
return val - (val < 58 ? 48 : 55);
}

Remove junk value from string

I am getting data from Device(Time attendance) using C++ library in C# 4.0, issue is that with name field have some junk value.
Name field is byte array and I had try using Encoding.Default.GetString(user.Name), here user is a Struct.
[StructLayout(LayoutKind.Sequential, Size = 48, CharSet = CharSet.Ansi), Serializable]
public struct User
{
public int ID;
[MarshalAsAttribute(UnmanagedType.ByValArray, SizeConst = 12)]
public byte[] Name;
}
Output
"Jon\0 41 0"
"rakesh\0 6"
I want to remove \0 41 0 and \0 6.
Any help would be appreciated.

Keep it simple:
static class StringExtensions
{
public static string TrimNullTerminatedString(this string s)
{
if (s == null)
throw new NotImplementedException();
int i = s.IndexOf('\0');
if (i >= 0)
return s.Substring(0, i);
return s;
}
}
Use it like this:
string name = Encoding.Default.GetString(user.Name).TrimNullTerminatedString();
That being said, a better option would be to handle that at declaration level. If Name is a string, there is no reason to declare it as byte[]; declare it as a string, and the null terminating character will be handled properly:
[MarshalAsAttribute(UnmanagedType.ByValTStr, SizeConst = 12)]
public string Name;
It would also be easier to manipulate in code...

RegEx is a best way for removing junk value, In this example with W I remove all character that is not word,
textBox1.Text = Regex.Replace("rakesh\0 6", "W", "");
You can find complete library for regex on http://regexlib.com/

do it like this
Regex re = New Regex("[\x0A\x0D]", RegexOptions.Compiled)
str = re.Replace(str.Trim(), String.Empty)
OR
string str1="";
for(int i = 0 ; i < str.lengh ; i++) {
if(!char.IsLetter(str[i])
str1 += str[i];
}
return str1

You are dealing with null-terminated strings. So you want to strip zero byte and all bytes after the zero byte in your arrays before passing it to Encoding.Default.GetString(byte[]).
Update:
Example code (may be not very optimal):
static byte[] RemoveJunk(byte[] input)
{
var end = Array.IndexOf(input, (byte)0);
Console.WriteLine(end);
if (end < 0)
return input;
var result = new byte[end];
Array.Copy(input, result, end);
return result;
}

Serializing & Deserializing a Struct Array

In the following bit of code, I am observing that the marshaler is reading past the 3 byte source array to populate another 8 bytes of data. With time, the code eventually throws a memory access violation. Is there a way to tell the marshaller to only marshal the first 3 bytes when converting a pointer to a structure? If I make the Articulations array "NonSerialized" then the constructor will throw an access violation when processing a source array of 11 bytes.
using System;
using System.Runtime.InteropServices;
namespace MarshallingTest
{
[StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct Articulation
{
public const int BaseSize = 8;
public byte attribute1;
public byte attribute2;
public byte attribute3;
public byte attribute4;
public byte attribute5;
public byte attribute6;
public byte attribute7;
public byte attribute8;
}
[StructLayout(LayoutKind.Sequential, Pack = 1)]
public class TestEntity
{
public const int BaseSize = 3;
public byte EntityId; // 1 byte
public byte Variable; // 1 byte
public byte NumArticulations; // 1 byte
[MarshalAs(UnmanagedType.ByValArray, ArraySubType = UnmanagedType.Struct)]
public Articulation[] Articulations; // 8 bytes each
public TestEntity(byte[] rawData)
{
unsafe
{
fixed (byte* pData = rawData)
{
// I am observing that the marshaler is reading past the 3 bytes
// to populate another 8 bytes of data. With time, the code
// will eventually throw a memory access violation.
//
// Is there a way to tell the marshaller to only marshal the
// first 3 bytes when converting a pointer to a structure?
Marshal.PtrToStructure((IntPtr) pData, this);
for (int i = 0; i < BaseSize + Articulation.BaseSize; i++)
{
Console.WriteLine("pData + " + i + " = " + *(pData + i));
}
}
}
}
public byte[] ToRaw()
{
byte[] byteArray = new byte[BaseSize + Articulation.BaseSize*Articulations.Length];
unsafe
{
fixed (byte* pData = byteArray)
{
Marshal.StructureToPtr(this, (IntPtr) pData, false);
}
}
return byteArray;
}
}
internal class Program
{
private const int TestDataSize = TestEntity.BaseSize;
private static void Main()
{
byte[] testData = new byte[TestDataSize];
for (int i = 0; i < TestDataSize; i++)
{
testData[i] = (byte) (i + 1);
}
TestEntity test = new TestEntity(testData);
// Print resulting array. You'll see that data outside the source
// byte array was marshalled into the test structure.
Console.WriteLine(test.EntityId);
Console.WriteLine(test.Variable);
Console.WriteLine(test.NumArticulations);
Console.WriteLine(test.Articulations[0].attribute1);
Console.WriteLine(test.Articulations[0].attribute2);
Console.WriteLine(test.Articulations[0].attribute3);
Console.WriteLine(test.Articulations[0].attribute4);
Console.WriteLine(test.Articulations[0].attribute5);
Console.WriteLine(test.Articulations[0].attribute6);
Console.WriteLine(test.Articulations[0].attribute7);
Console.WriteLine(test.Articulations[0].attribute8);
Console.WriteLine("Test complete.");
}
}
}

Are constants pinned in C#?

I'm working in C# with a Borland C API that uses a lot of byte pointers for strings. I've been faced with the need to pass some C# strings as (short lived) byte*.
It would be my natural assumption that a const object would not be allocated on the heap, but would be stored directly in program memory, but I've been unable to verify this in any documentation.
Here's an example of what I've done in order to generate a pointer to a constant string. This does work as intended in testing, I'm just not sure if it's really safe, or it's only working via luck.
private const string pinnedStringGetWeight = "getWeight";
unsafe public static byte* ExampleReturnWeightPtr(int serial)
{
fixed (byte* pGetWeight = ASCIIEncoding.ASCII.GetBytes(pinnedStringGetWeight))
return pGetWeight;
}
Is this const really pinned, or is there a chance it could be moved?
#Kragen:
Here is the import:
[DllImport("sidekick.dll", CallingConvention = CallingConvention.Winapi)]
public static extern int getValueByFunctionFromObject(int serial, int function, byte* debugCallString);
This is the actual function. Yes, it actually requires a static function pointer:
private const int FUNC_GetWeight = 0x004243D0;
private const string pinnedStringGetWeight = "getWeight";
unsafe public static int getWeight(int serial)
{
fixed (byte* pGetWeight = ASCIIEncoding.ASCII.GetBytes(pinnedStringGetWeight))
return Core.getValueByFunctionFromObject(serial, FUNC_GetWeight, pGetWeight);
}
Following is another method that I used when mocking my API, using a static struct, which I also hoped was pinned. I was hoping to find a way to simplify this.
public byte* getObjVarString(int serial, byte* varName)
{
string varname = StringPointerUtils.GetAsciiString(varName);
string value = MockObjVarAttachments.GetString(serial, varname);
if (value == null)
return null;
return bytePtrFactory.MakePointerToTempString(value);
}
static UnsafeBytePointerFactoryStruct bytePtrFactory = new UnsafeBytePointerFactoryStruct();
private unsafe struct UnsafeBytePointerFactoryStruct
{
fixed byte _InvalidScriptClass[255];
fixed byte _ItemNotFound[255];
fixed byte _MiscBuffer[255];
public byte* InvalidScriptClass
{
get
{
fixed (byte* p = _InvalidScriptClass)
{
CopyNullString(p, "Failed to get script class");
return p;
}
}
}
public byte* ItemNotFound
{
get
{
fixed (byte* p = _ItemNotFound)
{
CopyNullString(p, "Item not found");
return p;
}
}
}
public byte* MakePointerToTempString(string text)
{
fixed (byte* p = _ItemNotFound)
{
CopyNullString(p, text);
return p;
}
}
private static void CopyNullString(byte* ptrDest, string text)
{
byte[] textBytes = ASCIIEncoding.ASCII.GetBytes(text);
fixed (byte* p = textBytes)
{
int i = 0;
while (*(p + i) != 0 && i < 254 && i < textBytes.Length)
{
*(ptrDest + i) = *(p + i);
i++;
}
*(ptrDest + i) = 0;
}
}
}

How constants are allocated shouldn't matter in this case, because ASCIIEncoding.ASCII.GetBytes() returns a new byte array (it cannot return the constant's internal array, since it is encoded differently (edit: there is hopefully no way to get a pointer to a string's internal array, since strings are immutable)). However, the guarantee that the GC won't touch the array only lasts as long as the fixed scope does - in other words, when the function returns, the memory is no longer pinned.

Based on Kragans comment, I looked into the proper way to marshall my string to a byte pointer and am now using the following for the first example I used in my question:
[DllImport("sidekick.dll", CallingConvention = CallingConvention.Winapi)]
public static extern int getValueByFunctionFromObject(int serial, int function, [MarshalAs(UnmanagedType.LPStr)]string debugCallString);

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Reading binary file into struct - c#

Your struct sizes are not adding up correctly. The MailBox size is 0x95 as listed in MY_STRUCT, not 0x96 as you're calling it in the C# code.

Related

Bytes sequence problem during Marshaling in C#

Creating a managed structure for data

Remove junk value from string

Serializing & Deserializing a Struct Array

Are constants pinned in C#?

Categories

Resources