C# Struct Layouts and unexpected bench marking result? - c#

I'm not really micro-managing the performance of an application, but I'm curios on the below scenario.
For Structs, by default, C# compiler generates the layout, LayoutType. Sequential. This means the fields should stay in the order defined by the programmer. I believe that this is to support interoperability with unmanaged code. However most user defined Structs have nothing to do with interoperability. I have read that for better performance, we can explicitly specify the LayoutKind.Auto, and let the CLR to decide the best possible layout. In order to test this, I thought of doing a quick benchmark on both layouts. However my result says the default layout (LayoutType.Sequnetial) is bit quicker than the explicit layout (LayoutType.Auto). I was expecting the reverse.
Below is the test I ran on my machine (x86 running .NET 4)
//uses LayoutKind.Sequence by default
public struct StructSeq
{
private readonly Byte mb;
private readonly Int16 mx;
public string a;
public string b;
public string c;
public string d;
}
[StructLayout(LayoutKind.Auto)]
public struct StructAuto
{
private readonly Byte mb;
private readonly Int16 mx;
public string a;
public string b;
public string c;
public string d;
}
public sealed class Program
{
public static void Main()
{
StructSeq sq = new StructSeq();
Stopwatch sw1 = new Stopwatch();
sw1.Start();
for (int i = 0; i < 10000; i++)
{
sq = ProcessStructSeq(sq);
}
sw1.Stop();
Console.WriteLine("Struct LayoutKind.Sequence (default) {0}", sw1.Elapsed.TotalMilliseconds);
StructAuto so = new StructAuto();
Stopwatch sw2 = new Stopwatch();
sw2.Start();
for (int i = 0; i < 10000; i++)
{
so = ProcessStructAuto(so);
}
sw2.Stop();
Console.WriteLine("Struct LayoutKind.Auto (explicit) {0}", sw2.Elapsed.TotalMilliseconds);
Console.ReadLine();
}
public static StructSeq ProcessStructSeq(StructSeq structSeq)
{
structSeq.a = "1";
structSeq.b = "2";
structSeq.c = "3";
structSeq.d = "4";
return structSeq;
}
public static StructAuto ProcessStructAuto(StructAuto structAuto)
{
structAuto.a = "1";
structAuto.b = "2";
structAuto.c = "3";
structAuto.d = "4";
return structAuto;
}
}
Below is a sample result I get on my machine (x86 running .NET 4)
Struct LayoutKind.Sequence (default) 0.7488
Struct LayoutKind.Auto (explicit) 0.7643
I ran this test multiple times and I always get Struct LayoutKind.Sequence (default) < Struct LayoutKind.Auto (explicit)
Even though it is a micro milliseconds difference, I ‘m expecting the Struct LayoutKind.Auto (explicit) to be lower than the Struct LayoutKind.Sequence (default).
Does anyone know the reason for this? Or is it my benchmarking is not accurate enough give me the right result?

I have tested your code on my system, and found that the average time taken is the same when the test is run a large number of times, with each test run slightly favoring one or the other alternative. This applies both to debug and release builds.
Also, as a quick check, I looked at the x86 code in the debugger, and I see no difference in the generated code whatsoever. So with your program as it is, the difference you observed in your measurements essentially seems to be noise.

Honestly, it's so close that it wouldn't make any sort of visible difference unless you were processing a few million of these structs. In fact, running it multiple times may yield different results. I would up the number of iterations and try to run the program without the debugger attached to see if anything changes.
Just using structs doesn't immediately make your code faster though, there are many pitfalls that make structs far slower than their class equivalents.
If you want to optimize this benchmark, you should pass the structs to the process methods as references and not return another struct (avoiding the creation of 2 additional structs for the method), which should provide a much larger speedup than the different layout kinds:
public static void ProcessStructSeq(ref StructSeq structSeq)
{
structSeq.a = "1";
structSeq.b = "2";
structSeq.c = "3";
structSeq.d = "4";
}
public static void ProcessStructAuto(ref StructAuto structAuto)
{
structAuto.a = "1";
structAuto.b = "2";
structAuto.c = "3";
structAuto.d = "4";
}
Also, there's a point where structs become slower than their class counterparts, and that's estimated to be at about 16 bytes according to this MSDN article and further explained in this StackOverflow question.

I believe there is no difference due to how your fields are laid out. The way you declared them, the padding will be the same either way. If you try interlacing the fields of different sizes, you should see a difference, at least in size, if not in speed.
Also, according to this blog post, a struct with a reference field is changed to auto layout (meaning you were benching literally the exact same thing!).
public struct MyStruct
{
private byte b1;
public long a;
private byte b2;
public long b;
private byte b3;
public long c;
private byte b4;
public long d;
}

Related

Garbage data after stackalloc field initializer

I am using .NET 6.0 on Windows 10 with Visual Studio 2022 last Version, last Build, and this code runs fine and even SEEMINLGY does what I want: have a look:
But: keep a closer look on the "Init(in ReadonlySpan<Char> a)"
I can assign my new(a) to the nodes[0], no error or whatever. But when I do, the nodes[0] has STILL garbage data in and I cannot store my type in that [0] field of the stackallocated span field?
Is this a bug in .NET 6.0 or am I doing smth bad/wrong?
Pls help!
Ok here the exact code:
public ref struct MyType
{
private Span<RefType> nodes = stackalloc RefType[30];
public unsafe struct RefType
{
public char* data;
public int length;
public RefType(in CharSpan a)
{
//AsRef(span) is a private method, which gives me back a valid pointer to the span, this works ok in my code! dont bother to much with this
data = Unsafe.AsPointer(ref GetRefTo(s))
length = a.Length;
}
}
public unsafe MyType()
{
nodes.Clear();
}
public unsafe void Init(in ReadOnlySpan<char> a)
{
//nodes[0] = new(a); //this does not work!
//this does not work aswell!
nodes[0].data = Unsafe.AsPointer(ref MemoryMarshal.GetRef(a));
nodes[0].length = a.Length;
Console.WriteLine(*nodes[0].data + " " +
nodes[0].length);
}
public string DoWork()
{
//Thread.Sleep(5000);
return nodes[0].ToString();
}
}

C#: Looping through member objects of nested structs

Hi all you c# wizards!
I need to store all the memory offset values of (packed) nested structs within these respective structs.
Recusively looping through all the members works fine so far. Also, i get the appropriate memory offset values.
This struct contraption might contain several dozends of structs, and several hundreds of other members in the end.
But i do this whole thing at initialization time, so CPU performance won't be an issue here.
But:
In this iteration process, it seems i have trouble accessing the actual instances of those structs. As it turns out, when i try to store these offset values, they don't end up where i need them (of course, i need them in the instance "SomeStruct1" and its containing other struct instances, but the debugger clearly shows me the init values (-1)).
I suspect "field_info.GetValue" or "obj_type.InvokeMember" is not the proper thing to get the object reference? Is there any other way to loop through nested struct instances?
Please help! I've desperately debugged and googled for three days, but i'm so out of ideas now...
Thanks for your efforts!
-Albert
PS - the reason i do this unusual stuff:
I communicate between two embedded CPU cores via the mentioned nested struct (both are mixed c/c++ projects). This works like a charm, as both cores share the same memory, where the struct resides.
Additionally, i have to communicate between a c# host application and theses embedded cores, so i thought it could be a neat thing, if i implement a third instance of this struct. Only this time, i oviously can't use shared RAM. Instead, i implement value setters and getters for the data-holding members, find out the memory offset as well as the lenght of the data-holding members, and feed this information (along with the value itself) via USB or Ethernet down to the embedded system - so the "API" to my embedded system will simply be a struct. The only maintenance i have to do every thime i change the struct: i have to copy the holding .h file (of the embedded project) to a .cs file (host project).
I know it's crazy - but it works now.
Thanks for your interest. -Albert
This is a simplified (buggy, see below) example that should compile and execute (WinForms, c#7.3):
using System;
using System.Reflection;
using System.Runtime.InteropServices;
using System.Windows.Forms;
namespace CodingExample
{
public interface Interf
{
Int32 Offset {get; set; }
}
[StructLayout (LayoutKind.Sequential, Pack = 1, CharSet = CharSet.Ansi)]
public struct sSomeStruct2 : Interf
{
public sSomeStruct2 (bool dummy)
{
Offset = -1;
SomeMember3 = 0;
}
public Int32 Offset {get; set; }
public Int32 SomeMember3;
// much more various-typed members (e. g. nested structs)...
}
[StructLayout (LayoutKind.Sequential, Pack = 1, CharSet = CharSet.Ansi)]
public struct sSomeStruct1 : Interf
{
public sSomeStruct1 (bool dummy)
{
Offset = -1;
SomeMember1 = 0;
SomeStruct2 = new sSomeStruct2 (true);
SomeMember2 = 0;
}
public Int32 Offset {get; set; }
public Int32 SomeMember1;
public sSomeStruct2 SomeStruct2;
public Int16 SomeMember2;
// much more various-typed members...
}
public partial class Form1 : Form
{
void InitializeOffsets (object obj)
{
Console.WriteLine ("obj: {0}", obj);
Type obj_type = obj.GetType ();
foreach (FieldInfo field_info in obj_type.GetFields ())
{
string field_name = field_info.Name;
Int32 offset = (Int32) Marshal.OffsetOf (obj_type, field_name);
Type field_type = field_info.FieldType;
bool is_leafe = field_type.IsPrimitive;
// none of theses three options seem to give me the right reference:
// object node_obj = field_info.GetValue (obj);
// object node_obj = field_info.GetValue (null);
object node_obj = obj_type.InvokeMember (field_name, BindingFlags.GetField, null, obj, null);
Console.WriteLine ("field: {0}; field_type: {1}; is_leafe: {2}; offset: {3}", field_name, field_type, is_leafe, offset);
if (! is_leafe)
{
// this writes not as expected:
(node_obj as Interf).Offset = offset;
InitializeOffsets (node_obj);
}
}
}
sSomeStruct1 SomeStruct1;
public Form1 ()
{
InitializeComponent ();
SomeStruct1 = new sSomeStruct1 (true);
InitializeOffsets (SomeStruct1);
}
}
}
Meanwhile i found out, what i did wrong:
i have to do boxing, so i can use "ref" when i call my initialize function:
// instead of this:
SomeStruct1 = new sSomeStruct1 (true);
// i have to do it this way:
object boxed_SomeStruct1 = new sSomeStruct1 (true);
InitializeOffsets (ref boxed_SomeStruct1);
SomeStruct1 = (sSomeStruct1) boxed_SomeStruct1;
Within the "InitializeOffsets" function, "field_info.GetValue (obj)" delivers a copy of my member object. That's why i have to copy the modified copy back at the very end of the foreach loop:
field_info.SetValue (obj, node_obj);
After these changes, the code works as intended.
Thanks for your interest. -Albert

I'm trying to Multi-thread a MD5 Hash string method but getting the error CS0123 C# No overload for matches delegate 'ThreadStart'

I have done some research and found solutions that apply to a void method, however I have been unable to replicate my code to that of a void as there is no overload for my method 'MD5' that matches the delegate 'ThreadStart', and I have been unable to convert a void to string, this programs intent is to show how multi-threading can allow for more than one process can be done at once. I intend to add additional processes on different threads, however, it is important that this works.
using System.Security.Cryptography;//Added to allow for UTF8 encoding
using System.Threading;//Added to allow for multi-threading
namespace MTSTask5
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e)
{
}
//MD5 Hash method
public void MD5(string input)
{
MD5 md5 = new MD5CryptoServiceProvider();
//Convert the input string to a byte array and computer the hash, return the hexadecimal
byte[] bytes = md5.ComputeHash(Encoding.UTF8.GetBytes(input));
string result = BitConverter.ToString(bytes).Replace("-", string.Empty);
return result.ToLower();
}
private void btnStartHash_Click(object sender, EventArgs e)
{
int loopQty = Int32.Parse(txtboxLoopQty.Text);
int i = 0;
//Create a while loop for the MD5 method below
while (i < loopQty)
{
//loop output
string HashOutput = MD5(MD5(txtboxHashOne.Text + txtboxHashTwo.Text));
txtboxHashOutput.Text = HashOutput + " " + i;
Thread HashThread = new Thread(new ThreadStart(MD5));
HashThread.Start();
i++;
}
}
Some suggestions that may help you troubleshoot and solve your problem:
First, I believe you may be trying to return result.ToLower(), a data type of string from your method named MD5, I'm guessing you were trying to use this instead of returning void, (i.e. nothing):
//MD5 Hash method
public string MD5(string input)
{
MD5 md5 = new MD5CryptoServiceProvider();
//Convert the input string to a byte array and computer the hash, return the hexadecimal
byte[] bytes = md5.ComputeHash(Encoding.UTF8.GetBytes(input));
string result = BitConverter.ToString(bytes).Replace("-", string.Empty);
return result.ToLower();
}
That may not the entire problem, so let's check to make sure your method is working by copying the code you have in your btnStartHash_Click method into a safe place, and then replacing it with a simple message to yourself.
private void btnStartHash_Click(object sender, EventArgs e)
{
//Convert the input string to a byte array and computer the hash, return the hexadecimal and display it in a message box
MessageBox.Show(MD5("abcdefg"));//parse whatever known value test
}
If you are still unsure of the hash result from your MD5 method, then start taking parts out one by one.
Build the button click up again once you're certain of desired MD5 method's output:
private void btnStartHash_Click_(object sender, EventArgs e)
{
txtboxHashOne.Text = MD5(txtboxHashInput.Text);
string hashOfHash = MD5(txtboxHashOne.Text);
txtboxHashTwo.Text = hashOfHash;
}
In the above situation I'm using the MD5 method to hash the input textbox, txtboxHashInput.Text and then change the txtboxHashOne text box to reflect the change on the form. The txboxHashOne's entire string is then hashed to make sure to make the hashOfHash string.
Instead of having each instance txtboxHashOne , txtboxHashTwo, one may think it could do better by just creating the text box programmatically on the form:
//lets say the loopqty input is the number of times I wanted to hash this
int numberOfTimesToHash = Int32.Parse(txtboxLoopQty.Text);
//x and y represent where you want them to start appearing on your form..
int x = 10;
int y = 100;
int howeverManyThreadsIWant = numberOfTimesToHash;
for (int i = 0; i < howeverManyThreadsIWant; i++)
{
TextBox textBox = new TextBox();
textBox.Location = new Point(x, y);
//Could go into a recursive function such as` MD5(Input,recursionDepth)
//But instead going to reprint same hash for demonstration purposes
textBox.Text = MD5(txtboxHashInput.Text);
//MessageBox.Show(textBox.Text);
this.Controls.Add(textBox);
y += 30;
}
Then, the programmer may want to try to embrace a multithreaded approach to reduce complexity we have to do a lot more.
For example, unfortunately it's not so simple to do this:
//##!!Don't do this!!
var thread = new Thread(() =>
{
int x = 10;
int y = 100;
int howeverManyThreadsIWant = numberOfTimesToHash;
for (int i = 0; i < howeverManyThreadsIWant; i++)
{
TextBox textBox = new TextBox();
textBox.Tag = i;
textBox.Location = new Point(x, y);
//Could go into a recursive function such as MD5(Input,recursionDepth)
//But instead going to simply reprint same hash
textBox.Text = MD5(txtboxHashInput.Text);
//MessageBox.Show(textBox.Text);
this.Controls.Add(textBox);//<--invalid operations error
y += 30;
}
});
thread.Start();
would result in:
System.InvalidOperationException: 'Cross-thread operation not valid: Control 'Form1' accessed from a thread other than the thread it was created on.'<<
Strongly consider if you really need multithreading to solve this task.
Microsoft suggests:
When to Use Multiple Threads
Multithreading can be used in many common situations to significantly improve the responsiveness and usability of your application.
You should strongly consider using multiple threads to:
#Communicate over a network, for example to a Web server, database, or remote object.
#Perform time-consuming local operations that would cause the UI to freeze.
#Distinguish tasks of varying priority.
#Improve the performance of application startup and initialization.
It is useful to examine these uses in more detail.
Communicating Over a Network
#Smart-clients may communicate over a network in a number of ways, including:
#Remote object calls, such as DCOM, RPC or .NET remoting
#Message-based communications, such as Web service calls and HTTP requests
#Distributed transactions
With that in mind, if you really need to do these things, now that you have bug free code that is hashing the way you want it, visit the Using Multiple Threads by Microsoft.
Also you may want to check out Threading in Windows Forms, this has an example that you can run with most of what you need to know.
Hopefully some of this was what you were looking for.

improving conversions to binary and back in C#

I'm trying to write a general purpose socket server for a game I'm working on. I know I could very well use already built servers like SmartFox and Photon, but I wan't to go through the pain of creating one myself for learning purposes.
I've come up with a BSON inspired protocol to convert the the basic data types, their arrays, and a special GSObject to binary and arrange them in a way so that it can be put back together into object form on the client end. At the core, the conversion methods utilize the .Net BitConverter class to convert the basic data types to binary. Anyways, the problem is performance, if I loop 50,000 times and convert my GSObject to binary each time it takes about 5500ms (the resulting byte[] is just 192 bytes per conversion). I think think this would be way too slow for an MMO that sends 5-10 position updates per second with a 1000 concurrent users. Yes, I know it's unlikely that a game will have a 1000 users on at the same time, but like I said earlier this is supposed to be a learning process for me, I want to go out of my way and build something that scales well and can handle at least a few thousand users.
So yea, if anyone's aware of other conversion techniques or sees where I'm loosing performance I would appreciate the help.
GSBitConverter.cs
This is the main conversion class, it adds extension methods to main datatypes to convert to the binary format. It uses the BitConverter class to convert the base types. I've shown only the code to convert integer and integer arrays, but the rest of the method are pretty much replicas of those two, they just overload the type.
public static class GSBitConverter
{
public static byte[] ToGSBinary(this short value)
{
return BitConverter.GetBytes(value);
}
public static byte[] ToGSBinary(this IEnumerable<short> value)
{
List<byte> bytes = new List<byte>();
short length = (short)value.Count();
bytes.AddRange(length.ToGSBinary());
for (int i = 0; i < length; i++)
bytes.AddRange(value.ElementAt(i).ToGSBinary());
return bytes.ToArray();
}
public static byte[] ToGSBinary(this bool value);
public static byte[] ToGSBinary(this IEnumerable<bool> value);
public static byte[] ToGSBinary(this IEnumerable<byte> value);
public static byte[] ToGSBinary(this int value);
public static byte[] ToGSBinary(this IEnumerable<int> value);
public static byte[] ToGSBinary(this long value);
public static byte[] ToGSBinary(this IEnumerable<long> value);
public static byte[] ToGSBinary(this float value);
public static byte[] ToGSBinary(this IEnumerable<float> value);
public static byte[] ToGSBinary(this double value);
public static byte[] ToGSBinary(this IEnumerable<double> value);
public static byte[] ToGSBinary(this string value);
public static byte[] ToGSBinary(this IEnumerable<string> value);
public static string GetHexDump(this IEnumerable<byte> value);
}
Program.cs
Here's the the object that I'm converting to binary in a loop.
class Program
{
static void Main(string[] args)
{
GSObject obj = new GSObject();
obj.AttachShort("smallInt", 15);
obj.AttachInt("medInt", 120700);
obj.AttachLong("bigInt", 10900800700);
obj.AttachDouble("doubleVal", Math.PI);
obj.AttachStringArray("muppetNames", new string[] { "Kermit", "Fozzy", "Piggy", "Animal", "Gonzo" });
GSObject apple = new GSObject();
apple.AttachString("name", "Apple");
apple.AttachString("color", "red");
apple.AttachBool("inStock", true);
apple.AttachFloat("price", (float)1.5);
GSObject lemon = new GSObject();
apple.AttachString("name", "Lemon");
apple.AttachString("color", "yellow");
apple.AttachBool("inStock", false);
apple.AttachFloat("price", (float)0.8);
GSObject apricoat = new GSObject();
apple.AttachString("name", "Apricoat");
apple.AttachString("color", "orange");
apple.AttachBool("inStock", true);
apple.AttachFloat("price", (float)1.9);
GSObject kiwi = new GSObject();
apple.AttachString("name", "Kiwi");
apple.AttachString("color", "green");
apple.AttachBool("inStock", true);
apple.AttachFloat("price", (float)2.3);
GSArray fruits = new GSArray();
fruits.AddGSObject(apple);
fruits.AddGSObject(lemon);
fruits.AddGSObject(apricoat);
fruits.AddGSObject(kiwi);
obj.AttachGSArray("fruits", fruits);
Stopwatch w1 = Stopwatch.StartNew();
for (int i = 0; i < 50000; i++)
{
byte[] b = obj.ToGSBinary();
}
w1.Stop();
Console.WriteLine(BitConverter.IsLittleEndian ? "Little Endian" : "Big Endian");
Console.WriteLine(w1.ElapsedMilliseconds + "ms");
}
Here's the code for some of my other classes that are used in the code above. Most of it is repetitive.
GSObject
GSArray
GSWrappedObject
My first hunch, without much to go off, would be that a lot of your time is being sunk into constantly re-creating arrays and lists.
I would be inclined to move to a Stream-based approach rather than trying to create arrays constantly. That being, make all the GSBinary methods accept a Stream then write to it rather than making their own arrays, then if you want it in local memory use a MemoryStream at the base and then get your array out of it at the end (Or even better if you're planning this to be a networked application, write directly to the network stream).
As per Chris's comment earlier however the best way to start is to run a profiler such at dotTrace or redgate's ANTS performance profiler to actually find out which step is taking the most time before investing time refactoring something which, while inefficient, may only be a small fraction of the actual time.
1) ElementAt is very expensive. Use foreach (var v in value) instead of for (int i = 0; i < length; i++) .. .ElementAt(i) ..
2) ToGsBinary methods is expensive because they copy arrays frequently.
Use signature void WriteToGsBinary(Stream stream) instead of byte[] ToGsBinary()
3) Add overloads for arrays: void WriteToGsBinary(Stream stream, byte[] values), void WriteToGsBinary(Stream stream, short[] values), etc

C# equivalent of the Ruby symbol

I'm developing a little C# application for the fun. I love this language but something disturb me ...
Is there any way to do a #define (C mode) or a symbol (ruby mode).
The ruby symbol is quite useful. It's just some name preceded by a ":" (example ":guy") every symbol is unique and can be use any where in the code.
In my case I'd like to send a flag (connect or disconnect) to a function.
What is the most elegant C# way to do that ?
Here is what i'd like to do :
BgWorker.RunWorkersAsync(:connect)
//...
private void BgWorker_DoWork(object sender, DoWorkEventArgs e)
{
if (e.Arguement == :connect)
//Do the job
}
At this point the my favorite answer is the enum solution ;)
In your case, sending a flag can be done by using an enum...
public enum Message
{
Connect,
Disconnect
}
public void Action(Message msg)
{
switch(msg)
{
case Message.Connect:
//do connect here
break;
case Message.Disconnect:
//disconnect
break;
default:
//Fail!
break;
}
}
You could use a string constant:
public const string Guy = "guy";
In fact strings in .NET are special. If you declare two string variable with the same value they actually point to the same object:
string a = "guy";
string b = "guy";
Console.WriteLine(object.ReferenceEquals(a, b)); // prints True
C# doesn't support C-style macros, although it does still have #define. For their reasoning on this take a look at the csharp FAQ blog on msdn.
If your flag is for conditional compilation purposes, then you can still do this:
#define MY_FLAG
#if MY_FLAG
//do something
#endif
But if not, then what you're describing is a configuration option and should perhaps be stored in a class variable or config file instead of a macro.
Similar to #Darin but I often create a Defs class in my project to put all such constants so there is an easy way to access them from anywhere.
class Program
{
static void Main(string[] args)
{
string s = Defs.pi;
}
}
class Defs
{
public const int Val = 5;
public const string pi = "3.1459";
}

Categories

Resources