C# reinterpret bool as byte/int (branch-free) - c#

Is it possible in C# to turn a bool into a byte or int (or any integral type, really) without branching?
In other words, this is not good enough:
var myInt = myBool ? 1 : 0;
We might say we want to reinterpret a bool as the underlying byte, preferably in as few instructions as possible. The purpose is to avoid branch prediction fails as seen here.

unsafe
{
byte myByte = *(byte*)&myBool;
}
Another option is System.Runtime.CompilerServices.Unsafe, which requires a NuGet package on non-Core platforms:
byte myByte = Unsafe.As<bool, byte>(ref myBool);
The CLI specification only defines false as 0 and true as anything except 0 , so technically speaking this might not work as expected on all platforms. However, as far as I know the C# compiler also makes the assumption that there are only two values for bool, so in practice I would expect it to work outside of mostly academic cases.

The usual C# equivalent to "reinterpret cast" is to define a struct with fields of the types you want to reinterpret. That approach works fine in most cases. In your case, that would look like this:
[StructLayout(LayoutKind.Explicit)]
struct BoolByte
{
[FieldOffset(0)]
public bool Bool;
[FieldOffset(0)]
public byte Byte;
}
Then you can do something like this:
BoolByte bb = new BoolByte();
bb.Bool = true;
int myInt = bb.Byte;
Note that you only have to initialize the variable once, then you can set Bool and retrieve Byte as often as you like. This should perform as well or better than any approach involving unsafe code, calling methods, etc., especially with respect to addressing any branch-prediction issues.
It's important to point out that if you can read a bool as a byte, then of course anyone can write a bool as a byte, and the actual int value of the bool when it's true may or may not be 1. It technically could be any non-zero value.
All that said, this will make the code a lot harder to maintain. Both because of the lack of guarantees of what a true value actually looks like, and just because of the added complexity. It would be extremely rare to run into a real-world scenario that suffers from the missed branch-prediction issue you're asking about. Even if you had a legitimate real-world example, it's arguable that it would be better solved some other way. The exact alternative would depend on the specific real-world example, but one example might be to keep the data organized in a way that allows for batch processing on a given condition instead of testing for each element.
I strongly advise against doing something like this, until you have a demonstrated, reproducible real-world problem, and have exhausted other more idiomatic and maintainable options.

Here is a solution that takes more lines (and presumably more instructions) than I would like, but that actually solves the problem directly, i.e. by reinterpreting.
Since .NET Core 2.1, we have some reinterpret methods available in MemoryMarshal. We can treat our bool as a ReadOnlySpan<bool>, which in turn we can treat as a ReadOnlySpan<byte>. From there, it is trivial to read the single byte value.
var myBool = true;
var myBoolSpan = MemoryMarshal.CreateReadOnlySpan(ref myBool, length: 1);
var myByteSpan = MemoryMarshal.AsBytes(myBoolSpan);
var myByte = myByteSpan[0]; // =1

maybe this would work? (source of the idea)
using System;
using System.Reflection.Emit;
namespace ConsoleApp10
{
class Program
{
static Func<bool, int> BoolToInt;
static Func<bool, byte> BoolToByte;
static void Main(string[] args)
{
InitIL();
Console.WriteLine(BoolToInt(true));
Console.WriteLine(BoolToInt(false));
Console.WriteLine(BoolToByte(true));
Console.WriteLine(BoolToByte(false));
Console.ReadLine();
}
static void InitIL()
{
var methodBoolToInt = new DynamicMethod("BoolToInt", typeof(int), new Type[] { typeof(bool) });
var ilBoolToInt = methodBoolToInt.GetILGenerator();
ilBoolToInt.Emit(OpCodes.Ldarg_0);
ilBoolToInt.Emit(OpCodes.Ldc_I4_0); //these 2 lines
ilBoolToInt.Emit(OpCodes.Cgt_Un); //might not be needed
ilBoolToInt.Emit(OpCodes.Ret);
BoolToInt = (Func<bool, int>)methodBoolToInt.CreateDelegate(typeof(Func<bool, int>));
var methodBoolToByte = new DynamicMethod("BoolToByte", typeof(byte), new Type[] { typeof(bool) });
var ilBoolToByte = methodBoolToByte.GetILGenerator();
ilBoolToByte.Emit(OpCodes.Ldarg_0);
ilBoolToByte.Emit(OpCodes.Ldc_I4_0); //these 2 lines
ilBoolToByte.Emit(OpCodes.Cgt_Un); //might not be needed
ilBoolToByte.Emit(OpCodes.Ret);
BoolToByte = (Func<bool, byte>)methodBoolToByte.CreateDelegate(typeof(Func<bool, byte>));
}
}
}
based on microsoft documentation of each emit.
load the parameter in memory (the boolean)
load in memory a value of int = 0
compare if any the parameter is greater than the value (branching here maybe?)
return 1 if true else 0
line 2 and 3 can be removed but the return value could be something else than 0 / 1
like i said in the beginning this code is taken from another response, this seem to be working yes but it seem slow while being benchmarking, lookup .net DynamicMethod slow to find way to make it "faster"
you could maybe use the .GetHashCode of the boolean?
true will return int of 1 and false 0
you can then var myByte = (byte)bool.GetHashCode();

Related

Discard feature significance in C# 7.0?

While going through new C# 7.0 features, I stuck up with discard feature. It says:
Discards are local variables which you can assign but cannot read
from. i.e. they are “write-only” local variables.
and, then, an example follows:
if (bool.TryParse("TRUE", out bool _))
What is real use case when this will be beneficial? I mean what if I would have defined it in normal way, say:
if (bool.TryParse("TRUE", out bool isOK))
The discards are basically a way to intentionally ignore local variables which are irrelevant for the purposes of the code being produced. It's like when you call a method that returns a value but, since you are interested only in the underlying operations it performs, you don't assign its output to a local variable defined in the caller method, for example:
public static void Main(string[] args)
{
// I want to modify the records but I'm not interested
// in knowing how many of them have been modified.
ModifyRecords();
}
public static Int32 ModifyRecords()
{
Int32 affectedRecords = 0;
for (Int32 i = 0; i < s_Records.Count; ++i)
{
Record r = s_Records[i];
if (String.IsNullOrWhiteSpace(r.Name))
{
r.Name = "Default Name";
++affectedRecords;
}
}
return affectedRecords;
}
Actually, I would call it a cosmetic feature... in the sense that it's a design time feature (the computations concerning the discarded variables are performed anyway) that helps keeping the code clear, readable and easy to maintain.
I find the example shown in the link you provided kinda misleading. If I try to parse a String as a Boolean, chances are I want to use the parsed value somewhere in my code. Otherwise I would just try to see if the String corresponds to the text representation of a Boolean (a regular expression, for example... even a simple if statement could do the job if casing is properly handled). I'm far from saying that this never happens or that it's a bad practice, I'm just saying it's not the most common coding pattern you may need to produce.
The example provided in this article, on the opposite, really shows the full potential of this feature:
public static void Main()
{
var (_, _, _, pop1, _, pop2) = QueryCityDataForYears("New York City", 1960, 2010);
Console.WriteLine($"Population change, 1960 to 2010: {pop2 - pop1:N0}");
}
private static (string, double, int, int, int, int) QueryCityDataForYears(string name, int year1, int year2)
{
int population1 = 0, population2 = 0;
double area = 0;
if (name == "New York City")
{
area = 468.48;
if (year1 == 1960) {
population1 = 7781984;
}
if (year2 == 2010) {
population2 = 8175133;
}
return (name, area, year1, population1, year2, population2);
}
return ("", 0, 0, 0, 0, 0);
}
From what I can see reading the above code, it seems that the discards have a higher sinergy with other paradigms introduced in the most recent versions of C# like tuples deconstruction.
For Matlab programmers, discards are far from being a new concept because the programming language implements them since very, very, very long time (probably since the beginning, but I can't say for sure). The official documentation describes them as follows (link here):
Request all three possible outputs from the fileparts function:
helpFile = which('help');
[helpPath,name,ext] = fileparts('C:\Path\data.txt');
The current workspace now contains three variables from fileparts: helpPath, name, and ext. In this case, the variables are small. However, some functions return results that use much more memory. If you do not need those variables, they waste space on your system.
Ignore the first output using a tilde (~):
[~,name,ext] = fileparts(helpFile);
The only difference is that, in Matlab, inner computations for discarded outputs are normally skipped because output arguments are flexible and you can know how many and which one of them have been requested by the caller.
I have seen discards used mainly against methods which return Task<T> but you don't want to await the output.
So in the example below, we don't want to await the output of SomeOtherMethod() so we could do something like this:
//myClass.cs
public async Task<bool> Example() => await SomeOtherMethod()
// example.cs
Example();
Except this will generate the following warning:
CS4014 Because this call is not awaited, execution of the
current method continues before the call is completed. Consider
applying the 'await' operator to the result of the call.
To mitigate this warning and essentially ensure the compiler that we know what we are doing, you can use a discard:
//myClass.cs
public async Task<bool> Example() => await SomeOtherMethod()
// example.cs
_ = Example();
No more warnings.
To add another use case to the above answers.
You can use a discard in conjunction with a null coalescing operator to do a nice one-line null check at the start of your functions:
_ = myParam ?? throw new MyException();
Many times I've done code along these lines:
TextBox.BackColor = int32.TryParse(TextBox.Text, out int32 _) ? Color.LightGreen : Color.Pink;
Note that this would be part of a larger collection of data, not a standalone thing. The idea is to provide immediate feedback on the validity of each field of the data they are entering.
I use light green and pink rather than the green and red one would expect--the latter colors are dark enough that the text becomes a bit hard to read and the meaning of the lighter versions is still totally obvious.
(In some cases I also have a Color.Yellow to flag something which is not valid but neither is it totally invalid. Say the parser will accept fractions and the field currently contains "2 1". That could be part of "2 1/2" so it's not garbage, but neither is it valid.)
Discard pattern can be used with a switch expression as well.
string result = shape switch
{
Rectangule r => $"Rectangule",
Circle c => $"Circle",
_ => "Unknown Shape"
};
For a list of patterns with discards refer to this article: Discards.
Consider this:
5 + 7;
This "statement" performs an evaluation but is not assigned to something. It will be immediately highlighted with the CS error code CS0201.
// Only assignment, call, increment, decrement, and new object expressions can be used as a statement
https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/compiler-messages/cs0201?f1url=%3FappId%3Droslyn%26k%3Dk(CS0201)
A discard variable used here will not change the fact that it is an unused expression, rather it will appear to the compiler, to you, and others reviewing your code that it was intentionally unused.
_ = 5 + 7; //acceptable
It can also be used in lambda expressions when having unused parameters:
builder.Services.AddSingleton<ICommandDispatcher>(_ => dispatcher);

What is it that makes Enum.HasFlag so slow?

I was doing some speed tests and I noticed that Enum.HasFlag is about 16 times slower than using the bitwise operation.
Does anyone know the internals of Enum.HasFlag and why it is so slow? I mean twice as slow wouldn't be too bad but it makes the function unusable when its 16 times slower.
In case anyone is wondering, here is the code I am using to test its speed.
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
namespace app
{
public class Program
{
[Flags]
public enum Test
{
Flag1 = 1,
Flag2 = 2,
Flag3 = 4,
Flag4 = 8
}
static int num = 0;
static Random rand;
static void Main(string[] args)
{
int seed = (int)DateTime.UtcNow.Ticks;
var st1 = new SpeedTest(delegate
{
Test t = Test.Flag1;
t |= (Test)rand.Next(1, 9);
if (t.HasFlag(Test.Flag4))
num++;
});
var st2 = new SpeedTest(delegate
{
Test t = Test.Flag1;
t |= (Test)rand.Next(1, 9);
if (HasFlag(t , Test.Flag4))
num++;
});
rand = new Random(seed);
st1.Test();
rand = new Random(seed);
st2.Test();
Console.WriteLine("Random to prevent optimizing out things {0}", num);
Console.WriteLine("HasFlag: {0}ms {1}ms {2}ms", st1.Min, st1.Average, st1.Max);
Console.WriteLine("Bitwise: {0}ms {1}ms {2}ms", st2.Min, st2.Average, st2.Max);
Console.ReadLine();
}
static bool HasFlag(Test flags, Test flag)
{
return (flags & flag) != 0;
}
}
[DebuggerDisplay("Average = {Average}")]
class SpeedTest
{
public int Iterations { get; set; }
public int Times { get; set; }
public List<Stopwatch> Watches { get; set; }
public Action Function { get; set; }
public long Min { get { return Watches.Min(s => s.ElapsedMilliseconds); } }
public long Max { get { return Watches.Max(s => s.ElapsedMilliseconds); } }
public double Average { get { return Watches.Average(s => s.ElapsedMilliseconds); } }
public SpeedTest(Action func)
{
Times = 10;
Iterations = 100000;
Function = func;
Watches = new List<Stopwatch>();
}
public void Test()
{
Watches.Clear();
for (int i = 0; i < Times; i++)
{
var sw = Stopwatch.StartNew();
for (int o = 0; o < Iterations; o++)
{
Function();
}
sw.Stop();
Watches.Add(sw);
}
}
}
}
Results:
HasFlag: 52ms 53.6ms 55ms
Bitwise: 3ms 3ms 3ms
Does anyone know the internals of Enum.HasFlag and why it is so slow?
The actual check is just a simple bit check in Enum.HasFlag - it's not the problem here. That being said, it is slower than your own bit check...
There are a couple of reasons for this slowdown:
First, Enum.HasFlag does an explicit check to make sure that the type of the enum and the type of the flag are both the same type, and from the same Enum. There is some cost in this check.
Secondly, there is an unfortunate box and unbox of the value during a conversion to UInt64 that occurs inside of HasFlag. This is, I believe, due to the requirement that Enum.HasFlag work with all enums, regardless of the underlying storage type.
That being said, there is a huge advantage to Enum.HasFlag - it's reliable, clean, and makes the code very obvious and expressive. For the most part, I feel that this makes it worth the cost - but if you're using this in a very performance critical loop, it may be worth doing your own check.
Decompiled code of Enum.HasFlags() looks like this:
public bool HasFlag(Enum flag)
{
if (!base.GetType().IsEquivalentTo(flag.GetType()))
{
throw new ArgumentException(Environment.GetResourceString("Argument_EnumTypeDoesNotMatch", new object[] { flag.GetType(), base.GetType() }));
}
ulong num = ToUInt64(flag.GetValue());
return ((ToUInt64(this.GetValue()) & num) == num);
}
If I were to guess, I would say that checking the type was what's slowing it down most.
Note that in recent versions of .Net Core, this has been improved and Enum.HasFlag compiles to the same code as using bitwise comparisons.
The performance penalty due to boxing discussed on this page also affects the public .NET functions Enum.GetValues and Enum.GetNames, which both forward to (Runtime)Type.GetEnumValues and (Runtime)Type.GetEnumNames respectively.
All of these functions use a (non-generic) Array as a return type--which is not so bad for the names (since String is a reference type)--but is quite inappropriate for the ulong[] values.
Here's a peek at the offending code (.NET 4.7):
public override Array /* RuntimeType.*/ GetEnumValues()
{
if (!this.IsEnum)
throw new ArgumentException();
ulong[] values = Enum.InternalGetValues(this);
Array array = Array.UnsafeCreateInstance(this, values.Length);
for (int i = 0; i < values.Length; i++)
{
var obj = Enum.ToObject(this, values[i]); // ew. boxing.
array.SetValue(obj, i); // yuck
}
return array; // Array of object references, bleh.
}
We can see that prior to doing the copy, RuntimeType goes back again to System.Enum to get an internal array, a singleton which is cached, on demand, for each specific Enum. Notice also that this version of the values array does use the proper strong signature, ulong[].
Here's the .NET function (again we're back in System.Enum now). There's a similar function for getting the names (not shown).
internal static ulong[] InternalGetValues(RuntimeType enumType) =>
GetCachedValuesAndNames(enumType, false).Values;
See the return type? This looks like a function we'd like to use... But first consider that a second reason that .NET re-copys the array each time (as you saw above) is that .NET must ensure that each caller gets an unaltered copy of the original data, given that a malevolent coder could change her copy of the returned Array, introducing a persistent corruption. Thus, the re-copying precaution is especially intended to protect the cached internal master copy.
If you aren't worried about that risk, perhaps because you feel confident you won't accidentally change the array, or maybe just to eke-out a few cycles of (what's surely premature) optimization, it's simple to fetch the internal cached array copy of the names or values for any Enum:
→ The following two functions comprise the sum contribution of this article ←
→ (but see edit below for improved version) ←
static ulong[] GetEnumValues<T>() where T : struct =>
(ulong[])typeof(System.Enum)
.GetMethod("InternalGetValues", BindingFlags.Static | BindingFlags.NonPublic)
.Invoke(null, new[] { typeof(T) });
static String[] GetEnumNames<T>() where T : struct =>
(String[])typeof(System.Enum)
.GetMethod("InternalGetNames", BindingFlags.Static | BindingFlags.NonPublic)
.Invoke(null, new[] { typeof(T) });
Note that the generic constraint on T isn't fully sufficient for guaranteeing Enum. For simplicity, I left off checking any further beyond struct, but you might want to improve on that. Also for simplicity, this (ref-fetches and) reflects directly off the MethodInfo every time rather than trying to build and cache a Delegate. The reason for this is that creating the proper delegate with a first argument of non-public type RuntimeType is tedious. A bit more on this below.
First, I'll wrap up with usage examples:
var values = GetEnumValues<DayOfWeek>();
var names = GetEnumNames<DayOfWeek>();
and debugger results:
'values' ulong[7]
[0] 0
[1] 1
[2] 2
[3] 3
[4] 4
[5] 5
[6] 6
'names' string[7]
[0] "Sunday"
[1] "Monday"
[2] "Tuesday"
[3] "Wednesday"
[4] "Thursday"
[5] "Friday"
[6] "Saturday"
So I mentioned that the "first argument" of Func<RuntimeType,ulong[]> is annoying to reflect over. However, because this "problem" arg happens to be first, there's a cute workaround where you can bind each specific Enum type as a Target of its own delegate, where each is then reduced to Func<ulong[]>.)
Clearly, its pointless to make any of those delegates, since each would just be a function that always return the same value... but the same logic seems to apply, perhaps less obviously, to the original situation as well (i.e., Func<RuntimeType,ulong[]>). Although we do get by with a just one delegate here, you'd never really want to call it more than once per Enum type. Anyway, all of this leads to a much better solution, which is included in the edit below.
[edit:]Here's a slightly more elegant version of the same thing. If you will be calling the functions repeatedly for the same Enum type, the version shown here will only use reflection one time per Enum type. It saves the results in a locally-accessible cache for extremely rapid access subsequently.
static class enum_info_cache<T> where T : struct
{
static _enum_info_cache()
{
values = (ulong[])typeof(System.Enum)
.GetMethod("InternalGetValues", BindingFlags.Static | BindingFlags.NonPublic)
.Invoke(null, new[] { typeof(T) });
names = (String[])typeof(System.Enum)
.GetMethod("InternalGetNames", BindingFlags.Static | BindingFlags.NonPublic)
.Invoke(null, new[] { typeof(T) });
}
public static readonly ulong[] values;
public static readonly String[] names;
};
The two functions become trivial:
static ulong[] GetEnumValues<T>() where T : struct => enum_info_cache<T>.values;
static String[] GetEnumNames<T>() where T : struct => enum_info_cache<T>.names;
The code shown here illustrates a pattern of combining three specific tricks that seem to mutually result in an unusualy elegant lazy caching scheme. I've found the particular technique to have surprisingly wide application.
using a generic static class to cache independent copies of the arrays for each distinct Enum. Notably, this happens automatically and on demand;
related to this, the loader lock guarantees unique atomic initialization and does this without the clutter of conditional checking constructs. We can also protect static fields with readonly (which, for obvious reasons, typically can't be used with other lazy/deferred/demand methods);
finally, we can capitalize on C# type inference to automatically map the generic function (entry point) into its respective generic static class, so that the demand caching is ultimately even driven implicitly (viz., the best code is the code that isn't there--since it can never have bugs)
You probably noticed that the particular example shown here doesn't really illustrate point (3) very well. Rather than relying on type inference, the void-taking function has to manually propagate forward the type argument T. I didn't choose to expose these simple functions such that there would be an opportunity to show how C# type inference makes the overall technique shine...
However, you can imagine that when you do combine a static generic function that can infer its type argument(s)--i.e., so you don't even have to provide them at the call site--then it gets quite powerful.
The key insight is that, while generic functions have the full type-inference capability, generic classes do not, that is, the compiler will never infer T if you try to call the first of the following lines. But we can still get fully inferred access to a generic class, and all the benefits that entails, by traversing into them via generic function implicit typing (last line):
int t = 4;
typed_cache<int>.MyTypedCachedFunc(t); // no inference from 't', explicit type required
MyTypedCacheFunc<int>(t); // ok, (but redundant)
MyTypedCacheFunc(t); // ok, full inference
Designed well, inferred typing can effortlessly launch you into the appropriate automatically demand-cached data and behaviors, customized for each type (recall points 1. and 2). As noted, I find the approach useful, especially considering its simplicity.
The JITter ought to be inlining this as a simple bitwise operation. The JITter is aware enough to custom-handle even certain framework methods (via MethodImplOptions.InternalCall I think?) but HasFlag seems to have escaped Microsoft's serious attention.

Enum vs Constants/Class with Static Members?

I have a set of codes that are particular to the application (one to one mapping of the code to its name), and I've been using enums in C# to represent them. I'm not sure now if that is even necessary. The values never change, and they are always going to be associated with those labels:
Workflow_Status_Complete = 1
Workflow_Status_Stalled = 2
Workflow_Status_Progress = 3
Workflow_Status_Complete = 4
Workflow_Status_Fail = 5
Should I use an enum or a class with static members?
Static members of type int seems to be inferior to an enum to me. You lose the typesafety of an enum. And when debugging you don't see the symbolic name but just a number.
On the other hand if an entry consists of more than just a name/integervalue pair a class can be a good idea. But then the fields should be of that class and not int. Something like:
class MyFakeEnum
{
public static readonly MyFakeEnum Value1=new MyFakeEnum(...);
}
Use an enum. Even though your codes never change, it will be difficult to know what the value represents just by inspection. One of the many strengths of using enums.
enum RealEnum : uint
{
SomeValue = 0xDEADBEEF,
}
static class FakeEnum
{
public const uint SomeValue = 0xDEADBEEF;
}
var x = RealEnum.SomeValue;
var y = FakeEnum.SomeValue;
// what's the value?
var xstr = x.ToString(); // SomeValue
var ystr = y.ToString(); // 3735928559
Not even the debugger will help you much here, especially if there are many different values.
Check out the State Pattern as this is a better design. With the idea you are using you'll end up with a large switch/if-else statement which can be very difficult to keep up.
I would lean towards enums as they provide more information and they make your codes "easier to use correctly and difficult to use incorrectly". (I think the quote is from The Pragmatic Programmer.

What are your favorite little used C# trick? [duplicate]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
This came to my mind after I learned the following from this question:
where T : struct
We, C# developers, all know the basics of C#. I mean declarations, conditionals, loops, operators, etc.
Some of us even mastered the stuff like Generics, anonymous types, lambdas, LINQ, ...
But what are the most hidden features or tricks of C# that even C# fans, addicts, experts barely know?
Here are the revealed features so far:
Keywords
yield by Michael Stum
var by Michael Stum
using() statement by kokos
readonly by kokos
as by Mike Stone
as / is by Ed Swangren
as / is (improved) by Rocketpants
default by deathofrats
global:: by pzycoman
using() blocks by AlexCuse
volatile by Jakub Šturc
extern alias by Jakub Šturc
Attributes
DefaultValueAttribute by Michael Stum
ObsoleteAttribute by DannySmurf
DebuggerDisplayAttribute by Stu
DebuggerBrowsable and DebuggerStepThrough by bdukes
ThreadStaticAttribute by marxidad
FlagsAttribute by Martin Clarke
ConditionalAttribute by AndrewBurns
Syntax
?? (coalesce nulls) operator by kokos
Number flaggings by Nick Berardi
where T:new by Lars Mæhlum
Implicit generics by Keith
One-parameter lambdas by Keith
Auto properties by Keith
Namespace aliases by Keith
Verbatim string literals with # by Patrick
enum values by lfoust
#variablenames by marxidad
event operators by marxidad
Format string brackets by Portman
Property accessor accessibility modifiers by xanadont
Conditional (ternary) operator (?:) by JasonS
checked and unchecked operators by Binoj Antony
implicit and explicit operators by Flory
Language Features
Nullable types by Brad Barker
Anonymous types by Keith
__makeref __reftype __refvalue by Judah Himango
Object initializers by lomaxx
Format strings by David in Dakota
Extension Methods by marxidad
partial methods by Jon Erickson
Preprocessor directives by John Asbeck
DEBUG pre-processor directive by Robert Durgin
Operator overloading by SefBkn
Type inferrence by chakrit
Boolean operators taken to next level by Rob Gough
Pass value-type variable as interface without boxing by Roman Boiko
Programmatically determine declared variable type by Roman Boiko
Static Constructors by Chris
Easier-on-the-eyes / condensed ORM-mapping using LINQ by roosteronacid
__arglist by Zac Bowling
Visual Studio Features
Select block of text in editor by Himadri
Snippets by DannySmurf
Framework
TransactionScope by KiwiBastard
DependantTransaction by KiwiBastard
Nullable<T> by IainMH
Mutex by Diago
System.IO.Path by ageektrapped
WeakReference by Juan Manuel
Methods and Properties
String.IsNullOrEmpty() method by KiwiBastard
List.ForEach() method by KiwiBastard
BeginInvoke(), EndInvoke() methods by Will Dean
Nullable<T>.HasValue and Nullable<T>.Value properties by Rismo
GetValueOrDefault method by John Sheehan
Tips & Tricks
Nice method for event handlers by Andreas H.R. Nilsson
Uppercase comparisons by John
Access anonymous types without reflection by dp
A quick way to lazily instantiate collection properties by Will
JavaScript-like anonymous inline-functions by roosteronacid
Other
netmodules by kokos
LINQBridge by Duncan Smart
Parallel Extensions by Joel Coehoorn
This isn't C# per se, but I haven't seen anyone who really uses System.IO.Path.Combine() to the extent that they should. In fact, the whole Path class is really useful, but no one uses it!
I'm willing to bet that every production app has the following code, even though it shouldn't:
string path = dir + "\\" + fileName;
lambdas and type inference are underrated. Lambdas can have multiple statements and they double as a compatible delegate object automatically (just make sure the signature match) as in:
Console.CancelKeyPress +=
(sender, e) => {
Console.WriteLine("CTRL+C detected!\n");
e.Cancel = true;
};
Note that I don't have a new CancellationEventHandler nor do I have to specify types of sender and e, they're inferable from the event. Which is why this is less cumbersome to writing the whole delegate (blah blah) which also requires you to specify types of parameters.
Lambdas don't need to return anything and type inference is extremely powerful in context like this.
And BTW, you can always return Lambdas that make Lambdas in the functional programming sense. For example, here's a lambda that makes a lambda that handles a Button.Click event:
Func<int, int, EventHandler> makeHandler =
(dx, dy) => (sender, e) => {
var btn = (Button) sender;
btn.Top += dy;
btn.Left += dx;
};
btnUp.Click += makeHandler(0, -1);
btnDown.Click += makeHandler(0, 1);
btnLeft.Click += makeHandler(-1, 0);
btnRight.Click += makeHandler(1, 0);
Note the chaining: (dx, dy) => (sender, e) =>
Now that's why I'm happy to have taken the functional programming class :-)
Other than the pointers in C, I think it's the other fundamental thing you should learn :-)
From Rick Strahl:
You can chain the ?? operator so that you can do a bunch of null comparisons.
string result = value1 ?? value2 ?? value3 ?? String.Empty;
Aliased generics:
using ASimpleName = Dictionary<string, Dictionary<string, List<string>>>;
It allows you to use ASimpleName, instead of Dictionary<string, Dictionary<string, List<string>>>.
Use it when you would use the same generic big long complex thing in a lot of places.
From CLR via C#:
When normalizing strings, it is highly
recommended that you use
ToUpperInvariant instead of
ToLowerInvariant because Microsoft has
optimized the code for performing
uppercase comparisons.
I remember one time my coworker always changed strings to uppercase before comparing. I've always wondered why he does that because I feel it's more "natural" to convert to lowercase first. After reading the book now I know why.
My favorite trick is using the null coalesce operator and parentheses to automagically instantiate collections for me.
private IList<Foo> _foo;
public IList<Foo> ListOfFoo
{ get { return _foo ?? (_foo = new List<Foo>()); } }
Avoid checking for null event handlers
Adding an empty delegate to events at declaration, suppressing the need to always check the event for null before calling it is awesome. Example:
public delegate void MyClickHandler(object sender, string myValue);
public event MyClickHandler Click = delegate {}; // add empty delegate!
Let you do this
public void DoSomething()
{
Click(this, "foo");
}
Instead of this
public void DoSomething()
{
// Unnecessary!
MyClickHandler click = Click;
if (click != null) // Unnecessary!
{
click(this, "foo");
}
}
Please also see this related discussion and this blog post by Eric Lippert on this topic (and possible downsides).
Everything else, plus
1) implicit generics (why only on methods and not on classes?)
void GenericMethod<T>( T input ) { ... }
//Infer type, so
GenericMethod<int>(23); //You don't need the <>.
GenericMethod(23); //Is enough.
2) simple lambdas with one parameter:
x => x.ToString() //simplify so many calls
3) anonymous types and initialisers:
//Duck-typed: works with any .Add method.
var colours = new Dictionary<string, string> {
{ "red", "#ff0000" },
{ "green", "#00ff00" },
{ "blue", "#0000ff" }
};
int[] arrayOfInt = { 1, 2, 3, 4, 5 };
Another one:
4) Auto properties can have different scopes:
public int MyId { get; private set; }
Thanks #pzycoman for reminding me:
5) Namespace aliases (not that you're likely to need this particular distinction):
using web = System.Web.UI.WebControls;
using win = System.Windows.Forms;
web::Control aWebControl = new web::Control();
win::Control aFormControl = new win::Control();
I didn't know the "as" keyword for quite a while.
MyClass myObject = (MyClass) obj;
vs
MyClass myObject = obj as MyClass;
The second will return null if obj isn't a MyClass, rather than throw a class cast exception.
Two things I like are Automatic properties so you can collapse your code down even further:
private string _name;
public string Name
{
get
{
return _name;
}
set
{
_name = value;
}
}
becomes
public string Name { get; set;}
Also object initializers:
Employee emp = new Employee();
emp.Name = "John Smith";
emp.StartDate = DateTime.Now();
becomes
Employee emp = new Employee {Name="John Smith", StartDate=DateTime.Now()}
The 'default' keyword in generic types:
T t = default(T);
results in a 'null' if T is a reference type, and 0 if it is an int, false if it is a boolean,
etcetera.
Attributes in general, but most of all DebuggerDisplay. Saves you years.
The # tells the compiler to ignore any
escape characters in a string.
Just wanted to clarify this one... it doesn't tell it to ignore the escape characters, it actually tells the compiler to interpret the string as a literal.
If you have
string s = #"cat
dog
fish"
it will actually print out as (note that it even includes the whitespace used for indentation):
cat
dog
fish
I think one of the most under-appreciated and lesser-known features of C# (.NET 3.5) are Expression Trees, especially when combined with Generics and Lambdas. This is an approach to API creation that newer libraries like NInject and Moq are using.
For example, let's say that I want to register a method with an API and that API needs to get the method name
Given this class:
public class MyClass
{
public void SomeMethod() { /* Do Something */ }
}
Before, it was very common to see developers do this with strings and types (or something else largely string-based):
RegisterMethod(typeof(MyClass), "SomeMethod");
Well, that sucks because of the lack of strong-typing. What if I rename "SomeMethod"? Now, in 3.5 however, I can do this in a strongly-typed fashion:
RegisterMethod<MyClass>(cl => cl.SomeMethod());
In which the RegisterMethod class uses Expression<Action<T>> like this:
void RegisterMethod<T>(Expression<Action<T>> action) where T : class
{
var expression = (action.Body as MethodCallExpression);
if (expression != null)
{
// TODO: Register method
Console.WriteLine(expression.Method.Name);
}
}
This is one big reason that I'm in love with Lambdas and Expression Trees right now.
"yield" would come to my mind. Some of the attributes like [DefaultValue()] are also among my favorites.
The "var" keyword is a bit more known, but that you can use it in .NET 2.0 applications as well (as long as you use the .NET 3.5 compiler and set it to output 2.0 code) does not seem to be known very well.
Edit: kokos, thanks for pointing out the ?? operator, that's indeed really useful. Since it's a bit hard to google for it (as ?? is just ignored), here is the MSDN documentation page for that operator: ?? Operator (C# Reference)
I tend to find that most C# developers don't know about 'nullable' types. Basically, primitives that can have a null value.
double? num1 = null;
double num2 = num1 ?? -100;
Set a nullable double, num1, to null, then set a regular double, num2, to num1 or -100 if num1 was null.
http://msdn.microsoft.com/en-us/library/1t3y8s4s(VS.80).aspx
one more thing about Nullable type:
DateTime? tmp = new DateTime();
tmp = null;
return tmp.ToString();
it is return String.Empty. Check this link for more details
Here are some interesting hidden C# features, in the form of undocumented C# keywords:
__makeref
__reftype
__refvalue
__arglist
These are undocumented C# keywords (even Visual Studio recognizes them!) that were added to for a more efficient boxing/unboxing prior to generics. They work in coordination with the System.TypedReference struct.
There's also __arglist, which is used for variable length parameter lists.
One thing folks don't know much about is System.WeakReference -- a very useful class that keeps track of an object but still allows the garbage collector to collect it.
The most useful "hidden" feature would be the yield return keyword. It's not really hidden, but a lot of folks don't know about it. LINQ is built atop this; it allows for delay-executed queries by generating a state machine under the hood. Raymond Chen recently posted about the internal, gritty details.
Unions (the C++ shared memory kind) in pure, safe C#
Without resorting to unsafe mode and pointers, you can have class members share memory space in a class/struct. Given the following class:
[StructLayout(LayoutKind.Explicit)]
public class A
{
[FieldOffset(0)]
public byte One;
[FieldOffset(1)]
public byte Two;
[FieldOffset(2)]
public byte Three;
[FieldOffset(3)]
public byte Four;
[FieldOffset(0)]
public int Int32;
}
You can modify the values of the byte fields by manipulating the Int32 field and vice-versa. For example, this program:
static void Main(string[] args)
{
A a = new A { Int32 = int.MaxValue };
Console.WriteLine(a.Int32);
Console.WriteLine("{0:X} {1:X} {2:X} {3:X}", a.One, a.Two, a.Three, a.Four);
a.Four = 0;
a.Three = 0;
Console.WriteLine(a.Int32);
}
Outputs this:
2147483647
FF FF FF 7F
65535
just add
using System.Runtime.InteropServices;
Using # for variable names that are keywords.
var #object = new object();
var #string = "";
var #if = IpsoFacto();
If you want to exit your program without calling any finally blocks or finalizers use FailFast:
Environment.FailFast()
Returning anonymous types from a method and accessing members without reflection.
// Useful? probably not.
private void foo()
{
var user = AnonCast(GetUserTuple(), new { Name = default(string), Badges = default(int) });
Console.WriteLine("Name: {0} Badges: {1}", user.Name, user.Badges);
}
object GetUserTuple()
{
return new { Name = "dp", Badges = 5 };
}
// Using the magic of Type Inference...
static T AnonCast<T>(object obj, T t)
{
return (T) obj;
}
Here's a useful one for regular expressions and file paths:
"c:\\program files\\oldway"
#"c:\program file\newway"
The # tells the compiler to ignore any escape characters in a string.
Mixins. Basically, if you want to add a feature to several classes, but cannot use one base class for all of them, get each class to implement an interface (with no members). Then, write an extension method for the interface, i.e.
public static DeepCopy(this IPrototype p) { ... }
Of course, some clarity is sacrificed. But it works!
Not sure why anyone would ever want to use Nullable<bool> though. :-)
True, False, FileNotFound?
This one is not "hidden" so much as it is misnamed.
A lot of attention is paid to the algorithms "map", "reduce", and "filter". What most people don't realize is that .NET 3.5 added all three of these algorithms, but it gave them very SQL-ish names, based on the fact that they're part of LINQ.
"map" => Select Transforms data
from one form into another
"reduce" => Aggregate Aggregates
values into a single result
"filter" => Where Filters data
based on a criteria
The ability to use LINQ to do inline work on collections that used to take iteration and conditionals can be incredibly valuable. It's worth learning how all the LINQ extension methods can help make your code much more compact and maintainable.
Environment.NewLine
for system independent newlines.
If you're trying to use curly brackets inside a String.Format expression...
int foo = 3;
string bar = "blind mice";
String.Format("{{I am in brackets!}} {0} {1}", foo, bar);
//Outputs "{I am in brackets!} 3 blind mice"
?? - coalescing operator
using (statement / directive) - great keyword that can be used for more than just calling Dispose
readonly - should be used more
netmodules - too bad there's no support in Visual Studio
#Ed, I'm a bit reticent about posting this as it's little more than nitpicking. However, I would point out that in your code sample:
MyClass c;
if (obj is MyClass)
c = obj as MyClass
If you're going to use 'is', why follow it up with a safe cast using 'as'? If you've ascertained that obj is indeed MyClass, a bog-standard cast:
c = (MyClass)obj
...is never going to fail.
Similarly, you could just say:
MyClass c = obj as MyClass;
if(c != null)
{
...
}
I don't know enough about .NET's innards to be sure, but my instincts tell me that this would cut a maximum of two type casts operations down to a maximum of one. It's hardly likely to break the processing bank either way; personally, I think the latter form looks cleaner too.
Maybe not an advanced technique, but one I see all the time that drives me crazy:
if (x == 1)
{
x = 2;
}
else
{
x = 3;
}
can be condensed to:
x = (x==1) ? 2 : 3;

Indexing arrays with enums in C#

I have a lot of fixed-size collections of numbers where each entry can be accessed with a constant. Naturally this seems to point to arrays and enums:
enum StatType {
Foo = 0,
Bar
// ...
}
float[] stats = new float[...];
stats[StatType.Foo] = 1.23f;
The problem with this is of course that you cannot use an enum to index an array without a cast (though the compiled IL is using plain ints). So you have to write this all over the place:
stats[(int)StatType.foo] = 1.23f;
I have tried to find ways to use the same easy syntax without casting but haven't found a perfect solution yet. Using a dictionary seems to be out of the question since I found it to be around 320 times slower than an array. I also tried to write a generic class for an array with enums as index:
public sealed class EnumArray<T>
{
private T[] array;
public EnumArray(int size)
{
array = new T[size];
}
// slow!
public T this[Enum idx]
{
get { return array[(int)(object)idx]; }
set { array[(int)(object)idx] = value; }
}
}
or even a variant with a second generic parameter specifying the enum. This comes quite close to what I want but the problem is that you cannot just cast an unspecific enum (be it from a generic parameter or the boxed type Enum) to int. Instead you have to first box it with a cast to object and then cast it back. This works, but is quite slow. I found that the generated IL for the indexer looks something like this:
.method public hidebysig specialname instance !T get_Item(!E idx) cil managed
{
.maxstack 8
L_0000: ldarg.0
L_0001: ldfld !0[] EnumArray`2<!T, !E>::array
L_0006: ldarg.1
L_0007: box !E
L_000c: unbox.any int32
L_0011: ldelem.any !T
L_0016: ret
}
As you can see there are unnecessary box and unbox instructions there. If you strip them from the binary the code works just fine and is just a tad slower than pure array access.
Is there any way to easily overcome this problem? Or maybe even better ways?
I think it would also be possible to tag such indexer methods with a custom attribute and strip those two instructions post-compile. What would be a suitable library for that? Maybe Mono.Cecil?
Of course there's always the possibility to drop enums and use constants like this:
static class StatType {
public const int Foo = 0;
public const int Bar = 1;
public const int End = 2;
}
which may be the fastest way since you can directly access the array.
I suspect you may be able to make it a bit faster by compiling a delegate to do the conversion for you, such that it doesn't require boxing and unboxing. An expression tree may well be the simplest way of doing that if you're using .NET 3.5. (You'd use that in your EnumArray example.)
Personally I'd be very tempted to use your const int solution. It's not like .NET provides enum value validation anyway by default - i.e. your callers could always cast int.MaxValue to your enum type, and you'd get an ArrayIndexException (or whatever). So, given the relative lack of protection / type safety you're already getting, the constant value answer is appealing.
Hopefully Marc Gravell will be along in a minute to flesh out the compiled conversion delegate idea though...
If your EnumArray wasn't generic, but instead explicitly took a StatType indexer - then you'd be fine. If that's not desirable, then I'd probably use the const approach myself. However, a quick test with passing in a Func<T, E> shows no appreciable difference vs direct access.
public class EnumArray<T, E> where E:struct {
private T[] _array;
private Func<E, int> _convert;
public EnumArray(int size, Func<E, int> convert) {
this._array = new T[size];
this._convert = convert;
}
public T this[E index] {
get { return this._array[this._convert(index)]; }
set { this._array[this._convert(index)] = value; }
}
}
If you have a lot of fixed-size collections, then it would probably be easier to wrap up your properties in an object than a float[]:
public class Stats
{
public float Foo = 1.23F;
public float Bar = 3.14159F;
}
Passing an object around will give you the type safety, concise code, and constant-time access that you want.
And if you really need to use an array, its easy enough to add a ToArray() method which maps the properties of your object to a float[].
struct PseudoEnum
{
public const int INPT = 0;
public const int CTXT = 1;
public const int OUTP = 2;
};
// ...
String[] arr = new String[3];
arr[PseudoEnum.CTXT] = "can";
arr[PseudoEnum.INPT] = "use";
arr[PseudoEnum.CTXT] = "as";
arr[PseudoEnum.CTXT] = "array";
arr[PseudoEnum.OUTP] = "index";
(I also posted this answer at https://stackoverflow.com/a/12901745/147511)
[edit: oops I just noticed that Steven Behnke mentioned this approach elsewhere on this page. Sorry; but at least this shows an example of doing it...]
enums are supposed to be type safe. If you're using them as the index of an array, you're fixing both the type and the values of the enum, so you have no benefit over declaring a static class of int constants.
I don't believe there is any way to add an implicit conversion operator to an enum, unfortunately. So you'll have to either live with ugly typecasts or just use a static class with consts.
Here's a StackOverflow question that discusses more on the implicit conversion operator:
Can we define implicit conversions of enums in c#?
I'm not 100% familiar with C#, but I've seen implicit operators used to map one type to another before. Can you create an implicit operator for the Enum type that allows you to use it as an int?

Categories

Resources