Slimming down a switch statement

Slimming down a switch statement - c#

Wondering if there are good alternatives to this that perform no worse than what I have below? The real switch statement has additional sections for other non-English characters.
Note that I'd love to put multiple case statements per line, but StyleCop doesn't like it and will fail our release build as a result.
var retVal = String.Empty;
switch(valToCheck)
{
case "é":
case "ê":
case "è":
case "ë":
retVal = "e";
break;
case "à":
case "â":
case "ä":
case "å":
retVal = "a";
break;
default:
retVal = "-";
break;
}

The first thing that comes to mind is a Dictionary<char,char>()
(I prefer char instead of strings because you are dealing with chars)
Dictionary<char,char> dict = new Dictionary<char,char>();
dict.Add('å', 'a');
......
then you could remove your entire switch
char retValue;
char testValue = 'å';
if(dict.TryGetValue(testValue, out retValue) == false)
retVal = '-';

Well, start off by doing this transformation.
public class CharacterSanitizer
{
private static Dictionary<string, string> characterMappings = new Dictionary<string, string>();
static CharacterSanitizer()
{
characterMappings.Add("é", "e");
characterMappings.Add("ê", "e");
//...
}
public static string mapCharacter(string input)
{
string output;
if (characterMappings.TryGetValue(input, out output))
{
return output;
}
else
{
return input;
}
}
}
Now you're in the position where the character mappings are part of the data, rather than the code. I've hard coded the values here, but at this point it is simple enough to store the mappings in a file, read in the file and then populate the dictionary accordingly. This way you can not only clean up the code a lot by reducing the case statement to one bit text file (outside of code) but you can modify it without needing to re-compile.

You could make a small range check and look at the ascii values.
Assuming InRange(val, min, max) checks if a number is, yep, in range..
if(InRange(System.Convert.ToInt32(valToCheck),232,235))
return 'e';
else if(InRange(System.Convert.ToInt32(valToCheck),224,229))
return 'a';
This makes the code a little confusing, and depends on the standard used, but perhaps something to consider.

This answer presumes that you are going to apply that switch statement to a string, not just to single characters (though that would also work).
The best approach seems to be the one outlined in this StackOverflow answer.
I adapted it to use LINQ:
var chars = from character in valToCheck.Normalize(NormalizationForm.FormD)
where CharUnicodeInfo.GetUnicodeCategory(character)
!= UnicodeCategory.NonSpacingMark
select character;
return string.Join("", chars).Normalize(NormalizationForm.FormC);
you'll need a using directive for System.Globalization;
Sample input:
string valToCheck = "êéÈöü";
Sample output:
eeEou

Based on Michael Kaplan's RemoveDiacritics(), you could do something like this:
static char RemoveDiacritics(char c)
{
string stFormD = c.ToString().Normalize(NormalizationForm.FormD);
StringBuilder sb = new StringBuilder();
for (int ich = 0; ich < stFormD.Length; ich++)
{
UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(stFormD[ich]);
if (uc != UnicodeCategory.NonSpacingMark)
{
sb.Append(stFormD[ich]);
}
}
return (sb.ToString()[0]);
}
switch(RemoveDiacritics(valToCheck))
{
case 'e':
//...
break;
case 'a':
//...
break;
//...
}
or, potentially even:
retval = RemoveDiacritics(valToCheck);

Use Contains instead of switch.
var retVal = String.Empty;
string es = "éêèë";
if (es.Contains(valToCheck)) retVal = "e";
//etc.

Related

Is there a better way to enumerate text elements from an IEnumerable<char>?

I want to enumerate text elements (groups of Unicode code points that are displayed as single character, like e+´=é) from an IEnumerable<char>. Right now I have the following:
// This code is untested! I assume it works because it's fairly simple and I checked the specification though.
public static IEnumerable<string> AsTextElements(this IEnumerable<char> input)
{
StringBuilder currentElement = new StringBuilder();
char highSurrogate = (char)0;
foreach (var c in input)
{
// Assuming input contains valid UTF-16:
if (char.IsHighSurrogate(c))
{
highSurrogate = c;
continue;
}
int codepoint;
if (char.IsLowSurrogate(c))
{ codepoint = char.ConvertToUtf32(highSurrogate, c); }
else
{ codepoint = c; }
var codepointString = char.ConvertFromUtf32(codepoint);
var category = CharUnicodeInfo.GetUnicodeCategory(codepointString, 0);
switch (category)
{
// Do these catch all combining characters?
case UnicodeCategory.EnclosingMark:
case UnicodeCategory.NonSpacingMark:
case UnicodeCategory.SpacingCombiningMark:
if (currentElement == null)
{ currentElement = new StringBuilder(codepointString); }
else
{ currentElement.Append(codepointString); }
break;
default:
if (currentElement.Length != 0)
{
yield return currentElement.ToString();
currentElement.Clear();
}
currentElement.Append(codepointString);
break;
}
}
yield return currentElement.ToString();
}
What irks me about this are all the codepointString strings being created here, even though I need at most 32 bits for each code point. I couldn't find a method that gets the Unicode category directly from an int or two chars.
Adding the char(s) to the currentElement StringBuilder is easily possible though.
I'm aware of the "measure before optimizing" advice, this question is mainly because it would seem strange to me if if wasn't possible without heap allocations.
I didn't have to iterate text elements without having them available in the same string so far, but I may in the future.

If by text elements you mean "user-perceived characters", then Unicode Standard Annex 29 contains an algorithm for finding the boundaries between "extended grapheme clusters", which may better correspond to "user-perceived characters" than the code points resulting from normalization.
(My previous answer was incorrect, so I deleted it; it suggested using normalization form C, but it's inadequate for finding text elements in many cases.)

C# - Dealing with contradictions in string.replace

I'm getting started with C# and programming in general and I've been playing with the "if" statements, arrays and generally getting to grips with things. However, one thing that has stumped me is how you would go about performing an replace operation which is inherently contradictory.
IE: I have string "AAABBB" but I want to search through my text and replace all "A"s with "B"s and vice-versa. So my intended output would be "BBBAAA".
I'm currently trying to use string.replace and if statements but it's not working (it follows the order of the statements, so in the above examples I'd get all "A" or all "B".
Code examples:
if (string.Contains("a"));
{
string = string.Replace("a", "b");
}
if (string.Contains("b"));
{
string = string.Replace("b", "a");
}
Any help would be super welcome!

If you're always replacing one character with another, it's probably simplest to convert it to a char[] and go through it one character at a time, fixing each one appropriately - rather than doing "all the As" and then "all the Bs".
public static string PerformReplacements(string text)
{
char[] chars = text.ToCharArray();
for (int i = 0; i < chars.Length; i++)
{
switch (chars[i])
{
case 'A':
chars[i] = 'B';
break;
case 'B':
chars[i] = 'A';
break;
}
}
return new string(chars);
}

Consider using Linq:
s = new string(s.Select(x => x == 'A' ? 'B' : x == 'B' ? 'A' : x).ToArray());

The reason why this fails is because all A's are first replaced by B's but then back to A's.
A generic way to solve this is the following:
using System.Linq;
using System.Text;
using System.Diagnostics.Contracts;
public class Foo {
public static string ParallelReplace (string text, char[] fromc, char[] toc) {
Contract.Requires(text != null);
Contract.Requires(fromc != null);
Contract.Requires(toc != null)
Contract.Requires(fromc.Length == toc.Length);
Contract.Ensures(Contract.Result<string>().Length == text.Length);
Array.Sort(fromc,toc);
StringBuilder sb = new StringBuilder();
foreach(char c in text) {
int i = Array.BinarySearch(fromc,c);
if(i >= 0) {
sb.Append(toc[i]);
} else {
sb.Append(c);
}
}
return sb.ToString();
}
}
Demo with csharp interactive shell:
csharp> Foo.ParallelReplace("ABasdsadsadaABABB",new char[] {'b','a','s'},new char[] {'f','s','a'});
"ABsadasdasdsABABB"
This represents a mapping {b->f,a->s,s->a}. The method works in O(s*log(n)+n*log(n)), with s the length of the string and n the number of rules.
The Contract's are not necessary, but can help if one uses a static code analysis tool to prevent making errors.

Should a regular expression used to break up lines account for unix/dos issue?

I didn't feel like using XML for the input file of my T4 so I made this snippet that splits up a document into chunks separated by a blank line.
Am I appropriately making the carriage return optional here?
string s = #"Default
Default
CurrencyConversion
Details of currency conversions.
BudgetReportCache
Indicates wheather the budget report is taken from query results or cache.";
string oneLine = #"[\r]\n";
string twoLines = #"[\r]\n[\r]\n";
var chunks = Regex.Split(s, twoLines, RegexOptions.Multiline);
var items = chunks.Select(c=>Regex.Split(c, oneLine, RegexOptions.Multiline)).ToDictionary(c=>c[0], c=>c[1]);
Note: I would never have thought of this, but since I started using Git, I have seen it "say" things that reminded me of the unix2dos issues, which in turn made me think of Mono and finally if I needed to deal with portability (assuming the goal is perfection).

Your regular expressions doesn't do what you think that they do. Putting \r inside a set doesn't accomplish anything; the expression [\r]\n means the same thing as just \r\n.
You can make the work using the ? operator:
string oneLine = #"\r?\n";
string twoLines = #"\r?\n\r?\n";
However, I would suggest that you use the regular String.Split method instead of regular expressions:
string[] oneLine = { #"\r\n", #"\n" };
string[] twoLines = { #"\r\n\r\n", #"\n\n" };
var chunks = s.Split(twoLines, StringSplitOptions.None);
var items =
chunks.Select(c => c.Split(oneLine, StringSplitOptions.None))
.ToDictionary(c => c[0], c => c[1]);

Yes, you should allow for different line separators, but that's not how you do it. The square brackets don't make their contents optional, and you aren't taking the old Mac-style \r into account. I'd use these regexes:
string oneLine = #"\r\n|[\r\n]";
string twoLines = #"(?:\r\n|[\r\n]){2}";
That's "carriage-return + linefeed OR carriage-return OR linefeed".
Also, you don't need the Multiline option. It only changes the meaning of the ^ and $ anchors, which you aren't using (and don't need to use).

If you want to go full hog on portability (and yes, I'm only adding this answer in response to Alan's mentioning of old Mac-style \r) then you want to cover:
*nix style: \n
DOS/Windows style: \r\n
Old Mac style: \r
EBCDIC style: \u0085 (probably slightly more current-day use than old mac, I'd guess).
Line-separator formatting character: \u2028
Paragraph-separator formatting character: \u2029
Let's just not dwell on the precise semantics of \u000B and \u000C and turn this into something sensible (eventually). If we were to try to deal with all of those. How would we do it?
With 6 different line-breaks, one of which is a combination of two of the others, but which should not be treated as two line-breaks, dealing with this in the reg-ex itself could be nasty.
Much better would be to filter them all out in a TextReader wrapper:
public class LineBreakNormaliser : TextReader
{
private readonly TextReader _source;
private bool isNewLine(int charAsInt)
{
switch(charAsInt)
{
case '\n': case '\r':
case '\u0085': case '\u2028': case '\u2029':
case '\u000B': case '\u000C':
return true;
default:
return false;
}
}
public LineBreakNormaliser(TextReader source)
{
_source = source;
}
public override void Close()
{
_source.Close();
base.Close();
}
protected override void Dispose(bool disposing)
{
if(disposing)
_source.Dispose();
base.Dispose(disposing);
}
public override int Peek()
{
int i = _source.Peek();
if(i == -1)
return -1;
if(isNewLine(i))
return '\n';
return i;
}
public override int Read()
{
int i = _source.Read();
if(i == -1)
return -1;
if(i == '\r')
{
if(_source.Peek() == '\n')
_source.Read(); //eat next half of CRLF pair.
return i;
}
if(isNewLine(i))
return '\n';
return i;
}
public override int Read(char[] buffer, int index, int count)
{
//We take advantage of the fact that we are allowed to return fewer than requested.
//ReadBlock does the work for us for those who need the full amount:
char[] tmpBuffer = new char[count];
int cChars = count = _source.Read(tmpBuffer, 0, count);
if(cChars == 0)
return 0;
for(int i = 0; i != cChars; ++i)
{
char cur = tmpBuffer[i];
if(cur == '\r')
{
if(i == cChars -1)
{
if(_source.Peek() == '\n')
{
_source.Read(); //eat second half of CRLF
--count;
}
}
else if(tmpBuffer[i + 1] == '\r')
{
++i;
--count;
}
buffer[index++] = '\n';
}
else if(isNewLine(cur))
buffer[index++] = '\n';
else
buffer[index++] = '\n';
}
return count;
}
}
If you read the file via this text reader, then from this point on your regex can depend the only newline being \n and so can any other code.
This done, the regex can actually be simpler than ever, and you while it's totally overkill for this single case (and only written because after Alan's mention of OS9 and earlier the idea of supporting IBM EBCDIC machines amused me), it is reusable for all other cases, in which context it's actually not over-kill at all, because it becomes "just use the well-tested line-normaliser to make things simpler". (Once it is well-tested that is, I haven't tested any of the above).

Parsing custom data tokens and replacing with values in C#

I have about 10 pieces of data from a record and I want to be able to define the layout of a string where this data is returned, with the option of leaving some pieces out. My thought was to use an enum to give integer values to my tokens/fields and then have a format like {0}{1}{2}{3} or something as complicated as {4} - {3}{1} [{8}]. The meaning of the tokens relates to fields in my database. For instance I have this enum for my tokens relating to payments made.
AccountMask = 0,
AccountLast4 = 1,
AccountFirstDigit = 2,
AccountFirstLetter = 3,
ItemNumber = 4,
Amount = 5
Account mask is a string like VXXXXX1234 where the V is for a visa, and 1234 are the last 4 digits of the card. Sometimes clients wants the V, sometimes they want the first digit (It's easy to translate a card type into a first digit).
My goal is to create something reusable to generate a string using tokens in a format string that will then use the data associated with the digit inside the token to do an in place replace of the data.
So, for an example using the mask above and my enum if I wanted to define a format 9{2}{1}{4:[0:0000000000]}
if the item number is 678934
which would then translate to 9412340000678934 where the inner part of token 4 becomes a definition for a String.Format of that value. Also, the data placed around the tokens is ignored and kept in place.
My issue comes to the manipulation of the string and the best practice. I have been told that regular expressions can be costly if you are going to create them on the fly. As a CS major, I have a feeling the "right" (however complex) solution is to make a lexer/parser for my tokens. I have no experience writing a lexer/parse in C# so I'm not sure of the best practices around it. I'm looking for guidance here on a system that is efficient and easy to tweak.

I see this problem, and I immediately thought that the solution is pretty simple; store the masks you wish to use, with constant values for various data fields you wish to include, and pass the masks into a constant call to String.Format():
const string CreditCardWithFirstLetterMask = "{3}XXXXXXXXXXX{1}";
const string CreditCardWithFirstDigitMask = "{2}XXXXXXXXXXX{1}";
...
var formattedString = String.Format(CreditCardWithFirstDigitMask,
record.AccountMask,
record.AccountLast4,
record.AccountFirstDigit,
record.AccountFirstLetter,
record.ItemNumber,
record.Amount);

I ended up putting the regex as a static object in the class and then looping through matches to perform replacements and build out my token.
var token = request.TokenFormat;
var matches = tokenExpression.Matches(request.TokenFormat);
foreach (Match match in matches)
{
var value = match.Value;
var tokenCode = (Token)Convert.ToInt32(value.Substring(1, (value.Contains(":") ? value.IndexOf(":") : value.IndexOf("}")) - 1));
object data = null;
switch (tokenCode)
{
case Token.AccountMask:
data = accountMask;
break;
case Token.AccountLast4:
data = accountMask.Substring(accountMask.Length - 4);
break;
case Token.AccountFirstDigit:
string firstLetter = accountMask.Substring(0, 1);
switch (firstLetter.ToUpper())
{
case "A":
data = 3;
break;
case "V":
data = 4;
break;
case "M":
data = 5;
break;
case "D":
data = 6;
break;
}
break;
case Token.AccountFirstLetter:
data = accountMask.Substring(0, 1);
break;
case Token.ItemNumber:
if(item != null)
data = item.PaymentId;
break;
case Token.Amount:
if (item != null)
data = item.Amount;
break;
case Token.PaymentMethodId:
if (paymentMethod != null)
data = paymentMethod.PaymentMethodId;
break;
}
if (formatExpression.IsMatch(value))
{
Match formatMatch = formatExpression.Match(value);
string format = formatMatch.Value.Replace("[", "{").Replace("]", "}");
token = token.Replace(value, String.Format(format, data));
}
else
{
token = token.Replace(value, String.Format("{0}", data));
}
}
return token;

What is quicker, switch on string or elseif on type?

Lets say I have the option of identifying a code path to take on the basis of a string comparison or else iffing the type:
Which is quicker and why?
switch(childNode.Name)
{
case "Bob":
break;
case "Jill":
break;
case "Marko":
break;
}
if(childNode is Bob)
{
}
elseif(childNode is Jill)
{
}
else if(childNode is Marko)
{
}
Update: The main reason I ask this is because the switch statement is perculiar about what counts as a case. For example it wont allow you to use variables, only constants which get moved to the main assembly. I assumed it had this restriction due to some funky stuff it was doing. If it is only translating to elseifs (as one poster commented) then why are we not allowed variables in case statements?
Caveat: I am post-optimising. This method is called many times in a slow part of the app.

Greg's profile results are great for the exact scenario he covered, but interestingly, the relative costs of the different methods change dramatically when considering a number of different factors including the number of types being compared, and the relative frequency and any patterns in the underlying data.
The simple answer is that nobody can tell you what the performance difference is going to be in your specific scenario, you will need to measure the performance in different ways yourself in your own system to get an accurate answer.
The If/Else chain is an effective approach for a small number of type comparisons, or if you can reliably predict which few types are going to make up the majority of the ones that you see. The potential problem with the approach is that as the number of types increases, the number of comparisons that must be executed increases as well.
if I execute the following:
int value = 25124;
if(value == 0) ...
else if (value == 1) ...
else if (value == 2) ...
...
else if (value == 25124) ...
each of the previous if conditions must be evaluated before the correct block is entered. On the other hand
switch(value) {
case 0:...break;
case 1:...break;
case 2:...break;
...
case 25124:...break;
}
will perform one simple jump to the correct bit of code.
Where it gets more complicated in your example is that your other method uses a switch on strings rather than integers which gets a little more complicated. At a low level, strings can't be switched on in the same way that integer values can so the C# compiler does some magic to make this work for you.
If the switch statement is "small enough" (where the compiler does what it thinks is best automatically) switching on strings generates code that is the same as an if/else chain.
switch(someString) {
case "Foo": DoFoo(); break;
case "Bar": DoBar(); break;
default: DoOther; break;
}
is the same as:
if(someString == "Foo") {
DoFoo();
} else if(someString == "Bar") {
DoBar();
} else {
DoOther();
}
Once the list of items in the dictionary gets "big enough" the compiler will automatically create an internal dictionary that maps from the strings in the switch to an integer index and then a switch based on that index.
It looks something like this (Just imagine more entries than I am going to bother to type)
A static field is defined in a "hidden" location that is associated with the class containing the switch statement of type Dictionary<string, int> and given a mangled name
//Make sure the dictionary is loaded
if(theDictionary == null) {
//This is simplified for clarity, the actual implementation is more complex
// in order to ensure thread safety
theDictionary = new Dictionary<string,int>();
theDictionary["Foo"] = 0;
theDictionary["Bar"] = 1;
}
int switchIndex;
if(theDictionary.TryGetValue(someString, out switchIndex)) {
switch(switchIndex) {
case 0: DoFoo(); break;
case 1: DoBar(); break;
}
} else {
DoOther();
}
In some quick tests that I just ran, the If/Else method is about 3x as fast as the switch for 3 different types (where the types are randomly distributed). At 25 types the switch is faster by a small margin (16%) at 50 types the switch is more than twice as fast.
If you are going to be switching on a large number of types, I would suggest a 3rd method:
private delegate void NodeHandler(ChildNode node);
static Dictionary<RuntimeTypeHandle, NodeHandler> TypeHandleSwitcher = CreateSwitcher();
private static Dictionary<RuntimeTypeHandle, NodeHandler> CreateSwitcher()
{
var ret = new Dictionary<RuntimeTypeHandle, NodeHandler>();
ret[typeof(Bob).TypeHandle] = HandleBob;
ret[typeof(Jill).TypeHandle] = HandleJill;
ret[typeof(Marko).TypeHandle] = HandleMarko;
return ret;
}
void HandleChildNode(ChildNode node)
{
NodeHandler handler;
if (TaskHandleSwitcher.TryGetValue(Type.GetRuntimeType(node), out handler))
{
handler(node);
}
else
{
//Unexpected type...
}
}
This is similar to what Ted Elliot suggested, but the usage of runtime type handles instead of full type objects avoids the overhead of loading the type object through reflection.
Here are some quick timings on my machine:
Testing 3 iterations with 5,000,000 data elements (mode=Random) and 5 types
Method Time % of optimal
If/Else 179.67 100.00
TypeHandleDictionary 321.33 178.85
TypeDictionary 377.67 210.20
Switch 492.67 274.21
Testing 3 iterations with 5,000,000 data elements (mode=Random) and 10 types
Method Time % of optimal
If/Else 271.33 100.00
TypeHandleDictionary 312.00 114.99
TypeDictionary 374.33 137.96
Switch 490.33 180.71
Testing 3 iterations with 5,000,000 data elements (mode=Random) and 15 types
Method Time % of optimal
TypeHandleDictionary 312.00 100.00
If/Else 369.00 118.27
TypeDictionary 371.67 119.12
Switch 491.67 157.59
Testing 3 iterations with 5,000,000 data elements (mode=Random) and 20 types
Method Time % of optimal
TypeHandleDictionary 335.33 100.00
TypeDictionary 373.00 111.23
If/Else 462.67 137.97
Switch 490.33 146.22
Testing 3 iterations with 5,000,000 data elements (mode=Random) and 25 types
Method Time % of optimal
TypeHandleDictionary 319.33 100.00
TypeDictionary 371.00 116.18
Switch 483.00 151.25
If/Else 562.00 175.99
Testing 3 iterations with 5,000,000 data elements (mode=Random) and 50 types
Method Time % of optimal
TypeHandleDictionary 319.67 100.00
TypeDictionary 376.67 117.83
Switch 453.33 141.81
If/Else 1,032.67 323.04
On my machine at least, the type handle dictionary approach beats all of the others for anything over 15 different types when the distribution
of the types used as input to the method is random.
If on the other hand, the input is composed entirely of the type that is checked first in the if/else chain that method is much faster:
Testing 3 iterations with 5,000,000 data elements (mode=UniformFirst) and 50 types
Method Time % of optimal
If/Else 39.00 100.00
TypeHandleDictionary 317.33 813.68
TypeDictionary 396.00 1,015.38
Switch 403.00 1,033.33
Conversely, if the input is always the last thing in the if/else chain, it has the opposite effect:
Testing 3 iterations with 5,000,000 data elements (mode=UniformLast) and 50 types
Method Time % of optimal
TypeHandleDictionary 317.67 100.00
Switch 354.33 111.54
TypeDictionary 377.67 118.89
If/Else 1,907.67 600.52
If you can make some assumptions about your input, you might get the best performance from a hybrid approach where you perform if/else checks for the few types that are most common, and then fall back to a dictionary-driven approach if those fail.

Firstly, you're comparing apples and oranges. You'd first need to compare switch on type vs switch on string, and then if on type vs if on string, and then compare the winners.
Secondly, this is the kind of thing OO was designed for. In languages that support OO, switching on type (of any kind) is a code smell that points to poor design. The solution is to derive from a common base with an abstract or virtual method (or a similar construct, depending on your language)
eg.
class Node
{
public virtual void Action()
{
// Perform default action
}
}
class Bob : Node
{
public override void Action()
{
// Perform action for Bill
}
}
class Jill : Node
{
public override void Action()
{
// Perform action for Jill
}
}
Then, instead of doing the switch statement, you just call childNode.Action()

I just implemented a quick test application and profiled it with ANTS 4.
Spec: .Net 3.5 sp1 in 32bit Windows XP, code built in release mode.
3 million tests:
Switch: 1.842 seconds
If: 0.344 seconds.
Furthermore, the switch statement results reveal (unsurprisingly) that longer names take longer.
1 million tests
Bob: 0.612 seconds.
Jill: 0.835 seconds.
Marko: 1.093 seconds.
I looks like the "If Else" is faster, at least the the scenario I created.
class Program
{
static void Main( string[] args )
{
Bob bob = new Bob();
Jill jill = new Jill();
Marko marko = new Marko();
for( int i = 0; i < 1000000; i++ )
{
Test( bob );
Test( jill );
Test( marko );
}
}
public static void Test( ChildNode childNode )
{
TestSwitch( childNode );
TestIfElse( childNode );
}
private static void TestIfElse( ChildNode childNode )
{
if( childNode is Bob ){}
else if( childNode is Jill ){}
else if( childNode is Marko ){}
}
private static void TestSwitch( ChildNode childNode )
{
switch( childNode.Name )
{
case "Bob":
break;
case "Jill":
break;
case "Marko":
break;
}
}
}
class ChildNode { public string Name { get; set; } }
class Bob : ChildNode { public Bob(){ this.Name = "Bob"; }}
class Jill : ChildNode{public Jill(){this.Name = "Jill";}}
class Marko : ChildNode{public Marko(){this.Name = "Marko";}}

Switch statement is faster to execute than the if-else-if ladder. This is due to the compiler's ability to optimise the switch statement. In the case of the if-else-if ladder, the code must process each if statement in the order determined by the programmer. However, because each case within a switch statement does not rely on earlier cases, the compiler is able to re-order the testing in such a way as to provide the fastest execution.

If you've got the classes made, I'd suggest using a Strategy design pattern instead of switch or elseif.

Try using enumerations for each object, you can switch on enums quickly and easily.

Unless you've already written this and find you have a performance problem I wouldn't worry about which is quicker. Go with the one that's more readable. Remember, "Premature optimization is the root of all evil." - Donald Knuth

A SWITCH construct was originally intended for integer data; it's intent was to use the argument directly as a index into a "dispatch table", a table of pointers. As such, there would be a single test, then launch directly to the relevant code, rather than a series of tests.
The difficulty here is that it's use has been generalized to "string" types, which obviously cannot be used as an index, and all advantage of the SWITCH construct is lost.
If speed is your intended goal, the problem is NOT your code, but your data structure. If the "name" space is as simple as you show it, better to code it into an integer value (when data is created, for example), and use this integer in the "many times in a slow part of the app".

If the types you're switching on are primitive .NET types you can use Type.GetTypeCode(Type), but if they're custom types they will all come back as TypeCode.Object.
A dictionary with delegates or handler classes might work as well.
Dictionary<Type, HandlerDelegate> handlers = new Dictionary<Type, HandlerDelegate>();
handlers[typeof(Bob)] = this.HandleBob;
handlers[typeof(Jill)] = this.HandleJill;
handlers[typeof(Marko)] = this.HandleMarko;
handlers[childNode.GetType()](childNode);
/// ...
private void HandleBob(Node childNode) {
// code to handle Bob
}

The switch() will compile out to code equivalent to a set of else ifs. The string comparisons will be much slower than the type comparisons.

I recall reading in several reference books that the if/else branching is quicker than the switch statement. However, a bit of research on Blackwasp shows that the switch statement is actually faster:
http://www.blackwasp.co.uk/SpeedTestIfElseSwitch.aspx
In reality, if you're comparing the typical 3 to 10 (or so) statements, I seriously doubt there's any real performance gain using one or the other.
As Chris has already said, go for readability:
What is quicker, switch on string or elseif on type?

I think the main performance issue here is, that in the switch block, you compare strings, and that in the if-else block, you check for types... Those two are not the same, and therefore, I'd say you're "comparing potatoes to bananas".
I'd start by comparing this:
switch(childNode.Name)
{
case "Bob":
break;
case "Jill":
break;
case "Marko":
break;
}
if(childNode.Name == "Bob")
{}
else if(childNode.Name == "Jill")
{}
else if(childNode.Name == "Marko")
{}

I'm not sure how faster it could be the right design would be to go for polymorphism.
interface INode
{
void Action;
}
class Bob : INode
{
public void Action
{
}
}
class Jill : INode
{
public void Action
{
}
}
class Marko : INode
{
public void Action
{
}
}
//Your function:
void Do(INode childNode)
{
childNode.Action();
}
Seeing what your switch statement does will help better. If your function is not really anything about an action on the type, may be you could define an enum on each type.
enum NodeType { Bob, Jill, Marko, Default }
interface INode
{
NodeType Node { get; };
}
class Bob : INode
{
public NodeType Node { get { return NodeType.Bob; } }
}
class Jill : INode
{
public NodeType Node { get { return NodeType.Jill; } }
}
class Marko : INode
{
public NodeType Node { get { return NodeType.Marko; } }
}
//Your function:
void Do(INode childNode)
{
switch(childNode.Node)
{
case Bob:
break;
case Jill:
break;
case Marko:
break;
Default:
throw new ArgumentException();
}
}
I assume this has to be faster than both approaches in question. You might want to try abstract class route if nanoseconds does matter for you.

I created a little console to show my solution, just to highlight the speed difference. I used a different string hash algorithm as the certificate version is to slow for me on runtime and duplicates are unlikely and if so my switch statement would fail (never happened till now). My unique hash extension method is included in the code below.
I will take 29 ticks over 695 ticks any time, specially when using critical code.
With a set of strings from a given database you can create a small application to create the constant in a given file for you to use in your code, if values are added you just re-run your batch and constants are generated and picked up by the solution.
public static class StringExtention
{
public static long ToUniqueHash(this string text)
{
long value = 0;
var array = text.ToCharArray();
unchecked
{
for (int i = 0; i < array.Length; i++)
{
value = (value * 397) ^ array[i].GetHashCode();
value = (value * 397) ^ i;
}
return value;
}
}
}
public class AccountTypes
{
static void Main()
{
var sb = new StringBuilder();
sb.AppendLine($"const long ACCOUNT_TYPE = {"AccountType".ToUniqueHash()};");
sb.AppendLine($"const long NET_LIQUIDATION = {"NetLiquidation".ToUniqueHash()};");
sb.AppendLine($"const long TOTAL_CASH_VALUE = {"TotalCashValue".ToUniqueHash()};");
sb.AppendLine($"const long SETTLED_CASH = {"SettledCash".ToUniqueHash()};");
sb.AppendLine($"const long ACCRUED_CASH = {"AccruedCash".ToUniqueHash()};");
sb.AppendLine($"const long BUYING_POWER = {"BuyingPower".ToUniqueHash()};");
sb.AppendLine($"const long EQUITY_WITH_LOAN_VALUE = {"EquityWithLoanValue".ToUniqueHash()};");
sb.AppendLine($"const long PREVIOUS_EQUITY_WITH_LOAN_VALUE = {"PreviousEquityWithLoanValue".ToUniqueHash()};");
sb.AppendLine($"const long GROSS_POSITION_VALUE ={ "GrossPositionValue".ToUniqueHash()};");
sb.AppendLine($"const long REQT_EQUITY = {"ReqTEquity".ToUniqueHash()};");
sb.AppendLine($"const long REQT_MARGIN = {"ReqTMargin".ToUniqueHash()};");
sb.AppendLine($"const long SPECIAL_MEMORANDUM_ACCOUNT = {"SMA".ToUniqueHash()};");
sb.AppendLine($"const long INIT_MARGIN_REQ = { "InitMarginReq".ToUniqueHash()};");
sb.AppendLine($"const long MAINT_MARGIN_REQ = {"MaintMarginReq".ToUniqueHash()};");
sb.AppendLine($"const long AVAILABLE_FUNDS = {"AvailableFunds".ToUniqueHash()};");
sb.AppendLine($"const long EXCESS_LIQUIDITY = {"ExcessLiquidity".ToUniqueHash()};");
sb.AppendLine($"const long CUSHION = {"Cushion".ToUniqueHash()};");
sb.AppendLine($"const long FULL_INIT_MARGIN_REQ = {"FullInitMarginReq".ToUniqueHash()};");
sb.AppendLine($"const long FULL_MAINTMARGIN_REQ ={ "FullMaintMarginReq".ToUniqueHash()};");
sb.AppendLine($"const long FULL_AVAILABLE_FUNDS = {"FullAvailableFunds".ToUniqueHash()};");
sb.AppendLine($"const long FULL_EXCESS_LIQUIDITY ={ "FullExcessLiquidity".ToUniqueHash()};");
sb.AppendLine($"const long LOOK_AHEAD_INIT_MARGIN_REQ = {"LookAheadInitMarginReq".ToUniqueHash()};");
sb.AppendLine($"const long LOOK_AHEAD_MAINT_MARGIN_REQ = {"LookAheadMaintMarginReq".ToUniqueHash()};");
sb.AppendLine($"const long LOOK_AHEAD_AVAILABLE_FUNDS = {"LookAheadAvailableFunds".ToUniqueHash()};");
sb.AppendLine($"const long LOOK_AHEAD_EXCESS_LIQUIDITY = {"LookAheadExcessLiquidity".ToUniqueHash()};");
sb.AppendLine($"const long HIGHEST_SEVERITY = {"HighestSeverity".ToUniqueHash()};");
sb.AppendLine($"const long DAY_TRADES_REMAINING = {"DayTradesRemaining".ToUniqueHash()};");
sb.AppendLine($"const long LEVERAGE = {"Leverage".ToUniqueHash()};");
Console.WriteLine(sb.ToString());
Test();
}
public static void Test()
{
//generated constant values
const long ACCOUNT_TYPE = -3012481629590703298;
const long NET_LIQUIDATION = 5886477638280951639;
const long TOTAL_CASH_VALUE = 2715174589598334721;
const long SETTLED_CASH = 9013818865418133625;
const long ACCRUED_CASH = -1095823472425902515;
const long BUYING_POWER = -4447052054809609098;
const long EQUITY_WITH_LOAN_VALUE = -4088154623329785565;
const long PREVIOUS_EQUITY_WITH_LOAN_VALUE = 6224054330592996694;
const long GROSS_POSITION_VALUE = -7316842993788269735;
const long REQT_EQUITY = -7457439202928979430;
const long REQT_MARGIN = -7525806483981945115;
const long SPECIAL_MEMORANDUM_ACCOUNT = -1696406879233404584;
const long INIT_MARGIN_REQ = 4495254338330797326;
const long MAINT_MARGIN_REQ = 3923858659879350034;
const long AVAILABLE_FUNDS = 2736927433442081110;
const long EXCESS_LIQUIDITY = 5975045739561521360;
const long CUSHION = 5079153439662500166;
const long FULL_INIT_MARGIN_REQ = -6446443340724968443;
const long FULL_MAINTMARGIN_REQ = -8084126626285123011;
const long FULL_AVAILABLE_FUNDS = 1594040062751632873;
const long FULL_EXCESS_LIQUIDITY = -2360941491690082189;
const long LOOK_AHEAD_INIT_MARGIN_REQ = 5230305572167766821;
const long LOOK_AHEAD_MAINT_MARGIN_REQ = 4895875570930256738;
const long LOOK_AHEAD_AVAILABLE_FUNDS = -7687608210548571554;
const long LOOK_AHEAD_EXCESS_LIQUIDITY = -4299898188451362207;
const long HIGHEST_SEVERITY = 5831097798646393988;
const long DAY_TRADES_REMAINING = 3899479916235857560;
const long LEVERAGE = 1018053116254258495;
bool found = false;
var sValues = new string[] {
"AccountType"
,"NetLiquidation"
,"TotalCashValue"
,"SettledCash"
,"AccruedCash"
,"BuyingPower"
,"EquityWithLoanValue"
,"PreviousEquityWithLoanValue"
,"GrossPositionValue"
,"ReqTEquity"
,"ReqTMargin"
,"SMA"
,"InitMarginReq"
,"MaintMarginReq"
,"AvailableFunds"
,"ExcessLiquidity"
,"Cushion"
,"FullInitMarginReq"
,"FullMaintMarginReq"
,"FullAvailableFunds"
,"FullExcessLiquidity"
,"LookAheadInitMarginReq"
,"LookAheadMaintMarginReq"
,"LookAheadAvailableFunds"
,"LookAheadExcessLiquidity"
,"HighestSeverity"
,"DayTradesRemaining"
,"Leverage"
};
long t1, t2;
var sw = System.Diagnostics.Stopwatch.StartNew();
foreach (var name in sValues)
{
switch (name)
{
case "AccountType": found = true; break;
case "NetLiquidation": found = true; break;
case "TotalCashValue": found = true; break;
case "SettledCash": found = true; break;
case "AccruedCash": found = true; break;
case "BuyingPower": found = true; break;
case "EquityWithLoanValue": found = true; break;
case "PreviousEquityWithLoanValue": found = true; break;
case "GrossPositionValue": found = true; break;
case "ReqTEquity": found = true; break;
case "ReqTMargin": found = true; break;
case "SMA": found = true; break;
case "InitMarginReq": found = true; break;
case "MaintMarginReq": found = true; break;
case "AvailableFunds": found = true; break;
case "ExcessLiquidity": found = true; break;
case "Cushion": found = true; break;
case "FullInitMarginReq": found = true; break;
case "FullMaintMarginReq": found = true; break;
case "FullAvailableFunds": found = true; break;
case "FullExcessLiquidity": found = true; break;
case "LookAheadInitMarginReq": found = true; break;
case "LookAheadMaintMarginReq": found = true; break;
case "LookAheadAvailableFunds": found = true; break;
case "LookAheadExcessLiquidity": found = true; break;
case "HighestSeverity": found = true; break;
case "DayTradesRemaining": found = true; break;
case "Leverage": found = true; break;
default: found = false; break;
}
if (!found)
throw new NotImplementedException();
}
t1 = sw.ElapsedTicks;
sw.Restart();
foreach (var name in sValues)
{
switch (name.ToUniqueHash())
{
case ACCOUNT_TYPE:
found = true;
break;
case NET_LIQUIDATION:
found = true;
break;
case TOTAL_CASH_VALUE:
found = true;
break;
case SETTLED_CASH:
found = true;
break;
case ACCRUED_CASH:
found = true;
break;
case BUYING_POWER:
found = true;
break;
case EQUITY_WITH_LOAN_VALUE:
found = true;
break;
case PREVIOUS_EQUITY_WITH_LOAN_VALUE:
found = true;
break;
case GROSS_POSITION_VALUE:
found = true;
break;
case REQT_EQUITY:
found = true;
break;
case REQT_MARGIN:
found = true;
break;
case SPECIAL_MEMORANDUM_ACCOUNT:
found = true;
break;
case INIT_MARGIN_REQ:
found = true;
break;
case MAINT_MARGIN_REQ:
found = true;
break;
case AVAILABLE_FUNDS:
found = true;
break;
case EXCESS_LIQUIDITY:
found = true;
break;
case CUSHION:
found = true;
break;
case FULL_INIT_MARGIN_REQ:
found = true;
break;
case FULL_MAINTMARGIN_REQ:
found = true;
break;
case FULL_AVAILABLE_FUNDS:
found = true;
break;
case FULL_EXCESS_LIQUIDITY:
found = true;
break;
case LOOK_AHEAD_INIT_MARGIN_REQ:
found = true;
break;
case LOOK_AHEAD_MAINT_MARGIN_REQ:
found = true;
break;
case LOOK_AHEAD_AVAILABLE_FUNDS:
found = true;
break;
case LOOK_AHEAD_EXCESS_LIQUIDITY:
found = true;
break;
case HIGHEST_SEVERITY:
found = true;
break;
case DAY_TRADES_REMAINING:
found = true;
break;
case LEVERAGE:
found = true;
break;
default:
found = false;
break;
}
if (!found)
throw new NotImplementedException();
}
t2 = sw.ElapsedTicks;
sw.Stop();
Console.WriteLine($"String switch:{t1:N0} long switch:{t2:N0}");
var faster = (t1 > t2) ? "Slower" : "faster";
Console.WriteLine($"String switch: is {faster} than long switch: by {Math.Abs(t1-t2)} Ticks");
Console.ReadLine();
}

well it depend on language you need to test yourself to see timing that which one is fast. like in php web language if / else if is fast compare to switch so you need to find it out by running some bench basic code in your desire language.
personally i prefer if / else if for code reading as switch statements can be nightmare to read where there is big code blocks in each condition as you will have to look for break keywords it each end point manually while with if / else if due to the start and end braces its easy to trace code blocks.
php

String comparison will always rely completely on the runtime environment (unless the strings are statically allocated, though the need to compare those to each other is debatable). Type comparison, however, can be done through dynamic or static binding, and either way it's more efficient for the runtime environment than comparing individual characters in a string.

Surely the switch on String would compile down to a String comparison (one per case) which is slower than a type comparison (and far slower than the typical integer compare that is used for switch/case)?

Three thoughts:
1) If you're going to do something different based on the types of the objects, it might make sense to move that behavior into those classes. Then instead of switch or if-else, you'd just call childNode.DoSomething().
2) Comparing types will be much faster than string comparisons.
3) In the if-else design, you might be able to take advantage of reordering the tests. If "Jill" objects make up 90% of the objects going through there, test for them first.

One of the issues you have with the switch is using strings, like "Bob", this will cause a lot more cycles and lines in the compiled code. The IL that is generated will have to declare a string, set it to "Bob" then use it in the comparison. So with that in mind your IF statements will run faster.
PS. Aeon's example wont work because you can't switch on Types. (No I don't know why exactly, but we've tried it an it doesn't work. It has to do with the type being variable)
If you want to test this, just build a separate application and build two simple Methods that do what is written up above and use something like Ildasm.exe to see the IL. You'll notice a lot less lines in the IF statement Method's IL.
Ildasm comes with VisualStudio...
ILDASM page - http://msdn.microsoft.com/en-us/library/f7dy01k1(VS.80).aspx
ILDASM Tutorial - http://msdn.microsoft.com/en-us/library/aa309387(VS.71).aspx

Remember, the profiler is your friend. Any guesswork is a waste of time most of the time.
BTW, I have had a good experience with JetBrains' dotTrace profiler.

Switch on string basically gets compiled into a if-else-if ladder. Try decompiling a simple one. In any case, testing string equailty should be cheaper since they are interned and all that would be needed is a reference check. Do what makes sense in terms of maintainability; if you are compring strings, do the string switch. If you are selecting based on type, a type ladder is the more appropriate.

I kind of do it a bit different,
The strings you're switching on are going to be constants, so you can predict the values at compile time.
in your case i'd use the hash values, this is an int switch, you have 2 options, use compile time constants or calculate at run-time.
//somewhere in your code
static long _bob = "Bob".GetUniqueHashCode();
static long _jill = "Jill".GetUniqueHashCode();
static long _marko = "Marko".GeUniquetHashCode();
void MyMethod()
{
...
if(childNode.Tag==0)
childNode.Tag= childNode.Name.GetUniquetHashCode()
switch(childNode.Tag)
{
case _bob :
break;
case _jill :
break;
case _marko :
break;
}
}
The extension method for GetUniquetHashCode can be something like this:
public static class StringExtentions
{
/// <summary>
/// Return unique Int64 value for input string
/// </summary>
/// <param name="strText"></param>
/// <returns></returns>
public static Int64 GetUniquetHashCode(this string strText)
{
Int64 hashCode = 0;
if (!string.IsNullOrEmpty(strText))
{
//Unicode Encode Covering all character-set
byte[] byteContents = Encoding.Unicode.GetBytes(strText);
System.Security.Cryptography.SHA256 hash = new System.Security.Cryptography.SHA256CryptoServiceProvider();
byte[] hashText = hash.ComputeHash(byteContents);
//32Byte hashText separate
//hashCodeStart = 0~7 8Byte
//hashCodeMedium = 8~23 8Byte
//hashCodeEnd = 24~31 8Byte
//and Fold
Int64 hashCodeStart = BitConverter.ToInt64(hashText, 0);
Int64 hashCodeMedium = BitConverter.ToInt64(hashText, 8);
Int64 hashCodeEnd = BitConverter.ToInt64(hashText, 24);
hashCode = hashCodeStart ^ hashCodeMedium ^ hashCodeEnd;
}
return (hashCode);
}
}
The source of this code was published here
Please note that using Cryptography is slow, you would typically warm-up the supported string on application start, i do this my saving them at static fields as will not change and are not instance relevant. please note that I set the tag value of the node object, I could use any property or add one, just make sure that these are in sync with the actual text.
I work on low latency systems and all my codes come as a string of command:value,command:value....
now the command are all known as 64 bit integer values so switching like this saves some CPU time.

I was just reading through the list of answers here, and wanted to share this benchmark test which compares the switch construct with the if-else and ternary ? operators.
What I like about that post is it not only compares single-left constructs (eg, if-else) but double and triple level constructs (eg, if-else-if-else).
According to the results, the if-else construct was the fastest in 8/9 test cases; the switch construct tied for the fastest in 5/9 test cases.
So if you're looking for speed if-else appears to be the fastest way to go.

I may be missing something, but couldn't you do a switch statement on the type instead of the String? That is,
switch(childNode.Type)
{
case Bob:
break;
case Jill:
break;
case Marko:
break;
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Slimming down a switch statement - c#

Use Contains instead of switch. var retVal = String.Empty; string es = "éêèë"; if (es.Contains(valToCheck)) retVal = "e"; //etc.

Related

Is there a better way to enumerate text elements from an IEnumerable<char>?

C# - Dealing with contradictions in string.replace

Should a regular expression used to break up lines account for unix/dos issue?

Parsing custom data tokens and replacing with values in C#

What is quicker, switch on string or elseif on type?

Categories

Resources