Which outperforms the other?(related to string) [closed] - c#

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I don't know how IndexOf() method of String object works and so I would like to know which outperforms the other with the 2 following implementations:
At first, I want to introduce a little about the problem, simply, the function/method implemented here has a character as the only parameter, it should give out/return another character corresponding to the one passed in. (the rule of matching between source char collection and destination char collection is given below):
a <=> 9
b <=> 8
c <=> 7
d <=> 6
e <=> 5
f <=> 4
g <=> 3
h <=> 2
i <=> 1
j <=> 0
Please note that, the above rule is just made for easy to follow, it's not a fixed rule, it can be any so don't base on that rule to implement those methods another way.
Now is the 2 methods I would like to compare:
1. The first one is very short and based on IndexOf()
string source = "abcdefghij";
string destination = "9876543210";
public char SourceToDest(char c){
return destination[source.IndexOf(c)];//Suppose the c is always in source.
}
2. The second one is longer and uses switch case:
public char SourceToDest(char c){
switch(c){
case 'a': return '9';
case 'b': return '8';
case 'c': return '7';
case 'd': return '6';
case 'e': return '5';
case 'f': return '4';
case 'g': return '3';
case 'h': return '2';
case 'i': return '1';
case 'j': return '0';
}
}
As I mentioned before, the rule is made for easy to follow, if not noticing this, you may have another method like this:
public char SourceToDest(char c){
return (char)(154 - (int)c); //154 = 106 + 48
}
If you have another method which outperforms both the 2 methods I presented, please share with me.

You can make the other method easier to follow, and still be fast:
public char SourceToDest(char c)
{
return (char)((int)'j' - (int)c + (int)'0');
}
Another option is:
const string destination = "9876543210";
public char SourceToDest(char c)
{
return destination[(int)c - (int)'a'];
}
Which will be faster than your other two methods.

You can use SortedDictionary<char, char> in your case. Search in SortedDictionary is O(log n). Search in string with IndexOf I guess should be O(n), I don't think that it has some special optimizations (at least MSDN does not tell you that). So your example will be
SortedDictionary<char, char> encoding = new SortedDictionary<char, char>()
{
{ 'a', '9' }, { 'b', '8' } /* ... */ , { 'j', '0' }
}
public char SourceToDest(char c){
return encoding[c];
}

In general for large(r) string lengths N, the first will be O(N) by virtue of being a linear search, while the second will be O(1) as a indexed access.
For small(er) string lengths N the asymptotic performance is swamped by the constant factor, and you would have to measure hundreds of millions of accesses to get e meaningful comparison. But would you even care in these cases? Surely there are dozens of more productive performance cases to investigate in the application.

Related

why in this case, I can use for instead of switch-case [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I have a method, inside this method, at first, I use switch-case like this:
public int setDeviceName
{
set
{
device_id = value;
DeviceNoList[(page_no - 1) * display_data_num + selectedTagNo - 1] = value;
switch (selectedDeviceNo)
{
case 1:
lblLocation1.Text = lblDeviceName_Selected(device_id);
break;
case 2:
lblLocation2.Text = lblDeviceName_Selected(device_id);
break;
case 3:
lblLocation3.Text = lblDeviceName_Selected(device_id);
break;
case 4:
lblLocation4.Text = lblDeviceName_Selected(device_id);
break;
case 5:
lblLocation5.Text = lblDeviceName_Selected(device_id);
break;
case 6:
lblLocation6.Text = lblDeviceName_Selected(device_id);
break;
case 7:
lblLocation7.Text = lblDeviceName_Selected(device_id);
break;
case 8:
lblLocation8.Text = lblDeviceName_Selected(device_id);
break;
default:
break;
}
}
get
{
return device_id;
}
}
After, I tried to do For loop instead like below:
public int setDeviceName
{
set
{
device_id = value;
DeviceNoList[(page_no - 1) * display_data_num + selectedTagNo - 1] = value;
Label [] lbl_Location = { lblLocation1, lblLocation2, lblLocation3, lblLocation4, lblLocation5, lblLocation6, lblLocation7, lblLocation8 };
//set deviceName to each label in form
for(int i = 0; i<display_data_num; i++)
{
lbl_Location[i].Text = lblDeviceName_Selected(device_id);
}
}
get
{
return device_id;
}
}
I have checked, my program still goes well, I don't know Can I use For instead of Switch-case like that or not. If not, why my program still work. I am new in programming.
A for loop is better than switches and ifs when there are many different cases, all being a number in order. There are many reasons why but the best is: which is easier to read and write? 100 lines of cases or a few lines if a for loop?
In this scenario if you can change it to be in this form, it will be much better. We cannot move it straight over how you wanted, though, because indexing ([i]) doesn't work how you seem to think it does. Indexing grabs the value at the specified index of the collection, maybe an array, a list, etc..
If we change our lbl_Location1, ... to an array:
string[] lbl_Location = new string[]{ "a value 1", "a value 2", "and more" };
Obviously change the strings to the value and type of your choice. You can then do what you want and iterate through the indexes, which is better. Note that indexes start at 0! The first value is 0, second 1, etc..
Implementation (notice I changed your condition in the for statement to the length of the array, this is a cleaner and simpler way to do it. Unless you were doing something different where that wasn't just the length of the array, just stop using that and use this):
for(int i = 0; i<lbl_Location.Length; i++)
{
lbl_Location[i].Text = lblDeviceName_Selected(device_id);
}
If you don't understand something or I misunderstood please comment!
More information on indexing:
So let's say I have a variable called MyString1. What you were trying to do with MyString[1] (I have replaced i with a 1 to further simplify this) is "look for a (for simplicity's sake) array called MyString and look at the second (remember, indexes start at zero!) position." What you wanted it to do was just add a 1 at the end. When we make it an array, we are essentially making one variable that is a container for many values.

Designing For Configurable Return Values in C# [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I was given a spreadsheet with possible return codes and their description from a third party web service. They look like this (simplified):
Code Description
M1 Some description of M1
M2 Some description of M2
M3 Some description of M3
M4 Some description of M4
P1 Some description of P1
P2 Some description of P2
N1 Some description of N1
N2 Some description of N2
In the list above, M codes are classified as Match, P codes are Partial Match and I codes are No Match.
In the C# function, these return values are handled by a switch case like this:
...
switch(returncode)
{
case "M1":
case "M2":
case "M3":
case "M4":
DoSomethingForMatch(ReturnCodeDescription);
break;
case "P1":
case "P2":
case "P3":
case "P4":
DoSomethingForPartialMatch(ReturnCodeDescription);
break;
case "N1":
case "N2":
default:
DoSomethingForNoMatch(ReturnCodeDescription);
break;
}
Though the returncodes look similar, there is no naming convention. There may be other return codes in future that may have a different format. But they will still fall under one of the three categories: match, partial match and no match.
In case there are new return codes in the future, with this design, I have to update the code and rebuild, redeploy etc.
There's got to be a better way to do this than to hard code the return values in the code like this. I would like to ask your advice on how to do this in a configurable, scalable manner. Is saving all the possible codes and description in a DB table the best way to accomplish this?
Thank you.
Why not just check the first character?
switch(returncode[0])
{
case 'M':
DoSomethingForMatch(ReturnCodeDescription);
break;
case 'P':
DoSomethingForPartialMatch(ReturnCodeDescription);
break;
case 'N':
default:
DoSomethingForNoMatch(ReturnCodeDescription);
break;
}
If the return codes are somewhat predictable for future use you could use regular expressions.
The pseudo code would look like
RegEx fullMatch = new RegEx("some_reg_ex_here");
RegEx partialMatch = new RegEx("some_reg_ex_here");
if (fullMatch.IsMatch(returnCode)
DoSomethingForMatch(ReturnCodeDescription);
else if (partialMatch.IsMatch(returnCode)
DoSomethingForPartialMatch(ReturnCodeDescription);
else
DoSomethingForNoMatch(ReturnCodeDescription);
I would be tempted to convert to an enum.
Code.cs
enum Code
{
Unknown = 0,
Match = 'M',
PartialMatch = 'P',
NoMatch = 'N'
}
static class CodeExtensions
{
public static Code ToCode(this string value)
{
value = value.Trim();
if (String.IsNullOrEmpty(value))
return Code.Unknown;
if (value.Length != 2)
return Code.Unknown;
return value[0].ToCode();
}
public static Code ToCode(this char value)
{
int numericValue = value;
if (!Enum.IsDefined(typeof(Code), numericValue))
return Code.Unknown;
return (Code)numericValue;
}
}
Usage
var code = returnCode.ToCode();
switch (code)
{
case Code.Match:
DoSomethingForMatch(ReturnCodeDescription);
break;
case Code.PartialMatch:
DoSomethingForPartialMatch(ReturnCodeDescription);
break;
default:
DoSomethingForNoMatch(ReturnCodeDescription);
break;
}

What are the restrictions of the pattern matching mechanics?

Personally, I only know that dynamic cannot be used in pattern matching which is considered a pity :(
dynamic foo = 10;
switch(foo) {
case int i:
break;
}
Also, valued tuple/neo-tuples cannot be used in pattern matching:
dynamic foo = (420, 360);
switch(foo) {
case (int, int) i:
break;
}
It was removed in current version of C#7 and was assigned for future usage.
What are the other things I cannot do?
The new pattern matching features in C# 7 consist of the following:
Support for type switching,
Simple use of var patterns,
The addition of when guards to case statements,
the x is T y pattern expression.
Your examples focus on the first of these. And type switching is likely to be the most popular and commonly used of these new features. Whilst there are limitations, such as those that you mention, other features can be used to work around many of them.
For example, your first limitation is easily solved by boxing foo to object:
dynamic foo = 10;
switch ((object)foo)
{
case int i:
Console.WriteLine("int");
break;
default:
Console.WriteLine("other");
break;
}
Will print int as expected.
The var pattern and a guard can be used to work around your second restriction:
dynamic foo = (420, 360);
switch (foo)
{
case var ii when ii.GetType() == typeof((int, int)):
Console.WriteLine("(int,int)");
break;
default:
Console.WriteLine("other");
break;
}
will print (int,int).
Additionally, value tuples can be used for type switching, you just have to use the long-hand syntax:
var foo = (420, 360);
switch (foo)
{
case ValueTuple<int,int> x:
Console.WriteLine($"({x.Item1},{x.Item2})");
break;
default:
Console.WriteLine("other");
break;
}
The above will print (420,360).
For me, personally, the biggest restriction to pattern matching in C# 7 is the lack of pattern matching expressions using the match keyword. Originally, the following was planned for this release, but was pulled due to time constraints:
var x = 1;
var y = x match (
case int _ : "int",
case * : "other"
);
This can be worked around using switch, but it's messy:
var x = 1;
var y = IntOrOther(x);
...
private string IntOrOther(int i)
{
switch (i)
{
case int _ : return "int";
default: return "other";
}
}
But help is at hand here with numerous 3rd party pattern matching libraries, such as my own Succinc<T> library, which let's you write it as:
var x = 1;
var y = x.TypeMatch().To<string>()
.Caseof<int>().Do("int")
.Else("other")
.Result();
It's not as good as having the match keyword, but it's an optional workaround until that feature appears in a later language release.
To really understand the restrictions imposed by C# 7, it's worth referring to the the pattern matching spec on GitHub and comparing that with what will be in the next release of C#. Looking at it though, it becomes apparent that there are work-arounds to all of these.
This question was originally closed because it's open-ended as currently phrased. To give a couple of silly examples, restrictions to C# 7's pattern matching are that it won't make you a perfect cup of coffee, or fly you across the world in seconds ... but I prefer to answer the spirit of the question. And the answer really is that the only restriction is your imagination. If you don't let that restrict you, then one must take into account the fact that the work-arounds have readability and/or performance implications. They are likely the only real-world restrictions.

Multiple variables in switch statement in c

How to write following statement in c using switch statement in c
int i = 10;
int j = 20;
if (i == 10 && j == 20)
{
Mymethod();
}
else if (i == 100 && j == 200)
{
Yourmethod();
}
else if (i == 1000 || j == 2000) // OR
{
Anymethod();
}
EDIT:
I have changed the last case from 'and' to 'or' later. So I appologise from people who answered my question before this edit.
This scenario is for example, I just wanted to know that is it possible or not. I have google this and found it is not possible but I trust gurus on stackoverflow more.
Thanks
You're pressing for answers that will unnaturally force this code into a switch - that's not the right approach in C, C++ or C# for the problem you've described. Live with the if statements, as using a switch in this instance leads to less readable code and the possibility that a slip-up will introduce a bug.
There are languages that will evaluate a switch statement syntax similar to a sequence of if statements, but C, C++, and C# aren't among them.
After Jon Skeet's comment that it can be "interesting to try to make it work", I'm going to go against my initial judgment and play along because it's certainly true that one can learn by trying alternatives to see where they work and where they don't work. Hopefully I won't end up muddling things more than I should...
The targets for a switch statement in the languages under consideration need to be constants - they aren't expressions that are evaluated at runtime. However, you can potentially get a behavior similar to what you're looking for if you can map the conditions that you want to have as switch targets to a hash function that will produce a perfect hash the matches up to the conditions. If that can be done, you can call the hash function and switch on the value it produces.
The C# compiler does something similar to this automatically for you when you want to switch on a string value. In C, I've manually done something similar when I want to switch on a string. I place the target strings in a table along with enumerations that are used to identify the strings, and I switch on the enum:
char* cmdString = "copystuff"; // a string with a command identifier,
// maybe obtained from console input
StrLookupValueStruct CmdStringTable[] = {
{ "liststuff", CMD_LIST },
{ "docalcs", CMD_CALC },
{ "copystuff", CMD_COPY },
{ "delete", CMD_DELETE },
{ NULL, CMD_UNKNOWN },
};
int cmdId = strLookupValue( cmdString, CmdStringTable); // transform the string
// into an enum
switch (cmdId) {
case CMD_LIST:
doList();
break;
case CMD_CALC:
doCalc();
break;
case CMD_COPY:
doCopy();
break;
// etc...
}
Instead of having to use a sequence of if statements:
if (strcmp( cmdString, "liststuff") == 0) {
doList();
}
else if (strcmp( cmdString, "docalcs") == 0) {
doCalc();
}
else if (strcmp( cmdString, "copystuff") == 0) {
doCopy();
}
// etc....
As an aside, for the string to function mapping here I personally find the table lookup/switch statement combination to be a bit more readable, but I imagine there are people who might prefer the more direct approach of the if sequence.
The set of expressions you have in your question don't look particularly simple to transform into a hash - your hash function would almost certainly end up being a sequence of if statements - you would have basically just moved the construct somewhere else. Jon Skeet's original answer was essentially to turn your expressions into a hash, but when the or operation got thrown into the mix of one of the tests, the hash function broke down.
In general you can't. What you are doing already is fine, although you might want to add an else clause at the end to catch unexpected inputs.
In your specific example it seems that j is often twice the value of i. If that is a general rule you could try to take advantage of that by doing something like this instead:
if (i * 2 == j) /* Watch out for overflow here if i could be large! */
{
switch (i)
{
case 10:
// ...
break;
case 100:
// ...
break;
// ...
}
}
(Removed original answer: I'd missed the fact that the condition was an "OR" rather than an "AND". EDIT: Ah, because apparently it wasn't to start with.)
You could still theoretically use something like my original code (combining two 32-bit integers into one 64-bit integer and switching on that), although there would be 2^33 case statements covering the last condition. I doubt that any compiler would actually make it through such code :)
But basically, no: use the if/else structure instead.

How would you make this switch statement as fast as possible?

2009-12-04 UPDATE: For profiling results on a number of the suggestions posted here, see below!
The Question
Consider the following very harmless, very straightforward method, which uses a switch statement to return a defined enum value:
public static MarketDataExchange GetMarketDataExchange(string ActivCode) {
if (ActivCode == null) return MarketDataExchange.NONE;
switch (ActivCode) {
case "": return MarketDataExchange.NBBO;
case "A": return MarketDataExchange.AMEX;
case "B": return MarketDataExchange.BSE;
case "BT": return MarketDataExchange.BATS;
case "C": return MarketDataExchange.NSE;
case "MW": return MarketDataExchange.CHX;
case "N": return MarketDataExchange.NYSE;
case "PA": return MarketDataExchange.ARCA;
case "Q": return MarketDataExchange.NASDAQ;
case "QD": return MarketDataExchange.NASDAQ_ADF;
case "W": return MarketDataExchange.CBOE;
case "X": return MarketDataExchange.PHLX;
case "Y": return MarketDataExchange.DIRECTEDGE;
}
return MarketDataExchange.NONE;
}
My colleague and I batted around a few ideas today about how to actually make this method faster, and we came up with some interesting modifications that did in fact improve its performance rather significantly (proportionally speaking, of course). I'd be interested to know what sorts of optimizations anyone else out there can think up that might not have occurred to us.
Right off the bat, let me just offer a quick disclaimer: this is for fun, and not to fuel the whole "to optimize or not to optimize" debate. That said, if you count yourself among those who dogmatically believe "premature optimization is the root of all evil," just be aware that I work for a high-frequency trading firm, where everything needs to run absolutely as fast as possible--bottleneck or not. So, even though I'm posting this on SO for fun, it isn't just a huge waste of time, either.
One more quick note: I'm interested in two kinds of answers--those that assume every input will be a valid ActivCode (one of the strings in the switch statement above), and those that do not. I am almost certain that making the first assumption allows for further speed improvements; anyway, it did for us. But I know that improvements are possible either way.
The Results
Well, it turns out that the fastest solution so far (that I've tested) came from João Angelo, whose suggestion was actually very simple, yet extremely clever. The solution that my coworker and I had devised (after trying out several approaches, many of which were thought up here as well) came in second place; I was going to post it, but it turns out that Mark Ransom came up with the exact same idea, so just check out his answer!
Since I ran these tests, some other users have posted even newer ideas... I will test them in due time, when I have a few more minutes to spare.
I ran these tests on two different machines: my personal computer at home (a dual-core Athlon with 4 Gb RAM running Windows 7 64-bit) and my development machine at work (a dual-core Athlon with 2 Gb RAM running Windows XP SP3). Obviously, the times were different; however, the relative times, meaning, how each method compared to every other method, were the same. That is to say, the fastest was the fastest on both machines, etc.
Now for the results. (The times I'm posting below are from my home computer.)
But first, for reference--the original switch statement:
1000000 runs: 98.88 ms
Average: 0.09888 microsecond
Fastest optimizations so far:
João Angelo's idea of assigning values to the enums based on the hash codes of the ActivCode strings and then directly casing ActivCode.GetHashCode() to MarketDataExchange:
1000000 runs: 23.64 ms
Average: 0.02364 microsecond
Speed increase: 329.90%
My colleague's and my idea of casting ActivCode[0] to an int and retrieving the appropriate MarketDataExchange from an array initialized on startup (this exact same idea was suggested by Mark Ransom):
1000000 runs: 28.76 ms
Average: 0.02876 microsecond
Speed increase: 253.13%
tster's idea of switching on the output of ActivCode.GetHashCode() instead of ActivCode:
1000000 runs: 34.69 ms
Average: 0.03469 microsecond
Speed increase: 185.04%
The idea, suggested by several users including Auraseer, tster, and kyoryu, of switching on ActivCode[0] instead of ActivCode:
1000000 runs: 36.57 ms
Average: 0.03657 microsecond
Speed increase: 174.66%
Loadmaster's idea of using the fast hash, ActivCode[0] + ActivCode[1]*0x100:
1000000 runs: 39.53 ms
Average: 0.03953 microsecond
Speed increase: 153.53%
Using a hashtable (Dictionary<string, MarketDataExchange>), as suggested by many:
1000000 runs: 88.32 ms
Average: 0.08832 microsecond
Speed increase: 12.36%
Using a binary search:
1000000 runs: 1031 ms
Average: 1.031 microseconds
Speed increase: none (performance worsened)
Let me just say that it has been really cool to see how many different ideas people had on this simple problem. This was very interesting to me, and I'm quite thankful to everyone who has contributed and made a suggestion so far.
Assuming every input will be a valid ActivCode, that you can change the enumeration values and highly coupled to the GetHashCode implementation:
enum MarketDataExchange
{
NONE,
NBBO = 371857150,
AMEX = 372029405,
BSE = 372029408,
BATS = -1850320644,
NSE = 372029407,
CHX = -284236702,
NYSE = 372029412,
ARCA = -734575383,
NASDAQ = 372029421,
NASDAQ_ADF = -1137859911,
CBOE = 372029419,
PHLX = 372029430,
DIRECTEDGE = 372029429
}
public static MarketDataExchange GetMarketDataExchange(string ActivCode)
{
if (ActivCode == null) return MarketDataExchange.NONE;
return (MarketDataExchange)ActivCode.GetHashCode();
}
I'd roll my own fast hash function and use an integer switch statement to avoid string comparisons:
int h = 0;
// Compute fast hash: A[0] + A[1]*0x100
if (ActivCode.Length > 0)
h += (int) ActivCode[0];
if (ActivCode.Length > 1)
h += (int) ActivCode[1] << 8;
// Find a match
switch (h)
{
case 0x0000: return MarketDataExchange.NBBO; // ""
case 0x0041: return MarketDataExchange.AMEX; // "A"
case 0x0042: return MarketDataExchange.BSE; // "B"
case 0x5442: return MarketDataExchange.BATS; // "BT"
case 0x0043: return MarketDataExchange.NSE; // "C"
case 0x574D: return MarketDataExchange.CHX; // "MW"
case 0x004E: return MarketDataExchange.NYSE; // "N"
case 0x4150: return MarketDataExchange.ARCA; // "PA"
case 0x0051: return MarketDataExchange.NASDAQ; // "Q"
case 0x4451: return MarketDataExchange.NASDAQ_ADF; // "QD"
case 0x0057: return MarketDataExchange.CBOE; // "W"
case 0x0058: return MarketDataExchange.PHLX; // "X"
case 0x0059: return MarketDataExchange.DIRECTEDGE; // "Y"
default: return MarketDataExchange.NONE;
}
My tests show that this is about 4.5 times faster than the original code.
If C# had a preprocessor, I'd use a macro to form the case constants.
This technique is faster than using a hash table and certainly faster than using string comparisons. It works for up to four-character strings with 32-bit ints, and up to 8 characters using 64-bit longs.
If you know how often the various codes show up, the more common ones should go at the top of the list, so fewer comparisons are done. But let's assume you don't have that.
Assuming the ActivCode is always valid will of course speed things up. You don't need to test for null or the empty string, and you can leave off one test from the end of the switch. That is, test for everything except Y, and then return DIRECTEDGE if no match is found.
Instead of switching on the whole string, switch on its first letter. For the codes that have more letters, put a second test inside the switch case. Something like this:
switch(ActivCode[0])
{
//etc.
case 'B':
if ( ActivCode.Length == 1 ) return MarketDataExchange.BSE;
else return MarketDataExchange.BATS;
// etc.
}
It would be better if you could go back and change the codes so they are all single characters, because you would then never need more than one test. Better yet would be using the numerical value of the enum, so you can simply cast instead of having to switch/translate in the first place.
I'd use a dictionary for the key value pairs and take advantage of the O(1) lookup time.
Do you have any statistics on which strings are more common? So that those could be checked first?
With a valid input could use
if (ActivCode.Length == 0)
return MarketDataExchange.NBBO;
if (ActivCode.Length == 1)
return (MarketDataExchange) (ActivCode[0]);
return (MarketDataExchange) (ActivCode[0] | ActivCode[1] << 8);
Change the switch to switch on the HashCode() of the strings.
I'd extrapolate tster's reply to "switch over a custom hash function", assuming that the code generator creates a lookup table, or - failing that - building the lookup table myself.
The custom hash function should be simple, e.g.:
(int)ActivCode[0]*2 + ActivCode.Length-1
This would require a table of 51 elements, easily kept in L1 cache, under the following assumptions:
Input data must already be validated
empty string must be handled sepsarately
no two-character-codes start with the same character
adding new cases is hard
the empty string case could be incorporated if you could use an unsafe access to ActivCode[0] yielding the '\0' terminator.
Forgive me if I get something wrong here, I'm extrapolating from my knowledge of C++. For example, if you take ActivCode[0] of an empty string, in C++ you get a character whose value is zero.
Create a two dimensional array which you initialize once; the first dimension is the length of the code, the second is a character value. Populate with the enumeration value you'd like to return. Now your entire function becomes:
public static MarketDataExchange GetMarketDataExchange(string ActivCode) {
return LookupTable[ActivCode.Length][ActivCode[0]];
}
Lucky for you all the two-character codes are unique in the first letter compared to the other two-character codes.
I would put it in dictionary instead of using a switch statement. That being said, it may not make a difference. Or it might. See C# switch statement limitations - why?.
Avoid all string comparisons.
Avoid looking at more than a single character (ever)
Avoid if-else since I want the compiler to be able optimize the best it can
Try to get the result in a single switch jump
code:
public static MarketDataExchange GetMarketDataExchange(string ActivCode) {
if (ActivCode == null) return MarketDataExchange.NONE;
int length = ActivCode.Length;
if (length == 0) return MarketDataExchange.NBBO;
switch (ActivCode[0]) {
case 'A': return MarketDataExchange.AMEX;
case 'B': return (length == 2) ? MarketDataExchange.BATS : MarketDataExchange.BSE;
case 'C': return MarketDataExchange.NSE;
case 'M': return MarketDataExchange.CHX;
case 'N': return MarketDataExchange.NYSE;
case 'P': return MarketDataExchange.ARCA;
case 'Q': return (length == 2) ? MarketDataExchange.NASDAQ_ADF : MarketDataExchange.NASDAQ;
case 'W': return MarketDataExchange.CBOE;
case 'X': return MarketDataExchange.PHLX;
case 'Y': return MarketDataExchange.DIRECTEDGE;
default: return MarketDataExchange.NONE;
}
}
Trade memory for speed by pre-populating an index table to leverage simple pointer arithmetic.
public class Service
{
public static MarketDataExchange GetMarketDataExchange(string ActivCode) {
{
int x = 65, y = 65;
switch(ActivCode.Length)
{
case 1:
x = ActivCode[0];
break;
case 2:
x = ActivCode[0];
y = ActivCode[1];
break;
}
return _table[x, y];
}
static Service()
{
InitTable();
}
public static MarketDataExchange[,] _table =
new MarketDataExchange['Z','Z'];
public static void InitTable()
{
for (int x = 0; x < 'Z'; x++)
for (int y = 0; y < 'Z'; y++)
_table[x, y] = MarketDataExchange.NONE;
SetCell("", MarketDataExchange.NBBO);
SetCell("A", MarketDataExchange.AMEX);
SetCell("B", MarketDataExchange.BSE);
SetCell("BT", MarketDataExchange.BATS);
SetCell("C", MarketDataExchange.NSE);
SetCell("MW", MarketDataExchange.CHX);
SetCell("N", MarketDataExchange.NYSE);
SetCell("PA", MarketDataExchange.ARCA);
SetCell("Q", MarketDataExchange.NASDAQ);
SetCell("QD", MarketDataExchange.NASDAQ_ADF);
SetCell("W", MarketDataExchange.CBOE);
SetCell("X", MarketDataExchange.PHLX);
SetCell("Y", MarketDataExchange.DIRECTEDGE);
}
private static void SetCell(string s, MarketDataExchange exchange)
{
char x = 'A', y = 'A';
switch(s.Length)
{
case 1:
x = s[0];
break;
case 2:
x = s[0];
y = s[1];
break;
}
_table[x, y] = exchange;
}
}
Make the enum byte-based to save a little space.
public enum MarketDataExchange : byte
{
NBBO, AMEX, BSE, BATS, NSE, CHX, NYSE, ARCA,
NASDAQ, NASDAQ_ADF, CBOE, PHLIX, DIRECTEDGE, NONE
}
If the enumeration values are arbitrary you could do this...
public static MarketDataExchange GetValue(string input)
{
switch (input.Length)
{
case 0: return MarketDataExchange.NBBO;
case 1: return (MarketDataExchange)input[0];
case 2: return (MarketDataExchange)(input[0] << 8 | input[1]);
default: return MarketDataExchange.None;
}
}
... if you want to go totally nuts you can also use an unsafe call with pointers as noted by Pavel Minaev ...
The pure cast version above is faster than this unsafe version.
unsafe static MarketDataExchange GetValue(string input)
{
if (input.Length == 1)
return (MarketDataExchange)(input[0]);
fixed (char* buffer = input)
return (MarketDataExchange)(buffer[0] << 8 | buffer[1]);
}
public enum MarketDataExchange
{
NBBO = 0x00, //
AMEX = 0x41, //A
BSE = 0x42, //B
BATS = 0x4254, //BT
NSE = 0x43, //C
CHX = 0x4D57, //MW
NYSE = 0x4E, //N
ARCA = 0x5041, //PA
NASDAQ = 0x51, //Q
NASDAQ_ADF = 0x5144, //QD
CBOE = 0x57, //W
PHLX = 0x58, //X
DIRECTEDGE = 0x59, //Y
None = -1
}
+1 for using a dictionary. Not necessarily for optimization, but it'd be cleaner.
I would probably use constants for the strings as well, though i doubt that'd buy you anything performance wise.
Messy but using a combination of nested ifs and hard coding might just beat the optimiser:-
if (ActivCode < "N") {
// "" to "MW"
if (ActiveCode < "BT") {
// "" to "B"
if (ActiveCode < "B") {
// "" or "A"
if (ActiveCode < "A") {
// must be ""
retrun MarketDataExchange.NBBO;
} else {
// must be "A"
return MarketDataExchange.AMEX;
}
} else {
// must be "B"
return MarketDataExchange.BSE;
}
} else {
// "BT" to "MW"
if (ActiveCode < "MW") {
// "BT" or "C"
if (ActiveCode < "C") {
// must be "BT"
retrun MarketDataExchange.NBBO;
} else {
// must be "C"
return MarketDataExchange.NSE;
}
} else {
// must be "MV"
return MarketDataExchange.CHX;
}
}
} else {
// "N" TO "Y"
if (ActiveCode < "QD") {
// "N" to "Q"
if (ActiveCode < "Q") {
// "N" or "PA"
if (ActiveCode < "PA") {
// must be "N"
retrun MarketDataExchange.NYSE;
} else {
// must be "PA"
return MarketDataExchange.ARCA;
}
} else {
// must be "Q"
return MarketDataExchange.NASDAQ;
}
} else {
// "QD" to "Y"
if (ActiveCode < "X") {
// "QD" or "W"
if (ActiveCode < "W") {
// must be "QD"
retrun MarketDataExchange.NASDAQ_ADF;
} else {
// must be "W"
return MarketDataExchange.CBOE;
}
} else {
// "X" or "Y"
if (ActiveCode < "Y") {
// must be "X"
retrun MarketDataExchange.PHLX;
} else {
// must be "Y"
return MarketDataExchange.DIRECTEDGE;
}
}
}
}
This gets the right function with three or four compares. I wouldnt even think of doing this for real unless your piece of code is expected to run several times a second!
You further otimise it so that only single character compares occurred.
e.g. replace '< "BT" ' with '>= "B" ' -- ever so slightly faster and even less readable!
All your strings are at most 2 chars long, and ASCII, so we can use 1 byte per char.
Furthermore, more likely than not, they also never can have \0 appear in them (.NET string allows for embedded null characters, but many other things don't). With that assumption, we can null-pad all your strings to be exactly 2 bytes each, or an ushort:
"" -> (byte) 0 , (byte) 0 -> (ushort)0x0000
"A" -> (byte)'A', (byte) 0 -> (ushort)0x0041
"B" -> (byte)'B', (byte) 0 -> (ushort)0x0042
"BT" -> (byte)'B', (byte)'T' -> (ushort)0x5442
Now that we have a single integer in a relatively (64K) short range, we can use a lookup table:
MarketDataExchange[] lookup = {
MarketDataExchange.NBBO,
MarketDataExchange.NONE,
MarketDataExchange.NONE,
...
/* at index 0x041 */
MarketDataExchange.AMEX,
MarketDataExchange.BSE,
MarketDataExchange.NSE,
...
};
Now, obtaining the value given a string is:
public static unsafe MarketDataExchange GetMarketDataExchange(string s)
{
// Assume valid input
if (s.Length == 0) return MarketDataExchange.NBBO;
// .NET strings always have '\0' after end of data - abuse that
// to avoid extra checks for 1-char strings. Skip index checks as well.
ushort hash;
fixed (char* data = s)
{
hash = (ushort)data[0] | ((ushort)data[1] << 8);
}
return lookup[hash];
}
Put the cases in a sorted structure with non linear access (like a hash table).
The switch that you have will have a linear time.
You can get a mild speed-up by ordering the codes according to which ones are most used.
But I agree with Cletus: the best speed-up I can think of would be to use a hash map with plenty of room (so that there are no collisions.)
A couple of random thoughts, that may not all be applicable together:
Switch on the first character in the string, rather than the string itself, and do a sub-switch for strings which can contain more than one letter?
A hashtable would certainly guarantee O(1) retrieval, though it might not be faster for smaller numbers of comparisons.
Don't use strings, use enums or something like a flyweight instead. Using strings in this case seems a bit fragile anyway...
And if you really need it to be as fast as possible, why aren't you writing it in assembly? :)
Can we cast the ActivCode to int and then use int in our case statements?
Use the length of the code to create a unique value from that code instead of using GetHashCode() . It turns out there are no collisions if you use the first letter of the code shifted by the length of the code. This reduces the cost to two comparisons, one array index and one shift (on average).
public static MarketDataExchange GetMarketDataExchange(string ActivCode)
{
if (ActivCode == null)
return MarketDataExchange.NONE;
if (ActivCode.Length == 0)
return MarketDataExchange.NBBO;
return (MarketDataExchange)((ActivCode[0] << ActivCode.Length));
}
public enum MarketDataExchange
{
NONE = 0,
NBBO = 1,
AMEX = ('A'<<1),
BSE = ('B'<<1),
BATS = ('B'<<2),
NSE = ('C'<<1),
CHX = ('M'<<2),
NYSE = ('N'<<1),
ARCA = ('P'<<2),
NASDAQ = ('Q'<<1),
NASDAQ_ADF = ('Q'<<2),
CBOE = ('W'<<1),
PHLX = ('X'<<1),
DIRECTEDGE = ('Y'<<1),
}

Categories

Resources