Which one is faster? Regex or EndsWith? - c#

What would be faster?
public String Roll()
{
Random rnd = new Random();
int roll = rnd.Next(1, 100000);
if (Regex.IsMatch(roll.ToString(), #"(.)\1{1,}$"))
{
return "doubles";
}
return "none";
}
Or
public String Roll()
{
Random rnd = new Random();
int roll = rnd.Next(1, 100000);
if (roll.ToString().EndsWith("11") || roll.ToString().EndsWith("22") || roll.ToString().EndsWith("33") || roll.ToString().EndsWith("44") || roll.ToString().EndsWith("55") || roll.ToString().EndsWith("66") || roll.ToString().EndsWith("77") || roll.ToString().EndsWith("88") || roll.ToString().EndsWith("99") || roll.ToString().EndsWith("00"))
{
return "doubles";
}
return "none";
}
I'm currently using a really long if-statement list full with regexes to check if an int ends with doubles, triples, quads, quints, etc... so I would like to know which one is faster before I change all of it.

In your particular case, Regex is actually faster... but it is likely because you use EndsWith with many OR and redundant ToString(). If you simplify the logic, simple string operation will likely be faster.
Here is the performance summary for text processing - from the fastest to the slowest (10 millions loop [Prefer/Non-Prefer 32-bit] - rank is ordered based on the fastest of the two):
Large Lookup Fast Random UInt (not counted for bounty): 219/273 ms - Mine, improved from Evk's
Large Lookup Optimized Random: 228/273 ms - Ivan Stoev's Alternate Solution
Large Lookup Fast Random: 238/294 ms - Evk's Alternative Solution
Large Lookup Parameterless Random: 248/287 ms - Thomas Ayoub
There are few notes I want to make on this solution (based on the comments below it):
This solution introduces 0.0039% bias towards small numbers (< 100000) (ref: Eric Lippert's blog post, linked by Lucas Trzesniewski)
Does not generate the same number sequence as others while being tested (ref: Michael Liu's comment) - since it changes the way to use Random (from Random.Next(int) to Random.Next()), which is used for the testing itself.
While the testing cannot be performed with the exact same number sequence for this method as for the rests (as mentioned by Phil1970), I have two points to make:
Some might be interested to look at the implement of Random.Next() vs Random.Next(int) to understand why this solution will still be faster even if the same sequence of numbers are used.
The use of Random in the real case itself will (most of the time) not assume the number sequence to be the same (or predictable) - It is only for testing our method we want the Random sequence to be identical (for fair unit testing purpose). The expected faster result for this method cannot be fully derived from the testing result alone, but by also looking at the Next() vs Next(int) implementation.
Large Look-up: 320/284 ms - Evk
Fastest Optimized Random Modded: 286/333 ms Ivan Stoev
Lookup Optimized Modded: 315/329 ms - Corak
Optimized Modded: 471/330 ms - Stian Standahl
Optimized Modded + Constant: 472/337 - Gjermund Grøneng
Fastest Optimized Modded: 345/340 ms - Gjermund Grøneng
Modded: 496/370 ms - Corak + possibly Alexei Levenkov
Numbers: 537/408 ms - Alois Kraus
Simple: 1668/1176 ms - Mine
HashSet Contains: 2138/1609 ms - Dandré
List Contains: 3013/2465 ms - Another Mine
Compiled Regex: 8956/7675 ms - Radin Gospodinov
Regex: 15032/16640 ms - OP's Solution 1
EndsWith: 24763/20702 ms - OP's Solution 2
Here are my simple test cases:
Random rnd = new Random(10000);
FastRandom fastRnd = new FastRandom(10000);
//OP's Solution 2
public String RollRegex() {
int roll = rnd.Next(1, 100000);
if (Regex.IsMatch(roll.ToString(), #"(.)\1{1,}$")) {
return "doubles";
} else {
return "none";
}
}
//Radin Gospodinov's Solution
Regex optionRegex = new Regex(#"(.)\1{1,}$", RegexOptions.Compiled);
public String RollOptionRegex() {
int roll = rnd.Next(1, 100000);
string rollString = roll.ToString();
if (optionRegex.IsMatch(roll.ToString())) {
return "doubles";
} else {
return "none";
}
}
//OP's Solution 1
public String RollEndsWith() {
int roll = rnd.Next(1, 100000);
if (roll.ToString().EndsWith("11") || roll.ToString().EndsWith("22") || roll.ToString().EndsWith("33") || roll.ToString().EndsWith("44") || roll.ToString().EndsWith("55") || roll.ToString().EndsWith("66") || roll.ToString().EndsWith("77") || roll.ToString().EndsWith("88") || roll.ToString().EndsWith("99") || roll.ToString().EndsWith("00")) {
return "doubles";
} else {
return "none";
}
}
//My Solution
public String RollSimple() {
int roll = rnd.Next(1, 100000);
string rollString = roll.ToString();
return roll > 10 && rollString[rollString.Length - 1] == rollString[rollString.Length - 2] ?
"doubles" : "none";
}
//My Other Solution
List<string> doubles = new List<string>() { "00", "11", "22", "33", "44", "55", "66", "77", "88", "99" };
public String RollAnotherSimple() {
int roll = rnd.Next(1, 100000);
string rollString = roll.ToString();
return rollString.Length > 1 && doubles.Contains(rollString.Substring(rollString.Length - 2)) ?
"doubles" : "none";
}
//Dandré's Solution
HashSet<string> doublesHashset = new HashSet<string>() { "00", "11", "22", "33", "44", "55", "66", "77", "88", "99" };
public String RollSimpleHashSet() {
int roll = rnd.Next(1, 100000);
string rollString = roll.ToString();
return rollString.Length > 1 && doublesHashset.Contains(rollString.Substring(rollString.Length - 2)) ?
"doubles" : "none";
}
//Corak's Solution - hinted by Alexei Levenkov too
public string RollModded() { int roll = rnd.Next(1, 100000); return roll % 100 % 11 == 0 ? "doubles" : "none"; }
//Stian Standahl optimizes modded solution
public string RollOptimizedModded() { return rnd.Next(1, 100000) % 100 % 11 == 0 ? "doubles" : "none"; }
//Gjermund Grøneng's method with constant addition
private const string CONST_DOUBLES = "doubles";
private const string CONST_NONE = "none";
public string RollOptimizedModdedConst() { return rnd.Next(1, 100000) % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE; }
//Gjermund Grøneng's method after optimizing the Random (The fastest!)
public string FastestOptimizedModded() { return (rnd.Next(99999) + 1) % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE; }
//Corak's Solution, added on Gjermund Grøneng's
private readonly string[] Lookup = { "doubles", "none", "none", "none", "none", "none", "none", "none", "none", "none", "none" };
public string RollLookupOptimizedModded() { return Lookup[(rnd.Next(99999) + 1) % 100 % 11]; }
//Evk's Solution, large Lookup
private string[] LargeLookup;
private void InitLargeLookup() {
LargeLookup = new string[100000];
for (int i = 0; i < 100000; i++) {
LargeLookup[i] = i % 100 % 11 == 0 ? "doubles" : "none";
}
}
public string RollLargeLookup() { return LargeLookup[rnd.Next(99999) + 1]; }
//Thomas Ayoub's Solution
public string RollLargeLookupParameterlessRandom() { return LargeLookup[rnd.Next() % 100000]; }
//Alois Kraus's Solution
public string RollNumbers() {
int roll = rnd.Next(1, 100000);
int lastDigit = roll % 10;
int secondLastDigit = (roll / 10) % 10;
if (lastDigit == secondLastDigit) {
return "doubles";
} else {
return "none";
}
}
//Ivan Stoev's Solution
public string FastestOptimizedRandomModded() {
return ((int)(rnd.Next() * (99999.0 / int.MaxValue)) + 1) % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE;
}
//Ivan Stoev's Alternate Solution
public string RollLargeLookupOptimizedRandom() {
return LargeLookup[(int)(rnd.Next() * (99999.0 / int.MaxValue))];
}
//Evk's Solution using FastRandom
public string RollLargeLookupFastRandom() {
return LargeLookup[fastRnd.Next(99999) + 1];
}
//My Own Test, using FastRandom + NextUInt
public string RollLargeLookupFastRandomUInt() {
return LargeLookup[fastRnd.NextUInt() % 99999 + 1];
}
The additional FastRandom class:
//FastRandom's part used for the testing
public class FastRandom {
// The +1 ensures NextDouble doesn't generate 1.0
const double REAL_UNIT_INT = 1.0 / ((double)int.MaxValue + 1.0);
const double REAL_UNIT_UINT = 1.0 / ((double)uint.MaxValue + 1.0);
const uint Y = 842502087, Z = 3579807591, W = 273326509;
uint x, y, z, w;
#region Constructors
/// <summary>
/// Initialises a new instance using time dependent seed.
/// </summary>
public FastRandom() {
// Initialise using the system tick count.
Reinitialise((int)Environment.TickCount);
}
/// <summary>
/// Initialises a new instance using an int value as seed.
/// This constructor signature is provided to maintain compatibility with
/// System.Random
/// </summary>
public FastRandom(int seed) {
Reinitialise(seed);
}
#endregion
#region Public Methods [Reinitialisation]
/// <summary>
/// Reinitialises using an int value as a seed.
/// </summary>
/// <param name="seed"></param>
public void Reinitialise(int seed) {
// The only stipulation stated for the xorshift RNG is that at least one of
// the seeds x,y,z,w is non-zero. We fulfill that requirement by only allowing
// resetting of the x seed
x = (uint)seed;
y = Y;
z = Z;
w = W;
}
#endregion
#region Public Methods [System.Random functionally equivalent methods]
/// <summary>
/// Generates a random int over the range 0 to int.MaxValue-1.
/// MaxValue is not generated in order to remain functionally equivalent to System.Random.Next().
/// This does slightly eat into some of the performance gain over System.Random, but not much.
/// For better performance see:
///
/// Call NextInt() for an int over the range 0 to int.MaxValue.
///
/// Call NextUInt() and cast the result to an int to generate an int over the full Int32 value range
/// including negative values.
/// </summary>
/// <returns></returns>
public int Next() {
uint t = (x ^ (x << 11));
x = y; y = z; z = w;
w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));
// Handle the special case where the value int.MaxValue is generated. This is outside of
// the range of permitted values, so we therefore call Next() to try again.
uint rtn = w & 0x7FFFFFFF;
if (rtn == 0x7FFFFFFF)
return Next();
return (int)rtn;
}
/// <summary>
/// Generates a random int over the range 0 to upperBound-1, and not including upperBound.
/// </summary>
/// <param name="upperBound"></param>
/// <returns></returns>
public int Next(int upperBound) {
if (upperBound < 0)
throw new ArgumentOutOfRangeException("upperBound", upperBound, "upperBound must be >=0");
uint t = (x ^ (x << 11));
x = y; y = z; z = w;
// The explicit int cast before the first multiplication gives better performance.
// See comments in NextDouble.
return (int)((REAL_UNIT_INT * (int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))))) * upperBound);
}
/// <summary>
/// Generates a random int over the range lowerBound to upperBound-1, and not including upperBound.
/// upperBound must be >= lowerBound. lowerBound may be negative.
/// </summary>
/// <param name="lowerBound"></param>
/// <param name="upperBound"></param>
/// <returns></returns>
public int Next(int lowerBound, int upperBound) {
if (lowerBound > upperBound)
throw new ArgumentOutOfRangeException("upperBound", upperBound, "upperBound must be >=lowerBound");
uint t = (x ^ (x << 11));
x = y; y = z; z = w;
// The explicit int cast before the first multiplication gives better performance.
// See comments in NextDouble.
int range = upperBound - lowerBound;
if (range < 0) { // If range is <0 then an overflow has occured and must resort to using long integer arithmetic instead (slower).
// We also must use all 32 bits of precision, instead of the normal 31, which again is slower.
return lowerBound + (int)((REAL_UNIT_UINT * (double)(w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)))) * (double)((long)upperBound - (long)lowerBound));
}
// 31 bits of precision will suffice if range<=int.MaxValue. This allows us to cast to an int and gain
// a little more performance.
return lowerBound + (int)((REAL_UNIT_INT * (double)(int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))))) * (double)range);
}
/// <summary>
/// Generates a random double. Values returned are from 0.0 up to but not including 1.0.
/// </summary>
/// <returns></returns>
public double NextDouble() {
uint t = (x ^ (x << 11));
x = y; y = z; z = w;
// Here we can gain a 2x speed improvement by generating a value that can be cast to
// an int instead of the more easily available uint. If we then explicitly cast to an
// int the compiler will then cast the int to a double to perform the multiplication,
// this final cast is a lot faster than casting from a uint to a double. The extra cast
// to an int is very fast (the allocated bits remain the same) and so the overall effect
// of the extra cast is a significant performance improvement.
//
// Also note that the loss of one bit of precision is equivalent to what occurs within
// System.Random.
return (REAL_UNIT_INT * (int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)))));
}
/// <summary>
/// Fills the provided byte array with random bytes.
/// This method is functionally equivalent to System.Random.NextBytes().
/// </summary>
/// <param name="buffer"></param>
public void NextBytes(byte[] buffer) {
// Fill up the bulk of the buffer in chunks of 4 bytes at a time.
uint x = this.x, y = this.y, z = this.z, w = this.w;
int i = 0;
uint t;
for (int bound = buffer.Length - 3; i < bound; ) {
// Generate 4 bytes.
// Increased performance is achieved by generating 4 random bytes per loop.
// Also note that no mask needs to be applied to zero out the higher order bytes before
// casting because the cast ignores thos bytes. Thanks to Stefan Troschьtz for pointing this out.
t = (x ^ (x << 11));
x = y; y = z; z = w;
w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));
buffer[i++] = (byte)w;
buffer[i++] = (byte)(w >> 8);
buffer[i++] = (byte)(w >> 16);
buffer[i++] = (byte)(w >> 24);
}
// Fill up any remaining bytes in the buffer.
if (i < buffer.Length) {
// Generate 4 bytes.
t = (x ^ (x << 11));
x = y; y = z; z = w;
w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));
buffer[i++] = (byte)w;
if (i < buffer.Length) {
buffer[i++] = (byte)(w >> 8);
if (i < buffer.Length) {
buffer[i++] = (byte)(w >> 16);
if (i < buffer.Length) {
buffer[i] = (byte)(w >> 24);
}
}
}
}
this.x = x; this.y = y; this.z = z; this.w = w;
}
// /// <summary>
// /// A version of NextBytes that uses a pointer to set 4 bytes of the byte buffer in one operation
// /// thus providing a nice speedup. The loop is also partially unrolled to allow out-of-order-execution,
// /// this results in about a x2 speedup on an AMD Athlon. Thus performance may vary wildly on different CPUs
// /// depending on the number of execution units available.
// ///
// /// Another significant speedup is obtained by setting the 4 bytes by indexing pDWord (e.g. pDWord[i++]=w)
// /// instead of adjusting it dereferencing it (e.g. *pDWord++=w).
// ///
// /// Note that this routine requires the unsafe compilation flag to be specified and so is commented out by default.
// /// </summary>
// /// <param name="buffer"></param>
// public unsafe void NextBytesUnsafe(byte[] buffer)
// {
// if(buffer.Length % 8 != 0)
// throw new ArgumentException("Buffer length must be divisible by 8", "buffer");
//
// uint x=this.x, y=this.y, z=this.z, w=this.w;
//
// fixed(byte* pByte0 = buffer)
// {
// uint* pDWord = (uint*)pByte0;
// for(int i=0, len=buffer.Length>>2; i < len; i+=2)
// {
// uint t=(x^(x<<11));
// x=y; y=z; z=w;
// pDWord[i] = w = (w^(w>>19))^(t^(t>>8));
//
// t=(x^(x<<11));
// x=y; y=z; z=w;
// pDWord[i+1] = w = (w^(w>>19))^(t^(t>>8));
// }
// }
//
// this.x=x; this.y=y; this.z=z; this.w=w;
// }
#endregion
#region Public Methods [Methods not present on System.Random]
/// <summary>
/// Generates a uint. Values returned are over the full range of a uint,
/// uint.MinValue to uint.MaxValue, inclusive.
///
/// This is the fastest method for generating a single random number because the underlying
/// random number generator algorithm generates 32 random bits that can be cast directly to
/// a uint.
/// </summary>
/// <returns></returns>
public uint NextUInt() {
uint t = (x ^ (x << 11));
x = y; y = z; z = w;
return (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)));
}
/// <summary>
/// Generates a random int over the range 0 to int.MaxValue, inclusive.
/// This method differs from Next() only in that the range is 0 to int.MaxValue
/// and not 0 to int.MaxValue-1.
///
/// The slight difference in range means this method is slightly faster than Next()
/// but is not functionally equivalent to System.Random.Next().
/// </summary>
/// <returns></returns>
public int NextInt() {
uint t = (x ^ (x << 11));
x = y; y = z; z = w;
return (int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))));
}
// Buffer 32 bits in bitBuffer, return 1 at a time, keep track of how many have been returned
// with bitBufferIdx.
uint bitBuffer;
uint bitMask = 1;
/// <summary>
/// Generates a single random bit.
/// This method's performance is improved by generating 32 bits in one operation and storing them
/// ready for future calls.
/// </summary>
/// <returns></returns>
public bool NextBool() {
if (bitMask == 1) {
// Generate 32 more bits.
uint t = (x ^ (x << 11));
x = y; y = z; z = w;
bitBuffer = w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));
// Reset the bitMask that tells us which bit to read next.
bitMask = 0x80000000;
return (bitBuffer & bitMask) == 0;
}
return (bitBuffer & (bitMask >>= 1)) == 0;
}
#endregion
}
The test scenario:
public delegate string RollDelegate();
private void Test() {
List<string> rollMethodNames = new List<string>(){
"Large Lookup Fast Random UInt",
"Large Lookup Fast Random",
"Large Lookup Optimized Random",
"Fastest Optimized Random Modded",
"Numbers",
"Large Lookup Parameterless Random",
"Large Lookup",
"Lookup Optimized Modded",
"Fastest Optimized Modded",
"Optimized Modded Const",
"Optimized Modded",
"Modded",
"Simple",
"Another simple with HashSet",
"Another Simple",
"Option (Compiled) Regex",
"Regex",
"EndsWith",
};
List<RollDelegate> rollMethods = new List<RollDelegate>{
RollLargeLookupFastRandomUInt,
RollLargeLookupFastRandom,
RollLargeLookupOptimizedRandom,
FastestOptimizedRandomModded,
RollNumbers,
RollLargeLookupParameterlessRandom,
RollLargeLookup,
RollLookupOptimizedModded,
FastestOptimizedModded,
RollOptimizedModdedConst,
RollOptimizedModded,
RollModded,
RollSimple,
RollSimpleHashSet,
RollAnotherSimple,
RollOptionRegex,
RollRegex,
RollEndsWith
};
int trial = 10000000;
InitLargeLookup();
for (int k = 0; k < rollMethods.Count; ++k) {
rnd = new Random(10000);
fastRnd = new FastRandom(10000);
logBox.GetTimeLapse();
for (int i = 0; i < trial; ++i)
rollMethods[k]();
logBox.WriteTimedLogLine(rollMethodNames[k] + ": " + logBox.GetTimeLapse());
}
}
The result (Prefer 32-Bit):
[2016-05-30 08:20:54.056 UTC] Large Lookup Fast Random UInt: 219 ms
[2016-05-30 08:20:54.296 UTC] Large Lookup Fast Random: 238 ms
[2016-05-30 08:20:54.524 UTC] Large Lookup Optimized Random: 228 ms
[2016-05-30 08:20:54.810 UTC] Fastest Optimized Random Modded: 286 ms
[2016-05-30 08:20:55.347 UTC] Numbers: 537 ms
[2016-05-30 08:20:55.596 UTC] Large Lookup Parameterless Random: 248 ms
[2016-05-30 08:20:55.916 UTC] Large Lookup: 320 ms
[2016-05-30 08:20:56.231 UTC] Lookup Optimized Modded: 315 ms
[2016-05-30 08:20:56.577 UTC] Fastest Optimized Modded: 345 ms
[2016-05-30 08:20:57.049 UTC] Optimized Modded Const: 472 ms
[2016-05-30 08:20:57.521 UTC] Optimized Modded: 471 ms
[2016-05-30 08:20:58.017 UTC] Modded: 496 ms
[2016-05-30 08:20:59.685 UTC] Simple: 1668 ms
[2016-05-30 08:21:01.824 UTC] Another simple with HashSet: 2138 ms
[2016-05-30 08:21:04.837 UTC] Another Simple: 3013 ms
[2016-05-30 08:21:13.794 UTC] Option (Compiled) Regex: 8956 ms
[2016-05-30 08:21:28.827 UTC] Regex: 15032 ms
[2016-05-30 08:21:53.589 UTC] EndsWith: 24763 ms
The result (Non Prefer 32-Bit):
[2016-05-30 08:16:00.934 UTC] Large Lookup Fast Random UInt: 273 ms
[2016-05-30 08:16:01.230 UTC] Large Lookup Fast Random: 294 ms
[2016-05-30 08:16:01.503 UTC] Large Lookup Optimized Random: 273 ms
[2016-05-30 08:16:01.837 UTC] Fastest Optimized Random Modded: 333 ms
[2016-05-30 08:16:02.245 UTC] Numbers: 408 ms
[2016-05-30 08:16:02.532 UTC] Large Lookup Parameterless Random: 287 ms
[2016-05-30 08:16:02.816 UTC] Large Lookup: 284 ms
[2016-05-30 08:16:03.145 UTC] Lookup Optimized Modded: 329 ms
[2016-05-30 08:16:03.486 UTC] Fastest Optimized Modded: 340 ms
[2016-05-30 08:16:03.824 UTC] Optimized Modded Const: 337 ms
[2016-05-30 08:16:04.154 UTC] Optimized Modded: 330 ms
[2016-05-30 08:16:04.524 UTC] Modded: 370 ms
[2016-05-30 08:16:05.700 UTC] Simple: 1176 ms
[2016-05-30 08:16:07.309 UTC] Another simple with HashSet: 1609 ms
[2016-05-30 08:16:09.774 UTC] Another Simple: 2465 ms
[2016-05-30 08:16:17.450 UTC] Option (Compiled) Regex: 7675 ms
[2016-05-30 08:16:34.090 UTC] Regex: 16640 ms
[2016-05-30 08:16:54.793 UTC] EndsWith: 20702 ms
And the picture:

#StianStandahls friend here. This solution is fastest! It is the same as the previous fastest example in #Ians answer, but the random generator is optimized here.
private const string CONST_DOUBLES = "doubles";
private const string CONST_NONE = "none";
public string FastestOptimizedModded()
{
return (rnd.Next(99999)+1) % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE;
}

As for the most performance, I believe #Ian already covered that quite nicely. All credits go to him.
One thing that isn't answered to my satisfaction in the Q/A is why Regex'es outperform EndsWith in this case. I felt the need to explain the difference so people realize what solution will probably work better in which scenario.
Endswith
The EndsWith functionality is basically a 'compare' on part of the string in sequential order. Something like this:
bool EndsWith(string haystack, string needle)
{
bool equal = haystack.Length >= needle.Length;
for (int i=0; i<needle.Length && equal; ++i)
{
equal = s[i] == needle[needle.Length - haystack.Length + i];
}
return equal;
}
The code is pretty straight forward; we simply take the first character, see if it matches, then the next, etc - until we hit the end of the string.
Regex
Regex'es work differently. Consider looking for the needle "foofoo" in a very large haystack. The obvious implementation is to at the first character, check if it's an 'f', move to the next character, etc. until we hit the end of the string. However, we can do much better:
Look closely at the task. If we would first look at character 5 of the string, and notice that it's not an 'o' (the last character), we can immediately skip to character 11 and again check if it's an 'o'. That way, we would get a nice improvement over our original code of a factor 6 in the best case and the same performance in the worst case.
Also note that regexes can become more complex with 'or's, 'and's, etc. Doing forward scans no longer makes a lot of sense if we only need to look at the trailing characters.
This is why Regex'es usually work with NFA's that are compiled to DFA's. There's a great online tool here: http://hackingoff.com/compilers/regular-expression-to-nfa-dfa that shows what this looks like (for simple regex'es).
Internally, you can ask .NET to compile a Regex using Reflection.Emit and when you use a regex, you actually evaluate this optimized, compiled state machine (RegexOptions.Compiled).
What you will probably end up with is something that works like this:
bool Matches(string haystack)
{
char _1;
int index = 0;
// match (.)
state0:
if (index >= haystack.Length)
{
goto stateFail;
}
_1 = haystack[index];
state = 1;
++index;
goto state1;
// match \1{1}
state1:
if (index >= haystack.Length)
{
goto stateFail;
}
if (_1 == haystack[index])
{
++index;
goto state2;
}
goto stateFail;
// match \1{2,*}$ -- usually optimized away because it always succeeds
state1:
if (index >= haystack.Length)
{
goto stateSuccess;
}
if (_1 == haystack[index])
{
++index;
goto state2;
}
goto stateSuccess;
stateSuccess:
return true;
stateFail:
return false;
}
So what's faster?
Well, that depends. There's overhead in determining the NFA/DFA from the expression, compiling the program and for each call looking up the program and evaluating it. For very simple cases, an EndsWith beats the Regex. In this case it's the 'OR's in the EndsWith that make it slower than the Regex.
On the other hand, a Regex is usually something you use multiple times, which means that you only have to compile it once, and simply look it up for each call.

Since at this moment the subject has been moved to Random method micro optimizations, I'll concentrate on LargeLookup implementations.
First of, the RollLargeLookupParameterlessRandom solution in addition to bias has another issue. All other implementations check random numbers in range [1, 99999] inclusive, i.e. total 99999 numbers while % 100000 generates range [0, 99999] inclusive, i.e. total 100000 numbers.
So let correct that and at the same time optimize a bit RollLargeLookup implementation by removing add operation:
private string[] LargeLookup;
private void InitLargeLookup()
{
LargeLookup = new string[99999];
for (int i = 0; i < LargeLookup.Length; i++)
{
LargeLookup[i] = (i + 1) % 100 % 11 == 0 ? "doubles" : "none";
}
}
public string RollLargeLookup()
{
return LargeLookup[rnd.Next(99999)];
}
public string RollLargeLookupParameterlessRandom()
{
return LargeLookup[rnd.Next() % 99999];
}
Now, can we optimize further the RollLargeLookupParameterlessRandom implementation and at the same time remove the forementioned bias issue and make it compatible with the other implementations? It turns out that we can. In order to do that again we need to know the Random.Next(maxValue) implementation which is something like this:
return (int)((Next() * (1.0 / int.MaxValue)) * maxValue);
Note that 1.0 / int.MaxValue is a constant evaluated at compile time. The idea is to replace 1.0 with maxValue (also constant 99999 in our case), thus eliminating one multiplication. So the resulting function is:
public string RollLargeLookupOptimizedRandom()
{
return LargeLookup[(int)(rnd.Next() * (99999.0 / int.MaxValue))];
}
Interestingly, this not only fixes the RollLargeLookupParameterlessRandom issues but also is a little bit faster.
Actually this optimization can be applied to any of the other solutions, so the fastest non lookup implementation would be:
public string FastestOptimizedRandomModded()
{
return ((int)(rnd.Next() * (99999.0 / int.MaxValue)) + 1) % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE;
}
But before showing the performance tests, let prove that the result is compatible with Random.Next(maxValue) implementation:
for (int n = 0; n < int.MaxValue; n++)
{
var n1 = (int)((n * (1.0 / int.MaxValue)) * 99999);
var n2 = (int)(n * (99999.0 / int.MaxValue));
Debug.Assert(n1 == n2);
}
Finally, my benchmarks:
64 OS, Release build, Prefer 32 bit = True
Large Lookup Optimized Random: 149 ms
Large Lookup Parameterless Random: 159 ms
Large Lookup: 179 ms
Lookup Optimized Modded: 231 ms
Fastest Optimized Random Modded: 219 ms
Fastest Optimized Modded: 251 ms
Optimized Modded Const: 412 ms
Optimized Modded: 416 ms
Modded: 419 ms
Simple: 1343 ms
Another simple with HashSet: 1805 ms
Another Simple: 2690 ms
Option (Compiled) Regex: 8538 ms
Regex: 14861 ms
EndsWith: 39117 ms
64 OS, Release build, Prefer 32 bit = False
Large Lookup Optimized Random: 121 ms
Large Lookup Parameterless Random: 126 ms
Large Lookup: 156 ms
Lookup Optimized Modded: 168 ms
Fastest Optimized Random Modded: 154 ms
Fastest Optimized Modded: 186 ms
Optimized Modded Const: 178 ms
Optimized Modded: 180 ms
Modded: 202 ms
Simple: 795 ms
Another simple with HashSet: 1287 ms
Another Simple: 2178 ms
Option (Compiled) Regex: 7246 ms
Regex: 17090 ms
EndsWith: 36554 ms

A bit more perfomance could be squeezed out if pregenerate whole lookup table for all possible values. This will avoid two modulo divisions in the fastest method and so will be a bit faster:
private string[] LargeLookup;
private void Init() {
LargeLookup = new string[100000];
for (int i = 0; i < 100000; i++) {
LargeLookup[i] = i%100%11 == 0 ? "doubles" : "none";
}
}
And the method itself is then just:
public string RollLargeLookup() {
return LargeLookup[rnd.Next(99999) + 1];
}
While looking somewhat contrieved - such methods are often used. For example fastest known poker hand evaluator pregenerates huge array with hunders of thousands of entries (with very clever tricks) and then just makes several simple lookups on this array to evaluate strength of one poker hand over another in no time.
You can make even faster still, by using alternative random number generators. For example if you replace System.Random with this FastRandom class implementation (based on xorshift algorithm) - it will be twice as fast.
If implement both large lookup table and FastRandom - on my computer it shows 100ms vs 220ms of RollLookupOptimizedModded.
Here is the source code of FastRandom class mentioned in my link above:
public class FastRandom
{
// The +1 ensures NextDouble doesn't generate 1.0
const double REAL_UNIT_INT = 1.0 / ((double)int.MaxValue + 1.0);
const double REAL_UNIT_UINT = 1.0 / ((double)uint.MaxValue + 1.0);
const uint Y = 842502087, Z = 3579807591, W = 273326509;
uint x, y, z, w;
#region Constructors
/// <summary>
/// Initialises a new instance using time dependent seed.
/// </summary>
public FastRandom()
{
// Initialise using the system tick count.
Reinitialise((int)Environment.TickCount);
}
/// <summary>
/// Initialises a new instance using an int value as seed.
/// This constructor signature is provided to maintain compatibility with
/// System.Random
/// </summary>
public FastRandom(int seed)
{
Reinitialise(seed);
}
#endregion
#region Public Methods [Reinitialisation]
/// <summary>
/// Reinitialises using an int value as a seed.
/// </summary>
/// <param name="seed"></param>
public void Reinitialise(int seed)
{
// The only stipulation stated for the xorshift RNG is that at least one of
// the seeds x,y,z,w is non-zero. We fulfill that requirement by only allowing
// resetting of the x seed
x = (uint)seed;
y = Y;
z = Z;
w = W;
}
#endregion
#region Public Methods [System.Random functionally equivalent methods]
/// <summary>
/// Generates a random int over the range 0 to int.MaxValue-1.
/// MaxValue is not generated in order to remain functionally equivalent to System.Random.Next().
/// This does slightly eat into some of the performance gain over System.Random, but not much.
/// For better performance see:
///
/// Call NextInt() for an int over the range 0 to int.MaxValue.
///
/// Call NextUInt() and cast the result to an int to generate an int over the full Int32 value range
/// including negative values.
/// </summary>
/// <returns></returns>
public int Next()
{
uint t = (x ^ (x << 11));
x = y; y = z; z = w;
w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));
// Handle the special case where the value int.MaxValue is generated. This is outside of
// the range of permitted values, so we therefore call Next() to try again.
uint rtn = w & 0x7FFFFFFF;
if (rtn == 0x7FFFFFFF)
return Next();
return (int)rtn;
}
/// <summary>
/// Generates a random int over the range 0 to upperBound-1, and not including upperBound.
/// </summary>
/// <param name="upperBound"></param>
/// <returns></returns>
public int Next(int upperBound)
{
if (upperBound < 0)
throw new ArgumentOutOfRangeException("upperBound", upperBound, "upperBound must be >=0");
uint t = (x ^ (x << 11));
x = y; y = z; z = w;
// The explicit int cast before the first multiplication gives better performance.
// See comments in NextDouble.
return (int)((REAL_UNIT_INT * (int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))))) * upperBound);
}
/// <summary>
/// Generates a random int over the range lowerBound to upperBound-1, and not including upperBound.
/// upperBound must be >= lowerBound. lowerBound may be negative.
/// </summary>
/// <param name="lowerBound"></param>
/// <param name="upperBound"></param>
/// <returns></returns>
public int Next(int lowerBound, int upperBound)
{
if (lowerBound > upperBound)
throw new ArgumentOutOfRangeException("upperBound", upperBound, "upperBound must be >=lowerBound");
uint t = (x ^ (x << 11));
x = y; y = z; z = w;
// The explicit int cast before the first multiplication gives better performance.
// See comments in NextDouble.
int range = upperBound - lowerBound;
if (range < 0)
{ // If range is <0 then an overflow has occured and must resort to using long integer arithmetic instead (slower).
// We also must use all 32 bits of precision, instead of the normal 31, which again is slower.
return lowerBound + (int)((REAL_UNIT_UINT * (double)(w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)))) * (double)((long)upperBound - (long)lowerBound));
}
// 31 bits of precision will suffice if range<=int.MaxValue. This allows us to cast to an int and gain
// a little more performance.
return lowerBound + (int)((REAL_UNIT_INT * (double)(int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))))) * (double)range);
}
/// <summary>
/// Generates a random double. Values returned are from 0.0 up to but not including 1.0.
/// </summary>
/// <returns></returns>
public double NextDouble()
{
uint t = (x ^ (x << 11));
x = y; y = z; z = w;
// Here we can gain a 2x speed improvement by generating a value that can be cast to
// an int instead of the more easily available uint. If we then explicitly cast to an
// int the compiler will then cast the int to a double to perform the multiplication,
// this final cast is a lot faster than casting from a uint to a double. The extra cast
// to an int is very fast (the allocated bits remain the same) and so the overall effect
// of the extra cast is a significant performance improvement.
//
// Also note that the loss of one bit of precision is equivalent to what occurs within
// System.Random.
return (REAL_UNIT_INT * (int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)))));
}
/// <summary>
/// Fills the provided byte array with random bytes.
/// This method is functionally equivalent to System.Random.NextBytes().
/// </summary>
/// <param name="buffer"></param>
public void NextBytes(byte[] buffer)
{
// Fill up the bulk of the buffer in chunks of 4 bytes at a time.
uint x = this.x, y = this.y, z = this.z, w = this.w;
int i = 0;
uint t;
for (int bound = buffer.Length - 3; i < bound;)
{
// Generate 4 bytes.
// Increased performance is achieved by generating 4 random bytes per loop.
// Also note that no mask needs to be applied to zero out the higher order bytes before
// casting because the cast ignores thos bytes. Thanks to Stefan Troschьtz for pointing this out.
t = (x ^ (x << 11));
x = y; y = z; z = w;
w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));
buffer[i++] = (byte)w;
buffer[i++] = (byte)(w >> 8);
buffer[i++] = (byte)(w >> 16);
buffer[i++] = (byte)(w >> 24);
}
// Fill up any remaining bytes in the buffer.
if (i < buffer.Length)
{
// Generate 4 bytes.
t = (x ^ (x << 11));
x = y; y = z; z = w;
w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));
buffer[i++] = (byte)w;
if (i < buffer.Length)
{
buffer[i++] = (byte)(w >> 8);
if (i < buffer.Length)
{
buffer[i++] = (byte)(w >> 16);
if (i < buffer.Length)
{
buffer[i] = (byte)(w >> 24);
}
}
}
}
this.x = x; this.y = y; this.z = z; this.w = w;
}
// /// <summary>
// /// A version of NextBytes that uses a pointer to set 4 bytes of the byte buffer in one operation
// /// thus providing a nice speedup. The loop is also partially unrolled to allow out-of-order-execution,
// /// this results in about a x2 speedup on an AMD Athlon. Thus performance may vary wildly on different CPUs
// /// depending on the number of execution units available.
// ///
// /// Another significant speedup is obtained by setting the 4 bytes by indexing pDWord (e.g. pDWord[i++]=w)
// /// instead of adjusting it dereferencing it (e.g. *pDWord++=w).
// ///
// /// Note that this routine requires the unsafe compilation flag to be specified and so is commented out by default.
// /// </summary>
// /// <param name="buffer"></param>
// public unsafe void NextBytesUnsafe(byte[] buffer)
// {
// if(buffer.Length % 8 != 0)
// throw new ArgumentException("Buffer length must be divisible by 8", "buffer");
//
// uint x=this.x, y=this.y, z=this.z, w=this.w;
//
// fixed(byte* pByte0 = buffer)
// {
// uint* pDWord = (uint*)pByte0;
// for(int i=0, len=buffer.Length>>2; i < len; i+=2)
// {
// uint t=(x^(x<<11));
// x=y; y=z; z=w;
// pDWord[i] = w = (w^(w>>19))^(t^(t>>8));
//
// t=(x^(x<<11));
// x=y; y=z; z=w;
// pDWord[i+1] = w = (w^(w>>19))^(t^(t>>8));
// }
// }
//
// this.x=x; this.y=y; this.z=z; this.w=w;
// }
#endregion
#region Public Methods [Methods not present on System.Random]
/// <summary>
/// Generates a uint. Values returned are over the full range of a uint,
/// uint.MinValue to uint.MaxValue, inclusive.
///
/// This is the fastest method for generating a single random number because the underlying
/// random number generator algorithm generates 32 random bits that can be cast directly to
/// a uint.
/// </summary>
/// <returns></returns>
public uint NextUInt()
{
uint t = (x ^ (x << 11));
x = y; y = z; z = w;
return (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)));
}
/// <summary>
/// Generates a random int over the range 0 to int.MaxValue, inclusive.
/// This method differs from Next() only in that the range is 0 to int.MaxValue
/// and not 0 to int.MaxValue-1.
///
/// The slight difference in range means this method is slightly faster than Next()
/// but is not functionally equivalent to System.Random.Next().
/// </summary>
/// <returns></returns>
public int NextInt()
{
uint t = (x ^ (x << 11));
x = y; y = z; z = w;
return (int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))));
}
// Buffer 32 bits in bitBuffer, return 1 at a time, keep track of how many have been returned
// with bitBufferIdx.
uint bitBuffer;
uint bitMask = 1;
/// <summary>
/// Generates a single random bit.
/// This method's performance is improved by generating 32 bits in one operation and storing them
/// ready for future calls.
/// </summary>
/// <returns></returns>
public bool NextBool()
{
if (bitMask == 1)
{
// Generate 32 more bits.
uint t = (x ^ (x << 11));
x = y; y = z; z = w;
bitBuffer = w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));
// Reset the bitMask that tells us which bit to read next.
bitMask = 0x80000000;
return (bitBuffer & bitMask) == 0;
}
return (bitBuffer & (bitMask >>= 1)) == 0;
}
#endregion
}
Then you need to initialize it together with your Random:
Random rnd = new Random(10000);
FastRandom fastRnd = new FastRandom(10000);
And method becomes:
public string RollLargeLookup() {
return LargeLookup[fastRnd.Next(99999) + 1];
}

As several others have already pointed out string comparisons for numbers are not efficient.
public static String RollNumbers()
{
int roll = rnd.Next(1, 100000);
int lastDigit = roll % 10;
int secondLastDigit = (roll / 10) % 10;
if( lastDigit == secondLastDigit )
{
return "doubles";
}
else
{
return "none";
}
}
That will run on my machine in 50ms vs the 1200ms of the original approach. Most time is spent on allocating many small temporary objects. If you can you should get rid of strings in the first place. If that is your hot code path it can help to convert your data structures into something which is more expensive to create but very cheap to query. Lookup tables which have been shown here are a good start.
If you look closely to the LargeLookup implementation you will find that most of its good performance is because it does cheat by not using a string as key but it uses the inital random number with some calculations as index.
If you try my solution it will most likely run faster because lookup tables tend to have bad cache coherency which makes memory accesses more expensive.

The fastest that I was able to achieve is optimizing the use of Random with the large lookup method:
return LargeLookup[rnd.Next() % 100000];
And it runs 20% faster than the original since it avoid a division (look at Next() code vs Next(int maxValue)).
Looking for real fairnessIMHO, I changed a bit the way the method where tested.
TL;DR; here's the dasboard:
|-----------------Name---------------|--Avg--|--Min--|---Max---|
|------------------------------------|-------|-------|---------|
|RollLargeLookup | 108| 122| 110,2|
|RollLookupOptimizedModded | 141| 156| 145,5|
|RollOptimizedModdedConst | 156| 159| 156,7|
|RollOptimizedModded | 158| 163| 159,8|
|RollNumbers | 197| 214| 200,9|
|RollSimple | 1 242| 1 304| 1 260,8|
|RollSimpleHashSet | 1 635| 1 774| 1 664,6|
|RollAnotherSimple | 2 544| 2 732| 2 603,2|
|RollOptionRegex | 9 137| 9 605| 9 300,6|
|RollRegex | 17 510| 18 873| 17 959 |
|RollEndsWith | 20 725| 22 001| 21 196,1|
I changed a few points:
Pre-computed the numbers to test so each method were tested with the same set of numbers (taking out the random generation war and the biais I introduced);
Runned each method 10 times in a random order;
Introduced a parameter in each function;
Removed the dupes.
I created a class MethodToTest:
public class MethodToTest
{
public delegate string RollDelegate(int number);
public RollDelegate MethodDelegate { get; set; }
public List<long> timeSpent { get; set; }
public MethodToTest()
{
timeSpent = new List<long>();
}
public string TimeStats()
{
return string.Format("Min: {0}ms, Max: {1}ms, Avg: {2}ms", timeSpent.Min(), timeSpent.Max(),
timeSpent.Average());
}
}
Here's the main content:
private static void Test()
{
List<MethodToTest> methodList = new List<MethodToTest>
{
new MethodToTest{ MethodDelegate = RollNumbers},
new MethodToTest{ MethodDelegate = RollLargeLookup},
new MethodToTest{ MethodDelegate = RollLookupOptimizedModded},
new MethodToTest{ MethodDelegate = RollOptimizedModdedConst},
new MethodToTest{ MethodDelegate = RollOptimizedModded},
new MethodToTest{ MethodDelegate = RollSimple},
new MethodToTest{ MethodDelegate = RollSimpleHashSet},
new MethodToTest{ MethodDelegate = RollAnotherSimple},
new MethodToTest{ MethodDelegate = RollOptionRegex},
new MethodToTest{ MethodDelegate = RollRegex},
new MethodToTest{ MethodDelegate = RollEndsWith},
};
InitLargeLookup();
Stopwatch s = new Stopwatch();
Random rnd = new Random();
List<int> Randoms = new List<int>();
const int trial = 10000000;
const int numberOfLoop = 10;
for (int j = 0; j < numberOfLoop; j++)
{
Console.Out.WriteLine("Loop: " + j);
Randoms.Clear();
for (int i = 0; i < trial; ++i)
Randoms.Add(rnd.Next(1, 100000));
// Shuffle order
foreach (MethodToTest method in methodList.OrderBy(m => new Random().Next()))
{
s.Restart();
for (int i = 0; i < trial; ++i)
method.MethodDelegate(Randoms[i]);
method.timeSpent.Add(s.ElapsedMilliseconds);
Console.Out.WriteLine("\tMethod: " +method.MethodDelegate.Method.Name);
}
}
File.WriteAllLines(#"C:\Users\me\Desktop\out.txt", methodList.OrderBy(m => m.timeSpent.Average()).Select(method => method.MethodDelegate.Method.Name + ": " + method.TimeStats()));
}
And here are the functions:
//OP's Solution 2
public static String RollRegex(int number)
{
return Regex.IsMatch(number.ToString(), #"(.)\1{1,}$") ? "doubles" : "none";
}
//Radin Gospodinov's Solution
static readonly Regex OptionRegex = new Regex(#"(.)\1{1,}$", RegexOptions.Compiled);
public static String RollOptionRegex(int number)
{
return OptionRegex.IsMatch(number.ToString()) ? "doubles" : "none";
}
//OP's Solution 1
public static String RollEndsWith(int number)
{
if (number.ToString().EndsWith("11") || number.ToString().EndsWith("22") || number.ToString().EndsWith("33") ||
number.ToString().EndsWith("44") || number.ToString().EndsWith("55") || number.ToString().EndsWith("66") ||
number.ToString().EndsWith("77") || number.ToString().EndsWith("88") || number.ToString().EndsWith("99") ||
number.ToString().EndsWith("00"))
{
return "doubles";
}
return "none";
}
//Ian's Solution
public static String RollSimple(int number)
{
string rollString = number.ToString();
return number > 10 && rollString[rollString.Length - 1] == rollString[rollString.Length - 2] ?
"doubles" : "none";
}
//Ian's Other Solution
static List<string> doubles = new List<string>() { "00", "11", "22", "33", "44", "55", "66", "77", "88", "99" };
public static String RollAnotherSimple(int number)
{
string rollString = number.ToString();
return rollString.Length > 1 && doubles.Contains(rollString.Substring(rollString.Length - 2)) ?
"doubles" : "none";
}
//Dandré's Solution
static HashSet<string> doublesHashset = new HashSet<string>() { "00", "11", "22", "33", "44", "55", "66", "77", "88", "99" };
public static String RollSimpleHashSet(int number)
{
string rollString = number.ToString();
return rollString.Length > 1 && doublesHashset.Contains(rollString.Substring(rollString.Length - 2)) ?
"doubles" : "none";
}
//Stian Standahl optimizes modded solution
public static string RollOptimizedModded(int number) { return number % 100 % 11 == 0 ? "doubles" : "none"; }
//Gjermund Grøneng's method with constant addition
private const string CONST_DOUBLES = "doubles";
private const string CONST_NONE = "none";
public static string RollOptimizedModdedConst(int number) { return number % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE; }
//Corak's Solution, added on Gjermund Grøneng's
private static readonly string[] Lookup = { "doubles", "none", "none", "none", "none", "none", "none", "none", "none", "none", "none" };
public static string RollLookupOptimizedModded(int number) { return Lookup[number % 100 % 11]; }
//Evk's Solution, large Lookup
private static string[] LargeLookup;
private static void InitLargeLookup()
{
LargeLookup = new string[100000];
for (int i = 0; i < 100000; i++)
{
LargeLookup[i] = i % 100 % 11 == 0 ? "doubles" : "none";
}
}
public static string RollLargeLookup(int number) { return LargeLookup[number]; }
//Alois Kraus's Solution
public static string RollNumbers(int number)
{
int lastDigit = number % 10;
int secondLastDigit = (number / 10) % 10;
return lastDigit == secondLastDigit ? "doubles" : "none";
}

Related

Comparing bits efficiently ( overlap set of x )

I want to compare a stream of bits of arbitrary length to a mask in c# and return a ratio of how many bits were the same.
The mask to check against is anywhere between 2 bits long to 8k (with 90% of the masks being 5 bits long), the input can be anywhere between 2 bits up to ~ 500k, with an average input string of 12k (but yeah, most of the time it will be comparing 5 bits with the first 5 bits of that 12k)
Now my naive implementation would be something like this:
bool[] mask = new[] { true, true, false, true };
float dendrite(bool[] input) {
int correct = 0;
for ( int i = 0; i<mask.length; i++ ) {
if ( input[i] == mask[i] )
correct++;
}
return (float)correct/(float)mask.length;
}
but I expect this is better handled (more efficient) with some kind of binary operator magic?
Anyone got any pointers?
EDIT: the datatype is not fixed at this point in my design, so if ints or bytearrays work better, I'd also be a happy camper, trying to optimize for efficiency here, the faster the computation, the better.
eg if you can make it work like this:
int[] mask = new[] { 1, 1, 0, 1 };
float dendrite(int[] input) {
int correct = 0;
for ( int i = 0; i<mask.length; i++ ) {
if ( input[i] == mask[i] )
correct++;
}
return (float)correct/(float)mask.length;
}
or this:
int mask = 13; //1101
float dendrite(int input) {
return // your magic here;
} // would return 0.75 for an input
// of 101 given ( 1100101 in binary,
// matches 3 bits of the 4 bit mask == .75
ANSWER:
I ran each proposed answer against each other and Fredou's and Marten's solution ran neck to neck but Fredou submitted the fastest leanest implementation in the end. Of course since the average result varies quite wildly between implementations I might have to revisit this post later on. :) but that's probably just me messing up in my test script. ( i hope, too late now, going to bed =)
sparse1.Cyclone
1317ms 3467107ticks 10000iterations
result: 0,7851563
sparse1.Marten
288ms 759362ticks 10000iterations
result: 0,05066964
sparse1.Fredou
216ms 568747ticks 10000iterations
result: 0,8925781
sparse1.Marten
296ms 778862ticks 10000iterations
result: 0,05066964
sparse1.Fredou
216ms 568601ticks 10000iterations
result: 0,8925781
sparse1.Marten
300ms 789901ticks 10000iterations
result: 0,05066964
sparse1.Cyclone
1314ms 3457988ticks 10000iterations
result: 0,7851563
sparse1.Fredou
207ms 546606ticks 10000iterations
result: 0,8925781
sparse1.Marten
298ms 786352ticks 10000iterations
result: 0,05066964
sparse1.Cyclone
1301ms 3422611ticks 10000iterations
result: 0,7851563
sparse1.Marten
292ms 769850ticks 10000iterations
result: 0,05066964
sparse1.Cyclone
1305ms 3433320ticks 10000iterations
result: 0,7851563
sparse1.Fredou
209ms 551178ticks 10000iterations
result: 0,8925781
( testscript copied here, if i destroyed yours modifying it lemme know. https://dotnetfiddle.net/h9nFSa )
how about this one - dotnetfiddle example
using System;
namespace ConsoleApplication1
{
public class Program
{
public static void Main(string[] args)
{
int a = Convert.ToInt32("0001101", 2);
int b = Convert.ToInt32("1100101", 2);
Console.WriteLine(dendrite(a, 4, b));
}
private static float dendrite(int mask, int len, int input)
{
return 1 - getBitCount(mask ^ (input & (int.MaxValue >> 32 - len))) / (float)len;
}
private static int getBitCount(int bits)
{
bits = bits - ((bits >> 1) & 0x55555555);
bits = (bits & 0x33333333) + ((bits >> 2) & 0x33333333);
return ((bits + (bits >> 4) & 0xf0f0f0f) * 0x1010101) >> 24;
}
}
}
64 bits one here - dotnetfiddler
using System;
namespace ConsoleApplication1
{
public class Program
{
public static void Main(string[] args)
{
// 1
ulong a = Convert.ToUInt64("0000000000000000000000000000000000000000000000000000000000001101", 2);
ulong b = Convert.ToUInt64("1110010101100101011001010110110101100101011001010110010101100101", 2);
Console.WriteLine(dendrite(a, 4, b));
}
private static float dendrite(ulong mask, int len, ulong input)
{
return 1 - getBitCount(mask ^ (input & (ulong.MaxValue >> (64 - len)))) / (float)len;
}
private static ulong getBitCount(ulong bits)
{
bits = bits - ((bits >> 1) & 0x5555555555555555UL);
bits = (bits & 0x3333333333333333UL) + ((bits >> 2) & 0x3333333333333333UL);
return unchecked(((bits + (bits >> 4)) & 0xF0F0F0F0F0F0F0FUL) * 0x101010101010101UL) >> 56;
}
}
}
I came up with this code:
static float dendrite(ulong input, ulong mask)
{
// get bits that are same (0 or 1) in input and mask
ulong samebits = mask & ~(input ^ mask);
// count number of same bits
int correct = cardinality(samebits);
// count number of bits in mask
int inmask = cardinality(mask);
// compute fraction (0.0 to 1.0)
return inmask == 0 ? 0f : correct / (float)inmask;
}
// this is a little hack to count the number of bits set to one in a 64-bit word
static int cardinality(ulong word)
{
const ulong mult = 0x0101010101010101;
const ulong mask1h = (~0UL) / 3 << 1;
const ulong mask2l = (~0UL) / 5;
const ulong mask4l = (~0UL) / 17;
word -= (mask1h & word) >> 1;
word = (word & mask2l) + ((word >> 2) & mask2l);
word += word >> 4;
word &= mask4l;
return (int)((word * mult) >> 56);
}
This will check 64-bits at a time. If you need more than that you can just split the input data into 64-bit words and compare them one by one and compute the average result.
Here's a .NET fiddle with the code and a working test case:
https://dotnetfiddle.net/5hYFtE
I would change the code to something along these lines:
// hardcoded bitmask
byte mask = 255;
float dendrite(byte input) {
int correct = 0;
// store the xor:ed result
byte xored = input ^ mask;
// loop through each bit
for(int i = 0; i < 8; i++) {
// if the bit is 0 then it was correct
if(!(xored & (1 << i)))
correct++;
}
return (float)correct/(float)mask.length;
}
The above uses a mask and input of 8 bits, but of course you could modify this to use a 4 byte integer and so on.
Not sure if this will work as expected, but it might give you some clues on how to proceed.
For example if you only would like to check the first 4 bits you could change the code to something like:
float dendrite(byte input) {
// hardcoded bitmask i.e 1101
byte mask = 13;
// number of bits to check
byte bits = 4;
int correct = 0;
// store the xor:ed result
byte xored = input ^ mask;
// loop through each bit, notice that we only checking the first 4 bits
for(int i = 0; i < bits; i++) {
// if the bit is 0 then it was correct
if(!(xored & (1 << i)))
correct++;
}
return (float)correct/(float)bits;
}
Of course it might be faster to actually use a int instead of a byte.

Optimize the rearranging of bits

I have a core C# function that I am trying to speed up. Suggestions involving safe or unsafe code are equally welcome. Here is the method:
public byte[] Interleave(uint[] vector)
{
var byteVector = new byte[BytesNeeded + 1]; // Extra byte needed when creating a BigInteger, for sign bit.
foreach (var idx in PrecomputedIndices)
{
var bit = (byte)(((vector[idx.iFromUintVector] >> idx.iFromUintBit) & 1U) << idx.iToByteBit);
byteVector[idx.iToByteVector] |= bit;
}
return byteVector;
}
PrecomputedIndices is an array of the following class:
class Indices
{
public readonly int iFromUintVector;
public readonly int iFromUintBit;
public readonly int iToByteVector;
public readonly int iToByteBit;
public Indices(int fromUintVector, int fromUintBit, int toByteVector, int toByteBit)
{
iFromUintVector = fromUintVector;
iFromUintBit = fromUintBit;
iToByteVector = toByteVector;
iToByteBit = toByteBit;
}
}
The purpose of the Interleave method is to copy bits from an array of uints to an array of bytes. I have pre-computed the source and target array index and the source and target bit number and stored them in the Indices objects. No two adjacent bits in the source will be adjacent in the target, so that rules out certain optimizations.
To give you an idea of scale, the problem I am working on has about 4,200 dimensions, so "vector" has 4,200 elements. The values in vector range from zero to twelve, so I only need to use four bits to store their values in the byte array, thus I need 4,200 x 4 = 16,800 bits of data, or 2,100 bytes of output per vector. This method will be called millions of times. It consumes approximately a third of the time in the larger procedure I need to optimize.
UPDATE 1: Changing "Indices" to a struct and shrinking a few of the datatypes so that the object was just eight bytes (an int, a short, and two bytes) reduced the percentage of execution time from 35% to 30%.
These are the crucial parts of my revised implementation, with ideas drawn from the commenters:
Convert object to struct, shrink data types to smaller ints, and rearrange so that the object should fit into a 64-bit value, which is better for a 64-bit machine:
struct Indices
{
/// <summary>
/// Index into source vector of source uint to read.
/// </summary>
public readonly int iFromUintVector;
/// <summary>
/// Index into target vector of target byte to write.
/// </summary>
public readonly short iToByteVector;
/// <summary>
/// Index into source uint of source bit to read.
/// </summary>
public readonly byte iFromUintBit;
/// <summary>
/// Index into target byte of target bit to write.
/// </summary>
public readonly byte iToByteBit;
public Indices(int fromUintVector, byte fromUintBit, short toByteVector, byte toByteBit)
{
iFromUintVector = fromUintVector;
iFromUintBit = fromUintBit;
iToByteVector = toByteVector;
iToByteBit = toByteBit;
}
}
Sort the PrecomputedIndices so that I write each target byte and bit in ascending order, which improves memory cache access:
Comparison<Indices> sortByTargetByteAndBit = (a, b) =>
{
if (a.iToByteVector < b.iToByteVector) return -1;
if (a.iToByteVector > b.iToByteVector) return 1;
if (a.iToByteBit < b.iToByteBit) return -1;
if (a.iToByteBit > b.iToByteBit) return 1;
return 0;
};
Array.Sort(PrecomputedIndices, sortByTargetByteAndBit);
Unroll the loop so that a whole target byte is assembled at once, reducing the number of times I access the target array:
public byte[] Interleave(uint[] vector)
{
var byteVector = new byte[BytesNeeded + 1]; // An extra byte is needed to hold the extra bits and a sign bit for the BigInteger.
var extraBits = Bits - BytesNeeded << 3;
int iIndex = 0;
var iByte = 0;
for (; iByte < BytesNeeded; iByte++)
{
// Unroll the loop so we compute the bits for a whole byte at a time.
uint bits = 0;
var idx0 = PrecomputedIndices[iIndex];
var idx1 = PrecomputedIndices[iIndex + 1];
var idx2 = PrecomputedIndices[iIndex + 2];
var idx3 = PrecomputedIndices[iIndex + 3];
var idx4 = PrecomputedIndices[iIndex + 4];
var idx5 = PrecomputedIndices[iIndex + 5];
var idx6 = PrecomputedIndices[iIndex + 6];
var idx7 = PrecomputedIndices[iIndex + 7];
bits = (((vector[idx0.iFromUintVector] >> idx0.iFromUintBit) & 1U))
| (((vector[idx1.iFromUintVector] >> idx1.iFromUintBit) & 1U) << 1)
| (((vector[idx2.iFromUintVector] >> idx2.iFromUintBit) & 1U) << 2)
| (((vector[idx3.iFromUintVector] >> idx3.iFromUintBit) & 1U) << 3)
| (((vector[idx4.iFromUintVector] >> idx4.iFromUintBit) & 1U) << 4)
| (((vector[idx5.iFromUintVector] >> idx5.iFromUintBit) & 1U) << 5)
| (((vector[idx6.iFromUintVector] >> idx6.iFromUintBit) & 1U) << 6)
| (((vector[idx7.iFromUintVector] >> idx7.iFromUintBit) & 1U) << 7);
byteVector[iByte] = (Byte)bits;
iIndex += 8;
}
for (; iIndex < PrecomputedIndices.Length; iIndex++)
{
var idx = PrecomputedIndices[iIndex];
var bit = (byte)(((vector[idx.iFromUintVector] >> idx.iFromUintBit) & 1U) << idx.iToByteBit);
byteVector[idx.iToByteVector] |= bit;
}
return byteVector;
}
#1 cuts the function from taking up 35% of the execution time to 30% of the execution time (14% savings).
#2 did not speed the function up, but made #3 possible.
#3 cuts the function from 30% of exec time to 19.6%, another 33% in savings.
Total savings: 44%!!!

Rounding value to nearest power of two

I am looking for the fastest way in C# to round a value to the nearest power of two.
I've discovered that the fastest way to round a value to the next power of two if to use bitwise operators like this.
int ToNextNearest(int x)
{
if (x < 0) { return 0; }
--x;
x |= x >> 1;
x |= x >> 2;
x |= x >> 4;
x |= x >> 8;
x |= x >> 16;
return x + 1;
}
But this gives the next nearest and not the nearest and I would like to only have the nearest power of two.
Here is a simple way to do that.
int ToNearest(int x)
{
Math.Pow(2, Math.Round(Math.Log(x) / Math.Log(2)));
}
But is there a better optimized version of finding the nearest power of two value ?
Thanks a lot.
Surely the best way is to use your bitwise routine to find the next power of two, then divide that result by two. This gives you the previous power of two. Then a simple comparison will tell you which of the two is closer.
int ToNearest(int x)
{
int next = ToNextNearest(x);
int prev = next >> 1;
return next - x < x - prev ? next : prev;
}
Untested code but you get the idea.
I'm using this:
public static int CeilPower2(int x)
{
if (x < 2) {
return 1;
}
return (int) Math.Pow(2, (int) Math.Log(x-1, 2) + 1);
}
public static int FloorPower2(int x)
{
if (x < 1) {
return 1;
}
return (int) Math.Pow(2, (int) Math.Log(x, 2));
}
On .Net Core, the fastest way to do this would probably to use the intrinsics operations:
private static int NearestPowerOf2(uint x)
{
return 1 << (sizeof(uint) * 8 - BitOperations.LeadingZeroCount(x - 1));
}
On CPU supporting the LZCNT instructions, it is just 6 CPU instructions, without branching.
.Net6 has introduced a method for this
using System.Numerics;
var nearestPowOf2 = BitOperations.RoundUpToPowerOf2(100); //returns 128
How about this:
int ToNearest(int val, int pow)
{
if (pow < 0) return 0;
if (pow == 0) return val;
if (val & (1 << (pow - 1))) {
return ((val >> pow) + 1) << pow;
} else {
return (val >> pow) << pow;
}
}
Haven't tested but i think this could work
int ToNearest(value x)
{
int num = 0;
for(int i=1; i < 65; i++)
{
int cur = Math.Abs(value - 0<<i);
if(Math.Abs(value - 0<<i) < Math.Abs(value - num))
num = cur;
else if(num != 0) break;
}
return num;
}
This is the full implementation of the suggested solution of #john, with the change that it will round up if the value is exactly in between the next and previous power of two.
public static int RoundToNextPowerOfTwo(int a)
{
int next = CeilToNextPowerOfTwo(a);
int prev = next >> 1;
return next - a <= a - prev ? next : prev;
}
public static int CeilToNextPowerOfTwo(int number)
{
int a = number;
int powOfTwo = 1;
while (a > 1)
{
a = a >> 1;
powOfTwo = powOfTwo << 1;
}
if (powOfTwo != number)
{
powOfTwo = powOfTwo << 1;
}
return powOfTwo;
}
Since C# requires IEEE754 floats there is probably a faster way on any platform that does not emulate the floating point functions:
int ToNearestPowerOf2(int x) =>
1 << (int)(BitConverter.DoubleToInt64Bits(x + x/3) >> 52) - 1023;
Rationale:
x + x/3
nearest power of 2, basically *4/3
(BitConverter.DoubleToInt64Bits(x) >> 52) - 1023
take floating point exponent for floor(ln2(x))
1 << x
exponential function with base 2
The function obviously requires a positive value for x.
0 won't work because the closest power of 2 is -∞,
and negative values have a complex logarithms.
Whether this is the fastest way will probably highly depend on what the JIT optimizer squeezes out of the code, more specifically how it handles the hard pointer cast in DoubleToInt64Bits. This may prevent other optimizations.
You do not have to use any comparison to get the nearest power of 2. Since all powers of two are separated by the same factor the rounding point is always at 3/4 of the next power of 2 (i.e. exactly the topmost 2 bits are set). So multiplication by the reciprocal followed by truncation will do the job.

C# Random(Long)

I'm trying to generate a number based on a seed in C#. The only problem is that the seed is too big to be an int32. Is there a way I can use a long as the seed?
And yes, the seed MUST be a long.
Here's a C# version of Java.Util.Random that I ported from the Java Specification.
The best thing to do is to write a Java program to generate a load of numbers and check that this C# version generates the same numbers.
public sealed class JavaRng
{
public JavaRng(long seed)
{
_seed = (seed ^ LARGE_PRIME) & ((1L << 48) - 1);
}
public int NextInt(int n)
{
if (n <= 0)
throw new ArgumentOutOfRangeException("n", n, "n must be positive");
if ((n & -n) == n) // i.e., n is a power of 2
return (int)((n * (long)next(31)) >> 31);
int bits, val;
do
{
bits = next(31);
val = bits % n;
} while (bits - val + (n-1) < 0);
return val;
}
private int next(int bits)
{
_seed = (_seed*LARGE_PRIME + SMALL_PRIME) & ((1L << 48) - 1);
return (int) (((uint)_seed) >> (48 - bits));
}
private long _seed;
private const long LARGE_PRIME = 0x5DEECE66DL;
private const long SMALL_PRIME = 0xBL;
}
For anyone seeing this question today, .NET 6 and upwards provides Random.NextInt64, which has the following overloads:
NextInt64()
Returns a non-negative random integer.
NextInt64(Int64)
Returns a non-negative random integer that is less than the specified maximum.
NextInt64(Int64, Int64)
Returns a random integer that is within a specified range.
I'd go for the answer provided here by #Dyppl: Random number in long range, is this the way?
Put this function where it's accessible to the code that needs to generate the random number:
long LongRandom(long min, long max, Random rand)
{
byte[] buf = new byte[8];
rand.NextBytes(buf);
long longRand = BitConverter.ToInt64(buf, 0);
return (Math.Abs(longRand % (max - min)) + min);
}
Then call the function like this:
long r = LongRandom(100000000000000000, 100000000000000050, new Random());

Converting a int to a BCD byte array

I want to convert an int to a byte[2] array using BCD.
The int in question will come from DateTime representing the Year and must be converted to two bytes.
Is there any pre-made function that does this or can you give me a simple way of doing this?
example:
int year = 2010
would output:
byte[2]{0x20, 0x10};
static byte[] Year2Bcd(int year) {
if (year < 0 || year > 9999) throw new ArgumentException();
int bcd = 0;
for (int digit = 0; digit < 4; ++digit) {
int nibble = year % 10;
bcd |= nibble << (digit * 4);
year /= 10;
}
return new byte[] { (byte)((bcd >> 8) & 0xff), (byte)(bcd & 0xff) };
}
Beware that you asked for a big-endian result, that's a bit unusual.
Use this method.
public static byte[] ToBcd(int value){
if(value<0 || value>99999999)
throw new ArgumentOutOfRangeException("value");
byte[] ret=new byte[4];
for(int i=0;i<4;i++){
ret[i]=(byte)(value%10);
value/=10;
ret[i]|=(byte)((value%10)<<4);
value/=10;
}
return ret;
}
This is essentially how it works.
If the value is less than 0 or greater than 99999999, the value won't fit in four bytes. More formally, if the value is less than 0 or is 10^(n*2) or greater, where n is the number of bytes, the value won't fit in n bytes.
For each byte:
Set that byte to the remainder of the value-divided-by-10 to the byte. (This will place the last digit in the low nibble [half-byte] of the current byte.)
Divide the value by 10.
Add 16 times the remainder of the value-divided-by-10 to the byte. (This will place the now-last digit in the high nibble of the current byte.)
Divide the value by 10.
(One optimization is to set every byte to 0 beforehand -- which is implicitly done by .NET when it allocates a new array -- and to stop iterating when the value reaches 0. This latter optimization is not done in the code above, for simplicity. Also, if available, some compilers or assemblers offer a divide/remainder routine that allows retrieving the quotient and remainder in one division step, an optimization which is not usually necessary though.)
Here's a terrible brute-force version. I'm sure there's a better way than this, but it ought to work anyway.
int digitOne = year / 1000;
int digitTwo = (year - digitOne * 1000) / 100;
int digitThree = (year - digitOne * 1000 - digitTwo * 100) / 10;
int digitFour = year - digitOne * 1000 - digitTwo * 100 - digitThree * 10;
byte[] bcdYear = new byte[] { digitOne << 4 | digitTwo, digitThree << 4 | digitFour };
The sad part about it is that fast binary to BCD conversions are built into the x86 microprocessor architecture, if you could get at them!
Here is a slightly cleaner version then Jeffrey's
static byte[] IntToBCD(int input)
{
if (input > 9999 || input < 0)
throw new ArgumentOutOfRangeException("input");
int thousands = input / 1000;
int hundreds = (input -= thousands * 1000) / 100;
int tens = (input -= hundreds * 100) / 10;
int ones = (input -= tens * 10);
byte[] bcd = new byte[] {
(byte)(thousands << 4 | hundreds),
(byte)(tens << 4 | ones)
};
return bcd;
}
maybe a simple parse function containing this loop
i=0;
while (id>0)
{
twodigits=id%100; //need 2 digits per byte
arr[i]=twodigits%10 + twodigits/10*16; //first digit on first 4 bits second digit shifted with 4 bits
id/=100;
i++;
}
More common solution
private IEnumerable<Byte> GetBytes(Decimal value)
{
Byte currentByte = 0;
Boolean odd = true;
while (value > 0)
{
if (odd)
currentByte = 0;
Decimal rest = value % 10;
value = (value-rest)/10;
currentByte |= (Byte)(odd ? (Byte)rest : (Byte)((Byte)rest << 4));
if(!odd)
yield return currentByte;
odd = !odd;
}
if(!odd)
yield return currentByte;
}
Same version as Peter O. but in VB.NET
Public Shared Function ToBcd(ByVal pValue As Integer) As Byte()
If pValue < 0 OrElse pValue > 99999999 Then Throw New ArgumentOutOfRangeException("value")
Dim ret As Byte() = New Byte(3) {} 'All bytes are init with 0's
For i As Integer = 0 To 3
ret(i) = CByte(pValue Mod 10)
pValue = Math.Floor(pValue / 10.0)
ret(i) = ret(i) Or CByte((pValue Mod 10) << 4)
pValue = Math.Floor(pValue / 10.0)
If pValue = 0 Then Exit For
Next
Return ret
End Function
The trick here is to be aware that simply using pValue /= 10 will round the value so if for instance the argument is "16", the first part of the byte will be correct, but the result of the division will be 2 (as 1.6 will be rounded up). Therefore I use the Math.Floor method.
I made a generic routine posted at IntToByteArray that you could use like:
var yearInBytes = ConvertBigIntToBcd(2010, 2);
static byte[] IntToBCD(int input) {
byte[] bcd = new byte[] {
(byte)(input>> 8),
(byte)(input& 0x00FF)
};
return bcd;
}

Categories

Resources