Contains is faster than StartsWith? - c#

A consultant came by yesterday and somehow the topic of strings came up. He mentioned that he had noticed that for strings less than a certain length, Contains is actually faster than StartsWith. I had to see it with my own two eyes, so I wrote a little app and sure enough, Contains is faster!
How is this possible?
DateTime start = DateTime.MinValue;
DateTime end = DateTime.MinValue;
string str = "Hello there";
start = DateTime.Now;
for (int i = 0; i < 10000000; i++)
{
str.Contains("H");
}
end = DateTime.Now;
Console.WriteLine("{0}ms using Contains", end.Subtract(start).Milliseconds);
start = DateTime.Now;
for (int i = 0; i < 10000000; i++)
{
str.StartsWith("H");
}
end = DateTime.Now;
Console.WriteLine("{0}ms using StartsWith", end.Subtract(start).Milliseconds);
Outputs:
726ms using Contains
865ms using StartsWith
I've tried it with longer strings too!

Try using StopWatch to measure the speed instead of DateTime checking.
Stopwatch vs. using System.DateTime.Now for timing events
I think the key is the following the important parts bolded:
Contains:
This method performs an ordinal
(case-sensitive and
culture-insensitive) comparison.
StartsWith:
This method performs a word
(case-sensitive and culture-sensitive)
comparison using the current culture.
I think the key is the ordinal comparison which amounts to:
An ordinal sort compares strings based
on the numeric value of each Char
object in the string. An ordinal
comparison is automatically
case-sensitive because the lowercase
and uppercase versions of a character
have different code points. However,
if case is not important in your
application, you can specify an
ordinal comparison that ignores case.
This is equivalent to converting the
string to uppercase using the
invariant culture and then performing
an ordinal comparison on the result.
References:
http://msdn.microsoft.com/en-us/library/system.string.aspx
http://msdn.microsoft.com/en-us/library/dy85x1sa.aspx
http://msdn.microsoft.com/en-us/library/baketfxw.aspx
Using Reflector you can see the code for the two:
public bool Contains(string value)
{
return (this.IndexOf(value, StringComparison.Ordinal) >= 0);
}
public bool StartsWith(string value, bool ignoreCase, CultureInfo culture)
{
if (value == null)
{
throw new ArgumentNullException("value");
}
if (this == value)
{
return true;
}
CultureInfo info = (culture == null) ? CultureInfo.CurrentCulture : culture;
return info.CompareInfo.IsPrefix(this, value,
ignoreCase ? CompareOptions.IgnoreCase : CompareOptions.None);
}

I figured it out. It's because StartsWith is culture-sensitive, while Contains is not. That inherently means StartsWith has to do more work.
FWIW, here are my results on Mono with the below (corrected) benchmark:
1988.7906ms using Contains
10174.1019ms using StartsWith
I'd be glad to see people's results on MS, but my main point is that correctly done (and assuming similar optimizations), I think StartsWith has to be slower:
using System;
using System.Diagnostics;
public class ContainsStartsWith
{
public static void Main()
{
string str = "Hello there";
Stopwatch s = new Stopwatch();
s.Start();
for (int i = 0; i < 10000000; i++)
{
str.Contains("H");
}
s.Stop();
Console.WriteLine("{0}ms using Contains", s.Elapsed.TotalMilliseconds);
s.Reset();
s.Start();
for (int i = 0; i < 10000000; i++)
{
str.StartsWith("H");
}
s.Stop();
Console.WriteLine("{0}ms using StartsWith", s.Elapsed.TotalMilliseconds);
}
}

StartsWith and Contains behave completely different when it comes to culture-sensitive issues.
In particular, StartsWith returning true does NOT imply Contains returning true. You should replace one of them with the other only if you really know what you are doing.
using System;
class Program
{
static void Main()
{
var x = "A";
var y = "A\u0640";
Console.WriteLine(x.StartsWith(y)); // True
Console.WriteLine(x.Contains(y)); // False
}
}

I twiddled around in Reflector and found a potential answer:
Contains:
return (this.IndexOf(value, StringComparison.Ordinal) >= 0);
StartsWith:
...
switch (comparisonType)
{
case StringComparison.CurrentCulture:
return CultureInfo.CurrentCulture.CompareInfo.IsPrefix(this, value, CompareOptions.None);
case StringComparison.CurrentCultureIgnoreCase:
return CultureInfo.CurrentCulture.CompareInfo.IsPrefix(this, value, CompareOptions.IgnoreCase);
case StringComparison.InvariantCulture:
return CultureInfo.InvariantCulture.CompareInfo.IsPrefix(this, value, CompareOptions.None);
case StringComparison.InvariantCultureIgnoreCase:
return CultureInfo.InvariantCulture.CompareInfo.IsPrefix(this, value, CompareOptions.IgnoreCase);
case StringComparison.Ordinal:
return ((this.Length >= value.Length) && (nativeCompareOrdinalEx(this, 0, value, 0, value.Length) == 0));
case StringComparison.OrdinalIgnoreCase:
return ((this.Length >= value.Length) && (TextInfo.CompareOrdinalIgnoreCaseEx(this, 0, value, 0, value.Length, value.Length) == 0));
}
throw new ArgumentException(Environment.GetResourceString("NotSupported_StringComparison"), "comparisonType");
And there are some overloads so that the default culture is CurrentCulture.
So first of all, Ordinal will be faster (if the string is close to the beginning) anyway, right? And secondly, there's more logic here which could slow things down (although so so trivial)

Here is a benchmark of using StartsWith vs Contains.
As you can see, StartsWith using ordinal comparison is pretty good, and you should take note of the memory allocated for each method.
| Method | Mean | Error | StdDev | Median | Gen 0 | Gen 1 | Gen 2 | Allocated |
|----------------------------------------- |-------------:|-----------:|-------------:|-------------:|----------:|------:|------:|----------:|
| EnumEqualsMethod | 1,079.67 us | 43.707 us | 114.373 us | 1,059.98 us | 1019.5313 | - | - | 4800000 B |
| EnumEqualsOp | 28.15 us | 0.533 us | 0.547 us | 28.34 us | - | - | - | - |
| ContainsName | 1,572.15 us | 152.347 us | 449.198 us | 1,639.93 us | - | - | - | - |
| ContainsShortName | 1,771.03 us | 103.982 us | 306.592 us | 1,749.32 us | - | - | - | - |
| StartsWithName | 14,511.94 us | 764.825 us | 2,255.103 us | 14,592.07 us | - | - | - | - |
| StartsWithNameOrdinalComp | 1,147.03 us | 32.467 us | 93.674 us | 1,153.34 us | - | - | - | - |
| StartsWithNameOrdinalCompIgnoreCase | 1,519.30 us | 134.951 us | 397.907 us | 1,264.27 us | - | - | - | - |
| StartsWithShortName | 7,140.82 us | 61.513 us | 51.366 us | 7,133.75 us | - | - | - | 4 B |
| StartsWithShortNameOrdinalComp | 970.83 us | 68.742 us | 202.686 us | 1,019.14 us | - | - | - | - |
| StartsWithShortNameOrdinalCompIgnoreCase | 802.22 us | 15.975 us | 32.270 us | 792.46 us | - | - | - | - |
| EqualsSubstringOrdinalCompShortName | 4,578.37 us | 91.567 us | 231.402 us | 4,588.09 us | 679.6875 | - | - | 3200000 B |
| EqualsOpShortNametoCharArray | 1,937.55 us | 53.821 us | 145.508 us | 1,901.96 us | 1695.3125 | - | - | 8000000 B |
Here is my benchmark code
https://gist.github.com/KieranMcCormick/b306c8493084dfc953881a68e0e6d55b

Let's examine what ILSpy says about these two...
public virtual int IndexOf(string source, string value, int startIndex, int count, CompareOptions options)
{
if (source == null)
{
throw new ArgumentNullException("source");
}
if (value == null)
{
throw new ArgumentNullException("value");
}
if (startIndex > source.Length)
{
throw new ArgumentOutOfRangeException("startIndex", Environment.GetResourceString("ArgumentOutOfRange_Index"));
}
if (source.Length == 0)
{
if (value.Length == 0)
{
return 0;
}
return -1;
}
else
{
if (startIndex < 0)
{
throw new ArgumentOutOfRangeException("startIndex", Environment.GetResourceString("ArgumentOutOfRange_Index"));
}
if (count < 0 || startIndex > source.Length - count)
{
throw new ArgumentOutOfRangeException("count", Environment.GetResourceString("ArgumentOutOfRange_Count"));
}
if (options == CompareOptions.OrdinalIgnoreCase)
{
return source.IndexOf(value, startIndex, count, StringComparison.OrdinalIgnoreCase);
}
if ((options & ~(CompareOptions.IgnoreCase | CompareOptions.IgnoreNonSpace | CompareOptions.IgnoreSymbols | CompareOptions.IgnoreKanaType | CompareOptions.IgnoreWidth)) != CompareOptions.None && options != CompareOptions.Ordinal)
{
throw new ArgumentException(Environment.GetResourceString("Argument_InvalidFlag"), "options");
}
return CompareInfo.InternalFindNLSStringEx(this.m_dataHandle, this.m_handleOrigin, this.m_sortName, CompareInfo.GetNativeCompareFlags(options) | 4194304 | ((source.IsAscii() && value.IsAscii()) ? 536870912 : 0), source, count, startIndex, value, value.Length);
}
}
Looks like it considers culture as well, but is defaulted.
public bool StartsWith(string value, StringComparison comparisonType)
{
if (value == null)
{
throw new ArgumentNullException("value");
}
if (comparisonType < StringComparison.CurrentCulture || comparisonType > StringComparison.OrdinalIgnoreCase)
{
throw new ArgumentException(Environment.GetResourceString("NotSupported_StringComparison"), "comparisonType");
}
if (this == value)
{
return true;
}
if (value.Length == 0)
{
return true;
}
switch (comparisonType)
{
case StringComparison.CurrentCulture:
return CultureInfo.CurrentCulture.CompareInfo.IsPrefix(this, value, CompareOptions.None);
case StringComparison.CurrentCultureIgnoreCase:
return CultureInfo.CurrentCulture.CompareInfo.IsPrefix(this, value, CompareOptions.IgnoreCase);
case StringComparison.InvariantCulture:
return CultureInfo.InvariantCulture.CompareInfo.IsPrefix(this, value, CompareOptions.None);
case StringComparison.InvariantCultureIgnoreCase:
return CultureInfo.InvariantCulture.CompareInfo.IsPrefix(this, value, CompareOptions.IgnoreCase);
case StringComparison.Ordinal:
return this.Length >= value.Length && string.nativeCompareOrdinalEx(this, 0, value, 0, value.Length) == 0;
case StringComparison.OrdinalIgnoreCase:
return this.Length >= value.Length && TextInfo.CompareOrdinalIgnoreCaseEx(this, 0, value, 0, value.Length, value.Length) == 0;
default:
throw new ArgumentException(Environment.GetResourceString("NotSupported_StringComparison"), "comparisonType");
}
By contrast, the only difference I see that appears relevant is an extra length check.

Related

How to validate Date Start and Finish Overlap from a list of items

What I have
A list of objects with Id, DateStart and DateFinish.
[
{
Id: 1234567890,
DateStart: new DateTime(),
DateFinish: new DateTime(),
},
...
]
What I need to do
I need to validate if none of the dates overlap each other.
I'm not sure if overlap is passing the right meaning here, so here is some examples:
Invalid Entry
[
{
Id: 1,
DateStart: new DateTime().AddHours(1),
DateFinish: new DateTime().AddHours(3),
},
{
Id: 2,
DateStart: new DateTime().AddHours(2),
DateFinish: new DateTime().AddHours(4),
}
]
This list have an overlap because the time of id 2 is in the middle of id 1
A table to show better:
-------------------------------------------------------------
| 1 | 2 | 3 | 4 |
| DateStart1 | | DateFinish1 | |
| | DateStart2 | | DateFinish2 |
-------------------------------------------------------------
*overlap* *overlap*
Other Invalid Examples
-------------------------------------------------------------
| 1 | 2 | 3 | 4 |
| DateStart1 | | | DateFinish1 |
| | DateStart2 | DateFinish2 | |
-------------------------------------------------------------
*overlap* *overlap*
-------------------------------------------------------------
| 1 | 2 | 3 | 4 |
| DateStart1 | | | DateFinish1 | // This would be a full overlap
| DateStart2 | | | DateFinish2 | // And it's also Invalid
-------------------------------------------------------------
*overlap* *overlap*
-------------------------------------------------------------
| 1 | 2 | 3 | 4 |
| | DateStart1 | | DateFinish1 | // Same as first example
| DateStart2 | | DateFinish2 | | // But "inverted"
-------------------------------------------------------------
*overlap* *overlap*
Valid Entry
[
{
Id: 1,
DateStart: new DateTime().AddHours(1),
DateFinish: new DateTime().AddHours(2),
},
{
Id: 2,
DateStart: new DateTime().AddHours(2),
DateFinish: new DateTime().AddHours(4),
}
]
A table to show better:
-------------------------------------------------------------
| 1 | 2 | 3 | 4 |
| DateStart1 | DateFinish1 | | |
| | DateStart2 | | DateFinish2 |
-------------------------------------------------------------
*not overlap*
And you can also have DateStart and DateFinish that are the same value, which means it can start and end at the same time.
-------------------------------------------------------------
| 1 | 2 | 3 | 4 |
| DateStart1 | | | |
| DateFinish1 | | | |
| DateStart2 | | | DateFinish2 |
-------------------------------------------------------------
*not overlap*
What I have done so far:
I'm making a foreach loop, where item is each element, and using a where with the following expression:
myList.Any(
x => x.Id == item.Id
&&
(
(
item.DateStart <= x.DateStart
&&
item.DateFinish > x.DateStart
&&
item.DateFinish <= x.DateFinish
)
||
(
item.DateStart >= x.DateStart
&&
item.DateStart < x.DateFinish
&&
item.DateFinish > x.DateFinish
)
||
(
item.DateStart <= x.DateStart
&&
item.DateFinish >= x.DateFinish
)
)
)
My Question
Is this expression correct? I have tried it with a lot of data and it seems to be wrong sometimes.
I need to be certain that it will cover all edge cases.
If there is a better way of writing all this logic, it would help to, because this code looks to ugly and hard to understand for other people.
I would use the following code:
static bool IsOverlapping(IEnumerable<Range> list)
{
Range previousRange = null;
foreach (var currentRange in list.OrderBy(x => x.DateStart).ThenBy(x => x.DateFinish))
{
if (currentRange.DateStart > currentRange.DateFinish)
return true;
if (previousRange?.DateFinish > currentRange.DateStart)
return true;
previousRange = currentRange;
}
return false;
}
A quick and dirty version. Not very performant as is on large sets. But can be improved on.
https://dotnetfiddle.net/Widget/PEn2Lm
static void DetectOverlap(List<Range> l)
{
foreach(var r in l)
{
var overlap = l.Any(x => x.Id != r.Id
&& ((r.Start == x.Start && r.End == x.End)
|| (r.Start >= x.Start && r.Start < x.End)
|| (r.End > x.Start && r.End <= x.End)));
if(overlap)
{
Console.WriteLine("Overlap detected");
throw new Exception("Overlapping range detected");
}
}
Console.WriteLine("Clean ranges");
}
I tried to mirror your cases, but I'd suggest writing unit tests to full test all of your scenarios.
List<FooBar> bars = new List<FooBar>()
{
new FooBar() //end date is inside 3
{
Start = new DateTime(2001,12,1),
End = new DateTime(2002,5,15),
Id = 1
},
new FooBar() //fine
{
Start = new DateTime(2005,12,1),
End = new DateTime(2006,5,15),
Id = 2
},
new FooBar() //start date is inside 1
{
Start = new DateTime(2002,4,1),
End = new DateTime(2003,5,15),
Id = 3
},
new FooBar() //this one is fine
{
Start = new DateTime(2006,5,15),
End = new DateTime(2007,5,15),
Id = 4
},
new FooBar() //also fine
{
Start = new DateTime(2001,12,1),
End = new DateTime(2001,12,1),
Id = 5
},
};
And then a, to me at least, slightly easier to read / skim code snippet which seems to work perfectly:
var inside = bars.Where(w =>
bars.Where(outer => ((outer.Start < w.Start && outer.End > w.Start)
|| (outer.Start < w.End && outer.End > w.End)
|| (outer.Start == w.Start && outer.End == w.End)) && outer.Id != w.Id).Any()).ToList();
inside.ForEach(e => {
Console.WriteLine($"{e.Id}");
});
For actual use, I'd also test for Any or First, and not ToList, but this gives me the Ids for the console to check.
As for why I used this logic, it might prove faulty but my assumptions:
An overlapped start date is between another input's start and end,
or an item's end date is between another input's start and end dates,
or a start and end date matches another input's values exactly.
Additional tests (as with your provided code) I expect false positives due to the use of <= and >=.
For example, changing the method to include
bars.Where(outer => ((outer.Start <= w.Start && outer.End >= w.Start)
gives me false positives on 4 and 5

Bug in minimax for tic_tac_toe AI

I have been trying to implement an AI for the computer using minimax with alpha-beta pruning, but I m facing an unidentifiable bug. The algorithm should calculate all the possible moves of its own and the other player too, but it isn't playing back the way it should.
Here is my minimax code :
public int minimax(int[] board, char symbol, int alpha, int beta, int depth = 2)
{
int win = util.checkwin(board);
int nsymbol = (symbol == 'X' ? 1 : 2);
int mult = (symbol == compside ? 1 : -1);
if (win != -1)
{
if (win == nsymbol)
return mult;
else if (win != 0)
return (mult * -1);
else
return 0;
}
if (depth == 0)
return 0;
int[] newboard = new int[9];
Array.Copy(board, newboard, 9);
int score, i, pos = -1;
ArrayList emptyboard = new ArrayList();
emptyboard = util.filterboard(newboard);
for (i = 0; i < emptyboard.Count; i++)
{
if (i > 0)
newboard[(int)emptyboard[i - 1]] = 0;
newboard[(int)emptyboard[i]] = nsymbol;
score = minimax(newboard, util.changeside(symbol), alpha, beta, depth - 1);
if (mult == 1)
{
if (score > alpha)
{
alpha = score;
pos = (int)emptyboard[i];
}
if (alpha >= beta)
break;
}
else
{
if (score < beta)
beta = score;
if (alpha >= beta)
break;
}
}
if (depth == origdepth)
return pos;
if (mult == 1)
return alpha;
else
return beta;
}
The details of undefined functions:
util.checkwin(int[] board) = checks the board for a possible won or drawn outboard or an incomplete board, and returns the winner as 1 or 2 (player X or O), 0 for a draw, and -1 for an incomplete board.
util.filterboard(int[] newboard) = returns an arraylist containing all the positions of empty locations in board given.
util.changeside(char symbol) = simply flips X to O and O to X and returns the result.
I have tried with the depth as 2 which means it will calculate the next 2 moves (if it is winning and if the opponent can win). But the results weren't what I expected. and it is also trying to play on a filled location occasionally.
Here is an output(depth = 2):
Turn: X
| |
1 | 2 | 3
__|___|__
| |
4 | 5 | 6
__|___|__
| |
7 | 8 | 9
| |
Enter Your Choice:
Turn: O
| |
1 | 2 | 3
__|___|__
| |
X | 5 | 6
__|___|__
| |
7 | 8 | 9
| |
Enter Your Choice: 5
Turn: X
| |
1 | 2 | 3
__|___|__
| |
X | O | 6
__|___|__
| |
7 | 8 | 9
| |
Enter Your Choice:
Turn: O
| |
1 | X | 3
__|___|__
| |
X | O | 6
__|___|__
| |
7 | 8 | 9
| |
Enter Your Choice: 1
Turn: X
| |
O | X | 3
__|___|__
| |
X | O | 6
__|___|__
| |
7 | 8 | 9
| |
Enter Your Choice:
Turn: O
| |
O | X | 3
__|___|__
| |
X | O | 6
__|___|__
| |
7 | X | 9
| |
Enter Your Choice: 9
| |
O | X | 3
__|___|__
| |
X | O | 6
__|___|__
| |
7 | X | O
| |
O Wins
But it still fails to recognize my winning move.
All the other functions have been tested when played user against a user and they are all working fine. I would appreciate some help.
I am happy to provide my full code, if necessary and anything else required.
A couple of observations.
1) The if (depth == 0) return 0; should be changed to something like
if (depth == 0) return EvaluatePosition();,
because currently your algorithm will return 0 (score, corresponding to a draw) whenever it reaches depth zero (while the actual position at zero depth might not be equal - for instance, one of the sides can have huge advantage). EvaluatePosition() function should reflect the current board position (it should say something like "X has an advantage", "O is losing", "The position is more or less equal" etc, represented as a number). Note, that this will matter only if depth == 0 condition is triggered, otherwise it is irrelevant.
2) Do you really need this emptyboard stuff? You can iterate over all squares of the newboard and once you find an empty square, copy the original board, make the move on this empty square and call minimax with the copied and updated board. In pseudocode it will look something like this:
for square in board.squares:
if square is empty:
board_copy = Copy(board)
board_copy.MakeMove(square)
score = minimax(board_copy, /*other arguments*/)
/*the rest of minimax function*/
3) The if (alpha >= beta) break; piece is present in both branches (for mult == 1 and mult != 1), so you can put it after the if-else block to reduce code repetition.
4) Check if your algorithm is correct without alpha-beta pruning. The outcomes of plain minimax and alpha-beta pruning minimax should be the same, but plain minimax is easier to understand, code and debug. After your plain minimax is working properly, add enhancements like alpha-beta pruning and others.

Logarithmic distribution of profits among game winners

I have a gave, which, when it's finished, has a table of players and their scores.
On the other hand i have a virtual pot of money that i want to distribute among these winners. I'm looking for a SQL query or piece of C# code to do so.
The descending sorted table looks like this:
UserId | Name | Score | Position | % of winnings | abs. winnings $
00579 | John | 754 | 1 | ? | 500 $
98983 | Sam | 733 | 2 | ? | ?
29837 | Rick | 654 | 3 | ? | ? <- there are 2 3rd places
21123 | Hank | 654 | 3 | ? | ? <- there are 2 3rd places
99821 | Buck | 521 | 5 | ? | ? <- there is no 4th, because of the 2 3rd places
92831 | Joe | 439 | 6 | ? | ? <- there are 2 6rd places
99281 | Jack | 439 | 6 | ? | ? <- there are 2 6rd places
12345 | Hal | 412 | 8 | ? | ?
98112 | Mick | 381 | 9 | ? | ?
and so on, until position 50
98484 | Sue | 142 | 50 | ? | 5 $
Be aware of the double 3rd and 6th places.
Now i want to distribute the total amount of (virtual) money ($ 10,000) among the first 50 positions. (It would be nice if the positions to distribute among (which is now 50) can be a variable).
The max and min amount (for nr 1 and nr 50) are fixed at 500 and 5.
Does anyone have a good idea for a SQL query or piece of C# code to fill the columns with % of winnings and absolute winnings $ correctly?
I prefer to have a distribution that looks a bit logarithmic like this: (which makes that the higher positions get relatively more than the lower ones).
.
|.
| .
| .
| .
| .
| .
| .
| .
| .
I haven't done SQL since 1994, but I like C# :-). The following might suit, adjust parameters of DistributeWinPot.DistributeWinPot(...) as required:
private class DistributeWinPot {
private static double[] GetWinAmounts(int[] psns, double TotWinAmounts, double HighWeight, double LowWeight) {
double[] retval = new double[psns.Length];
double fac = -Math.Log(HighWeight / LowWeight) / (psns.Length - 1), sum = 0;
for (int i = 0; i < psns.Length; i++) {
sum += retval[i] = (i == 0 || psns[i] > psns[i - 1] ? HighWeight * Math.Exp(fac * (i - 1)) : retval[i - 1]);
}
double scaling = TotWinAmounts / sum;
for (int i = 0; i < psns.Length; i++) {
retval[i] *= scaling;
}
return retval;
}
public static void main(string[] args) {
// set up dummy data, positions in an int array
int[] psns = new int[50];
for (int i = 0; i < psns.Length; i++) {
psns[i] = i+1;
}
psns[3] = 3;
psns[6] = 6;
double[] WinAmounts = GetWinAmounts(psns, 10000, 500, 5);
for (int i = 0; i < psns.Length; i++) {
System.Diagnostics.Trace.WriteLine((i + 1) + "," + psns[i] + "," + string.Format("{0:F2}", WinAmounts[i]));
}
}
}
Output from that code was:
1,1,894.70
2,2,814.44
3,3,741.38
4,3,741.38
5,5,614.34
6,6,559.24
7,6,559.24
8,8,463.41
9,9,421.84
10,10,384.00
11,11,349.55
12,12,318.20
13,13,289.65
14,14,263.67
15,15,240.02
16,16,218.49
17,17,198.89
18,18,181.05
19,19,164.81
20,20,150.03
21,21,136.57
22,22,124.32
23,23,113.17
24,24,103.02
25,25,93.77
26,26,85.36
27,27,77.71
28,28,70.74
29,29,64.39
30,30,58.61
31,31,53.36
32,32,48.57
33,33,44.21
34,34,40.25
35,35,36.64
36,36,33.35
37,37,30.36
38,38,27.64
39,39,25.16
40,40,22.90
41,41,20.85
42,42,18.98
43,43,17.27
44,44,15.72
45,45,14.31
46,46,13.03
47,47,11.86
48,48,10.80
49,49,9.83
50,50,8.95
Then how about this?
Select userid, log(score),
10000 * log(score) /
(Select Sum(log(score))
From TableName
Where score >=
(Select Min(score)
from (Select top 50 score
From TableName
Order By score desc) z))
From TableName
Order By score desc

c# recursive function help understanding how it works?

I need help to understand how a function is working;: it is a recursive function with yield return but I can't figure out how it works. It is used calculate a cumulative density function (approximate) over a set of data.
Thanks a lot to everyone.
/// Approximates the cumulative density through a recursive procedure
/// estimating counts of regions at different resolutions.
/// </summary>
/// <param name="data">Source collection of integer values</param>
/// <param name="maximum">The largest integer in the resulting cdf (it has to be a power of 2...</param>
/// <returns>A list of counts, where entry i is the number of records less than i</returns>
public static IEnumerable<int> FUNCT(IEnumerable<int> data, int max)
{
if (max == 1)
{
yield return data.Count();
}
else
{
var t = data.Where(x => x < max / 2);
var f = data.Where(x => x > max / 2);
foreach (var value in FUNCT(t, max / 2))
yield return value;
var count = t.Count();
f = f.Select(x => x - max / 2);
foreach (var value in FUNCT(f, max / 2))
yield return value + count;
}
}
In essence, IEnumerable functions that use yield return function slightly differently from traditional recursive functions. As a base case, suppose you have:
IEnumerable<int> F(int n)
{
if (n == 1)
{
yield return 1;
yield return 2;
// implied yield return break;
}
// Enter loop 1
foreach (var v in F(n - 1))
yield return v;
// End loop 1
int sum = 5;
// Enter loop 2
foreach (var v in F(n - 1))
yield return v + sum;
// End loop 2
// implied yield return break;
}
void Main()
{
foreach (var v in F(2))
Console.Write(v);
// implied return
}
F takes the basic orm of the original FUNCT. If we call F(2), then walking through the yields:
F(2)
| F(1)
| | yield return 1
| yield return 1
Console.Write(1);
| | yield return 2
| yield return 2
Console.Write(2)
| | RETURNS
| sum = 5;
| F(1)
| | yield return 1
| yield return 1 + 5
Console.Write(6)
| | yield return 2
| yield return 2 + 5
Console.Write(7)
| | RETURNS
| RETURNS
RETURNS
And 1267 is printed. Note that the yield return statement yields control to the caller, but that the next iteration causes the function to continue where it had previously yielded.
The CDF method does adds some additional complexity, but not much. The recursion splits the collection into two pieces, and computes the CDF of each piece, until max=1. Then the function counts the number of elements and yields it, with each yield propogating recursively to the enclosing loop.
To walk through FUNCT, suppose you run with data=[0,1,0,1,2,3,2,1] and max=4. Then running through the method, using the same Main function above as a driver, yields:
FUNCT([0,1,0,1,2,3,2,1], 4)
| max/2 = 2
| t = [0,1,0,1,1]
| f = [3] // (note: per my comment to the original question,
| // should be [2,3,2] to get true CDF. The 2s are
| // ignored since the method uses > max/2 rather than
| // >= max/2.)
| FUNCT(t,max/2) = FUNCT([0,1,0,1,1], 2)
| | max/2 = 1
| | t = [0,0]
| | f = [] // or [1,1,1]
| | FUNCT(t, max/2) = FUNCT([0,0], 1)
| | | max = 1
| | | yield return data.count = [0,0].count = 2
| | yield return 2
| yield return 2
Console.Write(2)
| | | RETURNS
| | count = t.count = 2
| | F(f, max/2) = FUNCT([], 1)
| | | max = 1
| | | yield return data.count = [].count = 0
| | yield return 0 + count = 2
| yield return 2
Console.Write(2)
| | | RETURNS
| | RETURNS
| count = t.Count() = 5
| f = f - max/2 = f - 2 = [1]
| FUNCT(f, max/2) = FUNCT([1], 2)
| | max = 2
| | max/2 = 1
| | t = []
| | f = [] // or [1]
| | FUNCT(t, max/2) = funct([], 1)
| | | max = 1
| | | yield return data.count = [].count = 0
| | yield return 0
| yield return 0 + count = 5
Console.Write(5)
| | | RETURNS
| | count = t.count = [].count = 0
| | f = f - max/2 = []
| | F(f, max/2) = funct([], 1)
| | | max = 1
| | | yield return data.count = [].count = 0
| | yield return 0 + count = 0 + 0 = 0
| yield return 0 + count = 0 + 5 = 5
Console.Write(5)
| | RETURNS
| RETURNS
RETURNS
So this returns the values (2,2,5,5). (using >= would yield the values (2,5,7,8) -- note that these are the exact values of a scaled CDF for non-negative integral data, rather than an approximation).
Interesting question. Assuming you understand how yield works, the comments on the function (in your question) are very helpful. I've commented the code as I understand it which might help:
public static IEnumerable<int> FUNCT(IEnumerable<int> data, int max)
{
if (max == 1)
{
// Effectively the end of the recursion.
yield return data.Count();
}
else
{
// Split the data into two sets
var t = data.Where(x => x < max / 2);
var f = data.Where(x => x > max / 2);
// In the set of smaller numbers, recurse to split it again
foreach (var value in FUNCT(t, max / 2))
yield return value;
// For the set of smaller numbers, get the count.
var count = t.Count();
// Shift the larger numbers so they are in the smaller half.
// This allows the recursive function to reach an end.
f = f.Select(x => x - max / 2);
// Recurse but add the count of smaller numbers. We already know there
// are at least 'count' values which are less than max / 2.
// Recurse to find out how many more there are.
foreach (var value in FUNCT(f, max / 2))
yield return value + count;
}
}

How to Convert decimal number to time or vice versa

here is an example
if 8.30 is there it should be 8 hours 30 minute
if 8 hour 20 minutes then 8.20
Please tell whether it is possible ? if yes
how ?
When people talk about decimal hours, they usually mean 0.1 = 6 minutes.
So, the correct formula to convert 8.3 would be:
8 hours + 3 * 6 minutes = 8:18
To convert 8:20 to decimal it would be:
8 + 20/6 = 8.333333 (probably round to 8.3)
If it always be separated with . and you want it for displaying then simply use this:
var ar="8.30".split(new[]{'.'});
Console.Write("{0} hours {1} minutes",ar[0], ar[1]);
PS: Here we are sure to have two elements in array, but please check length of array ar before using ar[1]
My approach would look something like this. (This is ruby so you'll have to convert it yourself but the logic is whats important here)
def zeropad(number)
return ((number.to_f < 10) ? "0" : "") + number.round.to_s
end
def decimal_to_time(value)
t = value.split(".") #returns an array of ["hour", "minutes"]
hours, minutes = t[0], t[1]
minutes = zeropad( (minutes.to_f / 10**minutes.length) * 60 ) # parse the minutes into a time value
return (minutes.to_i == 0) ? hours : hours + ":" + minutes
end
def findTime(value)
value =~ /^\d+\.\d+/ ? decimal_to_time(value) : value
end
Where findTime("5.015") gives you the appropriate time value.
I've tested this across the following tests and they all pass.
| entered_time | expected_results|
| "5.6" | "5:36" |
| "5.9" | "5:54" |
| "5.09" | "5:05" |
| "5.0" | "5" |
| "5.00" | "5" |
| "5.015" | "5:01" |
| "6.03" | "6:02" |
| "5.30" | "5:18" |
| "4.2" | "4:12" |
| "8.3" | "8:18" |
| "8.33" | "8:20" |
| "105.5" | "105:30" |
| "16.7" | "16:42" |
| "Abc" | "Abc" |
| "5:36" | "5:36" |
| "5:44" | "5:44" |
Here's a couple of extension methods (for DateTime and Decimal) that do the job:
public static class DecimalToTimeConverters
{
public static DateTime ToDateTime(this decimal value)
{
string[] parts = value.ToString().Split(new char[] { '.' });
int hours = Convert.ToInt32(parts[0]);
int minutes = Convert.ToInt32(parts[1]);
if ((hours > 23) || (hours < 0))
{
throw new ArgumentOutOfRangeException("value", "decimal value must be no greater than 23.59 and no less than 0");
}
if ((minutes > 59) || (minutes < 0))
{
throw new ArgumentOutOfRangeException("value", "decimal value must be no greater than 23.59 and no less than 0");
}
DateTime d = new DateTime(1, 1, 1, hours, minutes, 0);
return d;
}
public static Decimal ToDecimal(this DateTime datetime)
{
Decimal d = new decimal();
d = datetime.Hour;
d = d + Convert.ToDecimal((datetime.Minute * 0.01));
return d;
}
}
I tested this very quickly in an ASP.net webpage (I had a web project open at the time) using the following in a new blank page, and it seemed to work a treat:
protected void Page_Load(object sender, EventArgs e)
{
Response.Clear();
Decimal d = new decimal();
d = 3.45M;
Response.Write(d.ToDateTime().ToString());
Response.Write("<br />");
DateTime d2 = new DateTime(2009, 1, 1, 4, 55, 0);
Response.Write(d2.ToDecimal().ToString());
}
As per Rob but substitute
string[] parts = value.ToString().Split(new char[] { '.' });
int hours = Convert.ToInt32(parts[0]);
int minutes = Convert.ToInt32(parts[1]);
as
int hours = (int)value;
int minutes = (int)((value - minutes) * 100);
no strings or reliance on current culture (the assumption that the '.' is the decimal point)
How can I parse the txtDuration.Text Value into a decimal value?
if (txtDuration.Text)
{
var duration = int.Parse(txtDuration.Text);
var timespan = Boolean.Parse(hdfToggleDuration.Value) ? new TimeSpan (0, 0, duration, 0) : new TimeSpan (0, duration, 0, 0);
DateTime end = start.Add(timespan);
}

Categories

Resources