How can multiple IndexOf be faster than raw iteration? - c#

string s = "abcabcabcabcabc";
var foundIndexes = new List<int>();
The question came from the discussion here. I was simply wondering
How can this:
for (int i = s.IndexOf('a'); i > -1; i = s.IndexOf('a', i + 1))
foundIndexes.Add(i);
Be better than this :
for (int i = 0; i < s.Length; i++)
if (s[i] == 'a') foundIndexes.Add(i);
EDIT : Where all does the performance gain come from?

I did not observe that using IndexOf was any faster than direct looping. Honestly, I don't see how it could be because each character needs to be checked in both cases. My initial results were this:
Found by loop, 974 ms
Found by IndexOf 1144 ms
Edit: After running several more times I've noticed that you must run release (ie with optimizations) to see my result above. Without optimizations, the for loop is indeed slower.
The benchmark code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
using System.Text;
using System.IO;
using System.Diagnostics;
namespace Test
{
public class Program
{
public static void Main(string[] args)
{
const string target = "abbbdbsdbsbbdbsabdbsabaababababafhdfhffadfd";
// Jit methods
TimeMethod(FoundIndexesLoop, target, 1);
TimeMethod(FoundIndexesIndexOf, target, 1);
Console.WriteLine("Found by loop, {0} ms", TimeMethod(FoundIndexesLoop, target, 2000000));
Console.WriteLine("Found by IndexOf {0} ms", TimeMethod(FoundIndexesIndexOf, target, 2000000));
}
private static long TimeMethod(Func<string, List<int>> method, string input, int reps)
{
var stopwatch = Stopwatch.StartNew();
List<int> result = null;
for(int i = 0; i < reps; i++)
{
result = method(input);
}
stopwatch.Stop();
TextWriter.Null.Write(result);
return stopwatch.ElapsedMilliseconds;
}
private static List<int> FoundIndexesIndexOf(string s)
{
List<int> indexes = new List<int>();
for (int i = s.IndexOf('a'); i > -1; i = s.IndexOf('a', i + 1))
{
// for loop end when i=-1 ('a' not found)
indexes.Add(i);
}
return indexes;
}
private static List<int> FoundIndexesLoop(string s)
{
var indexes = new List<int>();
for (int i = 0; i < s.Length; i++)
{
if (s[i] == 'a')
indexes.Add(i);
}
return indexes;
}
}
}

IndexOf(char value, int startIndex) is marked with the following attribute: [TargetedPatchingOptOut("Performance critical to inline across NGen image boundaries")].
Also, the implementation of this method is most likely optimized in many other ways, probably using unsafe code, or using more "native" techniques, say, using the native FindNLSString Win32 function.

Related

Why can't I find Sum() of this HashSet. says "Arithmetic operation resulted in an overflow."

I was trying to solve this problem projecteuler,problem125
this is my solution in python(just for understanding the logic)
lim = 10**8
total=0
found= set([])
for start in xrange(1,int(lim**0.5)):
s=start**2
for i in xrange(start+1,int(lim**0.5)):
s += i**2
if s>lim:
break
if str(s) == str(s)[::-1]:
found.add(s)
print sum(found)
the same code I wrote in C# is as follows
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
public static bool isPalindrome(string s)
{
string temp = "";
for (int i=s.Length-1;i>=0;i-=1){temp+=s[i];}
return (temp == s);
}
static void Main(string[] args)
{
int lim = Convert.ToInt32(Math.Pow(10,8));
var found = new HashSet<int>();
for (int start = 1; start < Math.Sqrt(lim); start += 1)
{
int s = start *start;
for (int i = start + 1; start < Math.Sqrt(lim); i += 1)
{
s += i * i;
if (s > lim) { break; }
if (isPalindrome(s.ToString()))
{ found.Add(s); }
}
}
Console.WriteLine(found.Sum());
}
}
}
the code debugs fine until it gives an exception at Console.WriteLine(found.Sum()); (line31). Why can't I find Sum() of the set found
The sum is: 2,906,969,179.
That is 759,485,532 greater than int.MaxValue;
Change int to long in var found = new HashSet<long>(); To handle the value.
You can also use uint however instead of long, however I would recommend using long.

Concat error in C#

I have a quick sort program using lists.
The error is in the quick sort function return statement.
System.Collections.List does not contain definition for Concat and the best extension method System.Collections.Generic.IEnumerableTsource has some invalid arguments.
The code is as follows.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Enter the n9o. of elements: ");
int n = Convert.ToInt32(Console.ReadLine());
List<int> unsorted = new List<int>();
Console.WriteLine("Enter the elements: ");
for (int i = 0; i < n; i++)
{
unsorted.Add(Convert.ToInt32(Console.ReadLine()));
}
List<int> sorted = quicksort(unsorted);
foreach (int entry in sorted)
{
Console.Write(entry + "\t");
}
return;
} //end of main.
public static List<int> quicksort(List<int> given)
{
if (given.Count == 1)
return given;
int mid = given.Count / 2;
List<int> less = new List<int>();
List<int> big = new List<int>();
for (int a = 0; a < given.Count; a++)
{
if (given[a] < mid)
{
less.Add(given[a]);
}
else
big.Add(given[a]);
}
return (quicksort(less).Concat(given[mid]).Concat(quicksort(big)));
}
}//end of class.
}//end of namespace.
You can't Concat an int into an IEnumerable<int>. You could instead wrap it in an array and Concat that to your other lists:
return quicksort(less)
.Concat(new[] { given[mid] })
.Concat(quicksort(big))
.ToList();
As the error is trying to tell you, the Concat() method takes a collection of items to concatenate, not a single int.
I think adding given[mid] to the resulting list is a mistake, since it will add the item to your resulting list twice...
also, you need to test against given[mid] not mid
So you should change your if statement to:
if (given[a] < given[mid])
less.Add(given[a]);
else if (given[a] > given[mid])
big.Add(given[a]);
This is assuming that all numbers are unique as well, because if given[mid] is not unique, then you have a problem

c# parallel arrays data from text file

Here's the problem: Index was outside the bounds of the array. Assignment: Write a program that determines the number of students who can still enroll in a given class. Design your solution using parallel arrays. Test your solution by retrieving the following data from a text file. Define a exception class for this problem if the current enrollment exceeds the maximum enrollment by more than three. Halt the program and display a message indicating which course is over-enrolled.
Here's the original code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
private static string[] classes = { "CS150", "CS250", "CS270", "CS300", "CS350" };
private static int[] currentEnrolled = { 18, 11, 9, 4, 20 };
private static int[] maxEnrollment = { 20, 20, 20, 20, 20 };
private static int currentEnrollment()
{
int enrolled = 0;
foreach (int i in currentEnrolled)
{
enrolled += i;
}
return enrolled;
}
private static void listClasses()
{
foreach (string i in classes)
{
Console.WriteLine("Class: {0}", i);
}
}
private static void ClassStatus()
{
for (int i = 0; i < currentEnrolled.Length; i++)
{
Console.WriteLine("Class: {0}, Max: {1}, Current: {2}, remaining: {3}", classes[i], maxEnrollment[i], currentEnrolled[i], maxEnrollment[i] - currentEnrolled[i]);
}
}
static void Main(string[] args)
{
Console.WriteLine("Currently Enrolled: {0}", currentEnrollment());
ClassStatus();
Console.ReadKey(false);
}
}
}
Now, I've been editing the above code to take a text file instead, however I get an error. Here's what I'm working with:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
private static string[] classes = new string[900];
private static int[] currentEnrolled = new int[900];
private static int[] maxEnrollment = new int[900];
private static int currentEnrollment()
{
int enrolled = 0;
foreach (int i in currentEnrolled)
{
enrolled += i;
}
return enrolled;
}
private static void listClasses()
{
foreach (string i in classes)
{
Console.WriteLine("Class: {0}", i);
}
}
private static void ClassStatus()
{
for (int i = 0; i < currentEnrolled.Length; i++)
{
Console.WriteLine("Class: {0}, Max: {1}, Current: {2}, remaining: {3}", classes[i], maxEnrollment[i], currentEnrolled[i], maxEnrollment[i] - currentEnrolled[i]);
}
}
static void Main(string[] args)
{
string[] lines = File.ReadAllLines("classes.txt");
int i = 0;
foreach (string line in File.ReadAllLines("classes.txt"))
{
string[] parts = line.Split(',');
while (i < 900 && i < parts.Length)
{
classes[i] = parts[1];
currentEnrolled[i] = int.Parse(parts[2]);
maxEnrollment[i] = int.Parse(parts[3]);
}
i++;
}
Console.WriteLine("Currently Enrolled: {0}", currentEnrollment());
ClassStatus();
Console.ReadKey(false);
}
}
}
Some of the components used in the above code were taken from this article: Splitting data from a text file into parallel arrays
Text file looks like this:
CS150,18,20
CS250,11,20
CS270,32,25
CS300,4,20
CS350,20,20
Any assistance will be appreciated. And yes, this is an assignment. Programming is most definitely not my strong suit.
There seem to be multiple problems with your while loop.
First, parts.Length will always be 3, since you have 2 commas and split on that. So the condition i < 900 && i < parts.Length does not really make sense, it's like i < 900 and i < 3, so it will always stop at 3. The intent is not really clear here, I think you meant to loop on each 900 values, but fi soforeach already does that for you.
Next, since there's 3 parts and C# arrays are 0-based, it should be parts[0], parts[1] and parts[2]. That's what causing your 'out of range' exception.
Finally, i++; should be in your while loop. If you leave it outside, you will loop forever as the index will never increase.
Basically, it should be something like this :
while (i < 900)
{
classes[i] = parts[0];
currentEnrolled[i] = int.Parse(parts[1]);
maxEnrollment[i] = int.Parse(parts[2]);
i++;
}
Again, the 900 is not really clear since you don't have 900 values per line (remember you're in a foreach). In my opinion you might as well scratch all that and redo it carefully.
What you need to do, is the following :
Read the file and store all the lines
Foreach line do:
Split the line in 3 parts
Store each separate part
Write results
For the "custom exception" part, you can add:
For the length of currentEnrollment do:
If currentEnrollment at current index is superior than maxEnrollment at current index do:
Throw a new exception with the className at current index

Changing one element of string in hashtable in C#

I have to write a program which use hashtable and the keys/values are input by the user. In the program, I have to output all the keys, however if any key starts with small 'a', I have to make it start with a big 'A'. I have a problem in the last step.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Collections;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
Hashtable hashtable = new Hashtable();
for (int i = 0; i < 10; i++)
{
Console.WriteLine("Vnesete kluc");
string kluc = Console.ReadLine();
Console.WriteLine("Vnesete podatok");
string podatok = Console.ReadLine();
hashtable.Add(kluc, podatok);
}
foreach (string klucevi in hashtable.Keys)
{
if (klucevi[0] == 'a')
{
klucevi[0] = 'A';
}
Console.WriteLine(klucevi);
}
}
}
}
I'm getting an error on the line where I'm converting the first element of the string if it's 'a' to 'A'.
You can't dynamically change keys. Most simple approach is check the key before you add to the collection:
for (int i = 0; i < 10; i++)
{
Console.WriteLine("Vnesete kluc");
string kluc = Console.ReadLine();
if (kluc.StartsWith("a"))
kluc = "A" + kluc.Substring(1);
Console.WriteLine("Vnesete podatok");
string podatok = Console.ReadLine();
hashtable.Add(kluc, podatok);
}
Your problem has nothing to do with hash tables. You have a compilation error, because in .NET strings are immutable.
Secondly, and this is unrelated, a foreach loop variable cannot be assigned to.
So, instead of
foreach (string klucevi in *whatever*)
{
if (klucevi[0] == 'a')
{
klucevi[0] = 'A';
}
Console.WriteLine(klucevi);
}
use
foreach (string klucevi in *whatever*)
{
var temp = klucevi;
if (temp[0] == 'a')
{
StringBuilder sb = new StringBuilder(temp);
sb[0] = 'A';
temp = sb.ToString();
}
Console.WriteLine(temp);
}
don't forget to include a using System.Text; declaration.
UPDATE:
The aswer above shows you a generic way to modify a string in .NET, not just to replace one character.
Furthermore, some people have raised concerns about the efficiency of the approach. They are mistaken. More information in this execelent article on Strings Undocumented.
UPDATE 2:
I like being challenged. While it is totally irrelevant for the question at hand, a discussion has emmerged about the efiiciency of using a StringBuilder, compared to using "A" + temp.Substring(1). Because I like facts, and I assume some readers would agree, I ran a little benchmark.
I ran the tests on a 64 bit Windows 7 box with .NET 4.5 as both a 32 and a 64 bit process. It turns out that the StringBuilder approach is always faster than the alternative, by about 20%. Memory usage is approximately the same. YMMV.
For those who care to repeat the test, here's the source code:
using System;
using System.Diagnostics;
using System.Text;
static class Program
{
static void Main(string[] args)
{
for (int length = 50; length <= 51200; length = length * 2)
{
string input = new string(' ', length);
// warm up
PerformTest(input, 1);
// actual test
PerformTest(input, 100000);
}
}
static void PerformTest(string input, int iterations)
{
GC.Collect();
GC.WaitForFullGCComplete();
int gcRuns = GC.CollectionCount(0);
Stopwatch sw = Stopwatch.StartNew();
for (int i = iterations; i > 0; i--)
{
StringBuilder sb = new StringBuilder(input);
sb[0] = 'A';
input = sb.ToString();
}
long ticksWithStringBuilder = sw.ElapsedTicks;
int gcRunsWithStringBuilder = GC.CollectionCount(0) - gcRuns;
GC.Collect();
GC.WaitForFullGCComplete();
gcRuns = GC.CollectionCount(0);
sw = Stopwatch.StartNew();
for (int i = iterations; i > 0; i--)
{
input = "A" + input.Substring(1);
}
long ticksWithConcatSubstring = sw.ElapsedTicks;
int gcRunsWithConcatSubstring = GC.CollectionCount(0) - gcRuns;
if (iterations > 1)
{
Console.WriteLine(
"String length: {0, 5} With StringBuilder {1, 9} ticks {2, 5} GC runs, alternative {3, 9} ticks {4, 5} GC Runs, speed ratio {5:0.00}",
input.Length,
ticksWithStringBuilder, gcRunsWithStringBuilder,
ticksWithConcatSubstring, gcRunsWithConcatSubstring,
((double)ticksWithStringBuilder) / ((double)ticksWithConcatSubstring));
}
}
}

c# ref for speed

I understand full the ref word in the .NET
Since using the same variable, would increase speed to use ref instead of making copy?
I find bottleneck to be in password general.
Here is my codes
protected internal string GetSecurePasswordString(string legalChars, int length)
{
Random myRandom = new Random();
string myString = "";
for (int i = 0; i < length; i++)
{
int charPos = myRandom.Next(0, legalChars.Length - 1);
myString = myString + legalChars[charPos].ToString();
}
return myString;
}
is better to ref before legalchars?
Passing a string by value does not copy the string. It only copies the reference to the string. There's no performance benefit to passing the string by reference instead of by value.
No, you shouldn't pass the string reference by reference.
However, you are creating several strings pointlessly. If you're creating long passwords, that could be why it's a bottleneck. Here's a faster implementation:
protected internal string GetSecurePasswordString(string legalChars, int length)
{
Random myRandom = new Random();
char[] chars = new char[length];
for (int i = 0; i < length; i++)
{
int charPos = myRandom.Next(0, legalChars.Length - 1);
chars[i] = legalChars[charPos];
}
return new string(chars);
}
However, it still has three big flaws:
It creates a new instance of Random each time. If you call this method twice in quick succession, you'll get the same password twice. Bad idea.
The upper bound specified in a Random.Next() call is exclusive - so you'll never use the last character of legalChars.
It uses System.Random, which is not meant to be in any way cryptographically secure. Given that this is meant to be for a "secure password" you should consider using something like System.Security.Cryptography.RandomNumberGenerator. It's more work to do so because the API is harder, but you'll end up with a more secure system (if you do it properly).
You might also want to consider using SecureString, if you get really paranoid.
strings in .Net are immutable , so all modify operations on strings always result in creation ( and garbage collection) of new strings. No performance gain would be achieved by using ref in this case. Instead , use StringBuilder.
A word about the general performance gain of passing a string ByReference ("ref") instead of ByValue:
There is a performance gain, but it is very small!
Consider the program below where a function is called 10.000.0000 times with a string argument by value and by reference. The average time measured was
ByValue: 249 milliseconds
ByReference: 226 milliseconds
In general "ref" is a little faster, but often it's not worth worrying about it.
Here is my code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
namespace StringPerformanceTest
{
class Program
{
static void Main(string[] args)
{
const int n = 10000000;
int k;
string time, s1;
Stopwatch sw;
// List for testing ("1", "2", "3" ...)
List<string> list = new List<string>(n);
for (int i = 0; i < n; i++)
list.Add(i.ToString());
// Test ByVal
k = 0;
sw = Stopwatch.StartNew();
foreach (string s in list)
{
s1 = s;
if (StringTestSubVal(s1)) k++;
}
time = GetElapsedString(sw);
Console.WriteLine("ByVal: " + time);
Console.WriteLine("123 found " + k + " times.");
// Test ByRef
k = 0;
sw = Stopwatch.StartNew();
foreach (string s in list)
{
s1 = s;
if (StringTestSubRef(ref s1)) k++;
}
time = GetElapsedString(sw);
Console.WriteLine("Time ByRef: " + time);
Console.WriteLine("123 found " + k + " times.");
}
static bool StringTestSubVal(string s)
{
if (s == "123")
return true;
else
return false;
}
static bool StringTestSubRef(ref string s)
{
if (s == "123")
return true;
else
return false;
}
static string GetElapsedString(Stopwatch sw)
{
if (sw.IsRunning) sw.Stop();
TimeSpan ts = sw.Elapsed;
return String.Format("{0:00}:{1:00}:{2:00}.{3:000}", ts.Hours, ts.Minutes, ts.Seconds, ts.Milliseconds);
}
}
}

Categories

Resources