I am trying to check the time complexity of the below simple program.
The program replaces spaces in a string with '%20'.
The loop to count spaces (O(1) time)
foreach (char k in s)
{
if (k == ' ')
{
spaces_cnt++;
}
}
The loop to replace the spaces (O(n) where n is size of string)
char[] c = new char[s.Length + spaces_cnt * 3];
int i = 0;
int j = 0;
while (i<s.Length)
{
if (s[i] != ' ')
{
c[j] = s[i];
j++;
i++;
}
else
{
c[j] = '%';
c[j + 1] = '2';
c[j + 2] = '0';
j = j + 3;
i++;
}
}
So I am guessing it is a "O(n) + O(1)" solution. Please correct me if I am wrong.
The loop to count spaces takes O(n), not O(1), since you’re iterating over – and performing a check on – each of the n characters in your string.
As you stated, the replacement loop takes O(n). Two O(n) operations performed sequentially have a combined complexity of O(n) (constant factors are discarded in Big-O notation).
P.S. You know that you can achieve the equivalent of all your code using a single line?
s = s.Replace(" ", "%20");
Looks like you're trying to encode a string. If that's the case, you can use the UrlPathEncode() method. If you're only trying to encode spaces, use Replace() (as mentioned by Douglas).
Related
Say we have the following strings that we pass as parameters to the function below:
string sString = "S104";
string sString2 = "AS105";
string sString3 = "ASRVT106";
I want to be able to extract the numbers from the string to place them in an int variable. Is there a quicker and/or more efficient way of removing the letters from the strings than the following code?: (*These strings will be populated dynamically at runtime - they are not assigned values at construction.)
Code:
public GetID(string sCustomTag = null)
{
m_sCustomTag = sCustomTag;
try {
m_lID = Convert.ToInt32(m_sCustomTag); }
catch{
try{
int iSubIndex = 0;
char[] subString = sCustomTag.ToCharArray();
//ITERATE THROUGH THE CHAR ARRAY
for (int i = 0; i < subString.Count(); i++)
{
for (int j = 0; j < 10; j++)
{
if (subString[i] == j)
{
iSubIndex = i;
goto createID;
}
}
}
createID: m_lID = Convert.ToInt32(m_sCustomTag.Substring(iSubIndex));
}
//IF NONE OF THAT WORKS...
catch(Exception e)
{
m_lID = 00000;
throw e;
}
}
}
}
I've done things like this before, but I'm not sure if there's a more efficient way to do it. If it was just going to be a single letter at the beginning, I could just set the subStringIndex to 1 every time, but the users can essentially put in whatever they want. Generally, they will be formatted to a LETTER-then-NUMBER format, but if they don't, or they want to put in multiple letters like sString2 or sString3, then I need to be able to compensate for that. Furthermore, if the user puts in some whacked-out, non-traditional format like string sString 4 = S51A24;, is there a way to just remove any and all letters from the string?
I've looked about, and can't find anything on MSDN or Google. Any help or links to it are greatly appreciated!
You can use a regular expression. It's not necessarily faster, but it's more concise.
string sString = "S104";
string sString2 = "AS105";
string sString3 = "ASRVT106";
var re = new Regex(#"\d+");
Console.WriteLine(re.Match(sString).Value); // 104
Console.WriteLine(re.Match(sString2).Value); // 105
Console.WriteLine(re.Match(sString3).Value); // 106
You can use a Regex, but it's probably faster to just do:
public int ExtractInteger(string str)
{
var sb = new StringBuilder();
for (int i = 0; i < str.Length; i++)
if(Char.IsDigit(str[i])) sb.Append(str[i]);
return int.Parse(sb.ToString());
}
You can simplify further with some LINQ at the expense of a small performance penalty:
public int ExtractInteger(string str)
{
return int.Parse(new String(str.Where(c=>Char.IsDigit(c)).ToArray()));
}
Now, if you only want to parse the first sequence of consecutive digits, do this instead:
public int ExtractInteger(string str)
{
return int.Parse(new String(str.SkipWhile(c=>!Char.IsDigit(c)).TakeWhile(c=>Char.IsDigit(c)).ToArray()));
}
Fastest is to parse the string without removing anything:
var s = "S51A24";
int m_lID = 0;
for (int i = 0; i < s.Length; i++)
{
int d = s[i] - '0';
if ((uint)d < 10)
m_lID = m_lID * 10 + d;
}
Debug.Print(m_lID + ""); // 5124
string removeLetters(string s)
{
for (int i = 0; i < s.Length; i++)
{
char c = s[i];
if (IsEnglishLetter(c))
{
s = s.Remove(i, 1);
}
}
return s;
}
bool IsEnglishLetter(char c)
{
return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
}
While you asked "what's the fastest way to remove characters..." what you really appear to be asking is "how do I create an integer by extracting only the digits from the string"?
Going with this assumption, your first call to Convert.ToInt32 will be slow for the case where you have other than digits because of the exception throwing.
Let's try another approach. Let's think about each of the cases.
The string always starts with a series of digits (e.g. 123ABC => 123)
The string always ends with a series of digits (e.g. ABC123 => 123)
A string has a series of contiguous digits in the middle (e.g. AB123C ==> 123)
The digits are possibly noncontiguous (e.g. A77C12 => 7712)
Case 4 is the "safest" assumption (after all, it is a superset of Case 1, 2 and 3. So, we need an algorithm for that. As a bonus I'll provide algorithms specialized to the other cases.
The Main Algorithm, All Cases
Using in-place unsafe iteration of the characters of the string, which uses fixed, we can extract digits and convert them to a single number without the data copy in ToCharArray(). We can also avoid the allocations of, say, a StringBuilder implementation and a possibly slow regex solution.
NOTE: This is valid C# code though it's using pointers. It does look like C++, but I assure you it's C#.
public static unsafe int GetNumberForwardFullScan(string s)
{
int value = 0;
fixed (char* pString = s)
{
var pChar = pString;
for (int i = 0; i != s.Length; i++, pChar++)
{
// this just means if the char is not between 0-9, we exit the loop (i.e. stop calculating the integer)
if (*pChar < '0' || *pChar > '9')
continue;
// running recalculation of the integer
value = value * 10 + *pChar - '0';
}
}
return value;
}
Running this against any of the inputs: "AS106RVT", "ASRVT106", "106ASRVT", or "1AS0RVT6" results in pulling out 1, 0, 6 and calculating on each digit as
0*10 + 1 == 1
1*10 + 0 == 10
10*10 + 6 == 106
Case 1 Only Algorithm (Digits at Start of String)
This algorithm is identical to the one above, but instead of continue we can break as soon as we reach a non-digit. This would be much faster if we can assume all the inputs start with digits and the strings are long.
Case 2 Only Algorithm (Digits at End of String)
This is almost the same as Case 1 Only except you have to
iterate from the end of the string to the beginning (aka backwards) stopping on the first non-digit
change the calculation to sum up powers of ten.
Both of those are a bit tricky, so here's what that looks like
public static unsafe int GetNumberBackward(string s)
{
int value = 0;
fixed (char* pString = s)
{
char* pChar = pString + s.Length - 1;
for (int i = 0; i != -1; i++, pChar--)
{
if (*pChar < '0' || *pChar > '9')
break;
value = (*pChar - '0') * (int)Math.Pow(10, i) + value;
}
}
return value;
}
So each of the iteration of the calculation looks like
6*100 + 0 == 6
0*101 + 6 == 6
1*102 + 6 == 106
While I used Math.Pow in these examples, you can find integer only versions that might be faster.
Cases 1-3 Only (i.e. All Digits Contiguous Somewhere in the String
This algorithm says to
Scan all non-digits
Then scan only digits
First non-digit after that, stop
It would look like
public static unsafe int GetContiguousDigits(string s)
{
int value = 0;
fixed (char* pString = s)
{
var pChar = pString;
// skip non-digits
int i = 0;
for (; i != s.Length; i++, pChar++)
if (*pChar >= '0' && *pChar <= '9')
break;
for (; i != s.Length; i++, pChar++)
{
if (*pChar < '0' || *pChar > '9')
break;
value = value * 10 + *pChar - '0';
}
}
return value;
}
For a string that may have zero or more hyphens in it, I need to extract all the different possibilities with and without hyphens.
For example, the string "A-B" would result in "A-B" and "AB" (two possibilities).
The string "A-B-C" would result in "A-B-C", "AB-C", "A-BC" and "ABC" (four possibilities).
The string "A-B-C-D" would result in "A-B-C-D", "AB-C-D", "A-BC-D", "A-B-CD", "AB-CD", "ABC-D", "A-BCD" and "ABCD" (eight possibilities).
...etc, etc.
I've experimented with some nested loops but haven't been able to get anywhere near the desired result. I suspect I need something recursive unless there is some simple solution I am overlooking.
NB. This is to build a SQL query (shame that SQL Server does't have MySQL's REGEXP pattern matching).
Here is one attempt I was working on. This might work if I do this recursively.
string keyword = "A-B-C-D";
List<int> hyphens = new List<int>();
int pos = keyword.IndexOf('-');
while (pos != -1)
{
hyphens.Add(pos);
pos = keyword.IndexOf('-', pos + 1);
}
for (int i = 0; i < hyphens.Count(); i++)
{
string result = keyword.Substring(0, hyphens[i]) + keyword.Substring(hyphens[i] + 1);
Response.Write("<p>" + result);
}
A B C D are words of varying length.
Take a look at your sample cases. Have you noticed a pattern?
With 1 hyphen there are 2 possibilities.
With 2 hyphens there are 4 possibilities.
With 3 hyphens there are 8 possibilities.
The number of possibilities is 2n.
This is literally exponential growth, so if there are too many hyphens in the string, it will quickly become infeasible to print them all. (With just 30 hyphens there are over a billion combinations!)
That said, for smaller numbers of hyphens it might be interesting to generate a list. To do this, you can think of each hyphen as a bit in a binary number. If the bit is 1, the hyphen is present, otherwise it is not. So this suggests a fairly straightforward solution:
Split the original string on the hyphens
Let n = the number of hyphens
Count from 2n - 1 down to 0. Treat this counter as a bitmask.
For each count begin building a string starting with the first part.
Concatenate each of the remaining parts to the string in order, preceded by a hyphen only if the corresponding bit in the bitmask is set.
Add the resulting string to the output and continue until the counter is exhausted.
Translated to code we have:
public static IEnumerable<string> EnumerateHyphenatedStrings(string s)
{
string[] parts = s.Split('-');
int n = parts.Length - 1;
if (n > 30) throw new Exception("too many hyphens");
for (int m = (1 << n) - 1; m >= 0; m--)
{
StringBuilder sb = new StringBuilder(parts[0]);
for (int i = 1; i <= n; i++)
{
if ((m & (1 << (i - 1))) > 0) sb.Append('-');
sb.Append(parts[i]);
}
yield return sb.ToString();
}
}
Fiddle: https://dotnetfiddle.net/ne3N8f
You should be able to track each hyphen position, and basically say its either there or not there. Loop through all the combinations, and you got all your strings. I found the easiest way to track it was using a binary, since its easy to add those with Convert.ToInt32
I came up with this:
string keyword = "A-B-C-D";
string[] keywordSplit = keyword.Split('-');
int combinations = Convert.ToInt32(Math.Pow(2.0, keywordSplit.Length - 1.0));
List<string> results = new List<string>();
for (int j = 0; j < combinations; j++)
{
string result = "";
string hyphenAdded = Convert.ToString(j, 2).PadLeft(keywordSplit.Length - 1, '0');
// Generate string
for (int i = 0; i < keywordSplit.Length; i++)
{
result += keywordSplit[i] +
((i < keywordSplit.Length - 1) && (hyphenAdded[i].Equals('1')) ? "-" : "");
}
results.Add(result);
}
This works for me:
Func<IEnumerable<string>, IEnumerable<string>> expand = null;
expand = xs =>
{
if (xs != null && xs.Any())
{
var head = xs.First();
if (xs.Skip(1).Any())
{
return expand(xs.Skip(1)).SelectMany(tail => new []
{
head + tail,
head + "-" + tail
});
}
else
{
return new [] { head };
}
}
else
{
return Enumerable.Empty<string>();
}
};
var keyword = "A-B-C-D";
var parts = keyword.Split('-');
var results = expand(parts);
I get:
ABCD
A-BCD
AB-CD
A-B-CD
ABC-D
A-BC-D
AB-C-D
A-B-C-D
I've tested this code and it is working as specified in the question. I stored the strings in a List<string>.
string str = "AB-C-D-EF-G-HI";
string[] splitted = str.Split('-');
List<string> finalList = new List<string>();
string temp = "";
for (int i = 0; i < splitted.Length; i++)
{
temp += splitted[i];
}
finalList.Add(temp);
temp = "";
for (int diff = 0; diff < splitted.Length-1; diff++)
{
for (int start = 1, limit = start + diff; limit < splitted.Length; start++, limit++)
{
int i = 0;
while (i < start)
{
temp += splitted[i++];
}
while (i <= limit)
{
temp += "-";
temp += splitted[i++];
}
while (i < splitted.Length)
{
temp += splitted[i++];
}
finalList.Add(temp);
temp = "";
}
}
I'm not sure your question is entirely well defined (i.e. could you have something like A-BCD-EF-G-H?). For "fully" hyphenated strings (A-B-C-D-...-Z), something like this should do:
string toParse = "A-B-C-D";
char[] toParseChars = toPase.toCharArray();
string result = "";
string binary;
for(int i = 0; i < (int)Math.pow(2, toParse.Length/2); i++) { // Number of subsets of an n-elt set is 2^n
binary = Convert.ToString(i, 2);
while (binary.Length < toParse.Length/2) {
binary = "0" + binary;
}
char[] binChars = binary.ToCharArray();
for (int k = 0; k < binChars.Length; k++) {
result += toParseChars[k*2].ToString();
if (binChars[k] == '1') {
result += "-";
}
}
result += toParseChars[toParseChars.Length-1];
Console.WriteLine(result);
}
The idea here is that we want to create a binary word for each possible hyphen. So, if we have A-B-C-D (three hyphens), we create binary words 000, 001, 010, 011, 100, 101, 110, and 111. Note that if we have n hyphens, we need 2^n binary words.
Then each word maps to the output you desire by inserting the hyphen where we have a '1' in our word (000 -> ABCD, 001 -> ABC-D, 010 -> AB-CD, etc). I didn't test the code above, but this is at least one way to solve the problem for fully hyphenated words.
Disclaimer: I didn't actually test the code
I want to create a new variable everytime a loop is run, for example:
for (char c = 'a'; c <= 'z'; c++;)
{
countA++;
// I want to create a new variable(countB, countC, etc.) every time
// the loop is run
//if you couldn't tell already, this loop counts letters in a string
}
As suggested by Paolo Costa, you could use an array:
string str = "lowercase string";
int[] counts = new int['z' - 'a' + 1];
for (int i = 0; i < str.Length; i++)
{
char ch = str[i];
if (ch < 'a' || ch > 'z')
{
continue;
}
counts[ch - 'a']++;
}
Note that in C# (and in .NET), chars are more similar to int than to string. They are a number (technically an integral type) that can have values between 0 and 65535, and that can be implicitly converted to int. Here I'm playing on it :-) a is 97 and z is 122. 'z' - 'a' + 1 is 122 - 97 + 1 == 26, and that is the size of the array you need. ch - 'a' tranforms the characters between a and z in numbers between 0 and 26. This because clearly 'a' - 'a' is 0, 'b' - 'a' is 1 and so on.
In order to count the characters you can use a single variable of Dictionary type as below.
Outside the loop
IDictionary<char, int> count = new Dictionary<char, int>();
Inside the loop increase the values as per the character which is found inside the loop.
...
...
count['a']++;
...
...
count['b']++;
I am guessing you wish to count small letter frequencies.
You could use :
int[] freq = new int[26]; // only 26 characters you are counting
for(char c = 'a'; c<='z'; c++)
{
freq[c-97] = data.Count(x => x == c); // data is a string of characters
}
First I want to go through what's wrong with your code.
You have no string that you're iterating through, using the postfix increment operator on a char only helps you iterate through a range of characters, but your aim isn't to iterate through a-z, but each char of a string.
string text = "This is a random string that I'll iterate through " +
"to find out how many instances of a character it contains";
Dictionary<char, int> Counter = new Dictionary<char, int>();
foreach (char c in text)
{
if (!Counter.ContainsKey(c))
{
Counter.Add(c, 0);
}
Counter[c] += 1;
}
foreach (var kv in Counter)
{
Console.WriteLine ("The character {0} occured {1} times", kv.Key, kv.Value);
}
So, I have to make a program that converts a decimal number to binary and prints it, but without using Convert. I got to a point where I can print out the number, but it's in reverse(for example: 12 comes out as 0011 instead of 1100), anyone has an idea how to fix that ? Here's my code:
Console.Write("Number = ");
int n = int.Parse(Console.ReadLine());
string counter = " ";
do
{
if (n % 2 == 0)
{
counter = "0";
}
else if (n % 2 != 0)
{
counter = "1";
}
Console.Write(counter);
n = n / 2;
}
while (n >= 1);
simple solution would be to add them at the beginning:
Console.Write("Number = ");
int n = int.Parse(Console.ReadLine());
string counter = "";
while (n >= 1)
{
counter = (n % 2) + counter;
n = n / 2;
}
Console.Write(counter);
You actually don't even need the if statement
Instead of write them inmediately you may insert them in a StringBuidler
var sb = new StringBuilder();
....
sb.Insert(0, counter);
And then use that StringBuilder
var result = sb.ToString();
You can reverse the String when you finish calculating it
You are generated the digits in the reverse order because you are starting with the least significiant digits when you use % 2 to determine the bit value. What you are doing is not bad though, as it is convenient way to determine the bits. All you have to do is reverse the output by collecting it until you have generated all of the bits then outputing everything in reverse order. ( I did not try to compile and run, may have a typo)
One easy solution is
System.Text.StringBuilder reversi = new System.Text.StringBuilder();
Then in your code replace
Console.Write(counter);
with
reversi.Append(counter);
Finally add the end of your loop, add code like this
string s = reversi.ToString();
for (int ii = s.Length-1; ii >= 0; --ii)
{
Console.Write(s[ii]);
}
There are better ways to do this, but this is easy to understand why it fixes your code -- It looks like you are trying to learn C#.
If I'm not mistaken
int value = 8;
string binary = Convert.ToString(value, 2);
I have this code,
void Generate(List<string> comb, string prefix, string remaining)
{
int currentDigit = Int32.Parse(remaining.Substring(0, 1));
if (remaining.Length == 1)
{
for (int i = 0; i < dictionary[currentDigit].Length; i++)
{
comb.Add(prefix + dictionary[currentDigit][i]);
}
}
else
{
for (int i = 0; i < dictionary[currentDigit].Length; i++)
{
Generate(comb, prefix + dictionary[currentDigit][i], remaining.Substring(1));
}
}
}
What is the time complexity of the above code?
Is it Generate is O(n) and that itself is being executed n times so O(n^2)?
dictionary is len = 10 and has phone keypads stored it in. 2 = "abc" etc.
The initial call to this code will be like
Generate(new List(), "", "12345");
Thanks.
Assume dictionary size is m and input string size is n (remaining) this will be:
T(1) = m + constant;
T(n) = m T(n-1) + O(n) ==> T(n) = O(m^n)
In fact in each running of else part, you will run m times, function of O(n).