Generating every character combination up to a certain word length - c#

I am doing a security presentation for my Computer and Information Security course in a few weeks time, and in this presentation I will be demonstrating the pros and cons of different attacks (dictionary, rainbow and bruteforce). I am do the dictionary and rainbow attacks fine but I need to generate the bruteforce attack on the fly. I need to find an algorithm that will let me cycle though every combination of letter, symbol and number up to a certain character length.
So as an example, for a character length of 12, the first and last few generations will be:
a
ab
abc
abcd
...
...
zzzzzzzzzzzx
zzzzzzzzzzzy
zzzzzzzzzzzz
But it will also use numbers and symbols, so it's quite hard for me to explain... but I think you get the idea. Using only symbols from the ASCII table is fine.
I can kind of picture using an ASCII function to do this with a counter, but I just can't work it out in my head. If anyone could provide some source code (I'll probably be using C#) or even some pseudo code that I can program a function from that'd be great.
Thank you in advance. :)

A recursive function will let you run through all combinations of ValidChars:
int maxlength = 12;
string ValidChars;
private void Dive(string prefix, int level)
{
level += 1;
foreach (char c in ValidChars)
{
Console.WriteLine(prefix + c);
if (level < maxlength)
{
Dive(prefix + c, level);
}
}
}
Assign the set of valid characters to ValidChars, the maximum length of string you want to maxlength, then call Dive("", 0); and away you go.

You need to generate all combinations of characters from a set of valid characters ; let's call this set validChars. Basically, each set of combinations of length N is a cartesian product of validChars with itself, N times. That's pretty easy to do using Linq:
char[] validChars = ...;
var combinationsOfLength1 =
from c1 in validChars
select new[] { c1 };
var combinationsOfLength2 =
from c1 in validChars
from c2 in validChars
select new[] { c1, c2 };
...
var combinationsOfLength12 =
from c1 in validChars
from c2 in validChars
...
from c12 in validChars
select new[] { c1, c2 ... c12 };
var allCombinations =
combinationsOfLength1
.Concat(combinationsOfLength2)
...
.Concat(combinationsOfLength12);
Obviously, you don't want to manually write the code for each length, especially if you don't know in advance the maximum length...
Eric Lippert has an article about generating the cartesian product of an arbitrary number of sequences. Using the CartesianProduct extension method provided by the article, you can generate all combinations of length N as follows:
var combinationsOfLengthN = Enumerable.Repeat(validChars, N).CartesianProduct();
Since you want all combinations from length 1 to MAX, you can do something like that:
var allCombinations =
Enumerable
.Range(1, MAX)
.SelectMany(N => Enumerable.Repeat(validChars, N).CartesianProduct());
allCombinations is an IEnumerable<IEnumerable<char>>, if you want to get the results as a sequence of strings, you just need to add a projection:
var allCombinations =
Enumerable
.Range(1, MAX)
.SelectMany(N => Enumerable.Repeat(validChars, N).CartesianProduct())
.Select(combination => new string(combination.ToArray()));
Note that it's certainly not the most efficient solution, but at least it's short and readable...

You can try this code, that use recursion to print all possible strings of 0 to stringsLenght chars lenght, composed by all combination of chars from firstRangeChar to lastRangeChar.
class BruteWriter
{
static void Main(string[] args)
{
var bw = new BruteWriter();
bw.WriteBruteStrings("");
}
private void WriteBruteStrings(string prefix)
{
Console.WriteLine(prefix);
if (prefix.Length == stringsLenght)
return;
for (char c = firstRangeChar; c <= lastRangeChar; c++)
WriteBruteStrings(prefix + c);
}
char firstRangeChar='A';
char lastRangeChar='z';
int stringsLenght=10;
}
This look to be faster than the solution of #dthorpe.I've compared the algorthms using this code:
class BruteWriter
{
static void Main(string[] args)
{
var st = new Stopwatch();
var bw = new BruteWriter();
st.Start();
bw.WriteBruteStrings("");
Console.WriteLine("First method: " + st.ElapsedMilliseconds);
for (char c = bw.firstRangeChar; c <= bw.lastRangeChar; c++)
bw.ValidChars += c;
st.Start();
bw.Dive("", 0);
Console.WriteLine("Second method: " + st.ElapsedMilliseconds);
Console.ReadLine();
}
private void WriteBruteStrings(string prefix)
{
if (prefix.Length == stringsLenght)
return;
for (char c = firstRangeChar; c <= lastRangeChar; c++)
WriteBruteStrings(prefix + c);
}
char firstRangeChar='A';
char lastRangeChar='R';
int stringsLenght=5;
int maxlength = 5;
string ValidChars;
private void Dive(string prefix, int level)
{
level += 1;
foreach (char c in ValidChars)
{
if (level <= maxlength)
{
Dive(prefix + c, level);
}
}
}
}
and, on my pc, I get these results:
First method: 247
Second method: 910

public void BruteStrings(int maxlength)
{
for(var i=1;i<i<=maxlength;i++)
BruteStrings(Enumerable.Repeat((byte)0,i));
}
public void BruteStrings(byte[] bytes)
{
Console.WriteLine(bytes
.Cast<char>()
.Aggregate(new StringBuilder(),
(sb,c) => sb.Append(c))
.ToString());
if(bytes.All(b=>b.MaxValue)) return;
bytes.Increment();
BruteStrings(bytes);
}
public static void Increment(this byte[] bytes)
{
bytes.Last() += 1;
if(bytes.Last == byte.MinValue)
{
var lastByte = bytes.Last()
bytes = bytes.Take(bytes.Count() - 1).ToArray().Increment();
bytes = bytes.Concat(new[]{lastByte});
}
}

Another alternative i did, that return a string.
I did not care about the performance of the thing since it was not for a real world scenario.
private void BruteForcePass(int maxLength)
{
var tempPass = "";
while (tempPass.Length <= maxLength)
{
tempPass = GetNextString(tempPass);//Use char from 32 to 256
//Do what you want
}
}
private string GetNextString(string initialString, int minChar= 32, int maxChar = 256)
{
char nextChar;
if (initialString.Length == 0)
{
nextChar = (char)minChar;//the initialString Length will increase
}
else if (initialString.Last() == (char)maxChar)
{
nextChar = (char)minChar;
var tempString = initialString.Substring(0, initialString.Length -1);//we need to increment the char just before the last one
initialString = GetNextString(tempString, minChar, maxChar);
}
else
{
nextChar = (char)(initialString.Last() + 1);//Get the lash Char and increment it;
initialString= initialString.Remove(initialString.Length - 1);//Remove the last char.
}
return initialString + nextChar;
}

Related

How to skip over characters already looped through in a string

I have a function findChar() that loops through a string to find occurrences of characters in a string Ex: "Cincinnati" (2 Cs, 2 i's, etc) but once it finds another 'C' or 'i' it will return the values again
public static int findChar(string name, char c)
{
int count = 0;
for (int i = 0; i < name.Length; i++)
{
if (name[i] == c || name[i] == Char.ToUpper(c) || name[i] == Char.ToLower(c))
{
count++;
}
}
return count;
}
static void Main(string[] args)
{
string name = "Cincinnati";
char c = ' ' ;
int count = 0;
for (int i = 0; i < name.Length; i++)
{
c = name[i];
count = findChar(name, c);
Console.WriteLine(count);
}
}
My Output looks like this:
2
3
3
2
3
3
3
1
1
3
And I need it to be like:
2
3
3
1
1
Option 1: keep track of the letters you already processed, and ignore it if you already did
Option 2: use System.Linq's GroupBy to get the count
public static void Main()
{
var name = "Cincinatti";
var letters = name.ToLower().GroupBy(letter => letter);
foreach (var group in letters) {
Console.WriteLine(group.Count());
}
}
There are many ways to solve a problem like this. First let's discuss a problem it looks like you've already run into, capitalization. Lower case and upper case versions of the same letter are classified as different characters. The easiest way to combat this is to either convert the string to lowercase or uppercase so each duplicate letter can also be classified as a duplicate character. You can do this either by using the String.ToLower() method or the String.ToUpper() method depending on which one you want to use.
The way to solve this that is the closest to what you have is to just create a list, add letters to it as you process them, then use the list to check what letters you've processed already. It would look something like this:
static void Main(string[] args)
{
string name = "Cincinnati";
char c = ' ' ;
int count = 0;
var countedLetters = new List<string>();
for (int i = 0; i < name.Length; i++)
{
c = name[i];
char cLower = char.ToLower(c);
if(countedLetters.Contains(cLower))
{
continue;
}
countedLetters.Add(cLower);
count = findChar(name, c);
Console.WriteLine(count);
}
}
Although, you can usually use System.Linq's Enumerable extension methods to do things like this pretty easily.
Not deviating too much from what you have, another solution using System.Linq would be to just get the distinct characters and loop through that instead of the entire string. When doing this, we need to convert the entire string to either upper or lower case in order for linq to return the expected result. This would like something like this:
static void Main(string[] args)
{
string name = "Cincinnati";
string nameLower = name.ToLower();
int count = 0;
foreach(char c in nameLower.Distinct())
{
count = findChar(name, c);
Console.WriteLine(count);
}
}
Then finally, you can simplify this a ton by leaning heavily into the linq route. GroupBy is very useful for this because it's entire purpose is to group duplicates together. There are many ways to implement this and two have already be provided, so I will just provide a third.
public static void Main()
{
string name = "Cincinatti";
int[] counts = name.ToLower()
.GroupBy(letter => letter)
.Select(group => group.Count())
.ToArray();
Console.WriteLine(string.Join("\n", counts));
}
you can do group by list, sort optional (i left it commented out) and then select count
var word="Cincinnati";
var groups = word.ToLower().GroupBy(n => {return n;})
.Select(n => new
{
CharachterName = n.Key,
CharachterCount = n.Count()
});
// .OrderBy(n => n.CharachterName);
Console.WriteLine(JsonConvert.SerializeObject(groups.Select(i=>i.CharachterCount)));

How to go out of bounds back to the start of an array?

I'm making a Caesar cipher, currently I have a shift of three that I want to encrypt the message with. If any letter has "x, y or z" it will give me an out of bounds array error (because the shift is 3).
How can I pass the error by going back to the start of the array, but ending with the remainder of the shift?
This is my code currently:
using System;
using System.Text;
//caesar cipher
namespace Easy47
{
class Program
{
static void Main(string[] args)
{
const string alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
var input = Input(alphabet);
Encrypt(3, input, alphabet);
Console.WriteLine();
}
private static void Encrypt(int shift, string input, string alphabet)
{
var message = new StringBuilder();
foreach (char letter in input)
{
for (int i = 0; i < alphabet.Length; i++)
{
if (letter == alphabet[i])
{
message.Append(alphabet[i + shift]);
}
}
}
Console.WriteLine("\n" + message);
}
private static string Input(string alphabet)
{
Console.Write("Input your string\n\n> ");
string input = Console.ReadLine().ToUpper();
return input;
}
}
}
You use the modulo operator:
var i = 255
var z = i % 200 // z == 55
in your case here:
for (int i = 0; i < alphabet.Length; i++)
{
if (letter == alphabet[i])
{
message.Append(alphabet[ (i + shift) % alphabet.Length]);
}
}
If after your addition of shift the index is bigger then alphabet.Length it will start at 0 again.
See C# Ref Modulo Operator
Unrelated, but your loop is not very efficient. A message of "ZZZZZ" would go 5 times through your full alphabet to get translated. You should use a Dictionary as lookup. You can create it at the start before you translate the message and then your lookup is very fast - thats what dictionarys excel at. O(1) lookups :o)
If you know a little about linq, this should be understandable:
// needs: using System.Linq;
private static void Encrypt(int shift, string input, string alphabet)
{
var message = new StringBuilder();
// create a string that is shifted by shift characters
// skip: skips n characters, take: takes n characters
// string.Join reassables the string from the enumerable of chars
var moved = string.Join("",alphabet.Skip(shift))+string.Join("",alphabet.Take(shift));
// the select iterates through your alphabet, c is the character you currently handle,
// i is the index it is at inside of alphabet
// the rest is a fancy way of creating a dictionary for
// a->d
// b->e
// etc using alphabet and the shifted lookup-string we created above.
var lookup = alphabet
.Select( (c,i)=> new {Orig=c,Chiff=moved[i]})
.ToDictionary(k => k.Orig, v => v.Chiff);
foreach (char letter in input)
{
// if the letter is not inside your alphabet, you might want to add
// it "as-is" in a else-branch. (Numbers or dates or .-,?! f.e.)
if (lookup.ContainsKey(letter))
{
message.Append(lookup[letter]);
}
}
Console.WriteLine("\n" + message);
}

Add separator to string at every N characters?

I have a string which contains binary digits. How to separate string after each 8 digit?
Suppose the string is:
string x = "111111110000000011111111000000001111111100000000";
I want to add a separator like ,(comma) after each 8 character.
output should be :
"11111111,00000000,11111111,00000000,11111111,00000000,"
Then I want to send it to a list<> last 8 char 1st then the previous 8 chars(excepting ,) and so on.
How can I do this?
Regex.Replace(myString, ".{8}", "$0,");
If you want an array of eight-character strings, then the following is probably easier:
Regex.Split(myString, "(?<=^(.{8})+)");
which will split the string only at points where a multiple of eight characters precede it.
Try this:
var s = "111111110000000011111111000000001111111100000000";
var list = Enumerable
.Range(0, s.Length/8)
.Select(i => s.Substring(i*8, 8));
var res = string.Join(",", list);
There's another Regex approach:
var str = "111111110000000011111111000000001111111100000000";
# for .NET 4
var res = String.Join(",",Regex.Matches(str, #"\d{8}").Cast<Match>());
# for .NET 3.5
var res = String.Join(",", Regex.Matches(str, #"\d{8}")
.OfType<Match>()
.Select(m => m.Value).ToArray());
...or old school:
public static List<string> splitter(string in, out string csv)
{
if (in.length % 8 != 0) throw new ArgumentException("in");
var lst = new List<string>(in/8);
for (int i=0; i < in.length / 8; i++) lst.Add(in.Substring(i*8,8));
csv = string.Join(",", lst); //This we want in input order (I believe)
lst.Reverse(); //As we want list in reverse order (I believe)
return lst;
}
Ugly but less garbage:
private string InsertStrings(string s, int insertEvery, char insert)
{
char[] ins = s.ToCharArray();
int length = s.Length + (s.Length / insertEvery);
if (ins.Length % insertEvery == 0)
{
length--;
}
var outs = new char[length];
long di = 0;
long si = 0;
while (si < s.Length - insertEvery)
{
Array.Copy(ins, si, outs, di, insertEvery);
si += insertEvery;
di += insertEvery;
outs[di] = insert;
di ++;
}
Array.Copy(ins, si, outs, di, ins.Length - si);
return new string(outs);
}
String overload:
private string InsertStrings(string s, int insertEvery, string insert)
{
char[] ins = s.ToCharArray();
char[] inserts = insert.ToCharArray();
int insertLength = inserts.Length;
int length = s.Length + (s.Length / insertEvery) * insert.Length;
if (ins.Length % insertEvery == 0)
{
length -= insert.Length;
}
var outs = new char[length];
long di = 0;
long si = 0;
while (si < s.Length - insertEvery)
{
Array.Copy(ins, si, outs, di, insertEvery);
si += insertEvery;
di += insertEvery;
Array.Copy(inserts, 0, outs, di, insertLength);
di += insertLength;
}
Array.Copy(ins, si, outs, di, ins.Length - si);
return new string(outs);
}
If I understand your last requirement correctly (it's not clear to me if you need the intermediate comma-delimited string or not), you could do this:
var enumerable = "111111110000000011111111000000001111111100000000".Batch(8).Reverse();
By utilizing morelinq.
Here my two little cents too. An implementation using StringBuilder:
public static string AddChunkSeparator (string str, int chunk_len, char separator)
{
if (str == null || str.Length < chunk_len) {
return str;
}
StringBuilder builder = new StringBuilder();
for (var index = 0; index < str.Length; index += chunk_len) {
builder.Append(str, index, chunk_len);
builder.Append(separator);
}
return builder.ToString();
}
You can call it like this:
string data = "111111110000000011111111000000001111111100000000";
string output = AddChunkSeparator(data, 8, ',');
One way using LINQ:
string data = "111111110000000011111111000000001111111100000000";
const int separateOnLength = 8;
string separated = new string(
data.Select((x,i) => i > 0 && i % separateOnLength == 0 ? new [] { ',', x } : new [] { x })
.SelectMany(x => x)
.ToArray()
);
I did it using Pattern & Matcher as following way:
fun addAnyCharacter(input: String, insertion: String, interval: Int): String {
val pattern = Pattern.compile("(.{$interval})", Pattern.DOTALL)
val matcher = pattern.matcher(input)
return matcher.replaceAll("$1$insertion")
}
Where:
input indicates Input string. Check results section.
insertion indicates Insert string between those characters. For example comma (,), start(*), hash(#).
interval indicates at which interval you want to add insertion character.
input indicates Input string. Check results section. Check results section; here I've added insertion at every 4th character.
Results:
I/P: 1234XXXXXXXX5678 O/P: 1234 XXXX XXXX 5678
I/P: 1234567812345678 O/P: 1234 5678 1234 5678
I/P: ABCDEFGHIJKLMNOP O/P: ABCD EFGH IJKL MNOP
Hope this helps.
As of .Net 6, you can simply use the IEnumerable.Chunk method (Which splits elements of a sequence into chunks) then reconcatenate the chunks using String.Join.
var text = "...";
string.Join(',', text.Chunk(size: 6).Select(x => new string(x)));
This is much faster without copying array (this version inserts space every 3 digits but you can adjust it to your needs)
public string GetString(double valueField)
{
char[] ins = valueField.ToString().ToCharArray();
int length = ins.Length + (ins.Length / 3);
if (ins.Length % 3 == 0)
{
length--;
}
char[] outs = new char[length];
int i = length - 1;
int j = ins.Length - 1;
int k = 0;
do
{
if (k == 3)
{
outs[i--] = ' ';
k = 0;
}
else
{
outs[i--] = ins[j--];
k++;
}
}
while (i >= 0);
return new string(outs);
}
For every 1 character, you could do this one-liner:
string.Join(".", "1234".ToArray()) //result: 1.2.3.4
If you intend to create your own function to acheive this without using regex or pattern matching methods, you can create a simple function like this:
String formatString(String key, String seperator, int afterEvery){
String formattedKey = "";
for(int i=0; i<key.length(); i++){
formattedKey += key.substring(i,i+1);
if((i+1)%afterEvery==0)
formattedKey += seperator;
}
if(formattedKey.endsWith("-"))
formattedKey = formattedKey.substring(0,formattedKey.length()-1);
return formattedKey;
}
Calling the mothod like this
formatString("ABCDEFGHIJKLMNOPQRST", "-", 4)
Would result in the return string as this
ABCD-EFGH-IJKL-MNOP-QRST
A little late to the party, but here's a simplified LINQ expression to break an input string x into groups of n separated by another string sep:
string sep = ",";
int n = 8;
string result = String.Join(sep, x.InSetsOf(n).Select(g => new String(g.ToArray())));
A quick rundown of what's happening here:
x is being treated as an IEnumerable<char>, which is where the InSetsOf extension method comes in.
InSetsOf(n) groups characters into an IEnumerable of IEnumerable -- each entry in the outer grouping contains an inner group of n characters.
Inside the Select method, each group of n characters is turned back into a string by using the String() constructor that takes an array of chars.
The result of Select is now an IEnumerable<string>, which is passed into String.Join to interleave the sep string, just like any other example.
I am more than late with my answer but you can use this one:
static string PutLineBreak(string str, int split)
{
for (int a = 1; a <= str.Length; a++)
{
if (a % split == 0)
str = str.Insert(a, "\n");
}
return str;
}

How to get continuous characters in C#?

I've a
List<String> MyList=new List<string>();
I need to fill the list MyList with n values.
if the value of n is 2 then the list MyList will contain
"A","B"
if 10 then
"A","B","C"....."J"
if 30 then
"A"....."Z","AA","AB",AC","AD"
if 1000 then
"A",....."Z","AA","AB"......"AZ","BA","BB"......."BZ"........"YZ","AAA",AAB".....
and so on
I do not know how to do this.
Please help me to do this using any method Using LINQ or LAMBDA Expression
Edit 2:
This is probably the easiest way to implement it. I tested it, it works fine. You could generate a infinite number of strings.
public IEnumerable<string> GenerateStrings()
{
foreach(string character in Alphabet())
{
yield return character;
}
foreach (string prefix in GenerateStrings())
{
foreach(string suffix in Alphabet())
{
yield return prefix + suffix;
}
}
}
public IEnumerable<string> Alphabet()
{
for(int i = 0; i < 26; i++)
{
yield return ((char)('A' + i)).ToString();
}
}
Stuff I wrote before:
You could also write a little recursive function which returns any string by a certain index. This may not be optimal performance wise, because there are some repetitive divisions, but it may be fast enough for your purpose.
It is quite short and easy:
string GetString(int index)
{
if (index < 26)
{
return ((char)('A' + index)).ToString();
}
return GetString(index / 26 - 1) + GetString(index % 26);
}
usage (may also be put into another method:
List<string> strings = Enumerable.Range(0, 1000)
.Select(x => GetString(x))
.ToList();
This is working code, just wrote a test for it.
Edit: eg, the "full linq way" application of GetString:
public void IEnumerale<string> GenerateStrings()
{
int index = 0;
// generate "infinit" number of values ...
while (true)
{
// ignoring index == int.MaxValue
yield return GetString(index++);
}
}
List<string> strings = GenerateStrings().Take(1000).ToList();
I did something similar in SQL a while back.
Translated to C# this is a function to create a code from a number:
public static string GetCode(int id) {
string code, chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
if (id <= chars.Length) {
code = chars.Substring(id - 1, 1);
} else {
id--;
int value = chars.Length, adder = 0;
while (id >= value * (chars.Length + 1) + adder) {
adder += value;
value *= chars.Length;
}
code = chars.Substring((id - adder) / value - 1, 1);
id = ((id - adder) % value);
while (value > 1) {
value /= chars.Length;
code += chars.Substring(id / value, 1);
id = id % value;
}
}
return code;
}
Then you just get numbers from 1 and up, and translate into codes:
var codes = Enumerable.Range(1, 1000).Select(n => GetCode(n));
The limit of the function is currently "ZZZZZZ" or 321272406. (After that you get a division by zero.)
Note that this function uses all combinations and returns "A".."Z", "AA".."ZZ", "AAA"..."ZZZ" rather than starting at "AB" and "ABC".
This is similar to this question (but not quite enough to mark it as a duplicate, and it's a hard problem to search for anyway).
Use any of the working IEnumerable<string> answers (or at least, any which cover the range you need), and then if you need to create a list with a certain number of elements, just use:
List<string> list = GenerateSequence().Take(count).ToList();
This code is working fine, but I'm not sure if it's "LINQ enough".
char[] validChars = Enumerable.Range(0, 26).Select(i => (char)('A' + i)).ToArray();
List<string> result = new List<string>();
List<string> generator = validChars.Select(ch => ch.ToString()).ToList();
int n = 1000;
while (result.Count < n)
{
result.AddRange(generator);
generator = generator.Take((n - result.Count) / validChars.Length + 1)
.SelectMany(s => validChars.Select(ch => s + ch))
.ToList();
}
var output = result.Take(n);
Try the below .. I am using Cross Join and a Union to build the source and then filtering the record by using the Take extension method
char[] charArray = "ABCDEFGHIJKLMNOPQRSTUVWXYZ".ToCharArray();
List<String> MyList = new List<string>();
int n = 1000;
(from value1 in charArray
select new
{
newString = value1.ToString()
})
.Union
(
(from value1 in charArray
from value2 in charArray
select new
{
newString = string.Concat(value1, value2)
})
)
.Union
(
(from value1 in charArray
from value2 in charArray
from value3 in charArray
select new
{
newString = string.Concat(value1, value2, value3)
})
)
.Take(n)
.ToList()
.ForEach(i => MyList.Add(i.newString));
Hope this will give you some idea of using a combination of Linq,Lambda and Extension method.
#DannyChen this is it. your code with a little changes..
char[] validChars = Enumerable.Range(0, 26).Select(i => (char)('A' + i)).ToArray();
int n = 30;
int pointer = 0;
int pointerSec = 0;
int Deg = 0;
string prefix = string.Empty;
string prefixMore = string.Empty;
List<string> result = new List<string>();
while (n > 0)
{
result.AddRange(validChars.Skip(pointer).Select(ch => prefix + ch).Take(n));
if (pointer == 26)
{
pointer = -1;
Deg += 1;
prefixMore = "" + validChars[pointerSec];
pointerSec++;
n++;
}
else
{
if (Deg == 0)
{
prefix = "" + validChars[pointer];
}
else
{
prefix = prefixMore + validChars[pointer];
}
}
n--;
pointer++;
}
it is 100% correct.

How to make next step of a string. C#

The question is complicated but I will explain it in details.
The goal is to make a function which will return next "step" of the given string.
For example
String.Step("a"); // = "b"
String.Step("b"); // = "c"
String.Step("g"); // = "h"
String.Step("z"); // = "A"
String.Step("A"); // = "B"
String.Step("B"); // = "C"
String.Step("G"); // = "H"
Until here its quite easy, But taking in mind that input IS string it can contain more than 1 characters and the function must behave like this.
String.Step("Z"); // = "aa";
String.Step("aa"); // = "ab";
String.Step("ag"); // = "ah";
String.Step("az"); // = "aA";
String.Step("aA"); // = "aB";
String.Step("aZ"); // = "ba";
String.Step("ZZ"); // = "aaa";
and so on...
This doesn't exactly need to extend the base String class.
I tried to work it out by each characters ASCII values but got stuck with strings containing 2 characters.
I would really appreciate if someone can provide full code of the function.
Thanks in advance.
EDIT
*I'm sorry I forgot to mention earlier that the function "reparse" the self generated string when its length reaches n.
continuation of this function will be smth like this. for example n = 3
String.Step("aaa"); // = "aab";
String.Step("aaZ"); // = "aba";
String.Step("aba"); // = "abb";
String.Step("abb"); // = "abc";
String.Step("abZ"); // = "aca";
.....
String.Step("zzZ"); // = "zAa";
String.Step("zAa"); // = "zAb";
........
I'm sorry I didn't mention it earlier, after reading some answers I realised that the problem was in question.
Without this the function will always produce character "a" n times after the end of the step.
NOTE: This answer is incorrect, as "aa" should follow after "Z"... (see comments below)
Here is an algorithm that might work:
each "string" represents a number to a given base (here: twice the count of letters in the alphabet).
The next step can thus be computed by parsing the "number"-string back into a int, adding 1 and then formatting it back to the base.
Example:
"a" == 1 -> step("a") == step(1) == 1 + 1 == 2 == "b"
Now your problem is reduced to parsing the string as a number to a given base and reformatting it. A quick googling suggests this page: http://everything2.com/title/convert+any+number+to+decimal
How to implement this?
a lookup table for letters to their corresponding number: a=1, b=2, c=3, ... Y = ?, Z = 0
to parse a string to number, read the characters in reverse order, looking up the numbers and adding them up:
"ab" -> 2*BASE^0 + 1*BASE^1
with BASE being the number of "digits" (2 count of letters in alphabet, is that 48?)
EDIT: This link looks even more promising: http://www.citidel.org/bitstream/10117/20/12/convexp.html
Quite collection of approaches, here is mine:-
The Function:
private static string IncrementString(string s)
{
byte[] vals = System.Text.Encoding.ASCII.GetBytes(s);
for (var i = vals.Length - 1; i >= 0; i--)
{
if (vals[i] < 90)
{
vals[i] += 1;
break;
}
if (vals[i] == 90)
{
if (i != 0)
{
vals[i] = 97;
continue;
}
else
{
return new String('a', vals.Length + 1);
}
}
if (vals[i] < 122)
{
vals[i] += 1;
break;
}
vals[i] = 65;
break;
}
return System.Text.Encoding.ASCII.GetString(vals);
}
The Tests
Console.WriteLine(IncrementString("a") == "b");
Console.WriteLine(IncrementString("z") == "A");
Console.WriteLine(IncrementString("Z") == "aa");
Console.WriteLine(IncrementString("aa") == "ab");
Console.WriteLine(IncrementString("az") == "aA");
Console.WriteLine(IncrementString("aZ") == "ba");
Console.WriteLine(IncrementString("zZ") == "Aa");
Console.WriteLine(IncrementString("Za") == "Zb");
Console.WriteLine(IncrementString("ZZ") == "aaa");
public static class StringStep
{
public static string Next(string str)
{
string result = String.Empty;
int index = str.Length - 1;
bool carry;
do
{
result = Increment(str[index--], out carry) + result;
}
while (carry && index >= 0);
if (index >= 0) result = str.Substring(0, index+1) + result;
if (carry) result = "a" + result;
return result;
}
private static char Increment(char value, out bool carry)
{
carry = false;
if (value >= 'a' && value < 'z' || value >= 'A' && value < 'Z')
{
return (char)((int)value + 1);
}
if (value == 'z') return 'A';
if (value == 'Z')
{
carry = true;
return 'a';
}
throw new Exception(String.Format("Invalid character value: {0}", value));
}
}
Split the input string into columns and process each, right-to-left, like you would if it was basic arithmetic. Apply whatever code you've got that works with a single column to each column. When you get a Z, you 'increment' the next-left column using the same algorithm. If there's no next-left column, stick in an 'a'.
I'm sorry the question is stated partly.
I edited the question so that it meets the requirements, without the edit the function would end up with a n times by step by step increasing each word from lowercase a to uppercase z without "re-parsing" it.
Please consider re-reading the question, including the edited part
This is what I came up with. I'm not relying on ASCII int conversion, and am rather using an array of characters. This should do precisely what you're looking for.
public static string Step(this string s)
{
char[] stepChars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ".ToCharArray();
char[] str = s.ToCharArray();
int idx = s.Length - 1;
char lastChar = str[idx];
for (int i=0; i<stepChars.Length; i++)
{
if (stepChars[i] == lastChar)
{
if (i == stepChars.Length - 1)
{
str[idx] = stepChars[0];
if (str.Length > 1)
{
string tmp = Step(new string(str.Take(str.Length - 1).ToArray()));
str = (tmp + str[idx]).ToCharArray();
}
else
str = new char[] { stepChars[0], str[idx] };
}
else
str[idx] = stepChars[i + 1];
break;
}
}
return new string(str);
}
This is a special case of a numeral system. It has the base of 52. If you write some parser and output logic you can do any kind of arithmetics an obviously the +1 (++) here.
The digits are "a"-"z" and "A" to "Z" where "a" is zero and "Z" is 51
So you have to write a parser who takes the string and builds an int or long from it. This function is called StringToInt() and is implemented straight forward (transform char to number (0..51) multiply with 52 and take the next char)
And you need the reverse function IntToString which is also implementet straight forward (modulo the int with 52 and transform result to digit, divide the int by 52 and repeat this until int is null)
With this functions you can do stuff like this:
IntToString( StringToInt("ZZ") +1 ) // Will be "aaa"
You need to account for A) the fact that capital letters have a lower decimal value in the Ascii table than lower case ones. B) The table is not continuous A-Z-a-z - there are characters inbetween Z and a.
public static string stepChar(string str)
{
return stepChar(str, str.Length - 1);
}
public static string stepChar(string str, int charPos)
{
return stepChar(Encoding.ASCII.GetBytes(str), charPos);
}
public static string stepChar(byte[] strBytes, int charPos)
{
//Escape case
if (charPos < 0)
{
//just prepend with a and return
return "a" + Encoding.ASCII.GetString(strBytes);
}
else
{
strBytes[charPos]++;
if (strBytes[charPos] == 91)
{
//Z -> a plus increment previous char
strBytes[charPos] = 97;
return stepChar(strBytes, charPos - 1); }
else
{
if (strBytes[charPos] == 123)
{
//z -> A
strBytes[charPos] = 65;
}
return Encoding.ASCII.GetString(strBytes);
}
}
}
You'll probably want some checking in place to ensure that the input string only contains chars A-Za-z
Edit Tidied up code and added new overload to remove redundant byte[] -> string -> byte[] conversion
Proof http://geekcubed.org/random/strIncr.png
This is a lot like how Excel columns would work if they were unbounded. You could change 52 to reference chars.Length for easier modification.
static class AlphaInt {
private static string chars =
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
public static string StepNext(string input) {
return IntToAlpha(AlphaToInt(input) + 1);
}
public static string IntToAlpha(int num) {
if(num-- <= 0) return "a";
if(num % 52 == num) return chars.Substring(num, 1);
return IntToAlpha(num / 52) + IntToAlpha(num % 52 + 1);
}
public static int AlphaToInt(string str) {
int num = 0;
for(int i = 0; i < str.Length; i++) {
num += (chars.IndexOf(str.Substring(i, 1)) + 1)
* (int)Math.Pow(52, str.Length - i - 1);
}
return num;
}
}
LetterToNum should be be a Function that maps "a" to 0 and "Z" to 51.
NumToLetter the inverse.
long x = "aazeiZa".Aggregate((x,y) => (x*52) + LetterToNum(y)) + 1;
string s = "";
do { // assertion: x > 0
var c = x % 52;
s = NumToLetter() + s;
x = (x - c) / 52;
} while (x > 0)
// s now should contain the result

Categories

Resources