How to skip over characters already looped through in a string - c#

I have a function findChar() that loops through a string to find occurrences of characters in a string Ex: "Cincinnati" (2 Cs, 2 i's, etc) but once it finds another 'C' or 'i' it will return the values again
public static int findChar(string name, char c)
{
int count = 0;
for (int i = 0; i < name.Length; i++)
{
if (name[i] == c || name[i] == Char.ToUpper(c) || name[i] == Char.ToLower(c))
{
count++;
}
}
return count;
}
static void Main(string[] args)
{
string name = "Cincinnati";
char c = ' ' ;
int count = 0;
for (int i = 0; i < name.Length; i++)
{
c = name[i];
count = findChar(name, c);
Console.WriteLine(count);
}
}
My Output looks like this:
2
3
3
2
3
3
3
1
1
3
And I need it to be like:
2
3
3
1
1

Option 1: keep track of the letters you already processed, and ignore it if you already did
Option 2: use System.Linq's GroupBy to get the count
public static void Main()
{
var name = "Cincinatti";
var letters = name.ToLower().GroupBy(letter => letter);
foreach (var group in letters) {
Console.WriteLine(group.Count());
}
}

There are many ways to solve a problem like this. First let's discuss a problem it looks like you've already run into, capitalization. Lower case and upper case versions of the same letter are classified as different characters. The easiest way to combat this is to either convert the string to lowercase or uppercase so each duplicate letter can also be classified as a duplicate character. You can do this either by using the String.ToLower() method or the String.ToUpper() method depending on which one you want to use.
The way to solve this that is the closest to what you have is to just create a list, add letters to it as you process them, then use the list to check what letters you've processed already. It would look something like this:
static void Main(string[] args)
{
string name = "Cincinnati";
char c = ' ' ;
int count = 0;
var countedLetters = new List<string>();
for (int i = 0; i < name.Length; i++)
{
c = name[i];
char cLower = char.ToLower(c);
if(countedLetters.Contains(cLower))
{
continue;
}
countedLetters.Add(cLower);
count = findChar(name, c);
Console.WriteLine(count);
}
}
Although, you can usually use System.Linq's Enumerable extension methods to do things like this pretty easily.
Not deviating too much from what you have, another solution using System.Linq would be to just get the distinct characters and loop through that instead of the entire string. When doing this, we need to convert the entire string to either upper or lower case in order for linq to return the expected result. This would like something like this:
static void Main(string[] args)
{
string name = "Cincinnati";
string nameLower = name.ToLower();
int count = 0;
foreach(char c in nameLower.Distinct())
{
count = findChar(name, c);
Console.WriteLine(count);
}
}
Then finally, you can simplify this a ton by leaning heavily into the linq route. GroupBy is very useful for this because it's entire purpose is to group duplicates together. There are many ways to implement this and two have already be provided, so I will just provide a third.
public static void Main()
{
string name = "Cincinatti";
int[] counts = name.ToLower()
.GroupBy(letter => letter)
.Select(group => group.Count())
.ToArray();
Console.WriteLine(string.Join("\n", counts));
}

you can do group by list, sort optional (i left it commented out) and then select count
var word="Cincinnati";
var groups = word.ToLower().GroupBy(n => {return n;})
.Select(n => new
{
CharachterName = n.Key,
CharachterCount = n.Count()
});
// .OrderBy(n => n.CharachterName);
Console.WriteLine(JsonConvert.SerializeObject(groups.Select(i=>i.CharachterCount)));

Related

how do i check whether a number contains all the digits of another number?

i'm trying to do something where I have two numbers (let's say 123 and 5321). And I want to check if the second number contains all the digits of the first number. now if there is any way this could be done using if and for loops that would help me so much but ANY HELP IS GREATLY APPRECIATED! oh, and and it doesn't matter how many of the same digits a number has(33 and 503 still counts)
If we don't count number of digits (e.g. 22 appears in 123 even if 22 has two 2 when 123 has just one 2):
int first = 123;
int second = 5321;
// If second contains first
bool contains = !first
.ToString()
.Except(second.ToString())
.Any();
If number of digits matters (i.e. 22 doesn't appear in 123):
var dict = second
.ToString()
.GroupBy(d => d)
.ToDictionary(chunk => chunk.Key,
chunk => chunk.Count());
// If second contains first
bool contains = first
.ToString()
.GroupBy(d => d)
.All(chunk => dict.TryGetValue(chunk.Key, out var count) && count >= chunk.Count());
Edit: good old if and for loop solution:
string firstSt = first.ToString();
string secondSt = second.ToString();
// contains unless we find a counter example
bool contains = true;
for (int i = 0; i < firstSt.Length; ++i) {
char toFind = firstSt[i];
bool found = false;
for (int j = 0; j < secondSt.Length; ++j) {
if (toFind = secondSt[j]) {
found = true;
break;
}
}
// firstSt[i] is not found within secondSt
if (!found) {
contains = false;
break;
}
}
Approach with string and Contains()
int i1 = 5321, i2 = 123;
bool result = i1.ToString().All(i2.ToString().Contains); //false, 5 missing

Char/String comparison

I'm trying to have a suggestion feature for the search function in my program eg I type janw doe in the search section and it will output NO MATCH - did you mean jane doe? I'm not sure what the problem is, maybe something to do with char/string comparison..I've tried comparing both variables as type char eg char temp -->temp.Contains ...etc but an error appears (char does not contain a definition for Contains). Would love any help on this! 8)
if (found == false)
{
Console.WriteLine("\n\nMATCH NOT FOUND");
int charMatch = 0, charCount = 0;
string[] checkArray = new string[26];
//construction site /////////////////////////////////////////////////////////////////////////////////////////////////////////////
for (int controlLoop = 0; controlLoop < contPeople.Length; controlLoop++)
{
foreach (char i in userContChange)
{
charCount = charCount + 1;
}
for (int i = 0; i < userContChange.Length; )
{
string temp = contPeople[controlLoop].name;
string check=Convert.ToString(userContChange[i]);
if (temp.Contains(check))
{
charMatch = charMatch + 1;
}
}
int half = charCount / 2;
if (charMatch >= half)
{
checkArray[controlLoop] = contPeople[controlLoop].name;
}
}
///////////////////////////////////////////////////////////////////////////////////////////////////////////
Console.WriteLine("Did you mean: ");
for (int a = 0; a < checkArray.Length; a++)
{
Console.WriteLine(checkArray[a]);
}
///////////////////////////////////////////////////////////////////////////////////////////////////
A string is made up of many characters. A character is a primitive, likewise, it doesn't "contain" any other items. A string is basically an array of characters.
For comparing string and characters:
char a = 'A';
String alan = "Alan";
Debug.Assert(alan[0] == a);
Or if you have a single digit string.. I suppose
char a = 'A';
String alan = "A";
Debug.Assert(alan == a.ToString());
All of these asserts are true
But, the main reason I wanted to comment on your question, is to suggest an alternative approach for suggesting "Did you mean?". There's an algorithm called Levenshtein Distance which calculates the "number of single character edits" required to convert one string to another. It can be used as a measure of how close two strings are. You may want to look into how this algorithm works because it could help you.
Here's an applet that I found which demonstrates: Approximate String Matching with k-differences
Also the wikipedia link Levenshtein distance
Char type cannot have .Contains() because is only 1 char value type.
In your case (if i understand), maybe you need to use .Equals() or the == operator.
Note: for compare String correctly, use .Equals(),
the == operator does not work good in this case because String is reference type.
Hope this help!
char type dosen't have the Contains() method, but you can use iit like this: 'a'.ToString().Contains(...)
if do not consider the performance, another simple way:
var input = "janw doe";
var people = new string[] { "abc", "123", "jane", "jane doe" };
var found = Array.BinarySearch<string>(people, input);//or use FirstOrDefault(), FindIndex, search engine...
if (found < 0)//not found
{
var i = input.ToArray();
var target = "";
//most similar
//target = people.OrderByDescending(p => p.ToArray().Intersect(i).Count()).FirstOrDefault();
//as you code:
foreach (var p in people)
{
var count = p.ToArray().Intersect(i).Count();
if (count > input.Length / 2)
{
target = p;
break;
}
}
if (!string.IsNullOrWhiteSpace(target))
{
Console.WriteLine(target);
}
}

Find and replace text in a string using C#

Anyone know how I would find & replace text in a string? Basically I have two strings:
string firstS = "/9j/4AAQSkZJRgABAQEAYABgAAD/2wBDABQODxIPDRQSERIXFhQYHzMhHxwcHz8tLyUzSkFOTUlBSEZSXHZkUldvWEZIZoxob3p9hIWET2ORm4+AmnaBhH//2wBDARYXFx8bHzwhITx/VEhUf39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f3//";
string secondS = "abcdefg2wBDABQODxIPDRQSERIXFh/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/abcdefg";
I want to search firstS to see if it contains any sequence of characters that's in secondS and then replace it. It also needs to be replaced with the number of replaced characters in squared brackets:
[NUMBER-OF-CHARACTERS-REPLACED]
For example, because firstS and secondS both contain "2wBDABQODxIPDRQSERIXFh" and "/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/" they would need to be replaced. So then firstS becomes:
string firstS = "/9j/4AAQSkZJRgABAQEAYABgAAD/[22]QYHzMhHxwcHz8tLyUzSkFOTUlBSEZSXHZkUldvWEZIZoxob3p9hIWET2ORm4+AmnaBhH//2wBDARYXFx8bHzwhITx/VEhUf39[61]f3//";
Hope that makes sense. I think I could do this with Regex, but I don't like the inefficiency of it. Does anyone know of another, faster way?
Does anyone know of another, faster way?
Yes, this problem actually has a proper name. It is called the Longest Common Substring, and it has a reasonably fast solution.
Here is an implementation on ideone. It finds and replaces all common substrings of ten characters or longer.
// This comes straight from Wikipedia article linked above:
private static string FindLcs(string s, string t) {
var L = new int[s.Length, t.Length];
var z = 0;
var ret = new StringBuilder();
for (var i = 0 ; i != s.Length ; i++) {
for (var j = 0 ; j != t.Length ; j++) {
if (s[i] == t[j]) {
if (i == 0 || j == 0) {
L[i,j] = 1;
} else {
L[i,j] = L[i-1,j-1] + 1;
}
if (L[i,j] > z) {
z = L[i,j];
ret = new StringBuilder();
}
if (L[i,j] == z) {
ret.Append(s.Substring( i-z+1, z));
}
} else {
L[i,j]=0;
}
}
}
return ret.ToString();
}
// With the LCS in hand, building the answer is easy
public static string CutLcs(string s, string t) {
for (;;) {
var lcs = FindLcs(s, t);
if (lcs.Length < 10) break;
s = s.Replace(lcs, string.Format("[{0}]", lcs.Length));
}
return s;
}
You need to be very careful between "Longest common substring and "longest common subsequence"
For Substring: http://en.wikipedia.org/wiki/Longest_common_substring_problem
For SubSequence: http://en.wikipedia.org/wiki/Longest_common_subsequence_problem
I would suggest you to also see few videos on youtube on these two topics
http://www.youtube.com/results?search_query=longest+common+substring&oq=longest+common+substring&gs_l=youtube.3..0.3834.10362.0.10546.28.17.2.9.9.2.225.1425.11j3j3.17.0...0.0...1ac.lSrzx8rr1kQ
http://www.youtube.com/results?search_query=longest+common+subsequence&oq=longest+common+s&gs_l=youtube.3.0.0l6.2968.7905.0.9132.20.14.2.4.4.0.224.2038.5j2j7.14.0...0.0...1ac.4CYZ1x50zpc
you can find c# implementation of longest common subsequence here:
http://www.alexandre-gomes.com/?p=177
http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Longest_common_subsequence
I have a similar issue, but for word occurrences! so, I hope this can help. I used SortedDictionary and a binary search tree
/* Application counts the number of occurrences of each word in a string
and stores them in a generic sorted dictionary. */
using System;
using System.Text.RegularExpressions;
using System.Collections.Generic;
public class SortedDictionaryTest
{
public static void Main( string[] args )
{
// create sorted dictionary
SortedDictionary< string, int > dictionary = CollectWords();
// display sorted dictionary content
DisplayDictionary( dictionary );
}
// create sorted dictionary
private static SortedDictionary< string, int > CollectWords()
{
// create a new sorted dictionary
SortedDictionary< string, int > dictionary =
new SortedDictionary< string, int >();
Console.WriteLine( "Enter a string: " ); // prompt for user input
string input = Console.ReadLine();
// split input text into tokens
string[] words = Regex.Split( input, #"\s+" );
// processing input words
foreach ( var word in words )
{
string wordKey = word.ToLower(); // get word in lowercase
// if the dictionary contains the word
if ( dictionary.ContainsKey( wordKey ) )
{
++dictionary[ wordKey ];
}
else
// add new word with a count of 1 to the dictionary
dictionary.Add( wordKey, 1 );
}
return dictionary;
}
// display dictionary content
private static void DisplayDictionary< K, V >(
SortedDictionary< K, V > dictionary )
{
Console.WriteLine( "\nSorted dictionary contains:\n{0,-12}{1,-12}",
"Key:", "Value:" );
/* generate output for each key in the sorted dictionary
by iterating through the Keys property with a foreach statement*/
foreach ( K key in dictionary.Keys )
Console.WriteLine( "{0,- 12}{1,-12}", key, dictionary[ key ] );
Console.WriteLine( "\nsize: {0}", dictionary.Count );
}
}
This is probably dog slow, but if you're willing to incur some technical debt and need something now for prototyping, you could use LINQ.
string firstS = "123abc";
string secondS = "456cdeabc123";
int minLength = 3;
var result =
from subStrCount in Enumerable.Range(0, firstS.Length)
where firstS.Length - subStrCount >= 3
let subStr = firstS.Substring(subStrCount, 3)
where secondS.Contains(subStr)
select secondS.Replace(subStr, "[" + subStr.Length + "]");
Results in
456cdeabc[3]
456cde[3]123

Regex to find first capital letter occurrence in a string

I want to find the index of first capital letter occurrence in a string.
E.g. -
String x = "soHaM";
Index should return 2 for this string. The regex should ignore all other capital letters after the first one is found. If there are no capital letters found then it should return 0. Please help.
I'm pretty sure all you need is the regex A-Z \p{Lu}:
public static class Find
{
// Apparently the regex below works for non-ASCII uppercase
// characters (so, better than A-Z).
static readonly Regex CapitalLetter = new Regex(#"\p{Lu}");
public static int FirstCapitalLetter(string input)
{
Match match = CapitalLetter.Match(input);
// I would go with -1 here, personally.
return match.Success ? match.Index : 0;
}
}
Did you try this?
Just for fun, a LINQ solution:
string x = "soHaM";
var index = from ch in x.ToArray()
where Char.IsUpper(ch)
select x.IndexOf(ch);
This returns IEnumerable<Int32>. If you want the index of the first upper case character, simply call index.First() or retrieve only the first instance in the LINQ:
string x = "soHaM";
var index = (from ch in x.ToArray()
where Char.IsUpper(ch)
select x.IndexOf(ch)).First();
EDIT
As suggested in the comments, here is another LINQ method (possibly more performant than my initial suggestion):
string x = "soHaM";
x.Select((c, index) => new { Char = c, Index = index }).First(c => Char.IsUpper(c.Char)).Index;
No need for Regex:
int firstUpper = -1;
for(int i = 0; i < x.Length; i++)
{
if(Char.IsUpper(x[i]))
{
firstUpper = i;
break;
}
}
http://msdn.microsoft.com/en-us/library/system.char.isupper.aspx
For the sake of completeness, here's my LINQ approach(although it's not the right tool here even if OP could use it):
int firstUpperCharIndex = -1;
var upperChars = x.Select((c, index) => new { Char = c, Index = index })
.Where(c => Char.IsUpper(c.Char));
if(upperChars.Any())
firstUpperCharIndex = upperChars.First().Index;
First your logic fails, if the method returns 0 in your case it would mean the first char in that list was in upperCase, so I would recomend that -1 meens not found, or throw a exception.
Anyway just use regular expressions becasue you can is not always the best choise, plus they are pretty slow and hard to read in general, making yoru code much harder to work with.
Anyway here is my contribution
public static int FindFirstUpper(string text)
{
for (int i = 0; i < text.Length; i++)
if (Char.IsUpper(text[i]))
return i;
return -1;
}
Using Linq:
using System.Linq;
string word = "soHaMH";
var capChars = word.Where(c => char.IsUpper(c)).Select(c => c);
char capChar = capChars.FirstOrDefault();
int index = word.IndexOf(capChar);
Using C#:
using System.Text.RegularExpressions;
string word = "soHaMH";
Match match= Regex.Match(word, "[A-Z]");
index = word.IndexOf(match.ToString());
Using loop
int i = 0;
for(i = 0; i < mystring.Length; i++)
{
if(Char.IsUpper(mystring, i))
break;
}
i is the value u should be looking at;

Generating every character combination up to a certain word length

I am doing a security presentation for my Computer and Information Security course in a few weeks time, and in this presentation I will be demonstrating the pros and cons of different attacks (dictionary, rainbow and bruteforce). I am do the dictionary and rainbow attacks fine but I need to generate the bruteforce attack on the fly. I need to find an algorithm that will let me cycle though every combination of letter, symbol and number up to a certain character length.
So as an example, for a character length of 12, the first and last few generations will be:
a
ab
abc
abcd
...
...
zzzzzzzzzzzx
zzzzzzzzzzzy
zzzzzzzzzzzz
But it will also use numbers and symbols, so it's quite hard for me to explain... but I think you get the idea. Using only symbols from the ASCII table is fine.
I can kind of picture using an ASCII function to do this with a counter, but I just can't work it out in my head. If anyone could provide some source code (I'll probably be using C#) or even some pseudo code that I can program a function from that'd be great.
Thank you in advance. :)
A recursive function will let you run through all combinations of ValidChars:
int maxlength = 12;
string ValidChars;
private void Dive(string prefix, int level)
{
level += 1;
foreach (char c in ValidChars)
{
Console.WriteLine(prefix + c);
if (level < maxlength)
{
Dive(prefix + c, level);
}
}
}
Assign the set of valid characters to ValidChars, the maximum length of string you want to maxlength, then call Dive("", 0); and away you go.
You need to generate all combinations of characters from a set of valid characters ; let's call this set validChars. Basically, each set of combinations of length N is a cartesian product of validChars with itself, N times. That's pretty easy to do using Linq:
char[] validChars = ...;
var combinationsOfLength1 =
from c1 in validChars
select new[] { c1 };
var combinationsOfLength2 =
from c1 in validChars
from c2 in validChars
select new[] { c1, c2 };
...
var combinationsOfLength12 =
from c1 in validChars
from c2 in validChars
...
from c12 in validChars
select new[] { c1, c2 ... c12 };
var allCombinations =
combinationsOfLength1
.Concat(combinationsOfLength2)
...
.Concat(combinationsOfLength12);
Obviously, you don't want to manually write the code for each length, especially if you don't know in advance the maximum length...
Eric Lippert has an article about generating the cartesian product of an arbitrary number of sequences. Using the CartesianProduct extension method provided by the article, you can generate all combinations of length N as follows:
var combinationsOfLengthN = Enumerable.Repeat(validChars, N).CartesianProduct();
Since you want all combinations from length 1 to MAX, you can do something like that:
var allCombinations =
Enumerable
.Range(1, MAX)
.SelectMany(N => Enumerable.Repeat(validChars, N).CartesianProduct());
allCombinations is an IEnumerable<IEnumerable<char>>, if you want to get the results as a sequence of strings, you just need to add a projection:
var allCombinations =
Enumerable
.Range(1, MAX)
.SelectMany(N => Enumerable.Repeat(validChars, N).CartesianProduct())
.Select(combination => new string(combination.ToArray()));
Note that it's certainly not the most efficient solution, but at least it's short and readable...
You can try this code, that use recursion to print all possible strings of 0 to stringsLenght chars lenght, composed by all combination of chars from firstRangeChar to lastRangeChar.
class BruteWriter
{
static void Main(string[] args)
{
var bw = new BruteWriter();
bw.WriteBruteStrings("");
}
private void WriteBruteStrings(string prefix)
{
Console.WriteLine(prefix);
if (prefix.Length == stringsLenght)
return;
for (char c = firstRangeChar; c <= lastRangeChar; c++)
WriteBruteStrings(prefix + c);
}
char firstRangeChar='A';
char lastRangeChar='z';
int stringsLenght=10;
}
This look to be faster than the solution of #dthorpe.I've compared the algorthms using this code:
class BruteWriter
{
static void Main(string[] args)
{
var st = new Stopwatch();
var bw = new BruteWriter();
st.Start();
bw.WriteBruteStrings("");
Console.WriteLine("First method: " + st.ElapsedMilliseconds);
for (char c = bw.firstRangeChar; c <= bw.lastRangeChar; c++)
bw.ValidChars += c;
st.Start();
bw.Dive("", 0);
Console.WriteLine("Second method: " + st.ElapsedMilliseconds);
Console.ReadLine();
}
private void WriteBruteStrings(string prefix)
{
if (prefix.Length == stringsLenght)
return;
for (char c = firstRangeChar; c <= lastRangeChar; c++)
WriteBruteStrings(prefix + c);
}
char firstRangeChar='A';
char lastRangeChar='R';
int stringsLenght=5;
int maxlength = 5;
string ValidChars;
private void Dive(string prefix, int level)
{
level += 1;
foreach (char c in ValidChars)
{
if (level <= maxlength)
{
Dive(prefix + c, level);
}
}
}
}
and, on my pc, I get these results:
First method: 247
Second method: 910
public void BruteStrings(int maxlength)
{
for(var i=1;i<i<=maxlength;i++)
BruteStrings(Enumerable.Repeat((byte)0,i));
}
public void BruteStrings(byte[] bytes)
{
Console.WriteLine(bytes
.Cast<char>()
.Aggregate(new StringBuilder(),
(sb,c) => sb.Append(c))
.ToString());
if(bytes.All(b=>b.MaxValue)) return;
bytes.Increment();
BruteStrings(bytes);
}
public static void Increment(this byte[] bytes)
{
bytes.Last() += 1;
if(bytes.Last == byte.MinValue)
{
var lastByte = bytes.Last()
bytes = bytes.Take(bytes.Count() - 1).ToArray().Increment();
bytes = bytes.Concat(new[]{lastByte});
}
}
Another alternative i did, that return a string.
I did not care about the performance of the thing since it was not for a real world scenario.
private void BruteForcePass(int maxLength)
{
var tempPass = "";
while (tempPass.Length <= maxLength)
{
tempPass = GetNextString(tempPass);//Use char from 32 to 256
//Do what you want
}
}
private string GetNextString(string initialString, int minChar= 32, int maxChar = 256)
{
char nextChar;
if (initialString.Length == 0)
{
nextChar = (char)minChar;//the initialString Length will increase
}
else if (initialString.Last() == (char)maxChar)
{
nextChar = (char)minChar;
var tempString = initialString.Substring(0, initialString.Length -1);//we need to increment the char just before the last one
initialString = GetNextString(tempString, minChar, maxChar);
}
else
{
nextChar = (char)(initialString.Last() + 1);//Get the lash Char and increment it;
initialString= initialString.Remove(initialString.Length - 1);//Remove the last char.
}
return initialString + nextChar;
}

Categories

Resources