I am trying to match user input with the pattern "ran,om", where "ran om will match exact characters with order, and "," can match to any characters. The program will find words in the arrayList for example in ArrayList dictionary{rammm, random, ranom}, for example, random will match, but ranom will not.
I have written the following code, but it only finds any words contains any of the characters in the user input:
for (int i = 0; i < userinput.Length; i++)
{
foreach (string line in dictionary)
if (line[i] == userinput[i])
{
Matching.Add(line);
}
foreach (string line in FirstCom)
Console.WriteLine(line);
}
Can anyone help me to figure out what do I do next? (p.s no regex will be using in this program)
How about this:
public static bool IsMatch(string pattern, string line)
{
var patternSplit = pattern.Split(',');
if (!line.StartsWith(patternSplit[0])) return false;
if(patternSplit.Count() > 2){
for (var i = 1; i < patternSplit.Count() - 1; i++)
{
if (!line.Contains(patternSplit[i])) return false;
}
}
if (!line.EndsWith(patternSplit[patternSplit.Count() - 1])) return false;
return true;
}
static void Main(string[] args)
{
var matchingData = "quick brown fox jumped over a lazy dog";
var failingData = "I am batman";
var pattern = "qu,pe,ov,og";
if(IsMatch(pattern, matchingData))Console.WriteLine("{0} matches pattern {1}", pattern, matchingData);
if(!IsMatch(pattern, failingData)) Console.WriteLine("{0} does not match {1}", pattern, failingData);
Console.ReadKey();
}
Related
Problem: I want to write a method that takes a message/index pair like this:
("Hello, I am *Name1, how are you doing *Name2?", 2)
The index refers to the asterisk delimited name in the message. So if the index is 1, it should refer to *Name1, if it's 2 it should refer to *Name2.
The method should return just the name with the asterisk (*Name2).
I have attempted to play around with substrings, taking the first delimited * and ending when we reach a character that isn't a letter, number, underscore or hyphen, but the logic just isn't setting in.
I know this is similar to a few problems on SO but I can't find anything this specific. Any help is appreciated.
This is what's left of my very vague attempt so far. Based on this thread:
public string GetIndexedNames(string message, int index)
{
int strStart = message.IndexOf("#") + "#".Length;
int strEnd = message.LastIndexOf(" ");
String result = message.Substring(strStart, strEnd - strStart);
}
If you want to do it the old school way, then something like:
public static void Main(string[] args)
{
string message = "Hello, I am *Name1, how are you doing *Name2?";
string name1 = GetIndexedNames(message, "*", 1);
string name2 = GetIndexedNames(message, "*", 2);
Console.WriteLine(message);
Console.WriteLine(name1);
Console.WriteLine(name2);
Console.ReadLine();
}
public static string GetIndexedNames(string message, string singleCharDelimiter, int index)
{
string valid = "abcdefghijlmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_-";
string[] parts = message.Split(singleCharDelimiter.ToArray());
if (parts.Length >= index)
{
StringBuilder sb = new StringBuilder();
for(int i = 0; i < parts[index].Length; i++)
{
string character = parts[index].Substring(i, 1);
if (valid.Contains(character))
{
sb.Append(character);
}
else
{
return sb.ToString();
}
}
return sb.ToString();
}
return "";
}
You can try using regular expressions to match the names. Assuming that name is a sequence of word characters (letters or digits):
using System.Linq;
using System.Text.RegularExpressions;
...
// Either name with asterisk *Name or null
// index is 1-based
private static ObtainName(string source, int index) => Regex
.Matches(source, #"\*\w+")
.Cast<Match>()
.Select(match => match.Value)
.Distinct() // in case the same name repeats several times
.ElementAtOrDefault(index - 1);
Demo:
string name = ObtainName(
"Hello, I am *Name1, how are you doing *Name2?", 2);
Console.Write(name);
Outcome:
*Name2
Perhaps not the most elegant solution, but if you want to use IndexOf, use a loop:
public static string GetIndexedNames(string message, int index, char marker='*')
{
int lastFound = 0;
for (int i = 0; i < index; i++) {
lastFound = message.IndexOf(marker, lastFound+1);
if (lastFound == -1) return null;
}
var space = message.IndexOf(' ', lastFound);
return space == -1 ? message.Substring(lastFound) : message.Substring(lastFound, space - lastFound);
}
I am a beginner in c#. I am trying to check if the first letter of postal code matches any element of char array. If it does not match any of the elements of the char array, it returns false.
Below is the approach:
string firstLetter= "KLMN";
char[] postalLetters = firstLetter.ToCharArray();
string PostalCode = "N2L0G6";
bool firstPostalLetterMatch = true;
foreach(char ch in firstLetter)
{
if (PostalCode.Substring(0, 1) != postalLetters.ToString())
{
firstPostalLetterMatch = false;
}
}
if(firstPostalLetterMatch == false)
{
Console.WriteLine("Error");
}
else
{
Console.WriteLine("No Error");
}
for example if postal code is N2L0G6. first letter will be N. Bool should be true. Since first letter is in the char array.
A. Linq
using System.Linq;
...
bool firstPostalLetterMatch =
postalLetters.Any( l => l == PostalCode[0] ); // BTW. This is case sensitive
This reverses the question a bit. It says: are there any letters in our collection of good letters that match the first letter of the tested postal code.
B. foreach
With foreach you want to find any match and then you can stop looking.
bool firstPostalLetterMatch = false;
foreach(char ch in postalLetters)
{
if (PostalCode[0] == ch)
{
firstPostalLetterMatch = true;
break; // Match found, we no longer have to search
}
}
You probably want something like this:
bool firstPostalLetterMatch = false;
char postCodeFirstLetter = PostalCode.ToCharArray()[0];
foreach(char ch in firstLetter)
{
if (postCodeFirstLetter == ch)
{
firstPostalLetterMatch = true;
break;
}
}
Since you said you're a beginner here's an easy to follow method for achieving what you need.
using System;
namespace ConsoleApp1
{
class Program
{
static void Main(string[] args)
{
string firstLetter = "KLMN";
int firstLength = firstLetter.Length;
int i = 0;
string PostalCode = "N2L0G6";
while (i < firstLength)
{
if (PostalCode[0].ToString().Contains(firstLetter[i]))
{
Console.WriteLine(firstLetter[i] + " matches first letter of " + PostalCode);
}
else
{
Console.WriteLine(firstLetter[i] + " does not match the first letter of " + PostalCode);
}
i++;
}
Console.Read();
}
}
}
This outputs
K does not match the first letter of N2L0G6
L does not match the first letter of N2L0G6
M does not match the first letter of N2L0G6
N matches first letter of N2L0G6
I have this snippet in my method:
MatchCollection words = Regex.Matches("dog cat fun toy", #"\w\w\w.\w?");
foreach (Match match in words)
{
Console.WriteLine(match);
}
I expected to see something like this:
dog c
cat f
fun t
But program came up with just that:
dog c
fun t
As I understood, it skipped second occurrence because part of it was in previous occurrence. But I still want to see it. How should I correct my snippet?
You may try some thing like that
var regX = new Regex(#"\w\w\w.\w?");
string pattern = "dog cat fun toy";
int i = 0;
while (i < pattern.Length)
{
var m = regX.Match(pattern, i);
if (!m.Success) break;
Console.WriteLine(m.Value);
i = m.Index + 1;
}
Even though it's not a universal solution, but pertinent to you case, the following snippet can do the job:
string _input = "dog cat fun toy";
string[] _arr = _input.Split(' ');
string _out = String.Empty;
for (int i = 0; i < _arr.Length-1; i++)
{
if (_arr[i].Length == 3) { _out+=_arr[i]+" "+_arr[i+1].Substring(0,1)+";";}
}
where string _out contains all matches separated by ";" (or any other char). Alternatively, you can send the output to Console:
string _input = "dog cat fun toy";
string[] _arr = _input.Split(' ');
for (int i = 0; i < _arr.Length-1; i++)
{
if (_arr[i].Length == 3) {Console.WriteLine(_arr[i]+" "+_arr[i+1].Substring(0,1));}
}
Hope this may help.
This question already has answers here:
How do I remove all non alphanumeric characters from a string except dash?
(13 answers)
Closed 9 years ago.
This is the code:
StringBuilder sb = new StringBuilder();
Regex rgx = new Regex("[^a-zA-Z0-9 -]");
var words = Regex.Split(textBox1.Text, #"(?=(?<=[^\s])\s+\w)");
for (int i = 0; i < words.Length; i++)
{
words[i] = rgx.Replace(words[i], "");
}
When im doing the Regex.Split() the words contain also strings with chars inside for exmaple:
Daniel>
or
Hello:
or
\r\nNew
or
hello---------------------------
And i need to get only the words without all the signs
So i tried to use this loop but i end that in words there are many places with ""
And some places with only ------------------------
And i cant use this as strings later in my code.
You don't need a regex to clear non-letters. This will remove all non-unicode letters.
public string RemoveNonUnicodeLetters(string input)
{
StringBuilder sb = new StringBuilder();
foreach(char c in input)
{
if(Char.IsLetter(c))
sb.Append(c);
}
return sb.ToString();
}
Alternatively, if you only want to allow Latin letters, you can use this
public string RemoveNonLatinLetters(string input)
{
StringBuilder sb = new StringBuilder();
foreach(char c in input)
{
if(c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')
sb.Append(c);
}
return sb.ToString();
}
Benchmark vs Regex
public static string RemoveNonUnicodeLetters(string input)
{
StringBuilder sb = new StringBuilder();
foreach (char c in input)
{
if (Char.IsLetter(c))
sb.Append(c);
}
return sb.ToString();
}
static readonly Regex nonUnicodeRx = new Regex("\\P{L}");
public static string RemoveNonUnicodeLetters2(string input)
{
return nonUnicodeRx.Replace(input, "");
}
static void Main(string[] args)
{
Stopwatch sw = new Stopwatch();
StringBuilder sb = new StringBuilder();
//generate guids as input
for (int j = 0; j < 1000; j++)
{
sb.Append(Guid.NewGuid().ToString());
}
string input = sb.ToString();
sw.Start();
for (int i = 0; i < 1000; i++)
{
RemoveNonUnicodeLetters(input);
}
sw.Stop();
Console.WriteLine("SM: " + sw.ElapsedMilliseconds);
sw.Restart();
for (int i = 0; i < 1000; i++)
{
RemoveNonUnicodeLetters2(input);
}
sw.Stop();
Console.WriteLine("RX: " + sw.ElapsedMilliseconds);
}
Output (SM = String Manipulation, RX = Regex)
SM: 581
RX: 9882
SM: 545
RX: 9557
SM: 664
RX: 10196
keyboardP’s solution is decent – do consider it. But as I’ve argued in the comments, regular expressions are actually the correct tool for the job, you’re just making it unnecessarily complicated. The actual solution is a one-liner:
var result = Regex.Replace(input, "\\P{L}", "");
\P{…} specifies a Unicode character class we do not want to match (the opposite of \p{…}). L is the Unicode character class for letters.
Of course it makes sense to encapsulate this into a method, as keyboardP did. To avoid recompiling the regular expression over again, you should also consider pulling the regex creation out of the actual code (although this probably won’t give a big impact on performance):
static readonly Regex nonUnicodeRx = new Regex("\\P{L}");
public static string RemoveNonUnicodeLetters(string input) {
return nonUnicodeRx.Replace(input, "");
}
To help Konrad and keyboardP resolve their differences, I ran a benchmark test, using their code. It turns out that keyboardP's code is 10x faster than Konrad's code
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string input = "asdf234!##*advfk234098awfdasdfq9823fna943";
DateTime start = DateTime.Now;
for (int i = 0; i < 100000; i++)
{
RemoveNonUnicodeLetters(input);
}
Console.WriteLine(DateTime.Now.Subtract(start).TotalSeconds);
start = DateTime.Now;
for (int i = 0; i < 100000; i++)
{
RemoveNonUnicodeLetters2(input);
}
Console.WriteLine(DateTime.Now.Subtract(start).TotalSeconds);
}
public static string RemoveNonUnicodeLetters(string input)
{
StringBuilder sb = new StringBuilder();
foreach (char c in input)
{
if (Char.IsLetter(c))
sb.Append(c);
}
return sb.ToString();
}
public static string RemoveNonUnicodeLetters2(string input)
{
var result = Regex.Replace(input, "\\P{L}", "");
return result;
}
}
}
I got
0.12
1.2
as output
UPDATE:
To see if it is the Regex compilation that is slowing down the Regex method, I put the regex in a static variable that is only constructed once.
static Regex rex = new Regex("\\P{L}");
public static string RemoveNonUnicodeLetters2(string input)
{
var result = rex.Replace(input,m => "");
return result;
}
But this had no effect on the runtime.
Rather than describing what I want (it's difficult to explain), Let me provide an example of what I need to accomplish in C# using a regular expression:
"HelloWorld" should be transformed to "Hello World"
"HelloWORld" should be transformed to "Hello WO Rld" //Two consecutive letters in capital should be treatead as one word
"helloworld" should be transformed to "helloworld"
EDIT:
"HellOWORLd" should be transformed to "Hell OW OR Ld"
Every 2-consecutive capital letters should be considered one word.
Is this possible?
This is fully working C# code, not just the regex:
Console.WriteLine(
Regex.Replace(
"HelloWORld",
"(?<!^)(?<wordstart>[A-Z]{1,2})",
" ${wordstart}", RegexOptions.Compiled));
And it prints:
Hello WO Rld
Update
To make this more UNICODE/international aware, consider replacing [A-Z] by \p{Lt} (meaning a UNICODE code point that represents a Letter in uppercase). The result for the current input would the same. So here is a slightly more compelling example:
Console.WriteLine(Regex.Replace(
#"ÉclaireürfØÑJßå",
#"(?<!^)(?<wordstart>\p{Lu}{1,2})",
#" ${wordstart}",
RegexOptions.Compiled));
The regular expression engine is not a transformative thing by nature, but rather a pattern matching (and replacing) engine. People often mistake the replace part of Regex, thinking that it can do more than it's designed to.
Back to your question, though... Regex cannot do what you want, instead, you should write your own parser to do this. With C#, if you're familiar with the language, this task is somewhat trivial.
It's a case of "You're using the wrong tool for the job".
Here are regular expressions that detect what you are looking for:
([A-Z]\w*?)[A-Z]
this matches any uppercase letter from A to Z once followed by aphanumerics up to the next uppercase.
([A-Z]{2}\w*?)[A-Z]
this matches any uppercase letter from A to Z exactly 2 times.
Regex is a matching engine, you can parse the input string and use regex.isMatch to find candidate matches to then insert spaces into the output string
string f(string input)
{
//'lowerUPPER' -> 'lower UPPER'
var x = Regex.Replace(input, "([a-z])([A-Z])","$1 $2");
//'UPPER' -> 'UP PE R'
return Regex.Replace(x, "([A-Z]{2})","$1 ");
}
class Program
{
static void Main(string[] args)
{
Print(Parse("HelloWorld"));
Print(Parse("HelloWORld"));
Print(Parse("helloworld"));
Print(Parse("HellOWORLd"));
Console.ReadLine();
}
static void Print(IEnumerable<string> input)
{
foreach (var s in input)
{
Console.Write(s);
Console.Write(' ');
}
Console.WriteLine();
}
static IEnumerable<string> Parse(string input)
{
var sb = new StringBuilder();
for (int i = 0; i < input.Length; i++)
{
if (!char.IsUpper(input[i]))
{
sb.Append(input[i]);
continue;
}
if (sb.Length > 0)
{
yield return sb.ToString();
sb.Clear();
}
sb.Append(input[i]);
if (char.IsUpper(input[i + 1]))
{
sb.Append(input[++i]);
yield return sb.ToString();
sb.Clear();
}
}
if (sb.Length > 0)
{
yield return sb.ToString();
}
}
}
I think does not need regular expression in this case.
Try this:
static void Main(string[] args)
{
var input = "HellOWORLd";
var i = 0;
var x = 4;
var len = input.Length;
var output = new List<string>();
while (x <= len)
{
output.Add(SubStr(input, i, x));
i = x;
x += 2;
}
var ret = output.ToArray(); //["Hell","OW", "OR", "Ld"]
Console.ReadLine();
}
static string SubStr(string str, int start, int end)
{
var len = str.Length;
if (start >= 0 && end <= len)
{
var ret = new StringBuilder();
for (int i = 0; i < len; i++)
{
if (i == start)
{
do
{
ret.Append(str[i]);
i++;
} while (i != end);
}
}
return ret.ToString();
}
return null;
}