I'm looking to know when a string does not contain two strings. For example.
string firstString = "pineapple"
string secondString = "mango"
string compareString = "The wheels on the bus go round and round"
So, I want to know when the first string and second string are not in the compareString.
How?
This should do the trick for you.
For one word:
if (!string.Contains("One"))
For two words:
if (!(string.Contains("One") && string.Contains("Two")))
You should put all your words into some kind of Collection or List and then call it like this:
var searchFor = new List<string>();
searchFor.Add("pineapple");
searchFor.Add("mango");
bool containsAnySearchString = searchFor.Any(word => compareString.Contains(word));
If you need to make a case or culture independent search you should call it like this:
bool containsAnySearchString =
searchFor.Any(word => compareString.IndexOf
(word, StringComparison.InvariantCultureIgnoreCase >= 0);
So you can utilize short-circuiting:
bool containsBoth = compareString.Contains(firstString) &&
compareString.Contains(secondString);
Use Enumerable.Contains function:
var result =
!(compareString.Contains(firstString) || compareString.Contains(secondString));
bool isFirst = compareString.Contains(firstString);
bool isSecond = compareString.Contains(secondString );
Option with a regexp if you want to discriminate between Mango and Mangosteen.
var reg = new Regex(#"\b(pineapple|mango)\b",
RegexOptions.IgnoreCase | RegexOptions.Multiline);
if (!reg.Match(compareString).Success)
...
The accepted answer, and most others will present a logic failure when an unassociated word contains another. Such as "low" in "follow". Those are separate words and .Contains and IndexOf will fail on those.
Word Boundary
What is needed is to say that a word must stand alone and not be within another word. The only way to handle that situation is using regular expressions and provide a word boundary \b rule to isolate each word properly.
Tests And Example
string first = "name";
var second = "low";
var sentance = "Follow your surname";
var ignorableWords = new List<string> { first, second };
The following are two tests culled from other answers (to show the failure) and then the suggested answer.
// To work, there must be *NO* words that match.
ignorableWords.Any(word => sentance.Contains(word)); // Returns True (wrong)
ignorableWords.Any(word => // Returns True (wrong)
sentance.IndexOf(word,
StringComparison.InvariantCultureIgnoreCase) >= 0);
// Only one that returns False
ignorableWords.Any(word =>
Regex.IsMatch(sentance, #$"\b{word}\b", RegexOptions.IgnoreCase));
Summary
.Any(word =>Regex.IsMatch(sentance, #$"\b{word}\b", RegexOptions.IgnoreCase)
One to many words to check against.
No internal word failures
Case is ignored.
Related
I have a kinda simple problem, but I want to solve it in the best way possible. Basically, I have a string in this kind of format: <some letters><some numbers>, i.e. q1 or qwe12. What I want to do is get two strings from that (then I can convert the number part to an integer, or not, whatever). The first one being the "string part" of the given string, so i.e. qwe and the second one would be the "number part", so 12. And there won't be a situation where the numbers and letters are being mixed up, like qw1e2.
Of course, I know, that I can use a StringBuilder and then go with a for loop and check every character if it is a digit or a letter. Easy. But I think it is not a really clear solution, so I am asking you is there a way, a built-in method or something like this, to do this in 1-3 lines? Or just without using a loop?
You can use a regular expression with named groups to identify the different parts of the string you are interested in.
For example:
string input = "qew123";
var match = Regex.Match(input, "(?<letters>[a-zA-Z]+)(?<numbers>[0-9]+)");
if (match.Success)
{
Console.WriteLine(match.Groups["letters"]);
Console.WriteLine(match.Groups["numbers"]);
}
You can try Linq as an alternative to regular expressions:
string source = "qwe12";
string letters = string.Concat(source.TakeWhile(c => c < '0' || c > '9'));
string digits = string.Concat(source.SkipWhile(c => c < '0' || c > '9'));
You can use the Where() extension method from System.Linq library (https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.where), to filter only chars that are digit (number), and convert the resulting IEnumerable that contains all the digits to an array of chars, that can be used to create a new string:
string source = "qwe12";
string stringPart = new string(source.Where(c => !Char.IsDigit(c)).ToArray());
string numberPart = new string(source.Where(Char.IsDigit).ToArray());
MessageBox.Show($"String part: '{stringPart}', Number part: '{numberPart}'");
Source:
https://stackoverflow.com/a/15669520/8133067
if possible add a space between the letters and numbers (q 3, zet 64 etc.) and use string.split
otherwise, use the for loop, it isn't that hard
You can test as part of an aggregation:
var z = "qwe12345";
var b = z.Aggregate(new []{"", ""}, (acc, s) => {
if (Char.IsDigit(s)) {
acc[1] += s;
} else {
acc[0] += s;
}
return acc;
});
Assert.Equal(new [] {"qwe", "12345"}, b);
I am trying to see if my string starts with a string in an array of strings I've created. Here is my code:
string x = "Table a";
string y = "a table";
string[] arr = new string["table", "chair", "plate"]
if (arr.Contains(x.ToLower())){
// this should be true
}
if (arr.Contains(y.ToLower())){
// this should be false
}
How can I make it so my if statement comes up true? Id like to just match the beginning of string x to the contents of the array while ignoring the case and the following characters. I thought I needed regex to do this but I could be mistaken. I'm a bit of a newbie with regex.
It seems you want to check if your string contains an element from your list, so this should be what you are looking for:
if (arr.Any(c => x.ToLower().Contains(c)))
Or simpler:
if (arr.Any(x.ToLower().Contains))
Or based on your comments you may use this:
if (arr.Any(x.ToLower().Split(' ')[0].Contains))
Because you said you want regex...
you can set a regex to var regex = new Regex("(table|plate|fork)");
and check for if(regex.IsMatch(myString)) { ... }
but it for the issue at hand, you dont have to use Regex, as you are searching for an exact substring... you can use
(as #S.Akbari mentioned : if (arr.Any(c => x.ToLower().Contains(c))) { ... }
Enumerable.Contains matches exact values (and there is no build in compare that checks for "starts with"), you need Any that takes predicate that takes each array element as parameter and perform the check. So first step is you want "contains" to be other way around - given string to contain element from array like:
var myString = "some string"
if (arr.Any(arrayItem => myString.Contains(arrayItem)))...
Now you actually asking for "string starts with given word" and not just contains - so you obviously need StartsWith (which conveniently allows to specify case sensitivity unlike Contains - Case insensitive 'Contains(string)'):
if (arr.Any(arrayItem => myString.StartsWith(
arrayItem, StringComparison.CurrentCultureIgnoreCase))) ...
Note that this code will accept "tableAAA bob" - if you really need to break on word boundary regular expression may be better choice. Building regular expressions dynamically is trivial as long as you properly escape all the values.
Regex should be
beginning of string - ^
properly escaped word you are searching for - Escape Special Character in Regex
word break - \b
if (arr.Any(arrayItem => Regex.Match(myString,
String.Format(#"^{0}\b", Regex.Escape(arrayItem)),
RegexOptions.IgnoreCase)) ...
you can do something like below using TypeScript. Instead of Starts with you can also use contains or equals etc..
public namesList: Array<string> = ['name1','name2','name3','name4','name5'];
// SomeString = 'name1, Hello there';
private isNamePresent(SomeString : string):boolean{
if (this.namesList.find(name => SomeString.startsWith(name)))
return true;
return false;
}
I think I understand what you are trying to say here, although there are still some ambiguity. Are you trying to see if 1 word in your String (which is a sentence) exists in your array?
#Amy is correct, this might not have to do with Regex at all.
I think this segment of code will do what you want in Java (which can easily be translated to C#):
Java:
x = x.ToLower();
string[] words = x.Split("\\s+");
foreach(string word in words){
foreach(string element in arr){
if(element.Equals(word)){
return true;
}
}
}
return false;
You can also use a Set to store the elements in your array, which can make look up more efficient.
Java:
x = x.ToLower();
string[] words = x.Split("\\s+");
HashSet<string> set = new HashSet<string>(arr);
for(string word : words){
if(set.contains(word)){
return true;
}
}
return false;
Edit: (12/22, 11:05am)
I rewrote my solution in C#, thanks to reminders by #Amy and #JohnyL. Since the author only wants to match the first word of the string, this edited code should work :)
C#:
static bool contains(){
x = x.ToLower();
string[] words = x.Split(" ");
var set = new HashSet<string>(arr);
if(set.Contains(words[0])){
return true;
}
return false;
}
Sorry my question was so vague but here is the solution thanks to some help from a few people that answered.
var regex = new Regex("^(table|chair|plate) *.*");
if (regex.IsMatch(x.ToLower())){}
I have several lists, with words content about 2000-3000 words:
var list1 = new List<string> {"able", "adorable", "adventurous", ...};
and than if string inputStr = "do, dream"; contains any value from list, I want, look for each word in string into string[] words = inputStr.Split(' '); foreach (string word in words) with if (list1.Any(word.Contains)).
I'm not sure, maybe it is because I use list, or my search Contains method is not correct for this case, but in result I found words, which is not equal to words exist in input string, but which contains this words as part of word, for example for word "do" or word "dream":
(do) adorable, doubt, fully, do, doh, freedom, down, double
(dream) dreamily, dream
Not sure how to avoid this, maybe better use Dictionary or SortedDictionary if problem is list. Same result I have if I check it this way var val1 = list1.FirstOrDefault(stringToCheck => stringToCheck.Contains(word)); Seems like different search gives me same results with list, all words which contains found words in input string as part of word, but desired result is to find only equal words:
(do) do
(dream) dream
IndexOf() method will get you the index of any equivalent strings within the collection.
You could also do something like this with LINQ:
list.Any(x => x == "testString");
To find the sequence that contains your "word" you should use Linq :
// (do) adorable, doubt, fully, do, doh, freedom, down, double
var result = list1.Select(word => word.Contains("do"));
But if you're trying to get word that matches fully :
var result = list1.Select(word => word.Equals("do"));
Combining this with your input list :
var result = input.SelectMany(x => list1.Where(w => w.Equals(x)));
EDIT:
Here you can check it online
You can get it done with a single Linq line:
List<string> list1 = new List<string> { "able", "adorable", "adventurous" };
string inputstr = "the adorable adventurous cat";
var found_words = inputstr.Split(' ').Where(word => list1.Contains(word));
// found_words[0] = "adorable"
// found_words[1] = "adventurous"
if (list1.Contains(word))
Will only match whole exact strings in list.
But in that case, you should make list1 a HashSet instead, that will have much better performance.
Linq is still your best bet. Assuming you want case sensitivity but don't want to observe hanging whitespace:
public string Foo(string input, List<string> list)
{
return (list.FirstOrDefault(t.Trim() == input.Trim()));
}
I personally prefer to compare strings by value than using Equals most of the time, though for string comparisons you may want to narrow down Culture as necessary..
I've got a long string in the format of:
WORD_1#WORD_3#WORD_5#CAT_DOG_FISH#WORD_2#WORD_3#CAT_DOG_FISH_2#WORD_7
I'm trying to dynamically match a string so I can return its position within the string.
I know the string will start with CAT_DOG_ but the FISH is dynamic and could be anything. It's also important not to match on the CAT_DOG_FISH_2(int)
Basically, I need to get back a match on any word starting with [CAT_DOG_] but not ending in [_(int)]
I've tried a few different think and I don't seem to be getting anywhere, any help appreciated.
Once I have the regex to match, I'll be able to get the index of the match, then work out when the next #(delimiter) is , which will get me the start/end position of the word, I can then substring it out to return the full word.
I hope that makes sense?
Personally I avoid Regex whenever possible as I find them hard to read and maintain unless you use them a lot, so here is a non-regex solution:
string words = "WORD_1#WORD_3#WORD_5#CAT_DOG_FISH#WORD_2#WORD_3#CAT_DOG_FISH_2#WORD_7";
var result = words.Split('#')
.Select((w,p) => new { WholeWord = w, SplitWord = w.Split('_'), Position = p, Dynamic = w.Split('_').Last() })
.FirstOrDefault(
x => x.SplitWord.Length == 3 &&
x.SplitWord[0] == "CAT" &&
x.SplitWord[1] == "DOG");
That gives you the whole word, the dynamic part and the position. I does assume the dynamic part doesn't have underscores.
You can use the following regex:
\bCAT_DOG_[a-zA-Z]+(?!_\d)\b
See demo
Or (if the FISH is really anything, but not _ or #):
\bCAT_DOG_[^_#]+(?!_\d)\b
See demo
The word boundaries \b with the look-ahead (?!_\d) (meaning that there must be no _ and a digit) help us return only the required strings. The [^_#] character class matches any character but a _ or #.
You can get the indices using LINQ:
var s = "WORD_1#WORD_3#WORD_5#CAT_DOG_FISH#WORD_2#WORD_3#CAT_DOG_FISH_2#WORD_7";
var rx1 = new Regex(#"\bCAT_DOG_[^_#]+(?!_\d)\b");
var indices = rx1.Matches(s).Cast<Match>().Select(p => p.Index).ToList();
Values can be obtained like this:
var values = rx1.Matches(s).Cast<Match>().Select(p => p.Value).ToList();
Or together:
var values = rx1.Matches(s).OfType<Match>().Select(p => new { p.Index, p.Value }).ToList();
Thanks for the help guys, since i know the int the string will end with I've settled on this:
int i = 0;
string[] words = textBox1.Text.Split('#');
foreach (string word in words)
{
if (word.StartsWith("CAT_DOG_") && (!word.EndsWith(i.ToString())) )
{
//process here
MessageBox.Show("match is: " + word);
}
}
Thanks to Eser for pointing me towards String.Split()
I have 2 strings which both are some kind of reference number (have a prefix and digits).
string a = "R&D123";
string b = "R&D 123";
string a and string b are two different user input, and I'm trying to compare if the two strings matches.
I know I can use String.Compare() to check if two strings are the same, but like in the example above, they could be different strings but are technically the same thing.
Because they are both user inputs (from different users), there can be several different formats.
"R&D123"
"R&D 123" //with space in between
"R.D.123 " //using period or other character
"r&d123" //different case
"RD123" //no special character
...etc
Is there a way I can somehow "normalize" the two strings first then compare them??
I know a easy-to-understand way is use string.Replace() to replace special characters and spaces to blank space and use string.ToLower() so I don't have to worry about cases. But the problem with this method is that if I have many special characters, I'll be doing .Replace() quite a few times and that's not ideal.
Another problem is that R&D is not the only prefix I need to worry about, there are others such as A.P., K-D, etc. Not sure if this will make a difference :/
Any help is appreciated, thanks!
If you want to just letters and digits,you can do it with linq:
var array1 = a.Where(x =>char.IsLetterOrDigit(x)).ToArray();
var array2 = b.Where(x => char.IsLetterOrDigit(x)).ToArray();
var normalizedStr1 = new String(array1).ToLower();
var normalizedStr2 = new String(array2).ToLower();
String.Compare(normalizedStr1,normalizedStr2);
This might not be the prettiest way to to do but it's the fastest
static void Main(string[] args)
{
string sampleResult = NormlizeAlphaNumeric("Hello wordl 3242348&&))&)*^&#R&#&R#)R##)R##R#R##");
}
public static string NormlizeAlphaNumeric(string someValue)
{
var sb = new StringBuilder(someValue.Length);
foreach (var ch in someValue)
{
if(char.IsLetterOrDigit(ch))
{
sb.Append(ch);
}
}
return sb.ToString().ToLower();
}
try this...
string s2 = Regex.Replace(s, #"[^[a-zA-Z0-9]]+", String.Empty);
it will replace all the special characters and give you the normalize string.