Regex Contains Multiple words - c#

I am trying to search in titles matching entire search terms.
My example is something like below
string exampleTitle = "apple orange banana";
string term1 = "app bana";
string term2 = "bana app";
string pattern1 = #term1.Replace(" ", "*.*") + "*"; //output:app*.*bana*
string pattern2 = #term2.Replace(" ", "*.*") + "*"; //output:bana*.*app*
//now test
bool isMatch1 = Regex.IsMatch(exampleTitle , pattern1) // true
//now test
bool isMatch2 = Regex.IsMatch(exampleTitle , pattern2) // false
Thus pattern2 not match because banana comes after apple. However I need to true when matching all of words in search term without any order.

Regular expressions can be tricky here. Use this approach instead:
String exampleTitle = "apple orange banana";
String terms = "app bana";
Boolean found = true;
// let's clean things up for malformed input with RemoveEmptyEntries
foreach (String term in terms.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries))
found &= exampleTitle.Contains(term);
Using LINQ instead:
// let's clean things up for malformed input with RemoveEmptyEntries
String[] terms = terms_list.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries);
Boolean found = terms.All(term => exampleTitle.Contains(term));

You can use the regular expression (?=.*app)(?=.*bana) instead:
string pattern1 = "(?=.*"+term1.Replace(" ", ")(?=.*") + ")"; //output:(?=.*app)(?=.*bana)
string pattern2 = "(?=.*" + term2.Replace(" ", ")(?=.*") + ")"; //output:(?=.*app)(?=.*bana)
You can limit backtracking and forward search with this:
string pattern1 = "(?=(?>.*?"+term1.Replace(" ", "))(?=(?>.*?") + "))"; //output:(?=(?>.*?app))(?=(?>.*?bana))
string pattern2 = "(?=(?>.*?" + term2.Replace(" ", "))(?=(?>.*?") + "))"; //output:(?=(?>.*?app))(?=(?>.*?bana))

I need to true when matching all of words in search term without any order
This could be more clearly expressed as:
bool isMatch = Regex.IsMatch(exampleTitle, ".*app.*") && Regex.IsMatch(exampleTitle, ".*bana.*);
As noted in the other answer, there are non-regex ways to do substring matching that may be more appropriate.

Related

Split and Append "AND" between values

how to split below value and append AND between values ?
I cannot Split with Space as there is spaces between words
"\"Mark John\" \"Tina Roy\""
as
"\"Mark John\" AND \"Tina Roy\""
In the end it should look like -
"Mark John" AND "Tina Roy"
Any help is appreciated.
string operatorValue = " AND ";
if (!string.IsNullOrEmpty(operatorValue))
{
foreach (string searchVal in SearchRequest.Text.Split(' '))
{
if (!string.IsNullOrEmpty(searchVal))
searchValue += searchVal + operatorValue;
}
}
int index = searchValue.LastIndexOf(operatorValue);
if (index != -1)
{
outputSearchValue = searchValue.Substring(0, index);
}
Try
var result = str.Replace("\" \"","\" And \"");
If you have more than one name, or there is a possibility that you could have more than one whitespace between two names, you could opt for Regex.
var result = Regex.Replace(str,"\"\\s+\"","\" And \"");
Example,
var str = "\"Mark John\" \"Tina Roy\" \"Anu Viswan\"";
var result = Regex.Replace(str,"\"\\s+\"","\" And \"");
Output
"Mark John" And "Tina Roy" And "Anu Viswan"
Or use Regular Expressions:
var test = "\"John Smith\" \"Bill jones\" \"Bob Norman\"";
Console.WriteLine(Regex.Replace(test, "\" \"", "\" AND \""));
Instead of splitting, replace the " " with " AND "
var test = "\"Mark John\" \"Tina Roy\"";
var new_string= test.Replace("\" \"", " AND ");

Extract some numbers and decimals from a string

I have a string:
" a.1.2.3 #4567 "
and I want to reduce that to just "1.2.3".
Currently using Substring() and Remove(), but that breaks if there ends up being more numbers after the pound sign.
What's the best way to go about doing this? I've read a bunch of questions on regex & string.split, but I can't get anything I try to work in VB.net. Would I have to do a match then replace using the match result?
Any help would be much appreciated.
This should work:
string input = " a.1.2.3 #4567 ";
int poundIndex = input.IndexOf("#");
if(poundIndex >= 0)
{
string relevantPart = input.Substring(0, poundIndex).Trim();
IEnumerable<Char> numPart = relevantPart.SkipWhile(c => !Char.IsDigit(c));
string result = new string(numPart.ToArray());
}
Demo
Try this...
String[] splited = split("#");
String output = splited[0].subString(2); // 1 is the index of the "." after "a" considering there are no blank spaces before it..
Here is regex way of doing it
string input = " a.1.2.3 #4567 ";
Regex regex = new Regex(#"(\d\.)+\d");
var match = regex.Match(input);
if(match.Success)
{
string output = match.Groups[0].Value;//"1.2.3"
//Or
string output = match.Value;//"1.2.3"
}
If the pound sign is the most relevant bit, rely on Split. Sample VB.NET code:
Dim inputString As String = " a.1.2.3 #4567 "
If (inputString.Contains("#")) Then
Dim firstBit As String = inputString.Split("#")(0).Trim()
Dim headingToRemove As String = "a."
Dim result As String = firstBit.Substring(headingToRemove.Length, firstBit.Length - headingToRemove.Length)
End If
As far as this is a multi-language question, here comes the translation to C#:
string inputString = " a.1.2.3 #4567 ";
if (inputString.Contains("#"))
{
string firstBit = inputString.Split('#')[0].Trim();
string headingToRemove = "a.";
string result = firstBit.Substring(headingToRemove.Length, firstBit.Length - headingToRemove.Length);
}
I guess another way using unrolled
\d+ (?: \. \d+ )+

C# replace part of string but only exact matches possible?

I have a string beginString = "apple|fruitsapple|turnip";
What I want to do is replace just apple with mango, not fruitsapple.
string fixedString = beginString.Replace("apple","mango"); This doesn't work because it replaces both apple and fruitsapple.
Any ideas?
beginString = "|" + beginString + "|";
fixedString = beginString.Replace("|apple|","|mango|");
This cannot be done in the way you have said since it will consider the entire string to be a string. You can do the split by | as you have used or else have the strings in a list and use equals and then replace it.
String[] words = beginString.Split("|");
now do the replace on words. works for any scenario.
The variation on other answers in LINQ style:
string fixedString = string.Join("|",
beginString
.Split('|')
.Select(s => s != "apple" ? s : "mango"));
Closest I can get. Was gonna suggest regular expression, but that won't always work as you want. You have to split the string first and then remake it.
string searchString = "apple";
string newString = "mango";
string beginString = "apple|fruitsapple|turnip";
string[] array = beginString.Split('|');
foreach (var item in array)
{
if (item == searchString)
item.Replace(searchString, newString);
}
string recreated = "";
new List<string>(array).ForEach(e => recreated += e + "|");
recreated.TrimEnd('|');
string newstr = Regex.Replace("apple|fruitsapple|turnip", #"\bapple\b", "mango");

Using LINQ to strip a suffix from a string if it contains a suffix in a list?

How can I strip a suffix from a string, and return it, using C#/LINQ? Example:
string[] suffixes = { "Plural", "Singular", "Something", "SomethingElse" };
string myString = "DeleteItemMessagePlural";
string stringWithoutSuffix = myString.???; // what do I do here?
// stringWithoutSuffix == "DeleteItemMessage"
var firstMatchingSuffix = suffixes.Where(myString.EndsWith).FirstOrDefault();
if (firstMatchingSuffix != null)
myString = myString.Substring(0, myString.LastIndexOf(firstMatchingSuffix));
You need to build a regular expression from the list:
var regex = new Regex("(" + String.Join("|", list.Select(Regex.Escape)) + ")$");
string stringWithoutSuffix = regex.Replace(myString, "");
// Assuming there is exactly one matching suffix (this will check that)
var suffixToStrip = suffixes.Single(x => myString.EndsWith(x));
// Replace the matching one:
var stringWithoutSuffix = Regex.Replace(myString, "(" +suffixToStrip + ")$", "");
OR, since you know the length of the matching suffix:
// Assuming there is exactly one matching suffix (this will check that)
int trim = suffixes.Single(x => myString.EndsWith(x)).Length;
// Remove the matching one:
var stringWithoutSuffix = myString.Substring(0, myString.Length - trim);

Remove 2 or more blank spaces from a string in c#..?

what is the efficient mechanism to remove 2 or more white spaces from a string leaving single white space.
I mean if string is "a____b" the output must be "a_b".
You can use a regular expression to replace multiple spaces:
s = Regex.Replace(s, " {2,}", " ");
Something like below maybe:
var b=a.Split(new char[] {' '}, StringSplitOptions.RemoveEmptyEntries);
var noMultipleSpaces = string.Join(" ",b);
string tempo = "this is a string with spaces";
RegexOptions options = RegexOptions.None;
Regex regex = new Regex(#"[ ]{2,}", options);
tempo = regex.Replace(tempo, #" ");
You Can user this method n pass your string value as argument
you have to add one namespace also using System.Text.RegularExpressions;
public static string RemoveMultipleWhiteSpace(string str)
{
// A.
// Create the Regex.
Regex r = new Regex(#"\s+");
// B.
// Remove multiple spaces.
string s3 = r.Replace(str, #" ");
return s3;
}

Categories

Resources