Case insensitive LIKE condition in LINQ (with Regular Expression)

Case insensitive LIKE condition in LINQ (with Regular Expression) - c#

I have following code that works if the search text and the items in the list are of same case (lower case / upper case). If there is a mixed casing it is not working,. How can we make it case insensitive search.
var text = "c";
var myStrings = new List<string>() { "Aa", "ACB", "cc" };
var regEx = new System.Text.RegularExpressions.Regex(text);
var results = myStrings
.Where<string>(item => regEx.IsMatch(item))
.ToList<string>();
EDIT :
I need to pass that string with Case Insensitive to the method how can i do that ...
public ActionResult GetItems(string text)
{
ContextObject contextObject = new ContextObject();
TransactionHistory transactionhistory = new TransactionHistory();
System.Text.RegularExpressions.Regex regEx = new System.Text.RegularExpressions.Regex(text, RegexOptions.IgnoreCase);
var items = transactionhistory.GetItems(contextObject, text);
return Json(items, JsonRequestBehavior.AllowGet);
}

Try to declare your regex like this
Regex regEx = new Regex(text, RegexOptions.IgnoreCase);

you need to use the overload which takes RegexOptions.IgnoreCase
Example
RegexOptions options = RegexOptions.IgnoreCase | RegexOptions.Compiled;
System.Text.RegularExpressions.Regex regEx = new System.Text.RegularExpressions.Regex(text, options);
EDIT:
RegexOptions options = RegexOptions.IgnoreCase | RegexOptions.Compiled;
var text = "c";
var myStrings = new List<string>() { "Aa", "ACB", "cc" };
var regEx = new System.Text.RegularExpressions.Regex(text, options);
var results = myStrings
.Where<string>(item => regEx.IsMatch(item))
.ToList<string>();
//you will have 2 items in results
foreach(string s in results)
{
GetItems(s);
}

Based in your code, why using a regex? I would use a regex only with complex text patterns. In this case is way easier to use string.IndexOf() like in
var text = "c";
var myStrings = new List<string>() { "Aa", "ACB", "cc" };
var results = myStrings
.Where(item => item.IndexOf(text, StringComparison.CurrentCultureIgnoreCase) >= 0)
.ToList();
I have al removed the explicit use of string in where and toList as it is applied by default.

Related

How to highlight only results of PrefixQuery in Lucene and not whole words?

I'm fairly new to Lucene and perhaps doing something really wrong, so please correct me if it is the case. Being searching for the answer for a few days now and not sure where to go from here.
The goal is to use Lucene.NET to search for user names with partial search (like StartsWith) and highlight only the found parts. For instance if I search for abc in a list of ['a', 'ab', 'abc', 'abcd', 'abcde'] it should return just the last three in a form of ['<b>abc</b>', '<b>abc</b>d', '<b>abc</b>de']
Here is how I approached this.
First the index creation:
using var indexDir = FSDirectory.Open(Path.Combine(IndexDirectory, IndexName));
using var standardAnalyzer = new StandardAnalyzer(CurrentVersion);
var indexConfig = new IndexWriterConfig(CurrentVersion, standardAnalyzer);
indexConfig.OpenMode = OpenMode.CREATE_OR_APPEND;
using var indexWriter = new IndexWriter(indexDir, indexConfig);
if (indexWriter.NumDocs == 0)
{
//fill the index with Documents
}
The documents are created like this:
static Document BuildClientDocument(int id, string surname, string name)
{
var document = new Document()
{
new StringField("Id", id.ToString(), Field.Store.YES),
new TextField("Surname", surname, Field.Store.YES),
new TextField("Surname_sort", surname.ToLower(), Field.Store.NO),
new TextField("Name", name, Field.Store.YES),
new TextField("Name_sort", name.ToLower(), Field.Store.NO),
};
return document;
}
The search is done like this:
using var multiReader = new MultiReader(indexWriter.GetReader(true)); //the plan was to use multiple indexes per entity types
var indexSearcher = new IndexSearcher(multiReader);
var queryString = "abc"; //just as a sample
var queryWords = queryString.SplitWords();
var query = new BooleanQuery();
queryWords
.Process((word, index) =>
{
var boolean = new BooleanQuery()
{
{ new PrefixQuery(new Term("Surname", word)) { Boost = 100 }, Occur.SHOULD }, //surnames are most important to match
{ new PrefixQuery(new Term("Name", word)) { Boost = 50 }, Occur.SHOULD }, //names are less important
};
boolean.Boost = (queryWords.Count() - index); //first words in a search query are more important than others
query.Add(boolean, Occur.MUST);
})
;
var topDocs = indexSearcher.Search(query, 50, new Sort( //sort by relevance and then in lexicographical order
SortField.FIELD_SCORE,
new SortField("Surname_sort", SortFieldType.STRING),
new SortField("Name_sort", SortFieldType.STRING)
));
And highlighting:
var htmlFormatter = new SimpleHTMLFormatter();
var queryScorer = new QueryScorer(query);
var highlighter = new Highlighter(htmlFormatter, queryScorer);
foreach (var found in topDocs.ScoreDocs)
{
var document = indexSearcher.Doc(found.Doc);
var surname = document.Get("Surname"); //just for simplicity
var surnameFragment = highlighter.GetBestFragment(standardAnalyzer, "Surname", surname);
Console.WriteLine(surnameFragment);
}
The problem is that the highlighter returns results like this:
<b>abc</b>
<b>abcd</b>
<b>abcde</b>
<b>abcdef</b>
So it "highlights" entire words even though I was searching for partials.
Explain returned NON-MATCH all the way so not sure if it's helpful here.
Is it possible to highlight only the parts which were searched for? Like in my example.

While searching a bit more on this I came to a conclusion that to make such highlighting work one needs to tweak index generation methods and split indices by parts so offsets would be properly calculated. Or else highlighting will highlight only surrounding words (fragments) entirely.
So based on this I've managed to build a simple highlighter of my own.
public class Highlighter
{
private const string TempStartToken = "\x02";
private const string TempEndToken = "\x03";
private const string SearchPatternTemplate = $"[{TempStartToken}{TempEndToken}]*{{0}}";
private const string ReplacePattern = $"{TempStartToken}$&{TempEndToken}";
private readonly ConcurrentDictionary<HighlightKey, Regex> _regexPatternsCache = new();
private static string GetHighlightTypeTemplate(HighlightType highlightType) =>
highlightType switch
{
HighlightType.Starts => "^{0}",
HighlightType.Contains => "{0}",
HighlightType.Ends => "{0}$",
HighlightType.Equals => "^{0}$",
_ => throw new ArgumentException($"Unsupported {nameof(HighlightType)}: '{highlightType}'", nameof(highlightType)),
};
public string Highlight(string text, IReadOnlySet<string> words, string startToken, string endToken, HighlightType highlightType)
{
foreach (var word in words)
{
var key = new HighlightKey
{
Word = word,
HighlightType = highlightType,
};
var regex = _regexPatternsCache.GetOrAdd(key, _ =>
{
var parts = word.Select(w => string.Format(SearchPatternTemplate, Regex.Escape(w.ToString())));
var pattern = string.Concat(parts);
var highlightPattern = string.Format(GetHighlightTypeTemplate(highlightType), pattern);
return new Regex(highlightPattern, RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.Compiled);
});
text = regex.Replace(text, ReplacePattern);
}
return text
.Replace(TempStartToken, startToken)
.Replace(TempEndToken, endToken)
;
}
private record HighlightKey
{
public string Word { get; init; }
public HighlightType HighlightType { get; init; }
}
}
public enum HighlightType
{
Starts,
Contains,
Ends,
Equals,
}
Use it like this:
var queries = new[] { "abc" }.ToHashSet();
var search = "a ab abc abcd abcde";
var highlighter = new Highlighter();
var outputs = search
.Split((string[])null, StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries)
.Select(w => highlighter.Highlight(w, queries, "<b>", "</b>", HighlightType.Starts))
;
var result = string.Join(" ", outputs).Dump();
Util.RawHtml(result).Dump();
Output looks like this:
a ab <b>abc</b> <b>abc</b>d <b>abc</b>de
a ab abc abcd abcde
I'm open to any other better solutions.

How to split and take multiple strings from a url in c#?

I have a string looking something like this:
/Gender=&Age=&Query=&Orgrimmar+l%C3%A4n=01&Stormwind+l%C3%A4n=07&Undercity+l%C3%A4n=09&Pag
I want a list of string with "Orgrimmar", "Stormwind" and "Undercity". How is this possible so that it splits AFTER Query and between & and + in order so that we avoid getting a string like this "Orgrimmar+l%C3%A4n=01&Stormwind".
Let us assume that we don't know the name of the strings.. :)
Updated, i still don't seem to get it to work. I have added a list of counties that i can use to validate this. However i still find it hard in this case. countyList is used to validate that the counties/cities in the url matches a pre-existing Collection.
var countyQuery = Request.Url.Query;
var counties = this._locationService.GetAllCounties();
List<string> countyList = new List<string>();
List<string> selectedCountiesList = new List<string>();
foreach (var i in counties)
{
countyList.Add(i.Name);
}
Regex r = new Regex(#"&(.+?)\+");
MatchCollection mc = r.Matches(countyQuery);
foreach (Match curMatch in mc)
{
if (countyList.Contains(curMatch.Groups[1].Value))
{
selectedCountiesList.Add(curMatch.Groups[1].Value);
}
}
return selectedCountiesList;
Changed url to be/?Gender=&Age=&Query=&county=13&county=08&county=01&Page=1
where 13, 08, 01 and so on is Id of the counties
The final solution was:
var selectedCountyQuery = Request.QueryString
//CountySearch = "county"
[QueryStringParameters.CountySearch];
List countyList = new List();
List<string> selectedCounties = new List<string>();
if (!string.IsNullOrEmpty(selectedCountyQuery))
{
var selectedCountiesArray = selectedCountyQuery.Split(new[]{ ',' });
foreach (var selectedCounty in selectedCountiesArray)
{
selectedCounties.Add(selectedCounty);
}
}
return selectedCounties;

You can get all parameter and value with Substring() and Split() method.
Example :
var URL = "controller/method?var1=&var2=&var3=dsgdf";
var ParameterPart = URL.Split("?")[1];
var ParametersArray = ParameterPart.Split("&");
//output : ["var1=","var2=","var3=dsgdf"];
foreach(var Parameter in ParametersArray)
{
var ParameterName= Parameter.Split("=")[0];
var ParameterValue= Parameter.Split("=")[1];
}

You can use a regex and extract the matches:
Regex r = new Regex(#"&(.+?)\+");
MatchCollection mc = r.Matches(s);
Then you can itterate your desired strings (in this case wow cities) like:
foreach(Match curMatch in mc)
{
Console.WriteLine(curMatch.Groups[1].Value);
}

string[] numbers ={ "/Gender=&Age=&Query=&Orgrimmar+l%C3%A4n=01&Stormwind+l%C3%A4n=07&Undercity+l%C3%A4n=09&Pag"};
string sPattern = #"(?<=&Orgrimmar)+";
foreach (string s in numbers){
if (System.Text.RegularExpressions.Regex.IsMatch(s, sPattern)){
System.Console.WriteLine(" - valid");}
else{System.Console.WriteLine(" - invalid");}
Output: valid
string[] numbers ={ "/Gender=&Age=&Query=Orgrimmar+l%C3%A4n=01&Stormwind+l%C3%A4n=07&Undercity+l%C3%A4n=09&Pag"};
Output: invalid
Further to check two parameters:
string[] numbers ={ "/Gender=&Age=&Query=&Orgrimmar+l%C3%A4n=01&Stormwind+l%C3%A4n=07&Undercity+l%C3%A4n=09&Pag"};
string sPattern = #"(?<=&Orgrimmar)+";
string sPattern2 = #"(?<=&Stormwind)+";
foreach (string s in numbers){
if (System.Text.RegularExpressions.Regex.IsMatch(s, sPattern) && System.Text.RegularExpressions.Regex.IsMatch(s, sPattern2))
...

Check a pattern in a string then convert it to upper case

I was not clear with my previous question
I have a list: new List<string> { "lts", "mts", "cwts", "rotc" };
Now I wan't to check a pattern in string that starts or ends with a forward slash like this: "cTws/Rotc/lTs" or "SomethingcTws cWtS/Rotc rOtc".
and convert to upper case only the string that starts/ends with a forward slash based on the list that I have.
So the output should be: "CWTS/ROTC/LTS", "SomethingcTws CWTS/ROTC rOtc"
I modified Sachin's answer:
List<string> replacementValues = new List<string>
{
"cwts",
"mts",
"rotc",
"lts"
};
string pattern = string.Format(#"\G({0})/?", string.Join("|", replacementValues.Select(x => Regex.Escape(x))));
Regex regExp = new Regex(pattern, RegexOptions.IgnoreCase);
string value = "Cwts/Rotc Somethingcwts1 Cwts/Rotc/lTs";
string result = regExp.Replace(value, s => s.Value.ToUpper());
Result: CWTS/ROTC Somethingcwts1 Cwts/Rotc/lTs
The desired output should be: CWTS/ROTC Somethingcwts1 CWTS/ROTC/LTS

So instead of using Regex, which I'm not really good with, I'm doing split by space then split by "/" then rejoin the strings
string val = "Somethingrotc1 cWts/rOtC/lTs Cwts";
List<string> replacementValues = new List<string>
{
"lts", "mts",
"cwts", "rotc"
};
string[] tokens = val.Split(new char[] { ' ' }, StringSplitOptions.None);
string result = string.Join(" ", tokens.Select(t =>
{
// Now split by "/"
string[] ts = t.Split(new char[] { '/' }, StringSplitOptions.None);
if (ts.Length > 1)
{
t = string.Join("/", ts.Select(x => replacementValues.Contains(x.ToLower()) ? x.ToUpper() : x));
}
return t;
}));
Output: Somethingrotc1 CWTS/ROTC/LTS Cwts

You want to change the specific words in the string to Upper case. Then you can use Regex to achieve it.
string value = "Somethingg1 Cwts/Rotc/Lts Cwts";
var replacementValues = new Dictionary<string, string>()
{
{"Cwts","CWTS"},
{"Rotc","ROTC"},
{"Lts","LTC"}
};
var regExpression = new Regex(String.Join("|", replacementValues.Keys.Select(x => Regex.Escape(x))));
var outputString = regExpression.Replace(value, s => replacementValues[s.Value]);

Get everything before the second given character using Regular Expression

I have a list of strings like
[ABC].[XXX].sdfnwoaenwaf
[ABC].[XXX].sdfnwoaenwaf
[ABC].[XX1].sdfnwoaenwaf
[ABC].[XX1].sdfnwoaenwaf
[AB1].[XX3].sdfnwoaenwaf
[AB2].[XX1].sdfnwoaenwaf
How can I get everything before the second dot? e.g. [ABC].[XXX] for the first one.

(.*)\. is the regex you need. You may test on http://regexhero.net/tester/

You could try a regex or use the LastIndexOf method of the String class or the Split method of the String class.
Regex
var toMatch = "[ABC].[XXX].sdfnwoaenwaf";
var pattern = new Regex("(.*?\\..*?)(:?\\.)");
var beforeSecondDot = pattern.Match(toMatch);
var stringBeforeSecondDot = beforeSecondDot.Groups[1].Value;
String.LastIndexOf
var toMatch = "[ABC].[XXX].sdfnwoaenwaf";
var indexOfLastDot = toMatch.LastIndexOf('.');
var beforeSecondDot = toMatch.Substring(0, indexOfLastDot);
String.Split
var toMatch = "[ABC].[XXX].sdfnwoaenwaf";
var parts = toMatch.Split('.');
var beforeSecondDot = String.Join(".", parts.Take(2));

string x = "[AB2].[XX1].sdfnwoaenwaf";
Regex regex = new Regex("([^.]+\\.[^.]+)\\.");
Match match = regex.Match(x);
if (match.Success)
{
Console.WriteLine(match.Groups[1].Value);
}
Output:
[AB2].[XX1]

List<string> finalStrings = new List<string>();
List<string> strings = new List<string>();
strings.Add("[ABC].[XXX].sdfnwoaenwaf");
strings.Add("[ABC].[XXX].sdfnwoaenwaf");
strings.Add("[ABC].[XX1].sdfnwoaenwaf");
strings.Add("[ABC].[XX1].sdfnwoaenwaf");
strings.Add("[ABC].[XX3].sdfnwoaenwaf");
strings.Add("[ABC].[XX1].sdfnwoaenwaf");
foreach (var item in strings)
{
Regex rex = new Regex("(.*)\\.");
string[] y = rex.Split(item);
finalStrings.Add(y[1]);
}

Searching the first few characters of every word within a string in C#

I am new to programming languages. I have a requirement where I have to return a record based on a search string.
For example, take the following three records and a search string of "Cal":
University of California
Pascal Institute
California University
I've tried String.Contains, but all three are returned. If I use String.StartsWith, I get only record #3. My requirement is to return #1 and #3 in the result.
Thank you for your help.

If you're using .NET 3.5 or higher, I'd recommend using the LINQ extension methods. Check out String.Split and Enumerable.Any. Something like:
string myString = "University of California";
bool included = myString.Split(' ').Any(w => w.StartsWith("Cal"));
Split divides myString at the space characters and returns an array of strings. Any works on the array, returning true if any of the strings starts with "Cal".
If you don't want to or can't use Any, then you'll have to manually loop through the words.
string myString = "University of California";
bool included = false;
foreach (string word in myString.Split(' '))
{
if (word.StartsWith("Cal"))
{
included = true;
break;
}
}

I like this for simplicity:
if(str.StartsWith("Cal") || str.Contains(" Cal")){
//do something
}

You can try:
foreach(var str in stringInQuestion.Split(' '))
{
if(str.StartsWith("Cal"))
{
//do something
}
}

You can use Regular expressions to find the matches. Here is an example
//array of strings to check
String[] strs = {"University of California", "Pascal Institute", "California University"};
//create the regular expression to look for
Regex regex = new Regex(#"Cal\w*");
//create a list to hold the matches
List<String> myMatches = new List<String>();
//loop through the strings
foreach (String s in strs)
{ //check for a match
if (regex.Match(s).Success)
{ //add to the list
myMatches.Add(s);
}
}
//loop through the list and present the matches one at a time in a message box
foreach (String matchItem in myMatches)
{
MessageBox.Show(matchItem + " was a match");
}

string univOfCal = "University of California";
string pascalInst = "Pascal Institute";
string calUniv = "California University";
string[] arrayofStrings = new string[]
{
univOfCal, pascalInst, calUniv
};
string wordToMatch = "Cal";
foreach (string i in arrayofStrings)
{
if (i.Contains(wordToMatch)){
Console.Write(i + "\n");
}
}
Console.ReadLine();
}

var strings = new List<string> { "University of California", "Pascal Institute", "California University" };
var matches = strings.Where(s => s.Split(' ').Any(x => x.StartsWith("Cal")));
foreach (var match in matches)
{
Console.WriteLine(match);
}
Output:
University of California
California University

This is actually a good use case for regular expressions.
string[] words =
{
"University of California",
"Pascal Institute",
"California University"
}
var expr = #"\bcal";
var opts = RegexOptions.IgnoreCase;
var matches = words.Where(x =>
Regex.IsMatch(x, expr, opts)).ToArray();
The "\b" matches any word boundary (punctuation, space, etc...).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Case insensitive LIKE condition in LINQ (with Regular Expression) - c#

Try to declare your regex like this Regex regEx = new Regex(text, RegexOptions.IgnoreCase);

Related

How to highlight only results of PrefixQuery in Lucene and not whole words?

How to split and take multiple strings from a url in c#?

Check a pattern in a string then convert it to upper case

Get everything before the second given character using Regular Expression

Searching the first few characters of every word within a string in C#

Categories

Resources