Comparing Multiple Strings using .Contains - c#

I am trying to compare a string to see if it contains a curse word. I assumed that I could do this using str.Contains("" || "") although I quickly realized I cannot use || with two strings. What would I use in place of this?
str.Contains("123" || "abc");
I expected it to see if it contains 123 or abc but the code segment does not work as it cannot compare two strings.

var str = "testabc123";
var str2 = "helloworld";
var bannedWords = new List<string>
{
"test",
"ok",
"123"
};
var res = bannedWords.Any(x => str.Contains(x)); //true
var res2 = bannedWords.Any(x => str2.Contains(x)); //false
You can do something like this. Create a list with the swear words, then you can check if the string contains any word in the list.

Try the following approach
using System;
using System.Collections.Generic;
public class Program
{
private static final List<String> curseWords = new List<String>() {"123", "abc"};
public static void Main()
{
String input = "text to be checked with word abc";
if(isContainCurseWord(input)){
Console.WriteLine("Input Contains atlease one curse word");
}else{
Console.WriteLine("input does not contain any curse words")
}
}
public static bool isContainCurseWord(String text){
for(String curse in curseWords){
if(text.Contains(curse)){
return true;
}
}
return false;
}
}

Try -
using System;
using System.Linq;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
var input = "some random string with abc and 123";
var words = new List<String>() {"123", "abc"};
var foundAll = words.Any(word => input.Contains(word));
Console.WriteLine("Contains: {0}", foundAll);
}
}

Try -
var array = new List<String>() {"123", "abc"};
var found = array.Contains("abc");
Console.WriteLine("Contains: {0}", found);

Related

How to highlight only results of PrefixQuery in Lucene and not whole words?

I'm fairly new to Lucene and perhaps doing something really wrong, so please correct me if it is the case. Being searching for the answer for a few days now and not sure where to go from here.
The goal is to use Lucene.NET to search for user names with partial search (like StartsWith) and highlight only the found parts. For instance if I search for abc in a list of ['a', 'ab', 'abc', 'abcd', 'abcde'] it should return just the last three in a form of ['<b>abc</b>', '<b>abc</b>d', '<b>abc</b>de']
Here is how I approached this.
First the index creation:
using var indexDir = FSDirectory.Open(Path.Combine(IndexDirectory, IndexName));
using var standardAnalyzer = new StandardAnalyzer(CurrentVersion);
var indexConfig = new IndexWriterConfig(CurrentVersion, standardAnalyzer);
indexConfig.OpenMode = OpenMode.CREATE_OR_APPEND;
using var indexWriter = new IndexWriter(indexDir, indexConfig);
if (indexWriter.NumDocs == 0)
{
//fill the index with Documents
}
The documents are created like this:
static Document BuildClientDocument(int id, string surname, string name)
{
var document = new Document()
{
new StringField("Id", id.ToString(), Field.Store.YES),
new TextField("Surname", surname, Field.Store.YES),
new TextField("Surname_sort", surname.ToLower(), Field.Store.NO),
new TextField("Name", name, Field.Store.YES),
new TextField("Name_sort", name.ToLower(), Field.Store.NO),
};
return document;
}
The search is done like this:
using var multiReader = new MultiReader(indexWriter.GetReader(true)); //the plan was to use multiple indexes per entity types
var indexSearcher = new IndexSearcher(multiReader);
var queryString = "abc"; //just as a sample
var queryWords = queryString.SplitWords();
var query = new BooleanQuery();
queryWords
.Process((word, index) =>
{
var boolean = new BooleanQuery()
{
{ new PrefixQuery(new Term("Surname", word)) { Boost = 100 }, Occur.SHOULD }, //surnames are most important to match
{ new PrefixQuery(new Term("Name", word)) { Boost = 50 }, Occur.SHOULD }, //names are less important
};
boolean.Boost = (queryWords.Count() - index); //first words in a search query are more important than others
query.Add(boolean, Occur.MUST);
})
;
var topDocs = indexSearcher.Search(query, 50, new Sort( //sort by relevance and then in lexicographical order
SortField.FIELD_SCORE,
new SortField("Surname_sort", SortFieldType.STRING),
new SortField("Name_sort", SortFieldType.STRING)
));
And highlighting:
var htmlFormatter = new SimpleHTMLFormatter();
var queryScorer = new QueryScorer(query);
var highlighter = new Highlighter(htmlFormatter, queryScorer);
foreach (var found in topDocs.ScoreDocs)
{
var document = indexSearcher.Doc(found.Doc);
var surname = document.Get("Surname"); //just for simplicity
var surnameFragment = highlighter.GetBestFragment(standardAnalyzer, "Surname", surname);
Console.WriteLine(surnameFragment);
}
The problem is that the highlighter returns results like this:
<b>abc</b>
<b>abcd</b>
<b>abcde</b>
<b>abcdef</b>
So it "highlights" entire words even though I was searching for partials.
Explain returned NON-MATCH all the way so not sure if it's helpful here.
Is it possible to highlight only the parts which were searched for? Like in my example.
While searching a bit more on this I came to a conclusion that to make such highlighting work one needs to tweak index generation methods and split indices by parts so offsets would be properly calculated. Or else highlighting will highlight only surrounding words (fragments) entirely.
So based on this I've managed to build a simple highlighter of my own.
public class Highlighter
{
private const string TempStartToken = "\x02";
private const string TempEndToken = "\x03";
private const string SearchPatternTemplate = $"[{TempStartToken}{TempEndToken}]*{{0}}";
private const string ReplacePattern = $"{TempStartToken}$&{TempEndToken}";
private readonly ConcurrentDictionary<HighlightKey, Regex> _regexPatternsCache = new();
private static string GetHighlightTypeTemplate(HighlightType highlightType) =>
highlightType switch
{
HighlightType.Starts => "^{0}",
HighlightType.Contains => "{0}",
HighlightType.Ends => "{0}$",
HighlightType.Equals => "^{0}$",
_ => throw new ArgumentException($"Unsupported {nameof(HighlightType)}: '{highlightType}'", nameof(highlightType)),
};
public string Highlight(string text, IReadOnlySet<string> words, string startToken, string endToken, HighlightType highlightType)
{
foreach (var word in words)
{
var key = new HighlightKey
{
Word = word,
HighlightType = highlightType,
};
var regex = _regexPatternsCache.GetOrAdd(key, _ =>
{
var parts = word.Select(w => string.Format(SearchPatternTemplate, Regex.Escape(w.ToString())));
var pattern = string.Concat(parts);
var highlightPattern = string.Format(GetHighlightTypeTemplate(highlightType), pattern);
return new Regex(highlightPattern, RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.Compiled);
});
text = regex.Replace(text, ReplacePattern);
}
return text
.Replace(TempStartToken, startToken)
.Replace(TempEndToken, endToken)
;
}
private record HighlightKey
{
public string Word { get; init; }
public HighlightType HighlightType { get; init; }
}
}
public enum HighlightType
{
Starts,
Contains,
Ends,
Equals,
}
Use it like this:
var queries = new[] { "abc" }.ToHashSet();
var search = "a ab abc abcd abcde";
var highlighter = new Highlighter();
var outputs = search
.Split((string[])null, StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries)
.Select(w => highlighter.Highlight(w, queries, "<b>", "</b>", HighlightType.Starts))
;
var result = string.Join(" ", outputs).Dump();
Util.RawHtml(result).Dump();
Output looks like this:
a ab <b>abc</b> <b>abc</b>d <b>abc</b>de
a ab abc abcd abcde
I'm open to any other better solutions.

How to check if a string matches multiple strings and return value based on match

Apologizes if the title doesn't make much sense, English isn't my native language.
What I am trying to do:
1. I have a list of strings
2. I want to check each of those strings against another list of strings
3. Depending which string they contain, the output will be different
In code, it looks like this:
public static Hashtable Matches = new Hashtable
{
{"first_match", "One"},
{"second_match", "Two"},
{"third_match", "Three"},
{"fourth_match", "Four!"},
{"fifth_match", "Five"}
};
Now, I have a list of strings like this:
001_first_match
010_second_match
011_third_match
And I want to check if each string in the list exists in the hashtable (or maybe other data type appropriate for this situation, suggestions appreciated) and based on that, to take the value for the key.
For example: 001_first_match is in the hashtable with first_match key. If found, then I want to take the One value of it and use it.
I can't use ContainsKey because the list of strings isn't 100% exact as the keys. The key is contained within the string, but there's extra data in the string.
I hope it's not too confusing what I want to do.
Try following linq :
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Serialization;
using System.IO;
namespace ConsoleApplication58
{
class Program
{
const string FILENAME = #"c:\temp\test.xml";
static void Main(string[] args)
{
string[] inputs = { "001_first_match", "010_second_match", "011_third_match" };
foreach (string input in inputs)
{
var results = Matches.Keys.Cast<string>().Where(x => input.Contains(x)).FirstOrDefault();
Console.WriteLine("Input '{0}' found in HashTable : {1}", input, (results == null) ? "False" : "True, key = '" + results + "', Value = '" + Matches[results] + "'");
}
Console.ReadLine();
}
public static Hashtable Matches = new Hashtable
{
{"first_match", "One"},
{"second_match", "Two"},
{"third_match", "Three"},
{"fourth_match", "Four!"},
{"fifth_match", "Five"}
};
}
}
You can use Linq to do this by enumerating over the hashtable, casting each item to DictionaryEntry, and seeing if any element of the list of strings contains the key from the hashtable:
using System;
using System.Linq;
using System.Collections;
using System.Collections.Generic;
namespace Demo
{
class Program
{
public static void Main(string[] args)
{
var Matches = new Hashtable
{
{"first_match", "One"},
{"second_match", "Two"},
{"third_match", "Three"},
{"fourth_match", "Four!"},
{"fifth_match", "Five"}
};
var Targets = new List<string>
{
"001_first_match",
"010_second_match",
"011_third_match"
};
var matches =
Matches.Cast<DictionaryEntry>()
.Where(x => Targets.Any(s => s.Contains((string)x.Key)))
.Select(v => v.Value);
Console.WriteLine(string.Join("\n", matches)); // Outputs "Three", "One" and "Two".
}
}
}
using System;
using NUnit.Framework;
using System.Collections.Generic;
namespace StackOverflow
{
public class StringMatch
{
public Dictionary<string, string> Matches = new Dictionary<string, string>
{
{ "first_match", "One" },
{ "second_match", "Two" },
{ "third_match", "Three" },
{ "fourth_match", "Four!" },
{ "fifth_match", "Five" }
};
public List<string> Strings = new List<string>
{
"001_first_match",
"010_second_match",
"011_third_match"
};
[Test]
public void FindMatches()
{
foreach (var item in Strings)
{
foreach (var match in Matches)
{
if (item.Contains(match.Key))
{
Console.WriteLine(match.Value);
break;
}
}
}
}
}
}
I can do this by two dimensional array hope it can help you.
public string test()
{
string result="";
string[,] Hashtable = new string[2,2]
{
{"first_match", "One"},
{"second_match", "Two"},
};
string match = "001_first_match";
for (int i = 0; i < Hashtable.GetLength(0); i++)
{
string test1= Hashtable[i, 0];
if (match.Contains(test1)) { result = Hashtable[i, 1]; }
}
return result;
}

How to extract a substring from one delimiter to another in C#?

My input is going to be as follows:
abc#gmail.com,def#yahoo.com;xyz#gmail.com;ghi#hotmail.com and so on
Now I want my output to be:
abc
def
xyz
ghi
The following is my code:
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main(string[] args)
{
string str;
string[] newstr,newstr2;
Console.WriteLine("Enter the email addresses: ");
str=Console.ReadLine();
newstr=Regex.Split(str,",|;|#");
foreach (string s in newstr)
{
Console.WriteLine(s);
}
}
}
My output right now is:
abc
gmail.com
def
yahoo.com
xyz
gmail.com
ghi
hotmail.com
Any kind of help would be greatly appreciated. Thanks.
You shouldn't use regex for split, and should no split by #. Instead, use the follopwing code:
using System;
public class Program
{
public static void Main(string[] args)
{
string str;
string[] newstr;
Console.WriteLine("Enter the email addresses: ");
str = Console.ReadLine();
newstr = str.Split(new char[] { ',', ';' }); // Split to get a temporal array of addresses
foreach (string s in newstr)
{
Console.WriteLine(s.Substring(0, s.IndexOf('#'))); // Extract the sender from the email addresses
}
}
}
Edit:
Or, with LINQ:
using System;
using System.Linq;
public class Program
{
public static void Main(string[] args)
{
string str;
string[] newstr;
Console.WriteLine("Enter the email addresses: ");
str = Console.ReadLine();
newstr = str.Split(new char[] { ',', ';' }) // Split to get a array of addresses to work with
.Select(s => s.Substring(0, s.IndexOf('#'))).ToArray(); // Extract the sender from the email addresses
foreach (string s in newstr)
{
Console.WriteLine(s);
}
}
}
another approach without RegEx
string input = "abc#gmail.com,def#yahoo.com;xy#gmail.com; ghi#hotmail.com";
var result = input.Split(',', ';').Select(x => x.Split('#').First());
first Split the adresses by , and ;, then select the part before the # by splitting again.
You can use this email regex:
var regex = new Regex(#"(?<name>\w+([-+.']\w+)*)#\w+([-.]\w+)*\.\w+([-.]\w+)*");
var results =
regex.Matches("abc#gmail.com,def#yahoo.com;xyz#gmail.com;ghi#hotmail.com")
.Cast<Match>()
.Select(m => m.Groups["name"].Value)
.ToList();
Perhaps using this might help
str.Substring(0, str.LastIndexOf(" ")<0?0:str.LastIndexOf(" "));
As Mail is a weird thing with a complexe definition, I will never assume that something with an # is a mail.
My best try would be to convert the string to a MailAddress, just in case it look like a mail but it's not one because of some invalid char etc.
string input = "abc#gmail.com,ghi#hotmail.com;notme; #op this is not a mail!";
var result = input
.Split(',', ';') // Split
.Select(x =>
{
string adr = "";
try
{ // Create an MailAddress, MailAddress has no TryParse.
adr = new MailAddress(x).User;
}
catch
{
return new { isValid = false, mail = adr };
}
return new { isValid = true, mail = adr };
})
.Where(x => x.isValid)
.Select(x => x.mail);
Actually, in the regular expression, to capture some substring, you need to wrap the expected content by ( and )
Below code should work
string str22 = "abc#gmail.com;def#yahoo.com,xyz#gmail.com;fah#yao.com,h347.2162#yahoo.com.hk";// ghi#hotmail.com";
List<string> ret = new List<string>();
string regExp = #"(.*?)#.*?[,;]{1}|(.*)#";
MatchCollection matches = Regex.Matches(str22, regExp, RegexOptions.IgnoreCase);
foreach (Match match in matches)
{
if (match.Success)
{
int pvt = 1;
while (string.IsNullOrEmpty(match.Groups[pvt].Value))
{
pvt++;
}
MessageBox.Show(match.Groups[pvt].Value);
}
}
return;
The regular expression is as below
(.*?)#.*?[,;]{1}|(.*)#
(.*?)#.*?[,;]{1} is fetching the substring before # and ? restrict it fetches the first match.
The last email do not contain , or ;, thus add a OR condition and fetch the last email name by the substring before #

Need to split string with regex

i need to split a string in C# .net.
output i am getting : i:0#.f|membership|sdp950452#abctechnologies.com or i:0#.f|membership|tss954652#abctechnologies.com
I need to remove i:0#.f|membership| and #abctechnologies.com from the string. out put i need is sdp950452 or tss954652
also one more string I am getting is Pawar, Jaywardhan and i need it to be jaywardhan pawar
thanks,
Jay
Here is code example how you can do first part with Regex and the second with Splits and Replaces:
using System;
using System.Text.RegularExpressions;
namespace ConsoleApplication1
{
public class Program
{
public static void Main()
{
//First part
string first = "i:0#.f|membership|sdp950452#abctechnologies.com";
string second = "i:0#.f|membership|tss954652#abctechnologies.com";
string pattern = #"\|[A-Za-z0-9]+\#";
Regex reg = new Regex(pattern);
Match m1 = reg.Match(first);
Match m2 = reg.Match(second);
string result1 = m1.Value.Replace("|",string.Empty).Replace("#",string.Empty);
string result2 = m2.Value.Replace("|", string.Empty).Replace("#", string.Empty);
Console.WriteLine(result1);
Console.WriteLine(result2);
//Second part
string inputString = "Pawar, Jaywardhan";
string a = inputString.ToLower();
var b = a.Split(' ');
var result3 = b[1] + " " + b[0].Replace(",",string.Empty);
}
}
}
Using Linq to reduce the code lines
Link to dotnetfiddle code sample
using System.Linq;
using System;
public class Program
{
public static void Main()
{
//Extract email
string a = "i:0#.f|membership|sdp950452#abctechnologies.com";
string s = a.Split('|').Where(splitted => splitted.Contains("#")).FirstOrDefault().Split('#').First();
Console.WriteLine(s);
//Format Name
string name = "Pawar, Jaywardhan";
string formatted = String.Join(" ",name.Split(',').Reverse()).ToLower().TrimStart().TrimEnd();
Console.WriteLine(formatted);
}
}

C#: How can I cut a String based on a Value?

I have this string:
value1*value2*value3*value4
How would cut the String in multiple Strings?
string1 = value1;
string2 = value2;
etc...
My way (and probably not a very good way):
I take an array with all the indexes of the "*" character and after that, I call the subString method to get what I need.
string valueString = "value1*value2*value3*value4";
var strings = valueString.Split('*');
string string1 = strings[0];
string string2 = strings[1];
...
More info here.
Try this
string string1 = "value1*value2*value3*value4";
var myStrings = string1.Split('*');
string s = "value1*value2*value3*value4";
string[] array = s.Split('*');
simply :
string[] parts = myString.Split("*");
parts will be an array of string (string[])
you can just use the Split() method of the String-Object like so:
String temp = "value1*value2*value3*value4";
var result = temp.Split(new char[] {'*'});
The result variable is a string[] with the four values.
If you want to shine in society, you can also use dynamic code :
using System;
using System.Dynamic;
namespace ConsoleApplication1
{
class DynamicParts : System.Dynamic.DynamicObject
{
private string[] m_Values;
public DynamicParts(string values)
{
this.m_Values = values.Split('*');
}
public override bool TryGetMember(GetMemberBinder binder, out object result)
{
var index = Convert.ToInt32(binder.Name.Replace("Value", ""));
result = m_Values[index - 1];
return true;
}
public static void Main()
{
dynamic d = new DynamicParts("value1*value2*value3*value4");
Console.WriteLine(d.Value1);
Console.WriteLine(d.Value2);
Console.WriteLine(d.Value3);
Console.WriteLine(d.Value4);
Console.ReadLine();
}
}
}

Categories

Resources