C# finding percent of matched words between two string - c#

I have a single string that i want to compare against a list of strings to find the best match.
For example,
string search = "Orange Black Red One Five"
the List of strings could contain the following
l[0] = "Orange Seven Three Black"
l[1] = " Nine Eight Seven Six"
l[2] = " Black Blue Purple Red Five Four Nine Ten"
l[0] contains 2 matches
l[1] contains 0 matches
l[2] contains 3 matches
so the program would choose l[2] as the best match, with a 60% match.
How would I compare two strings like this?

var s = search.Split(new string[] { " "}, StringSplitOptions.RemoveEmptyEntries);
var res1 = (from string part in l
select new
{
list = part,
count = part.Split(new char[] {' '}).Sum(p => s.Contains(p) ? 1 : 0)
}).OrderByDescending(p=> p.count).First();
Console.Write(res1.count);

Split the strings in to arrays.
Determine the number of matches.
Divide.
...
Profit!
Code:
double Compare(string a, string b)
{
var aWords = a.Split(' ');
var bWords = b.Split(' ');
double matches = (double)aWords.Count(x => bWords.Contains(x));
return matches / (double)aWords.Count();
}
Edit: Or, if you just want to get the match count...
int Matches(string a, string b)
{
var aWords = a.Split(' ');
var bWords = b.Split(' ');
return aWords.Count(x => bWords.Contains(x));
}

Related

How to count 2 or 3 letter words in a string using asp c#

How to count 2 or 3 letter words of a string using asp csharp, eg.
string value="This is my string value";
and output should look like this
2 letter words = 2
3 letter words = 0
4 letter words = 1
Please help, Thanks in advance.
You can try something like this:
split sentence by space to get array of words
group them by length of word (and order by that length)
iterate through every group and write letter count and number of words with that letter count
code
using System.Linq;
using System.Diagnostics;
...
var words = value.Split(' ');
var groupedByLength = words.GroupBy(w => w.Length).OrderBy(x => x.Key);
foreach (var grp in groupedByLength)
{
Debug.WriteLine(string.Format("{0} letter words: {1}", grp.Key, grp.Count()));
}
First of all you need to decide what counts as a word. A naive approach is to split the string with spaces, but this will also count commas. Another approach is to use the following regex
\b\w+?\b
and collect all the matches.
Now you got all the words in a words array, we can write a LINQ query:
var query = words.Where(x => x.Length >= 2 && x.Length <= 4)
.GroupBy(x => x.Length)
.Select(x => new { CharCount = x.Key, WordCount = x.Count() });
Then you can print the query out like this:
query.ToList().ForEach(Console.WriteLine);
This prints:
{ CharCount = 4, WordCount = 1 }
{ CharCount = 2, WordCount = 2 }
You can write some code yourself to produce a more formatted output.
If i understood your question correctly
You can do it using dictionary
First split the string by space in this case
string value = "This is my string value";
string[] words = value.Split(' ');
Then loop trough array of words and set the length of each word as a key of dictionary, note that I've used string as a key, but you can modify this to your needs.
Dictionary<string, int> latteWords = new Dictionary<string,int>();
for(int i=0;i<words.Length;i++)
{
string key = words[i].Length + " letter word";
if (latteWords.ContainsKey(key))
latteWords[key] += 1;
else
latteWords.Add(key, 1);
}
And the output would be
foreach(var ind in latteWords)
{
Console.WriteLine(ind.Key + " = " + ind.Value);
}
Modify this by wish.

How to find 1 in my string but ignore -1 C#

I have a string
string test1 = "255\r\n\r\n0\r\n\r\n-1\r\n\r\n255\r\n\r\n1\r";
I want to find all the 1's in my string but not the -1's. So in my string there is only one 1. I use string.Contain("1") but this will find two 1's. So how do i do this?
You can use regular expression:
string test1 = "255\r\n\r\n0\r\n\r\n-1\r\n\r\n255\r\n\r\n1\r";
// if at least one "1", but not "-1"
if (Regex.IsMatch(test1, "(?<!-)1")) {
...
}
the pattern is exactly 1 which is not preceed by -. To find all the 1s:
var matches = Regex
.Matches(test1, "(?<!-)1")
.OfType<Match>()
.ToArray(); // if you want an array
Try this simple solution:
Note : You can convert this to extension Method Easily.
static List<int> FindIndexSpecial(string search, char find, char ignoreIfPreceededBy)
{
// Map each Character with its Index in the String
var characterIndexMapping = search.Select((x, y) => new { character = x, index = y }).ToList();
// Check the Indexes of the excluded Character
var excludeIndexes = characterIndexMapping.Where(x => x.character == ignoreIfPreceededBy).Select(x => x.index).ToList();
// Return only Indexes who match the 'find' and are not preceeded by the excluded character
return (from t in characterIndexMapping
where t.character == find && !excludeIndexes.Contains(t.index - 1)
select t.index).ToList();
}
Usage :
static void Main(string[] args)
{
string test1 = "255\r\n\r\n0\r\n\r\n-1\r\n\r\n255\r\n\r\n1\r";
var matches = FindIndexSpecial(test1, '1', '-');
foreach (int index in matches)
{
Console.WriteLine(index);
}
Console.ReadKey();
}
You could use String.Split and Enumerable.Contains or Enumerable.Where:
string[] lines = test1.Split(new[] {Environment.NewLine, "\r"}, StringSplitOptions.RemoveEmptyEntries);
bool contains1 = lines.Contains("1");
string[] allOnes = lines.Where(l => l == "1").ToArray();
String.Contains searches for sub-strings in a given string instance. Enumerable.Contains looks if there's at least one string in the string[] which equals it.

Regex pattern to grab all the numbers in between square brackets?

I'm trying to create a regex pattern to grab all the numbers from a given string which are in between square brackets and separated by commas. The output should be like so,
Number1 = 45
Number2 = 66
And so on... All I have so far is a pattern that greedy grabs everything in between square brackets.
string input3;
//string pattern = #"\b\w+es\b";
string pattern = #"\[(.*?)\]";
//Regex regex = new Regex("[*]");
Console.WriteLine("Enter string to search: ");
input3 = Console.ReadLine();
//Console.WriteLine(input3);
List<string> substrings = new List<string>();
int count = 1;
foreach (Match match in Regex.Matches(input3, pattern)) {
string substring = string.Format("Number{0} = '{1}'",count,match);
count++;
Console.WriteLine(substring);
substrings.Add(substring);
}
string[] subStringArray = substrings.ToArray();
}
Should I just create two patterns, the greedy one and then a second pattern to search the greedy output for all numbers separated by commas? Or would it be more efficient to just create a single pattern?
You said that your string is
string which are in between square brackets and separated by commas.
I guess the input is something like that
[1,2,3,4,5,6]
So you can use this regex to get numbers
var numbers = Regex.Match("[1,2,3,4,5,6]", #"\[(?<numbers>[\d,]+)\]").Groups["numbers"].Value;
And then split by , to get a collection of numbers
var collectionOfNumbers = numbers.Split(',');
And to display this string Number1 = 45
Lets us a litle bit of LINQ to do that
C# 6 syntax
var strings = numbers.Split(',').Select((number, i) => $"Number{i + 1} = {number}");
Console.WriteLine(string.Join("\n", strings))
C# <= 5 syntax
var strings = numbers.Split(',').Select((number, i) => string.Format("Number{0} = {1}", i+1, number));
Console.WriteLine(string.Join("\n", strings))
And this is the ouput
Number1 = 1
Number2 = 2
Number3 = 3
Number4 = 4
Number5 = 5
Number6 = 6
Another example with inpunt: Foo Bar [45,66]
C# 6 syntax
var numbers = Regex.Match("Foo Bar [45,66]", #"\[(?<numbers>[\d,]+)\]").Groups["numbers"].Value;
var strings = numbers.Split(',').Select((number, i) => $"Number{i + 1} = {number}");
Console.WriteLine(string.Join("\n", strings))
C# <= 5 syntax
var numbers = Regex.Match("Foo Bar [45,66]", #"\[(?<numbers>[\d,]+)\]").Groups["numbers"].Value;
var strings = numbers.Split(',').Select((number, i) => string.Format("Number{0} = {1}", i+1, number));
Console.WriteLine(string.Join("\n", strings))
The output is
Number1 = 45
Number2 = 66

Get all numbers within a string c# regex

I have a string which can have values like
"Barcode : X 2
4688000000000"
"Barcode : X 10
1234567890123"
etc.
I want to retrieve the quantity value (i.e., 2, 10) and the barcode value (i.e., 4688000000000, 1234567890123)
I tried the following code -
string[] QtyandBarcode = Regex.Split(NPDVariableMap.NPDUIBarcode.DisplayText, #"\D+");
But when I execute, I'm getting QtyandBarcode as a string array having 3 values -
""
"2"
"4688000000000"
How do I prevent the null value from being stored?
You could do this:
MatchCollection m = Regex.Matches(NPDVariableMap.NPDUIBarcode.DisplayText, #"\d+");
var x = (from Match a in m select a.Value).ToArray();
string[] QtyandBarcode = Regex.Split(NPDVariableMap.NPDUIBarcode.DisplayText, #"\D+").Where(s => !string.IsNullOrEmpty(s)).ToArray();
now you can
string qty = QtyandBarcode[0];
string barcode= QtyandBarcode[1];
What about this simple approach.
string[] parts = NPDVariableMap.NPDUIBarcode.DisplayText.split(' '); //split on space
string qty = parts[1];
string barcode = parts[2];

How to keep the delimiters of Regex.Split?

I'd like to split a string using the Split function in the Regex class. The problem is that it removes the delimiters and I'd like to keep them. Preferably as separate elements in the splitee.
According to other discussions that I've found, there are only inconvenient ways to achieve that.
Any suggestions?
Just put the pattern into a capture-group, and the matches will also be included in the result.
string[] result = Regex.Split("123.456.789", #"(\.)");
Result:
{ "123", ".", "456", ".", "789" }
This also works for many other languages:
JavaScript: "123.456.789".split(/(\.)/g)
Python: re.split(r"(\.)", "123.456.789")
Perl: split(/(\.)/g, "123.456.789")
(Not Java though)
Use Matches to find the separators in the string, then get the values and the separators.
Example:
string input = "asdf,asdf;asdf.asdf,asdf,asdf";
var values = new List<string>();
int pos = 0;
foreach (Match m in Regex.Matches(input, "[,.;]")) {
values.Add(input.Substring(pos, m.Index - pos));
values.Add(m.Value);
pos = m.Index + m.Length;
}
values.Add(input.Substring(pos));
Say that input is "abc1defg2hi3jkl" and regex is to pick out digits.
String input = "abc1defg2hi3jkl";
var parts = Regex.Matches(input, #"\d+|\D+")
.Cast<Match>()
.Select(m => m.Value)
.ToList();
Parts would be: abc 1 defg 2 hi 3 jkl
For Java:
Arrays.stream("123.456.789".split("(?<=\\.)|(?=\\.)+"))
.forEach((p) -> {
System.out.println(p);
});
outputs:
123
.
456
.
789
inspired from this post (How to split string but keep delimiters in java?)
Add them back:
string[] Parts = "A,B,C,D,E".Split(',');
string[] Parts2 = new string[Parts.Length * 2 - 1];
for (int i = 0; i < Parts.Length; i++)
{
Parts2[i * 2] = Parts[i];
if (i < Parts.Length - 1)
Parts2[i * 2 + 1] = ",";
}
for c#:
Split paragraph to sentance keeping the delimiters
sentance is splited by . or ? or ! followed by one space (otherwise if there any mail id in sentance it will be splitted)
string data="first. second! third? ";
Regex delimiter = new Regex("(?<=[.?!] )"); //there is a space between ] and )
string[] afterRegex=delimiter.Split(data);
Result
first.
second!
third?

Categories

Resources