Looking for next match (integer) in list of strings

Looking for next match (integer) in list of strings - c#

I have a problem finding the next integer match in a list of strings, there are some other aspects to consider:
single string contains non relevant trailing and leading chars
numbers are formatted "D6" example 000042
there are gaps in the numbers
the list is not sorted, but it could be if there is a fast way to ignore the leading chars
Example:
abc-000001.file
aaac-000002.file
ab-002010.file
abbc-00003.file
abbbc-00004.file
abcd-00008.file
abc-000010.file
x-902010.file
The user input is 7 => next matching string would be abcd-000008.file
My attempt is :
int userInput = 0;
int counter = 0;
string found = String.Empty;
bool run = true;
while (run)
{
for (int i = 0; i < strList.Count; i++)
{
if(strList[i].Contains((userInput + counter).ToString("D6")))
{
found = strList[i];
run = false;
break;
}
}
counter++;
}
It's bad because it's slow and it can turn into a infinite loop. But I really don't know how to do this (fast).

You can parse numbers from strings with Regex and created a sorted collection which you can search with Where clause:
var strings = new[] { "abc-000001.file", "x-000004.file"};
var regP = "\\d{6}"; // simplest option in my example, maybe something more complicated will be needed
var reg = new Regex(regP);
var collection = strings
.Select(s =>
{
var num = reg.Match(s).Captures.First().Value;
return new { num = int.Parse(num), str = s};
})
.OrderBy(arg => arg.num)
.ToList();
var userInput = 2;
var res = collection
.Where(arg => arg.num >= userInput)
.FirstOrDefault()?.str; // x-000004.file
P.S.
How 9002010, 0000010, 0002010 should be treated? Cause they have 7 characters. Is it [9002010, 10, 2010] or [900201, 1, 201]?

If you don't want regex, you can do something like that:
List<string> strings = new List<string>
{
"abc-000001.file",
"aaac-000002.file",
"ab-0002010.file",
"abbc-000003.file",
"abbbc-000004.file",
"abcd-000008.file"
};
int input = 7;
var converted = strings.Select(s => new { value = Int32.Parse(s.Split('-', '.')[1]), str = s })
.OrderBy(c => c.value);
string result = converted.FirstOrDefault(v => v.value >= input)?.str;
Console.WriteLine(result);

Related

In a LINQ with a select can I compare forward to the next row and decide what to select?

What I implemented with a for loop is this:
phraseSources2 = new List<PhraseSource2>();
for (int i = 0; i < phraseSources.Count; i++)
{
var ps = phraseSources[i];
if (i != phraseSources.Count - 1)
{
var psNext = phraseSources[i + 1];
if (psNext != null &&
ps.Kanji == psNext.Kanji &&
ps.Kana == psNext.Kana &&
ps.English.Length <= psNext.English.Length)
{
i++;
ps = phraseSources[i];
}
} else
{
ps = phraseSources[i];
}
phraseSources2.Add(new PhraseSource2()
{
Kanji = ps.Kanji,
Kana = ps.Kana,
Furigana = ps.Furigana,
English = ps.English,
});
}
Previously I had been using LINQ
phraseSources2 = (List<Data1.Model.PhraseSource2>)phraseSources
.Select(x => new PhraseSource2()
{
Kanji = x.Kanji,
Kana = x.Kana,
Furigana = x.Furigana,
English = x.English,
}).ToList();
I know LINQ can do a lot but can it look forward at the next row when doing a select?

If I understand your problem correcly I wouldn't "look forward" but use GroupBy instead and group by Kanji and Kana then Select the longest English as the value in the PhraseSource2 object.
Something like this:
var phraseSource2 = phraseSources
.GroupBy(x => new {Kanji = x.Kanji, Kana = x.Kana})
.Select(g => new PhraseSource2 {
Kanji = g.Key.Kanji,
Kana = g.Key.Kana,
Furigana = g.First().Furigana,
English = g.OrderByDescending(x => x.English.Length).First().English
});

If the source collection can be accessed by index than you can use an overload to the select which gives you the current index.
var source = new[] { 'a', 'b', 'c' };
var result = source.Select((x, i) => new { Current = x, Next = source.Length > i+1 ? source[i+1] : ' '});

All you have to do is just set up a variable inside a query where you can easily retrieve next or previous value like this:
phraseSources2 = (List<Data1.Model.PhraseSource2>)phraseSources
.Select((x, y) =>
var NextKanji = (List<Data1.Model.PhraseSource2>)phraseSources.Skip(y + 1).FirstOrDefault().Kanji;
new PhraseSource2()
{
Kanji = NextKanji,
Kana = x.Kana,
Furigana = x.Furigana,
English = x.English,
}).ToList();
If you want to check some conditions before, you can do it like this:
phraseSources2 = (List<Data1.Model.PhraseSource2>)phraseSources
.Where((x, y) =>
var NextEnglish = (List<Data1.Model.PhraseSource2>)phraseSources.Skip(y + 1).FirstOrDefault().English;
x.English.Length < NextEnglish.Length)
.Select(x =>
new PhraseSource2()
{
Kanji = x.Kanji,
Kana = x.Kana,
Furigana = x.Furigana,
English = x.English,
}).ToList();

There is no built-in method, but there are third-party libraries that offer this functionality. The MoreLinq is a respected and free .NET library that offers a WindowLeft extension method, that processes a sequence into a series of subsequences representing a windowed subset of the original. So you could use it to process your phraseSources in pairs, and discard the pairs that have two equal phrases. Finally select the first phrase of the pairs that survived.
using static MoreLinq.Extensions.WindowLeftExtension;
var phraseSources2 = phraseSources
.WindowLeft(size: 2)
.Where(phrases => // phrases is of type IList<PhraseSource2>
{
if (phrases.Count == 2) // All have size 2 except from the last
{
var ps = phrases[0];
var psNext = phrases[1];
return ps.Kanji != psNext.Kanji || ps.Kana != psNext.Kana ||
ps.English.Length > psNext.English.Length;
}
else // The last is a single phrase
{
return true;
}
})
.Select(window => window[0]) // Select the first phrase
.ToList();

How to separate sting with comma plus 8 digits

I want to split a long string (that contains only numbers) to string arr 0f numbers with 8 digits after the comma.
for example:
input:
string str = "45.00019821162.206580920.032150970.03215097244.0031982274.245303020.014716900.046867870.000198351974.613444580.391664580.438532450.00020199 3499.19734739 0.706802871.145335320.000202002543.362378010.513759201.659094520.000202102.391733720.000483371.65957789"
output:
string[] Arr=
"
45.00019821 162.20658092 234.03215097 123123.03215097
255.00019822 74.24530302 23422.01471690 1.04686787
12.00019835 1974.61344458 234.39166458 123212.43853245
532.00020199 3499.19734739 878.70680287 1.14533532
1234.00020200 2543.36237801 23.51375920 1.65909452
12221.00020210 2.39173372 0.00048337 1.65957789"
EDIT:
I try use
String.Format("{0:0.00000000}", str);
or some SubString such as:
public static string GetSubstring(string input, int count, char delimiter)
{
return string.Join(delimiter.ToString(), input.Split(delimiter).Take(count));
}
with no success.

You can split the string using Regex:
var strRegex = #"(?<num>\d+\.\d{8})";
var myRegex = new Regex(strRegex, RegexOptions.None);
foreach (Match myMatch in myRegex.Matches(str))
{
var part = myMatch.Groups["num"].Value;
// convert 'part' to double and store it wherever you want...
}
More compact version:
var myRegex = new Regex(#"(?<num>\d*\.\d{8})", RegexOptions.None);
var myNumbers = myRegex.Matches(str).Cast<Match>()
.Select(m => m.Groups["num"].Value)
.Select(v => Convert.ToDouble(v, CultureInfo.InvariantCulture));

The input string str can be converted to the desired output as follows.
static IEnumerable<string> NumberParts(string iString)
{
IEnumerable<char> iSeq = iString;
while (iSeq.Count() > 0)
{
var Result = new String(iSeq.TakeWhile(Char.IsDigit).ToArray());
iSeq = iSeq.SkipWhile(Char.IsDigit);
Result += new String(iSeq.Take(1).ToArray());
iSeq = iSeq.Skip(1);
Result += new String(iSeq.Take(8).ToArray());
iSeq = iSeq.Skip(8);
yield return Result;
}
}
The parsing method above can be called as follows.
var Parts = NumberParts(str).ToArray();
var Result = String.Join(" ", Parts);

This would be the classical for-loop version of it, (no magic involved):
// split by separator
string[] allparts = str.Split('.');
// Container for the resulting numbers
List<string> numbers = new List<string>();
// Handle the first number separately
string start = allparts[0];
string decimalPart ="";
for (int i = 1; i < allparts.Length; i++)
{
decimalPart = allparts[i].Substring(0, 8);
numbers.Add(start + "." + decimalPart);
// overwrite the start with the next number
start = allparts[i].Substring(8, allparts[i].Length - 8);
}
EDIT:
Here would be a LINQ Version yielding the same result:
// split by separator
string[] allparts = str.Split('.');
IEnumerable<string> allInteger = allparts.Select(x => x.Length > 8 ? x.Substring(8, x.Length - 8) : x);
IEnumerable<string> allDecimals = allparts.Skip(1).Select(x => x.Substring(0,8));
string [] allWholeNumbers = allInteger.Zip(allDecimals, (i, d) => i + "." + d).ToArray();

The shortest way without regex:
var splitted = ("00000000" + str.Replace(" ", "")).Split('.');
var result = splitted
.Zip(splitted.Skip(1), (f, s) =>
string.Concat(f.Skip(8).Concat(".").Concat(s.Take(8))))
.ToList()
Try it online!

c# ordering strings with different formats

I have Licence plate numbers which I return to UI and I want them ordered in asc order:
So let's say the input is as below:
1/12/13/2
1/12/11/3
1/12/12/2
1/12/12/1
My expected output is:
1/12/11/3
1/12/12/1
1/12/12/2
1/12/13/2
My current code which is working to do this is:
var orderedData = allLicenceNumbers
.OrderBy(x => x.LicenceNumber.Length)
.ThenBy(x => x.LicenceNumber)
.ToList();
However for another input sample as below:
4/032/004/2
4/032/004/9
4/032/004/3/A
4/032/004/3/B
4/032/004/11
I am getting the data returned as:
4/032/004/2
4/032/004/9
4/032/004/11
4/032/004/3/A
4/032/004/3/B
when what I need is:
4/032/004/2
4/032/004/3/A
4/032/004/3/B
4/032/004/9
4/032/004/11
Is there a better way I can order this simply to give correct result in both sample inputs or will I need to write a custom sort?
EDIT
It wont always be the same element on the string.
This could be example input:
2/3/5/1/A
1/4/6/7
1/3/8/9/B
1/3/8/9/A
1/5/6/7
Expected output would be:
1/3/8/9/A
1/3/8/9/B
1/4/6/7
1/5/6/7
2/3/5/1/A

You should split your numbers and compare each part with each other. Compare numbers by value and strings lexicographically.
var licenceNumbers = new[]
{
"4/032/004/2",
"4/032/004/9",
"4/032/004/3",
"4/032/004/3/A",
"4/032/004/3/B",
"4/032/004/11"
};
var ordered = licenceNumbers
.Select(n => n.Split(new[] { '/' }))
.OrderBy(t => t, new LicenceNumberComparer())
.Select(t => String.Join("/", t));
Using the following comparer:
public class LicenceNumberComparer: IComparer<string[]>
{
public int Compare(string[] a, string[] b)
{
var len = Math.Min(a.Length, b.Length);
for(var i = 0; i < len; i++)
{
var aIsNum = int.TryParse(a[i], out int aNum);
var bIsNum = int.TryParse(b[i], out int bNum);
if (aIsNum && bIsNum)
{
if (aNum != bNum)
{
return aNum - bNum;
}
}
else
{
var strCompare = String.Compare(a[i], b[i]);
if (strCompare != 0)
{
return strCompare;
}
}
}
return a.Length - b.Length;
}
}

If we can assume that
Number plate constist of several (one or more) parts separated by '/', e.g. 4, 032, 004, 2
Each part is not longer than some constant value (3 in the code below)
Each part consist of either digits (e.g. 4, 032) or non-digits (e.g. A, B)
We can just PadLeft each number plate's digit part with 0 in order to compare not "3" and "11" (and get "3" > "11") but padded "003" < "011":
var source = new string[] {
"4/032/004/2",
"4/032/004/9",
"4/032/004/3/A",
"4/032/004/3/B",
"4/032/004/11",
};
var ordered = source
.OrderBy(item => string.Concat(item
.Split('/') // for each part
.Select(part => part.All(char.IsDigit) // we either
? part.PadLeft(3, '0') // Pad digit parts e.g. 3 -> 003, 11 -> 011
: part))); // ..or leave it as is
Console.WriteLine(string.Join(Environment.NewLine, ordered));
Outcome:
4/032/004/2
4/032/004/3/A
4/032/004/3/B
4/032/004/9
4/032/004/11

You seem to be wanting to sort on the fourth element of the string (delimited by /) in numeric rather than string mode.. ?
You can make a lambda more involved/multi-statement by putting it like any other method code block, in { }
var orderedData = allLicenceNumbers
.OrderBy(x =>
{
var t = x.Split('/');
if(t.Length<4)
return -1;
else{
int o = -1;
int.TryParse(t[3], out o);
return o;
}
)
.ToList();
If you're after sorting on more elements of the string, you might want to look at some alternative logic, perhaps if the first part of the string will always be in the form N/NNN/NNN/??/?, then do:
var orderedData = allLicenceNumbers
.OrderBy(w => w.Remove(9)) //the first 9 are always in the form N/NNN/NNN
.ThenBy(x => //then there's maybe a number that should be parsed
{
var t = x.Split('/');
if(t.Length<4)
return -1;
else{
int o = -1;
int.TryParse(t[3], out o);
return o;
}
)
.ThenBy(y => y.Substring(y.LastIndexOf('/'))) //then there's maybe A or B..
.ToList();
Ultimately, it seems that more and more outliers will be thrown into the mix, so you're just going to have to keep inventing rules to sort with..
Either that or change your strings to standardize everything (int an NNN/NNN/NNN/NNN/NNA format for example), and then sort as strings..
var orderedData = allLicenceNumbers
.OrderBy(x =>
{
var t = x.Split('/');
for(int i = 0; i < t.Length; i++) //make all elements in the form NNN
{
t[i] = "000" + t[i];
t[i] = t[i].Substring(t[i].Length - 3);
}
return string.Join(t, "/");
}
)
.ToList();
Mmm.. nasty!

How to find maximum number of repeated string in a string in a list of string in c#

If we have a list of strings, then how we can find the list of strings that have the maximum number of repeated symbol by using LINQ.
List <string> mylist=new List <string>();
mylist.Add("%1");
mylist.Add("%136%250%3"); //s0
mylist.Add("%1%5%20%1%10%50%8%3"); // s1
mylist.Add("%4%255%20%1%14%50%8%4"); // s2
string symbol="%";
List <string> List_has_MAX_num_of_symbol= mylist.OrderByDescending(s => s.Length ==max_num_of(symbol)).ToList();
//the result should be a list of s1 + s2 since they have **8** repeated '%'
I tried
var longest = mylist.Where(s => s.Length == mylist.Max(m => m.Length)) ;
this gives me only one string not both

Here's a very simple solution, but not exactly efficient. Every element has the Count operation performed twice...
List<string> mylist = new List<string>();
mylist.Add("%1");
mylist.Add("%136%250%3"); //s0
mylist.Add("%1%5%20%1%10%50%8%3"); // s1
mylist.Add("%4%255%20%1%14%50%8%4"); // s2
char symbol = '%';
var maxRepeat = mylist.Max(item => item.Count(c => c == symbol));
var longest = mylist.Where(item => item.Count(c => c == symbol) == maxRepeat);
It will return 2 strings:
"%1%5%20%1%10%50%8%3"
"%4%255%20%1%14%50%8%4"

Here is an implementation that depends upon SortedDictionary<,> to get what you're after.
var mylist = new List<string> {"%1", "%136%250%3", "%1%5%20%1%10%50%8%3", "%4%255%20%1%14%50%8%4"};
var mappedValues = new SortedDictionary<int, IList<string>>();
mylist.ForEach(str =>
{
var count = str.Count(c => c == '%');
if (mappedValues.ContainsKey(count))
{
mappedValues[count].Add(str);
}
else
{
mappedValues[count] = new List<string> { str };
}
});
// output to validate output
foreach (var str in mappedValues.Last().Value)
{
Console.WriteLine(str);
}
Here's one using LINQ that gets the result you're after.
var result = (from str in mylist
group str by str.Count(c => c == '%')
into g
let max = (from gKey in g select g.Key).Max()
select new
{
Count = max,
List = (from str2 in g select str2)
}).LastOrDefault();

OK, here's my answer:
char symbol = '%';
var recs = mylist.Select(s => new { Str = s, Count = s.Count(c => c == symbol) });
var maxCount = recs.Max(x => x.Count);
var longest = recs.Where(x => x.Count == maxCount).Select(x => x.Str).ToList();
It is complicated because it has three lines (the char symbol = '%'; line excluded), but it counts each string only once. EZI's answer has only two lines, but it is complicated because it counts each string twice. If you really want a one-liner, here it is:
var longest = mylist.Where(x => x.Count(c => c == symbol) == mylist.Max(y => y.Count(c => c == symbol))).ToList();
but it counts each string many times. You can choose whatever complexity you want.

We can't assume that the % is always going to be the most repeated character in your list. First, we have to determine what character appears the most in an individual string for each string.
Once we have the character and it maximum occurrence, we can apply Linq to the List<string> and grab the strings that contain the character equal to its max occurrence.
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
public static void Main()
{
List <string> mylist=new List <string>();
mylist.Add("%1");
mylist.Add("%136%250%3");
mylist.Add("%1%5%20%1%10%50%8%3");
mylist.Add("%4%255%20%1%14%50%8%4");
// Determine what character appears most in a single string in the list
char maxCharacter = ' ';
int maxCount = 0;
foreach (string item in mylist)
{
// Get the max occurrence of each character
int max = item.Max(m => item.Count(c => c == m));
if (max > maxCount)
{
maxCount = max;
// Store the character whose occurrence equals the max
maxCharacter = item.Select(c => c).Where(c => item.Count(i => i == c) == max).First();
}
}
// Print the strings containing the max character
mylist.Where(item => item.Count(c => c == maxCharacter) == maxCount)
.ToList().ForEach(Console.WriteLine);
}
}
Results:
%1%5%20%1%10%50%8%3
%4%255%20%1%14%50%8%4
Fiddle Demo

var newList = myList.maxBy(x=>x.Count(y=>y.Equals('%'))).ToList();
This should work. Please correct syntax if wrong anywhere and update here too if it works for you.

How can I split an eight character string into four variables?

I hava a string like this:
var id = "01020304";
Is there a simple way I can split this into four variables called pr, pa, fo, it Each variable needs to have the two characters of the string. Looking for an elegant solution if it exists.

You can use Substring:
pr = id.Substring(0, 2);
pa = id.Substring(2, 2);
fo = id.Substring(4, 2);
it = id.Substring(6, 2);

Since you have four distinct variables, you can't get much more elegant than using Substring:
var pr = id.Substring(0, 2);
var pa = id.Substring(2, 2);
var fo = id.Substring(4, 2);
var it = id.Substring(6);
Were you looking for an array of four 2-character substrings, you could get fancier:
var parts = new string[4];
for (int i = 0 ; i != parts.Length ; i ++) {
parts[i] = id.Substring(2*i, 2);
}
EDIT: The same can be done with a LINQ expression:
var parts = Enumerable
.Range(0, id.Length/2)
.Select(i => x.Substring(2*i, 2))
.ToArray();

Try regular expression.
var id = "01020304";
string pat = "(?:(\\d{2}))";
var result = Regex.Split(id, pat).Where(p=>p!=string.Empty);
foreach (var t in result)
{
Console.WriteLine(t);
}

If you are always going to have an input of 8 characters and always require 4 variables, you can simply split the string with the Substring(...) method:
var id = "01020304";
string pr = id.Substring(0, 2);
string pa = id.Substring(2, 2);
string fo = id.Substring(4, 2);
string it = id.Substring(6, 2);
Otherwise, you can employ a method of running through a for loop and splitting off two characters at a time.

Here is a solution with a loop, which would support strings of this format, any length:
string id = "01020304";
int length = id.Length;
int[] holder = new int[length/2];
for (int i = 0; i < length/2; i++) {
holder[i] = id.Substring(i*2, 2);
}

Here is a version using Linq.
Interestingly enough, this is not so easy to achieve with just the built-in operators. The version below just uses the Linq extension methods that come with the .NET framework:
var result = "01020304".ToCharArray().
Select((c,i) => new { idx = i % 2 == 1 ? i - 1 : i, ch = c }).
GroupBy(e => e.idx,
(k,g) => new String(g.Select(e => e.ch).ToArray()));
If you use the morelinq Extensions the query can be simplified to
var result = "01020304".ToCharArray().
Batch(2).Select(ca => new String(ca.ToArray()));

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Looking for next match (integer) in list of strings - c#

Related

In a LINQ with a select can I compare forward to the next row and decide what to select?

How to separate sting with comma plus 8 digits

c# ordering strings with different formats

How to find maximum number of repeated string in a string in a list of string in c#

How can I split an eight character string into four variables?

Categories

Resources