C# RegEx Pattern to Split a String into 2 Character Substring - c#

I am trying to figure out a regex to use to split a string into 2 character substring.
Let's say we have the following string:
string str = "Idno1";
string pattern = #"\w{2}";
Using the pattern above will get me "Id" and "no", but it will skip the "1" since it doesn't match the pattern. I would like the following results:
string str = "Idno1"; // ==> "Id" "no" "1 "
string str2 = "Id n o 2"; // ==> "Id", " n", " o", " 2"

Linq can make easy the code. Fiddle version works
The idea: I have a chunkSize = 2 as your requirement, then, Take the string at the index (2,4,6,8,...) to get the chunk of chars and Join them to string.
public static IEnumerable<string> ProperFormat(string s)
{
var chunkSize = 2;
return s.Where((x,i) => i % chunkSize == 0)
.Select((x,i) => s.Skip(i * chunkSize).Take(chunkSize))
.Select(x=> string.Join("", x));
}
With the input, I have the output
Idno1 -->
Id
no
1
Id n o 2 -->
Id
n
o
2

Linq is really better in this case. You can use this method - it will allow to split string in chunks of arbitrary size:
public static IEnumerable<string> SplitInChunks(string s, int size = 2)
{
return s.Select((c, i) => new {c, id = i / size})
.GroupBy(x => x.id, x => x.c)
.Select(g => new string(g.ToArray()));
}
But if you are bound to regex, use this code:
public static IEnumerable<string> SplitInChunksWithRegex(string s, int size = 2)
{
var regex = new Regex($".{{1,{size}}}");
return regex.Matches(s).Cast<Match>().Select(m => m.Value);
}

Related

Retrieving Numeric value before a decimal in a string value

I am working on a routine in C#
I have a list of alphanumeric sheet numbers that I would like to retrieve the numbers before the decimal to use in my routine.
FP10.01-->10
M1.01-->1
PP8.01-->8
If possible, how can something like this be achieved as either a string or integer?
You could use a regex:
Regex r = new Regex("([0-9]+)[.]");
string s = "FP10.01";
var result = Convert.ToInt32(r.Match(s).Groups[1].ToString()); //10
string input = "FP10.01";
string[] _input = input.Split('.');
string num = find(_input[0]);
public string find(string input)
{
char[] _input = input.ToArray();
int number;
string result = null;
foreach (var item in _input)
{
if (int.TryParse(item.ToString(), out number) == true)
{
result = result + number;
}
}
return result;
}
To accumulate the resulting elements into a list, you can do something like:
List<string> myList = new List<string>(){ "FP10.01","M1.01", "PP8.01"};
List<int> resultSet =
myList.Select(e =>
Regex.Replace(e.Substring(0, e.IndexOf('.')), #"[^\d]", string.Empty))
.Select(int.Parse)
.ToList();
This will take each element in myList and in turn, take a substring of each element from index 0 until before the . and then replace all the non-numeric data with string.Empty and then finally parse the string element into an int and store it into a list.
another variant would be:
List<int> resultSet =
myList.Select(e => e.Substring(0, e.IndexOf('.')))
.Select(e => string.Join(string.Empty, e.Where(char.IsDigit)))
.Select(int.Parse)
.ToList();
or if you want the elements to be strings then you could do:
List<string> resultSet =
myList.Select(e => e.Substring(0, e.IndexOf('.')))
.Select(e => string.Join(string.Empty, e.Where(char.IsDigit)))
.ToList();
To retrieve a single element of type string then you can create a helper function as such:
public static string GetValueBeforeDot(string input){
return input.Substring(0, input.IndexOf('.'))
.Where(char.IsDigit)
.Aggregate(string.Empty, (e, a) => e + a);
}
To retrieve a single element of type int then the helper function should be:
public static int GetValueBeforeDot(string input){
return int.Parse(input.Substring(0, input.IndexOf('.'))
.Where(char.IsDigit)
.Aggregate(string.Empty, (e, a) => e + a));
}
This approach removes alphabet characters by replacing them with an empty string. Splitting on the '.' character will leave you with a two element array consisting of numbers at index 0 and after decimal values at index 1.
string input = "FP10.01";
var result = Regex.Replace(input, #"([A-Za-z]+)", string.Empty).Split('.');
var beforeDecimalNumbers = result[0]; // 10
var afterDecimalNumbers = result[1]; // 01

Split String to array and Sort Array

I am trying to sort a string split by comma. But it is not behaving as expected
var classes = "10,7,8,9";
Console.Write(string.Join(",", classes.Split(',').OrderBy(x => x)));
Console.ReadKey();
and output is
10,7,8,9
But I want the expected output to be like:
7,8,9,10
Classes can have a section along with them. like 7a,7b
and I want to achieve it on one line of code.
You can use Regex like this
var classes = "10,7,8,9";
Regex number = new Regex(#"^\d+");
Console.Write(string.Join(",", classes.Split(',').OrderBy(x => Convert.ToInt32(number.Match(x).Value)).ThenBy(x => number.Replace(x, ""))));
Console.ReadKey();
CODE:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
using System.Collections;
namespace Rextester
{
public class Program
{
public static void Main(string[] args)
{
var l = new List<string> { "1D", "25B", "30A", "9C" };
l.Sort((b, a) =>
{
var x = int.Parse(Regex.Replace(a, "[^0-9]", ""));
var y = int.Parse(Regex.Replace(b, "[^0-9]", ""));
if (x != y) return y - x;
return -1 * string.Compare(a, b);
});
foreach (var item in l) Console.WriteLine(item);
}
}
}
OUTPUT:
1D
9C
25B
30A
ONLINE COMPILE:
http://rextester.com/CKKQK66159
Use the following using-directive:
using System.Text.RegularExpressions;
And try the following:
var input = "7,7a,8,9a,9c,9d,10";
var sorted = from sp in input.Split(',')
let reg = Regex.Match(sp, #"(?<num>[0-9]+)(?<char>[a-z]*)", RegexOptions.IgnoreCase | RegexOptions.Compiled)
let number = int.Parse(reg.Groups["num"].ToString())
orderby reg.Groups["char"].ToString() ascending // sort first by letter
orderby number ascending // then by number
select sp;
var result = string.Join(",", sorted);
Console.WriteLine(result);
//output (tested): 7,7a,8,9a,9c,9d,10
It uses regex to determine the numeric and alphabetic part of the input string.
The regex pattern uses named groups, which are noted as follows: (?<group_name> regex_expr ).
The time complexity of the code above is O(n log(n)), in case you are worried about big collections of numbers.
More information about named Regex groups.
More information about LINQ.
... and about the orderby-clause.
All on one line, also supports '4a' etc.
edit: On testing this, a string such as 1,2,111,3 would display as 111,1,2,3, so may not quite be what you're looking for.
string str = "1,2,3,4a,4b,5,6,4c";
str.Split(',').OrderBy(x => x).ToList().ForEach(x=> Console.WriteLine(x));
Console.ReadKey();
Here is my implementation:
IEnumerable<Tuple<string, string[]>> splittedItems =
items.Select(i => new Tuple<string, string[]>(i, Regex.Split(i, "([0-9]+)")));
List<string> orderedItems = splittedItems.OrderBy(t => Convert.ToInt16(t.Item2[1]))
.ThenBy(t => t.Item2.Length > 1 ? t.Item2[2] : "1")
.Select(t => t.Item1).ToList();
Split the input to a number and non numeric characters
Store the splitted strings with their parent string
Order by number
Then order by non numeric characters
Take the parent string again after sorting
The result is like required: { "10", "7", "8b", "8a", "9" } is sorted to { "7", "8a", "8b", "9", "10" }
You are sorting strings (alphabetically), so yes, "10" comes before "7".
My solution converts "10,7b,8,7a,9b" to "7a,7b,8,9b,10" (first sort by the integer prefix, then by the substring itself).
Auxiliary method to parse the prefix of a string:
private static int IntPrefix(string s)
=> s
.TakeWhile(ch => ch >= '0' && ch <= '9')
.Aggregate(0, (a, c) => 10 * a + (c - '0'));
Sorting the substrings by the integer prefix then by the string itself:
classes.Split(',') // string[]
.Select(s => new { s, i = IntPrefix(s) }) // IEnumerable<{ s: string, i: int }>
.OrderBy(si => si.i) // IEnumerable<{ s: string, i: int }>
.ThenBy(si => si.s) // IEnumerable<{ s: string, i: int }>
.Select(si => si.s) // IEnumerable<string>
One liner (with string.Join):
var result = string.Join(",", classes.Split(',').Select(s => new {s, i = IntPrefix(s)}).OrderBy(si => si.i).ThenBy(si => si.s).Select(si => si.s));

How to find 1 in my string but ignore -1 C#

I have a string
string test1 = "255\r\n\r\n0\r\n\r\n-1\r\n\r\n255\r\n\r\n1\r";
I want to find all the 1's in my string but not the -1's. So in my string there is only one 1. I use string.Contain("1") but this will find two 1's. So how do i do this?
You can use regular expression:
string test1 = "255\r\n\r\n0\r\n\r\n-1\r\n\r\n255\r\n\r\n1\r";
// if at least one "1", but not "-1"
if (Regex.IsMatch(test1, "(?<!-)1")) {
...
}
the pattern is exactly 1 which is not preceed by -. To find all the 1s:
var matches = Regex
.Matches(test1, "(?<!-)1")
.OfType<Match>()
.ToArray(); // if you want an array
Try this simple solution:
Note : You can convert this to extension Method Easily.
static List<int> FindIndexSpecial(string search, char find, char ignoreIfPreceededBy)
{
// Map each Character with its Index in the String
var characterIndexMapping = search.Select((x, y) => new { character = x, index = y }).ToList();
// Check the Indexes of the excluded Character
var excludeIndexes = characterIndexMapping.Where(x => x.character == ignoreIfPreceededBy).Select(x => x.index).ToList();
// Return only Indexes who match the 'find' and are not preceeded by the excluded character
return (from t in characterIndexMapping
where t.character == find && !excludeIndexes.Contains(t.index - 1)
select t.index).ToList();
}
Usage :
static void Main(string[] args)
{
string test1 = "255\r\n\r\n0\r\n\r\n-1\r\n\r\n255\r\n\r\n1\r";
var matches = FindIndexSpecial(test1, '1', '-');
foreach (int index in matches)
{
Console.WriteLine(index);
}
Console.ReadKey();
}
You could use String.Split and Enumerable.Contains or Enumerable.Where:
string[] lines = test1.Split(new[] {Environment.NewLine, "\r"}, StringSplitOptions.RemoveEmptyEntries);
bool contains1 = lines.Contains("1");
string[] allOnes = lines.Where(l => l == "1").ToArray();
String.Contains searches for sub-strings in a given string instance. Enumerable.Contains looks if there's at least one string in the string[] which equals it.

How to find maximum number of repeated string in a string in a list of string in c#

If we have a list of strings, then how we can find the list of strings that have the maximum number of repeated symbol by using LINQ.
List <string> mylist=new List <string>();
mylist.Add("%1");
mylist.Add("%136%250%3"); //s0
mylist.Add("%1%5%20%1%10%50%8%3"); // s1
mylist.Add("%4%255%20%1%14%50%8%4"); // s2
string symbol="%";
List <string> List_has_MAX_num_of_symbol= mylist.OrderByDescending(s => s.Length ==max_num_of(symbol)).ToList();
//the result should be a list of s1 + s2 since they have **8** repeated '%'
I tried
var longest = mylist.Where(s => s.Length == mylist.Max(m => m.Length)) ;
this gives me only one string not both
Here's a very simple solution, but not exactly efficient. Every element has the Count operation performed twice...
List<string> mylist = new List<string>();
mylist.Add("%1");
mylist.Add("%136%250%3"); //s0
mylist.Add("%1%5%20%1%10%50%8%3"); // s1
mylist.Add("%4%255%20%1%14%50%8%4"); // s2
char symbol = '%';
var maxRepeat = mylist.Max(item => item.Count(c => c == symbol));
var longest = mylist.Where(item => item.Count(c => c == symbol) == maxRepeat);
It will return 2 strings:
"%1%5%20%1%10%50%8%3"
"%4%255%20%1%14%50%8%4"
Here is an implementation that depends upon SortedDictionary<,> to get what you're after.
var mylist = new List<string> {"%1", "%136%250%3", "%1%5%20%1%10%50%8%3", "%4%255%20%1%14%50%8%4"};
var mappedValues = new SortedDictionary<int, IList<string>>();
mylist.ForEach(str =>
{
var count = str.Count(c => c == '%');
if (mappedValues.ContainsKey(count))
{
mappedValues[count].Add(str);
}
else
{
mappedValues[count] = new List<string> { str };
}
});
// output to validate output
foreach (var str in mappedValues.Last().Value)
{
Console.WriteLine(str);
}
Here's one using LINQ that gets the result you're after.
var result = (from str in mylist
group str by str.Count(c => c == '%')
into g
let max = (from gKey in g select g.Key).Max()
select new
{
Count = max,
List = (from str2 in g select str2)
}).LastOrDefault();
OK, here's my answer:
char symbol = '%';
var recs = mylist.Select(s => new { Str = s, Count = s.Count(c => c == symbol) });
var maxCount = recs.Max(x => x.Count);
var longest = recs.Where(x => x.Count == maxCount).Select(x => x.Str).ToList();
It is complicated because it has three lines (the char symbol = '%'; line excluded), but it counts each string only once. EZI's answer has only two lines, but it is complicated because it counts each string twice. If you really want a one-liner, here it is:
var longest = mylist.Where(x => x.Count(c => c == symbol) == mylist.Max(y => y.Count(c => c == symbol))).ToList();
but it counts each string many times. You can choose whatever complexity you want.
We can't assume that the % is always going to be the most repeated character in your list. First, we have to determine what character appears the most in an individual string for each string.
Once we have the character and it maximum occurrence, we can apply Linq to the List<string> and grab the strings that contain the character equal to its max occurrence.
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
public static void Main()
{
List <string> mylist=new List <string>();
mylist.Add("%1");
mylist.Add("%136%250%3");
mylist.Add("%1%5%20%1%10%50%8%3");
mylist.Add("%4%255%20%1%14%50%8%4");
// Determine what character appears most in a single string in the list
char maxCharacter = ' ';
int maxCount = 0;
foreach (string item in mylist)
{
// Get the max occurrence of each character
int max = item.Max(m => item.Count(c => c == m));
if (max > maxCount)
{
maxCount = max;
// Store the character whose occurrence equals the max
maxCharacter = item.Select(c => c).Where(c => item.Count(i => i == c) == max).First();
}
}
// Print the strings containing the max character
mylist.Where(item => item.Count(c => c == maxCharacter) == maxCount)
.ToList().ForEach(Console.WriteLine);
}
}
Results:
%1%5%20%1%10%50%8%3
%4%255%20%1%14%50%8%4
Fiddle Demo
var newList = myList.maxBy(x=>x.Count(y=>y.Equals('%'))).ToList();
This should work. Please correct syntax if wrong anywhere and update here too if it works for you.

Given collection of strings, count number of times each word appears in List<T>

Input 1: List<string>, e.g:
"hello", "world", "stack", "overflow".
Input 2: List<Foo> (two properties, string a, string b), e.g:
Foo 1:
a: "Hello there!"
b: string.Empty
Foo 2:
a: "I love Stack Overflow"
b: "It's the best site ever!"
So i want to end up with a Dictionary<string,int>. The word, and the number of times it appears in the List<Foo>, either in the a or the b field.
Current first-pass/top of my head code, which is far too slow:
var occurences = new Dictionary<string, int>();
foreach (var word in uniqueWords /* input1 */)
{
var aOccurances = foos.Count(x => !string.IsNullOrEmpty(x.a) && x.a.Contains(word));
var bOccurances = foos.Count(x => !string.IsNullOrEmpty(x.b) && x.b.Contains(word));
occurences.Add(word, aOccurances + bOccurances);
}
Roughly:
Build a dictionary (occurrences) from the first input, optionally with a case-insensitive comparer.
For each Foo in the second input, use RegEx to split a and b into words.
For each word, check if the key exists in occurrences. If it exists, increment and update the value in the dictionary.
You could try concating the two strings a + b. Then doing a regex to pull out all the words into a collection. Then finally indexing that using a group by query.
For example
void Main()
{
var a = "Hello there!";
var b = "It's the best site ever!";
var ab = a + " " + b;
var matches = Regex.Matches(ab, "[A-Za-z]+");
var occurences = from x in matches.OfType<System.Text.RegularExpressions.Match>()
let word = x.Value.ToLowerInvariant()
group word by word into g
select new { Word = g.Key, Count = g.Count() };
var result = occurences.ToDictionary(x => x.Word, x => x.Count);
Console.WriteLine(result);
}
Example with some changes suggested...
Edit. Just reread the requirement....kinda strange but hey...
void Main()
{
var counts = GetCount(new [] {
"Hello there!",
"It's the best site ever!"
});
Console.WriteLine(counts);
}
public IDictionary<string, int> GetCount(IEnumerable<Foo> inputs)
{
var allWords = from input in inputs
let matchesA = Regex.Matches(input.A, "[A-Za-z']+").OfType<System.Text.RegularExpressions.Match>()
let matchesB = Regex.Matches(input.B, "[A-Za-z']+").OfType<System.Text.RegularExpressions.Match>()
from x in matchesA.Concat(matchesB)
select x.Value;
var occurences = allWords.GroupBy(x => x, (x, y) => new{Key = x, Count = y.Count()}, StringComparer.OrdinalIgnoreCase);
var result = occurences.ToDictionary(x => x.Key, x => x.Count, StringComparer.OrdinalIgnoreCase);
return result;
}

Categories

Resources