C# Regex return numbers in brakets - c#

I have a string like this:
numbers(23,54)
The input format is like this:
numbers([integer1],[integer2])
How can I get the number "23" and "54" using regular expression ? Or are there any other better ways to get?

You can avoid regular expressions usage thus your input has consistent format:
string input = "numbers(23,54)";
var numbers = input.Replace("numbers(", "")
.Replace(")", "")
.Split(',')
.Select(s => Int32.Parse(s));
Or even (if you don't afraid of magic numbers):
input.Substring(8, input.Length - 9).Split(',').Select(s => Int32.Parse(s))
UPDATE Here also Regex version
var numbers = Regex.Matches(input, #"\d+")
.Cast<Match>()
.Select(m => Int32.Parse(m.Value));

Yeah Use (\d+) to get the numbers correctly
This is the correct way

Related

c# Regex quotes

How to parse string "\"bcd ef\" a 'x y'", and catch all text between quotes ",' and without them with regular expressions? I tried pattern "(\\\"|')(.*?)(\\\"|'), but got only "bcd ef", 'x y'. Result should be:
"bcd ef"
a
'x y'
string pattern ="(\\\"|')(.*?)(\\\"|')";
Regex regex = new Regex(pattern);
Two options are string.Split() or Regex.Split(). string.Split() is much faster but Regex.Split() is more powerful.
string.Split() version:
var parts = input.Split(new []{'"', '\''})
.Where(p => !string.IsNullOrEmpty(p))
.Select(p => p.Trim())
.ToList();
Regex.Split()version:
var input = "\"bcd ef\" a 'x y'";
var parts = Regex.Split(input, "[\"']")
.Where(p => !string.IsNullOrEmpty(p))
.Select(p => p.Trim())
.ToList();
As long as you want to split by single characters, the regex version is simply slower. So there's no reason to use it.
Docs:
https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.split
https://learn.microsoft.com/en-us/dotnet/api/system.string.split

What's the best way to acquire a list of strings that match a string inside a string, by looking through a string list?

Basically I have a string array that I am using to match inside a single string:
string[] matches = { "{A}", "{B}", "{CC}" };
Then I from these I look if I find any of these inside my string:
string text = "Lorem Ipsum is {CC} simply dummy text {A} of the {CC} printing and typesetting industry {B}."
In which case, the resulting array I want to gather should be:
string[] allmatches = { "{CC}", "{A}", "{CC}", "{B}" };
Is there an easy way to do this using LINQ or maybe Regex?
Construct the regex by first Escapeing each element in matches using Select, then Joining with |. After that, get the Matches of the regex against text and Select the Values:
var regex = string.Join("|", matches.Select(Regex.Escape));
var result = Regex.Matches(text, regex)
.Cast<Match>()
.Select(x => x.Value).ToArray();
Assuming, that {A}..{Z} are the only matches required, we can try combining Regex and Linq, e.g.
string text =
#"Lorem Ipsum is {C} simply dummy text {A} of the {C} printing and typesetting industry {B}.";
string[] allmatches = Regex
.Matches(text, #"\{[A-Z]\}")
.Cast<Match>()
.Select(m => m.Value)
//.Where(item => matches.Contains(item)) // uncomment to validate matches
.ToArray();
Let's have a look:
Console.Write(string.Join(", ", allmatches));
Outcome:
{C}, {A}, {C}, {B}
Edit: uncomment .Where(...) if you want matches which are in matches[] only
Edit 2: If match doesn't necessary contain one letter only, change pattern:
.Matches(text, #"\{[A-Z]+\}") // one or more capital letters
.Matches(text, #"\{[a-zA-Z]+\}") // one or more English letters
.Matches(text, #"\{\p{L}+\}") // one or more Unicode letters
.Matches(text, #"\{[^}{]+\}") // one or more characters except "{" and "}"

How to split a string every time the character changes?

I'd like to turn a string such as abbbbcc into an array like this: [a,bbbb,cc] in C#. I have tried the regex from this Java question like so:
var test = "aabbbbcc";
var split = new Regex("(?<=(.))(?!\\1)").Split(test);
but this results in the sequence [a,a,bbbb,b,cc,c] for me. How can I achieve the same result in C#?
Here is a LINQ solution that uses Aggregate:
var input = "aabbaaabbcc";
var result = input
.Aggregate(" ", (seed, next) => seed + (seed.Last() == next ? "" : " ") + next)
.Trim()
.Split(' ');
It aggregates each character based on the last one read, then if it encounters a new character, it appends a space to the accumulating string. Then, I just split it all at the end using the normal String.Split.
Result:
["aa", "bb", "aaa", "bb", "cc"]
I don't know how to get it done with split. But this may be a good alternative:
//using System.Linq;
var test = "aabbbbcc";
var matches = Regex.Matches(test, "(.)\\1*");
var split = matches.Cast<Match>().Select(match => match.Value).ToList();
There are several things going on here that are producing the output you're seeing:
The regex combines a positive lookbehind and a negative lookahead to find the last character that matches the one preceding it but does not match the one following it.
It creates capture groups for every match, which are then fed into the Split method as delimiters. The capture groups are required by the negative lookahead, specifically the \1 identifier, which basically means "the value of the first capture group in the statement" so it can not be omitted.
Regex.Split, given a capture group or multiple capture groups to match on when identifying the splitting delimiters, will include the delimiters used for every individual Split operation.
Number 3 is why your string array is looking weird, Split will split on the last a in the string, which becomes split[0]. This is followed by the delimiter at split[1], etc...
There is no way to override this behaviour on calling Split.
Either compensation as per Gusman's answer or projecting the results of a Matches call as per Ruard's answer will get you what you want.
To be honest I don't exactly understand how that regex works, but you can "repair" the output very easily:
Regex reg = new Regex("(?<=(.))(?!\\1)", RegexOptions.Singleline);
var res = reg.Split("aaabbcddeee").Where((value, index) => index % 2 == 0 && value != "").ToArray();
Could do this easily with Linq, but I don't think it's runtime will be as good as regex.
A whole lot easier to read though.
var myString = "aaabbccccdeee";
var splits = myString.ToCharArray()
.GroupBy(chr => chr)
.Select(grp => new string(grp.Key, grp.Count()));
returns the values `['aaa', 'bb', 'cccc', 'd', 'eee']
However this won't work if you have a string like "aabbaa", you'll just get ["aaaa","bb"] as a result instead of ["aa","bb","aa"]

Trying to find word in string but getting enumeration yields no result

I am having 1 long string in which i want to find word starting with Emp only after dot in my string and if match then extract that part after dot.
Below is my string:
Value.EmployeeRFID,Value.EmployeeRFID1,Value.EmkhasisGFTR,Value.EmployeeGHID,Value.EmployeeFCKJ
Now in my above input i just want to extract only part after dot(Foreg:EmployeeRFID,EmployeeRFID1 etc) and i want to add that in below list:
var list= new List<string>();
Expected output in above list variable:
[0]:EmployeeRFID ,[1]=EmployeeRFID1, [2]:EmployeeGHID, [3]:EmployeeFCKJ
This is how i am trying with linq but i am getting Enumeration yielded no results:
string str="Value.EmployeeRFID,Value.EmployeeRFID1,Value.EmkhasisGFTR,Value.EmployeeGHID,Value.EmployeeFCKJ";
var data= str.Where(t=>t.ToString().StartsWith("Emp")).Select(t=>t.t.ToString()) // Enumeration yielded no results
Try regular expressions:
string source = "Value.EmployeeRFID,...,Value.EmployeeGHID,Value.EmployeeFCKJ";
string pattern = #"(?<=\.)Emp\w*";
string[] result = Regex.Matches(source, pattern, RegexOptions.IgnoreCase)
.OfType<Match>()
.Select(match => match.Value)
.ToArray();
Test:
// EmployeeRFID
// EmployeeRFID1
// EmployeeGHID
// EmployeeFCKJ
Console.Write(string.Join(Environment.NewLine, result));
string as an argument to LINQ extensions is an IEnumerable<char>, so your t is only one character. You probably meant to do something like this:
var data= str.Split('.')
.Where(t => t.StartsWith("Emp")).Select(t => t.Split(',').First())
.ToList();
But regular expressions as suggested by Dmitry seem to be a better approach for string parsing than LINQ.

How do I split a string by a character, but only when it is not contained within parentheses?

Input: ((Why,Heck),(Ask,Me),(Bla,No))
How can I split this data into a string array:
Element1 (Why,Heck)
Element2 (Ask,Me)
Element3 (Bla,No)
I tried the String.Split or String.TrimEnd/Start but no chance the result is always wrong.
Would it be better with Regex?
var input = "((Why,Heck),(Ask,Me),(Bla,No))";
var result = Regex.Matches(input, #"\([^\(\)]+?\)")
.Cast<Match>()
.Select(m => m.Value)
.ToList();
Another - non regex approach which should work:
string[] result = str.Split(new[]{"),"}, StringSplitOptions.None)
.Select(s => string.Format("({0})", s.Trim('(', ')')))
.ToArray();
Demo
you could also:
remove all parenthesis to simplify your splits
split by ','
Read your returned array in groups of two. That's using a for loop or a similar recursive algorithm, get indices 0 and 1, 2 and 3 e.t.c
Reconstruct with parenthesis
Or you could just use Regular expressions

Categories

Resources