Splitting text and integers into array/list - c#

I'm trying to find a way to split a string by its letters and numbers but I've had luck.
An example:
I have a string "AAAA000343BBB343"
I am either needing to split it into 2 values "AAAA000343" and "BBB343" or into 4 "AAAA" "000343" "BBB" "343"
Any help would be much appreciated
Thanks

Here is a RegEx approach to split your string into 4 values
string input = "AAAA000343BBB343";
string[] result = Regex.Matches(input, #"[a-zA-Z]+|\d+")
.Cast<Match>()
.Select(x => x.Value)
.ToArray(); //"AAAA" "000343" "BBB" "343"

So you can use regex
For
"AAAA000343" and "BBB343"
var regex = new Regex(#"[a-zA-Z]+\d+");
var result = regex
.Matches("AAAA000343BBB343")
.Cast<Match>()
.Select(x => x.Value);
// result outputs: "AAAA000343" and "BBB343"
For
4 "AAAA" "000343" "BBB" "343"
See #fubo answer

Try this:
var numAlpha = new Regex("(?<Alpha>[a-zA-Z]*)(?<Numeric>[0-9]*)");
var match = numAlpha.Match("codename123");
var Character = match.Groups["Alpha"].Value;
var Integer = match.Groups["Numeric"].Value;

Related

C# Returning words in a string that contains another string

Hello I'm a beginner at C# I want to know how I can return the words in a string that contain another string.
for example:
string s1 = "This is a string"
string s2 = "is"
I know I can use the following code to return the whole string:
if (s1.Contains(s2))
{
Console.WriteLine(s1);
}
else
{
Console.WriteLine("{0} does not contain {1}",s1,s2);
}
but how can I return only the words that contain the second string?
so:
Result: This is
Thank you in advance.
A very simple and quick solution, assuming you define 'word' as pieces of your string being delimited by a <space>.
var containingWords = s1.Split(' ').Where(word => word.Contains(s2));
string s1 = "This is a string";
string s2 = "is";
int _index = s1.IndexOf(s2);
if(_index > -1)
{
Console.WriteLine(s1.Substring(_index, _index + s2.Length));
}
First you have to split the string into words; assuming that word is a sequence of letters or apostrophes, you can do it with a help of regular expressions:
var words = Regex
.Matches(s1, #"[\p{L}']+")
.Cast<Match>()
.Select(match => match.Value);
However, you don't want all the words; you have to filter them, say, with a help of Linq Where:
string[] words = Regex
.Matches(s1, #"[\p{L}']+")
.Cast<Match>()
.Select(match => match.Value)
.Where(word => word.Contains(s2))
.ToArray();
You can Join all the words found into a single string:
string result = string.Join(" ", Regex
.Matches(s1, #"[\p{L}']+")
.Cast<Match>()
.Select(match => match.Value)
.Where(word => word.Contains(s2)));

extracting strings between 2 chars - all occurrences

I would like to do something like this:
My string example: "something;123:somethingelse;156:somethingelse2;589:somethingelse3"
I would like to get an array with values extracted from the string example. These values lies between ";" and ":" : 123, 156, 589
I have tried this, but I do not know how to iterate to get all occurrences:
string str = stringExample.Split(';', ':')[1];
string[i] arr = str;
Thank you for helping me.
LINQ is your friend here, something like this would do:
str.Split(';').Select(s => s.Split(':')[0]).Skip(1)
I would work with named groups:
string stringExample = "something;123:somethingelse;156:somethingelse2;589:somethingelse3";
Regex r = new Regex(";(?<digit>[0-9]+):");
foreach (Match item in r.Matches(stringExample))
{
var digit = item.Groups["digit"].Value;
}
You can use a regular expression like this:
Regex r = new Regex(#";(\d+):");
string s = "something;123:somethingelse;156:somethingelse2;589:somethingelse3";
foreach(Match m in r.Matches(s))
Console.WriteLine(m.Groups[1]);
;(\d+): matches one or more digits standing between ; and : and Groups[1] selects the content inside the brackest, ergo the digits.
Output:
123
156
589
To get these strings into an array use:
string[] numberStrings = r.Matches(s).OfType<Match>()
.Select(m => m.Groups[1].Value)
.ToArray();
So you want to extract all 3 numbers, you could use this approach:
string stringExample = "something;123:somethingelse;156:somethingelse2;589:somethingelse3";
string[] allTokens = stringExample.Split(';', ':'); // remove [1] since you want the whole array
string[] allNumbers = allTokens.Where(str => str.All(Char.IsDigit)).ToArray();
Result is:
allNumbers {string[3]} string[]
[0] "123" string
[1] "156" string
[2] "589" string
This sounds like a perfect case for a regular expression.
var sample = "something;123:somethingelse;156:somethingelse2;589:somethingelse3";
var regex = new Regex(#"(?<=;)(\d+)(?=:)");
var matches = regex.Matches(sample);
var array = matches.Cast<Match>().Select(m => m.Value).ToArray();

Using Regex, how to find repeating patterns between 2 characters?

How an I use regex to find anything between 2 ASCII codes?
ASCII code STX (\u0002) and ETX (\u0003)
Example string "STX,T1,ETXSTX,1,1,1,1,1,1,ETXSTX,A,1,0,B,ERRETX"
Using Regex on the above my matches should be
,T1,
,1,1,1,1,1,1,
,A,1,0,B,ERR
Did a bit of googling and I tried the following pattern but it didn't find anything.
#"^\u0002.*\u0003$"
UPDATE: Thank you all, some great answers below and all seem to work!
You could use Regex.Split.
var input = (char)2 + ",T1," + (char)3 + (char)2 + ",1,1,1,1,1,1," + (char)3 + (char)2 + ",A,1,0,B,ERR" + (char)3;
var result = Regex.Split(input, "\u0002|\u0003").Where(r => !String.IsNullOrEmpty(r));
You may use a non-regex solution, too (based on Wyatt's answer):
var result = input.Split(new[] {'\u0002', '\u0003'}) // split with the known char delimiters
.Where(p => !string.IsNullOrEmpty(p)) // Only take non-empty ones
.ToList();
A Regex solution I suggested in comments:
var res = Regex.Matches(input, "(?s)\u0002(.*?)\u0003")
.OfType<Match>()
.Select(p => p.Groups[1].Value)
.ToList();
var s = "STX,T1,ETXSTX,1,1,1,1,1,1,ETXSTX,A,1,0,B,ERRETX";
s = s.Replace("STX", "\u0002");
s = s.Replace("ETX", "\u0003");
var result1 = Regex.Split(s, #"[\u0002\u0003]").Where(a => a != String.Empty).ToList();
result1.ForEach(a=>Console.WriteLine(a));
Console.WriteLine("------------ OR WITHOUT REGEX ---------------");
var result2 = s.Split(new char[] { '\u0002','\u0003' }, StringSplitOptions.RemoveEmptyEntries).ToList();
result2.ForEach(a => Console.WriteLine(a));
output:
,T1,
,1,1,1,1,1,1,
,A,1,0,B,ERR
------------ OR WITHOUT REGEX ---------------
,T1,
,1,1,1,1,1,1,
,A,1,0,B,ERR

How do I split a string by a character, but only when it is not contained within parentheses?

Input: ((Why,Heck),(Ask,Me),(Bla,No))
How can I split this data into a string array:
Element1 (Why,Heck)
Element2 (Ask,Me)
Element3 (Bla,No)
I tried the String.Split or String.TrimEnd/Start but no chance the result is always wrong.
Would it be better with Regex?
var input = "((Why,Heck),(Ask,Me),(Bla,No))";
var result = Regex.Matches(input, #"\([^\(\)]+?\)")
.Cast<Match>()
.Select(m => m.Value)
.ToList();
Another - non regex approach which should work:
string[] result = str.Split(new[]{"),"}, StringSplitOptions.None)
.Select(s => string.Format("({0})", s.Trim('(', ')')))
.ToArray();
Demo
you could also:
remove all parenthesis to simplify your splits
split by ','
Read your returned array in groups of two. That's using a for loop or a similar recursive algorithm, get indices 0 and 1, 2 and 3 e.t.c
Reconstruct with parenthesis
Or you could just use Regular expressions

C# Regex return numbers in brakets

I have a string like this:
numbers(23,54)
The input format is like this:
numbers([integer1],[integer2])
How can I get the number "23" and "54" using regular expression ? Or are there any other better ways to get?
You can avoid regular expressions usage thus your input has consistent format:
string input = "numbers(23,54)";
var numbers = input.Replace("numbers(", "")
.Replace(")", "")
.Split(',')
.Select(s => Int32.Parse(s));
Or even (if you don't afraid of magic numbers):
input.Substring(8, input.Length - 9).Split(',').Select(s => Int32.Parse(s))
UPDATE Here also Regex version
var numbers = Regex.Matches(input, #"\d+")
.Cast<Match>()
.Select(m => Int32.Parse(m.Value));
Yeah Use (\d+) to get the numbers correctly
This is the correct way

Categories

Resources