Getting a value from a string using regular expressions? - c#

I have a string "Page 1 of 15".
I need to get the value 15 as this value could be any number between 1 and 100. I'm trying to figure out:
If regular expressions are best suited here. Considering string will never change maybe just split the string by spaces? Or a better solution.
How to get the value using a regular expression.

Regular expression you can use: Page \d+ of (\d+)
Regex re = new Regex(#"Page \d+ of (\d+)");
string input = "Page 1 of 15";
Match match = re.Match(input);
string pages = match.Groups[1].Value;
Analysis of expression: between ( and ) you capture a group. The \d stands for digit, the + for one or more digits. Note that it is important to be exact, so copy the spaces as well.
The code is a tad verbose, but I figured it'd be better understandable this way. With a split you just need: var pages = input.Split(' ')[3];, looks easier, but is error-prone. The regex is easily extended to grab other parts of the string in one go.

var myString = "Page 1 of 15";
var number = myString.SubString(myString.LastIndexOf(' ') + 1);
If there is a risk of whitespace at the end of the string then apply a TrimEnd method:
var number = myString.SubString(myString.TrimEnd().LastIndexOf(' ') + 1);

I think a simple String.Replace() is the best, most readable solution for something so simple.
string myString = "Page 1 of 15";
string pageNumber = myString.Replace("Page 1 of ", "");
EDIT:
The above solution assumes that the string will never be Page 2 of 15. If the first page number can change, you'll need to use String.Split() instead.
string myString = "Page 1 of 15";
string pageNumber = myString.Split(new string[] {"of"},
StringSplitOptions.None).Last().Trim();

if the string format will never change then you can try this...
string input = "Page 1 of 15";
string[] splits = input.Split(' ');
int totalPages = int.Parse(splits[splits.Length - 1]);

If this is the only case you will ever have to handle, just split the string by spaces and use the parse the 4th part to an integer. Something like that will work:
string input = "Page 1 of 15";
string[] splitInput = string.Split(' ');
int numberOfPages = int.Parse(splitInput[3]);

In c# it should looks like this (using Regex class):
Regex r = new Regex(#"Page \d+ of (\d+)");
var str = "Page 1 of 15";
var mathes = r.Matches(str);
Your resoult will be in: mathes[0].Groups[1]

You dont need a regex for this. Just the the index of the last space
string var = "1 of100";
string secondvar = string.Empty;
int startindex = var.LastIndexOf(" ");
if (startindex > -1)
{
secondvar = var.Substring(startindex +1);
}

Related

Extract specific number from string with fixed pattern in C#

This might sound like a very basic question, but it's one that's given me quite a lot of trouble in C#.
Assume I have, for example, the following Strings known as my chosenTarget.titles:
2008/SD128934 - Wordz aaaaand more words (1233-26-21)
20998/AD1234 - Wordz and less words (1263-21-21)
208/ASD12345 - Wordz and more words (1833-21-21)
Now as you can see, all three Strings are different in some ways.
What I need is to extract a very specific part of these Strings, but getting the subtleties right is what confuses me, and I was wondering if some of you knew better than I.
What I know is that the Strings will always come in the following pattern:
yearNumber + "/" + aFewLetters + theDesiredNumber + " - " + descriptiveText + " (" + someDate + ")"
In the above example, what I would want to return to me would be:
128934
1234
12345
I need to extract theDesiredNumber.
Now, I'm not (that) lazy so I have made a few attempts myself:
var a = chosenTarget.title.Substring(chosenTarget.title.IndexOf("/") + 1, chosenTarget.title.Length - chosenTarget.title.IndexOf("/"));
What this has done is sliced out yearNumber and the /, leaving me with aFewLetter before theDesiredNumber.
I have a hard time properly removing the rest however, and I was wondering if any of you could aid me in the matter?
It sounds as if you only need to extract the number behind the first / which ends at -. You could use a combination of string methods and LINQ:
int startIndex = str.IndexOf("/");
string number = null;
if (startIndex >= 0 )
{
int endIndex = str.IndexOf(" - ", startIndex);
if (endIndex >= 0)
{
startIndex++;
string token = str.Substring(startIndex, endIndex - startIndex); // SD128934
number = String.Concat(token.Where(char.IsDigit)); // 128934
}
}
Another mainly LINQ approach using String.Split:
number = String.Concat(
str.Split(new[] { " - " }, StringSplitOptions.None)[0]
.Split('/')
.Last()
.Where(char.IsDigit));
Try this:
int indexSlash = chosenTarget.title.IndexOf("/");
int indexDash = chosenTarget.title.IndexOf("-");
string out = new string(chosenTarget.title.Substring(indexSlash,indexDash-indexSlash).Where(c => Char.IsDigit(c)).ToArray());
You can use a regex:
var pattern = "(?:[0-9]+/\w+)[0-9]";
var matcher = new Regex(pattern);
var result = matcher.Matches(yourEntireSetOfLinesInAString);
Or you can loop every line and use Match instead of Matches. In this case you don't need to build a "matcher" in every iteration but build it outside the loop
Regex is your friend:
(new [] {"2008/SD128934 - Wordz aaaaand more words (1233-26-21)",
"20998/AD1234 - Wordz and less words (1263-21-21)",
"208/ASD12345 - Wordz and more words (1833-21-21)"})
.Select(x => new Regex(#"\d+/[A-Z]+(\d+)").Match(x).Groups[1].Value)
The pattern you had recognized is very important, here is the solution:
const string pattern = #"\d+\/[a-zA-Z]+(\d+).*$";
string s1 = #"2008/SD128934 - Wordz aaaaand more words(1233-26-21)";
string s2 = #"20998/AD1234 - Wordz and less words(1263-21-21)";
string s3 = #"208/ASD12345 - Wordz and more words(1833-21-21)";
var strings = new List<string> { s1, s2, s3 };
var desiredNumber = string.Empty;
foreach (var s in strings)
{
var match = Regex.Match(s, pattern);
if (match.Success)
{
desiredNumber = match.Groups[1].Value;
}
}
I would use a RegEx for this, the string you're looking for is in Match.Groups[1]
string composite = "2008/SD128934 - Wordz aaaaand more words (1233-26-21)";
Match m= Regex.Match(composite,#"^\d{4}\/[a-zA-Z]+(\d+)");
if (m.Success) Console.WriteLine(m.Groups[1]);
The breakdown of the RegEx is as follows
"^\d{4}\/[a-zA-Z]+(\d+)"
^ - Indicates that it's the beginning of the string
\d{4} - Four digits
\/ - /
[a-zA-Z]+ - More than one letters
(\d+) - More than one digits (the parenthesis indicate that this part is captured as a group - in this case group 1)

How to extract specific number in a string surrounded by numbers and text C#

I am trying to extract specific number in a string with a format of "Q23-00000012-A14" I only wanted to get the numbers in 8 digit 00000000 the 12.
string rx = "Q23-00000012-A14"
string numb = Regex.Replace(rx, #"\D", "");
txtResult.Text = numb;
But im getting the result of 230000001214, I only want to get the 12 and disregard the rest. Can someone guide me.
If your string are always in this format (numbers are covered with "-"), I suggest useing string.split()
string rx = "Q23-00000012-A14"
string numb = int.parse(rx.Split('-')[1]).ToString();//this will get 12 for you
txtResult.Text = numb;
It's an easier way than using regex
Edit!! When you use rx.split('-') , it break string into array of strings with value of splited texts before and after '-'
So in this case:
rx.Split('-')[0]= "Q23"
rx.Split('-')[1]= "00000012"
rx.Split('-')[2]= "A12"
So you shouldn't use Replace. Use Match instead.
string pattern = #"[A-Z]\d+-(\d+)-[A-Z]\d+" ;
var regex = new Regex(pattern);
var match = regex.Match("Q23-00000012-A14");
if (match.Success)
{
String eightNumberString = match.Groups[1].Value; // Contains "00000012"
int yourvalueAsInt = Convert.ToInt32(eightNumberString) ; // Contains 12
}
Why you use don't simply substring or split function ?
string rx = "Q23-00000012-A14";
// substring
int numb = int.Parse(rx.Substring(5, 8));
// or split
int numb = int.Parse(rx.Split('-')[1]);
txtResult.Text = numb.ToString();
(I think it's a better way to use split method because if you change your constant 'Q23' length the method still work)

How to replace the text between two characters in c#

I am bit confused writing the regex for finding the Text between the two delimiters { } and replace the text with another text in c#,how to replace?
I tried this.
StreamReader sr = new StreamReader(#"C:abc.txt");
string line;
line = sr.ReadLine();
while (line != null)
{
if (line.StartsWith("<"))
{
if (line.IndexOf('{') == 29)
{
string s = line;
int start = s.IndexOf("{");
int end = s.IndexOf("}");
string result = s.Substring(start+1, end - start - 1);
}
}
//write the lie to console window
Console.Write Line(line);
//Read the next line
line = sr.ReadLine();
}
//close the file
sr.Close();
Console.ReadLine();
I want replace the found text(result) with another text.
Use Regex with pattern: \{([^\}]+)\}
Regex yourRegex = new Regex(#"\{([^\}]+)\}");
string result = yourRegex.Replace(yourString, "anyReplacement");
string s = "data{value here} data";
int start = s.IndexOf("{");
int end = s.IndexOf("}", start);
string result = s.Substring(start+1, end - start - 1);
s = s.Replace(result, "your replacement value");
To get the string between the parentheses to be replaced, use the Regex pattern
string errString = "This {match here} uses 3 other {match here} to {match here} the {match here}ation";
string toReplace = Regex.Match(errString, #"\{([^\}]+)\}").Groups[1].Value;
Console.WriteLine(toReplace); // prints 'match here'
To then replace the text found you can simply use the Replace method as follows:
string correctString = errString.Replace(toReplace, "document");
Explanation of the Regex pattern:
\{ # Escaped curly parentheses, means "starts with a '{' character"
( # Parentheses in a regex mean "put (capture) the stuff
# in between into the Groups array"
[^}] # Any character that is not a '}' character
* # Zero or more occurrences of the aforementioned "non '}' char"
) # Close the capturing group
\} # "Ends with a '}' character"
The following regular expression will match the criteria you specified:
string pattern = #"^(\<.{27})(\{[^}]*\})(.*)";
The following would perform a replace:
string result = Regex.Replace(input, pattern, "$1 REPLACE $3");
For the input: "<012345678901234567890123456{sdfsdfsdf}sadfsdf" this gives the output "<012345678901234567890123456 REPLACE sadfsdf"
You need two calls to Substring(), rather than one: One to get textBefore, the other to get textAfter, and then you concatenate those with your replacement.
int start = s.IndexOf("{");
int end = s.IndexOf("}");
//I skip the check that end is valid too avoid clutter
string textBefore = s.Substring(0, start);
string textAfter = s.Substring(end+1);
string replacedText = textBefore + newText + textAfter;
If you want to keep the braces, you need a small adjustment:
int start = s.IndexOf("{");
int end = s.IndexOf("}");
string textBefore = s.Substring(0, start-1);
string textAfter = s.Substring(end);
string replacedText = textBefore + newText + textAfter;
the simplest way is to use split method if you want to avoid any regex .. this is an aproach :
string s = "sometext {getthis}";
string result= s.Split(new char[] { '{', '}' })[1];
You can use the Regex expression that some others have already posted, or you can use a more advanced Regex that uses balancing groups to make sure the opening { is balanced by a closing }.
That expression is then (?<BRACE>\{)([^\}]*)(?<-BRACE>\})
You can test this expression online at RegexHero.
You simply match your input string with this Regex pattern, then use the replace methods of Regex, for instance:
var result = Regex.Replace(input, "(?<BRACE>\{)([^\}]*)(?<-BRACE>\})", textToReplaceWith);
For more C# Regex Replace examples, see http://www.dotnetperls.com/regex-replace.

Regex split and replace

I need to replace a word that starts with %.
For example Welcome to home | %brand %productName
hoping to split on words begining with % which would give me { brand, productName }.
My regex is less than average so would appreciate help with this.
Following code might help you :
string[] splits = "Welcome to home | %brand %productName".Split(' ');
List<string> lstdata = new List<string>();
for(i=0;i<splits.length;i++)
{
if(splits[i].StartsWith("%"))
lstdata.Add(splits[i].Replace('%',''));
}
Nothing wrong with string.split approach, mind you, but here's a regex approach:
string input = #"Welcome to home | %brand %productName";
string pattern = #"%\S+";
var matches = Regex.Matches(input, pattern);
string result = string.Empty;
for (int i = 0; i < matches.Count; i++)
{
result += "match " + i + ",value:" + matches[i].Value + "\n";
}
Console.WriteLine(result);
Try this:
(?<=%)\w+
This looks for any combination of word characters immediately preceded by a percent symbol.
Now, if you're doing search and replace on these matches, you'll probably want to remove the % sign as well, so you'd need to remove the lookbehind group and just have this:
%\w+
But in doing so, your replacement code would need to trim off the % sign from each match to get the word by itself.

Quantity of specific strings inside a string

I'm working in .net c# and I have a string text = "Whatever text FFF you can FFF imagine";
What i need is to get the quantity of times the "FFF" appears in the string text.
How can i acomplished that?
Thank you.
You can use regular expressions for this and right about anything you want:
string s = "Whatever text FFF you can FFF imagine";
Console.WriteLine(Regex.Matches(s, Regex.Escape("FFF")).Count);
Here are 2 approaches. Note that the regex should use the word boundary \b metacharacter to avoid incorrectly matching occurrences within other words. The solutions posted so far do not do this, which would incorrectly count "FFF" in "fooFFFbar" as a match.
string text = "Whatever text FFF you can FFF imagine fooFFFbar";
// use word boundary to avoid counting occurrences in the middle of a word
string wordToMatch = "FFF";
string pattern = #"\b" + Regex.Escape(wordToMatch) + #"\b";
int regexCount = Regex.Matches(text, pattern).Count;
Console.WriteLine(regexCount);
// split approach
int count = text.Split(' ').Count(word => word == "FFF");
Console.WriteLine(count);
Regex.Matches(text, "FFF").Count;
Use the System.Text.RegularExpressions.Regex for this:
string p = "Whatever text FFF you can FFF imagine";
var regex = new System.Text.RegularExpressions.Regex("FFF");
var instances = r.Matches(p).Count;
// instances will now equal 2,
Here's an alternative to the regular expressions:
string s = "Whatever text FFF you can FFF imagine FFF";
//Split be the number of non-FFF entries so we need to subtract one
int count = s.Split(new string[] { "FFF" }, StringSplitOptions.None).Count() - 1;
You could easily tweak this to use several different strings if necessary.

Categories

Resources