getting number from string - c#

How I can get number from following string:
###E[shouldbesomenumber][space or endofline]
[] here only for Illustration not present in the real string.
I am in .net 2.0.
Thanks.

I would suggest that you use regular expressions or string operations to isolate just the numeric part, and then call int.Parse, int.TryParse, decimal.Parse, decimal.TryParse etc depending on the type of number you need to parse.
The regular expression might look something like:
#"###E(-?\d+) ?$";
You'll need to change it for non-integers, of course. Sample code:
using System;
using System.Text.RegularExpressions;
class Test
{
static void Main(string[] arg)
{
Regex regex = new Regex(#"###E(-?\d+) ?$");
string text = "###E123 ";
Match match = regex.Match(text);
if (match.Success)
{
string group = match.Groups[1].Value;
int parsed = int.Parse(group);
Console.WriteLine(parsed);
}
}
}
Note that this could still fail with a number which exceeds the range of int. (Another reason to use int.TryParse...)

static string ExtractNumber(string text)
{
const string prefix = "###E";
int index = text.IndexOfAny(new []{' ', '\r', '\n'});
string number = text.Substring(prefix.Length, index - prefix.Length);
return number;
}
Now that your number is extracted you can parse it or use it as it is.

Related

Split string on Nth occurrence of char

I have this a lot of strings like this:
29/10/2018 14:50:09402325 671
I want to split these string so they are like this:
29/10/2018 14:50
09402325 671
These will then be added to a data set and analysed later.
The issue I am having is if I use this code:
string[] words = emaildata.Split(':');
it splits them twice; I only want to split it once on the second occurrence of the :.
How can I do that?
You can use LastIndexOf() and some subsequent Substring() calls:
string input = "29/10/2018 14:50:09402325 671";
int index = input.LastIndexOf(':');
string firstPart = input.Substring(0, index);
string secondPart = input.Substring(index + 1);
Fiddle here
However, another thing to ask yourself is if you even need to make it more complicated than it needs to be. It looks like this data will always be of a the same length until that second : instance right? Why not just split at a known index (i.e not finding the : first):
string firstPart = input.Substring(0, 16);
string secondPart = input.Substring(17);
you can reverse the string, then call the regular split method asking for a single result, and then reverse back the two results
and with a regex : https://dotnetfiddle.net/Nfiwmv
using System;
using System.Text.RegularExpressions;
public class Program {
public static void Main() {
string input = "29/10/2018 14:50:09402325 671";
Regex rx = new Regex(#"(.*):([^:]+)",
RegexOptions.Compiled | RegexOptions.IgnoreCase);
MatchCollection matches = rx.Matches(input);
if ( matches.Count >= 1 ) {
var m = matches[0].Groups;
Console.WriteLine(m[1]);
Console.WriteLine(m[2]);
}
}
}

Compare if input string is in correct format

Check if input string entered by user is in format like IIPIII, where I is integer number, any one digit number can be used on place of I and P is a character.
Example if input is 32P125 it is valid string else N23P33 is invalid.
I tried using string.Length or string.IndexOf("P") but how to validate other integer values?
I'm sure someone can offer a more succinct answer but pattern matching is the way to go.
using System.Text.RegularExpressions;
string test = "32P125";
// 2 integers followed by any upper cased letter, followed by 3 integers.
Regex regex = new Regex(#"\d{2}[A-Z]\d{3}", RegexOptions.ECMAScript);
Match match = regex.Match(test);
if (match.Success)
{
//// Valid string
}
else
{
//// Invalid string
}
Considering that 'P' has to matched literally -
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string st1 = "32P125";
string st2 = "N23P33";
Regex rg = new Regex(#"\d{2}P\d{3}");
// If 'P' is not to be matched literally, reeplace above line with below one
// Regex rg = new Regex(#"\d{2}[A-Za-z]\d{3}");
Console.WriteLine(rg.IsMatch(st1));
Console.WriteLine(rg.IsMatch(st2));
}
}
OUTPUT
True
False
It can be encapsulated in one simple if:
string testString = "12P123";
if(
// check if third number is letter
Char.IsLetter(testString[2]) &&
// if above succeeds, code proceeds to second condition (short-circuiting)
// remove third character, after that it should be a valid number (only digits)
int.TryParse(testString.Remove(2, 1), out int i)
) {...}
I would encourage the usage of MaskedTextProvided over Regex.
Not only is this looking cleaner but it's also less error prone.
Sample code would look like the following:
string Num = "12P123";
MaskedTextProvider prov = new MaskedTextProvider("##P###");
prov.Set(Num);
var isValid = prov.MaskFull;
if(isValid){
string result = prov.ToDisplayString();
Console.WriteLine(result);
}
Use simple Regular expression for this kind of stuff.

Validate and pass only valid characters against regex expression in c#

I am working on solution where I need to validate and pass only valid characters of string in c#.
E.g. my regular expression is : "^\\S(|(.|\\n)*\\S)\\Z"
and text I want validate is below
127 Finchfield Lane
Now I know its invalid. But how do I remove invalid against regex and pass only if string validate successfully against regex ?
if i understand you correctly, you are looking for Regex.IsMatch
if(Regex.IsMatch(str, "^\\S(|(.|\\n)*\\S)\\Z"))
{
// do something with the valid string
}
else
{
// strip invalid characters from the string
}
using System;
using System.Text.RegularExpressions;
namespace PatternMatching
{
class Program
{
static void Main()
{
string pattern = #"(\d+) (\w+)";
string[] strings = { "123 ABC", "ABC 123", "CD 45678", "57998 DAC" };
foreach (var s in strings)
{
Match result = Regex.Match(s, pattern);
if (result.Success)
{
Console.WriteLine("Match: {0}", result.Value);
}
}
Console.ReadKey();
}
}
}
This seems to do what you require. Hope I haven't misunderstood.
To validate the string against regex you can use use Regex.IsMatch.
Regex.IsMatch(string, pattern) //returns true if string is valid
If you want to get the Match value only then you can use it.
Match match = new Regex(#"\d+").Match(str);
match.value; //it returns only the matched string and unmatched string automatically stripped out

Substitute only one group when dealing with an unknown number of capturing groups

Assuming I have this input:
/green/blah/agriculture/apple/blah/
I'm only trying to capture and replace the occurrence of apple (need to replace it with orange), so I have this regex
var regex = new Regex("^/(?:green|red){1}(?:/.*)+(apple){1}(?:/.*)");
So I'm grouping sections of the input, but as non-capturing, and only capturing the one I'm concerned with. According to this $` will retrieve everything before the match in the input string, and $' will get everything after, so theoretically the following should work:
"$`Orange$'"
But it only retrieves the match ("apple").
Is it possible to do this with just substitutions and NOT match evaluators and looping through groups?
The issue is that apple can occur anywhere in that url scheme, hence an unknown number of capture groups.
Thanks.
To achieve what you want, I slightly changed your regex.
The new regex looks like this look for the updated version at the end of the answer:
What I am doing here is, I want all the other groups to become captured groups. Doing this I can use them as follow:
String replacement = "$1Orange$2";
string result = Regex.Replace(text, regex.ToString(), replacement);
I am using group 1,2 and 4 and in the middle of everything (where I suspect 'apple') I replace it with Orange.
A complete example looks like this:
using System;
using System.Text.RegularExpressions;
public class Test
{
public static void Main()
{
String text = "/green/blah/agriculture/apple/blah/hallo/apple";
var regex = new Regex("^(/(?:green|red)/(?:[^/]+/)*?)apple(/.*)");
String replacement = "$1$2Orange$4";
string result = Regex.Replace(text, regex.ToString(), replacement);
Console.WriteLine(result);
}
}
And as well a running example is here
See the updated regex, I needed to change it again to capture things like this:
/green/blah/agriculture/apple/blah/hallo/apple/green/blah/agriculture/apple/blah/hallo/apple
With the above regex it matched the last apple and not the first as prio designated. I changed the regex to this:
var regex = new Regex("^(/(?:green|red)/(?:[^/]+/)*?)apple(/.*)");
I updated the code as well as the running example.
If you really want to replace only the first occurence of apple and dont mind about the URL structure then can you use one of the following methods:
First simply use apple as regex and use the overloaded Replace method.
using System;
using System.Text.RegularExpressions;
public class Test
{
public static void Main()
{
String text = "/green/blah/agriculture/apple/blah/hallo/apple/green/blah/agriculture/apple/blah/hallo/apple";
var regex = new Regex(Regex.Escape("apple"));
String replacement = "Orange";
string result = regex.Replace(text, replacement.ToString(), 1);
Console.WriteLine(result);
}
}
See working Example
Second is the use of IndexOf and Substring which could be much quick as the use of the regex classes.
See the following Example:
class Program
{
static void Main(string[] args)
{
string search = "apple";
string text = "/green/blah/agriculture/apple/blah/hallo/apple/green/blah/agriculture/apple/blah/hallo/apple";
int idx = text.IndexOf(search);
int endIdx = idx + search.Length;
int secondStrLen = text.Length - endIdx;
if (idx != -1 && idx < text.Length && endIdx < text.Length && secondStrLen > -1)
{
string first = text.Substring(0, idx);
string second = text.Substring(endIdx, secondStrLen);
string result = first + "Orange" + second;
Console.WriteLine(result);
}
}
}
Working Example

Replace regular expression with regular expression

Consider two regular expressions:
var regex_A = "Main\.(.+)\.Value";
var regex_B = "M_(.+)_Sp";
I want to be able to replace a string using regex_A as input, and regex_B as the replacement string. But also the other way around. And without supplying additional information like a format string per regex.
Specifically I want to create a replaced_B string from an input_A string. So:
var input_A = "Main.Rotating.Value";
var replaced_B = input_A.RegEx_Awesome_Replace(regex_A, regex_B);
Assert.AreEqual("M_Rotating_Sp", replaced_B);
And this should also work in reverse (thats the reason i can't use a simple string.format for regex_B). Because I don't want to supply a format string for every regular expression (i'm lazy).
var input_B = "M_Skew_Sp";
var replaced_A = input_B.RegEx_Awesome_Replace(regex_B, regex_A);
Assert.AreEqual("Main.Skew.Value", replaced_A);
I have no clue if this exists, or how to call it. Google search finds me all kinds of other regex replaces... not this one.
Update:
So basically I need a way to convert a regular expression to a format string.
var regex_A_format = Regex2Format(regex_A);
Assert.AreEqual("Main.$1.Value", regex_A_format);
and
var regex_B_format = Regex2Format(regex_B);
Assert.AreEqual("M_$1_Sp", regex_B_format);
So what should the RegEx_Awesome_Replace and/or Regex2Format function look like?
Update 2:
I guess the RegEx_Awesome_Replace should look something like (using some code from answers below):
public static class StringExtenstions
{
public static string RegExAwesomeReplace(this string inputString,string searchPattern,string replacePattern)
{
return Regex.Replace(inputString, searchPattern, Regex2Format(replacePattern));
}
}
Which would leave the Regex2Format as an open question.
There is no defined way for one regex to refer to a match found in another regex. Regexes are not format strings.
What you can do is to use Tuples of a format string together with its regex. e.g.
var a = new Tuple<Regex,string>(new Regex(#"(?<=Main\.).+(?=\.Value)"), #"Main.{0}.Value")
var b = new Tuple<Regex,string>(new Regex(#"(?<=M_).+(?=_Sp)"), #"M_{0}_Sp")`
Then you can pass these objects to a common replacement method in any order, like this:
private string RegEx_Awesome_Replace(string input, Tuple<Regex,string> toFind, Tuple<Regex,string> replaceWith)
{
return string.Format(replaceWith.Item2, toFind.Item1.Match(input).Value);
}
You will notice that I have used zero-width positive lookahead assertion and zero-width positive lookbehind assertions in my regexes, to ensure that Value contains exactly the text that I want to replace.
You may also want to add error handling, for cases where the match can not be found. Maybe read about Regex.Match
Since you have already reduced your problem to where you need to change a Regex into a string format (implementing Regex2Format) I will focus my answer just on that part. Note that my answer is incomplete because it doesn't address the full breadth of parsing regex capturing groups, however it works for simple cases.
First thing needed is a Regex that will match Regex capture groups. There is a negative lookbehind to not match escaped bracket symbols. There are other cases that break this regex. E.g. a non-capturing group, wildcard symbols, things between square braces.
private static readonly Regex CaptureGroupMatcher = new Regex(#"(?<!\\)\([^\)]+\)");
The implementation of Regex2Format here basically writes everything outside of capture groups into the output string, and replaces the capture group value by {x}.
static string Regex2Format(string pattern)
{
var targetBuilder = new StringBuilder();
int previousEndIndex = 0;
int formatIndex = 0;
foreach (Match match in CaptureGroupMatcher.Matches(pattern))
{
var group = match.Groups[0];
int endIndex = group.Index;
AppendPart(pattern, previousEndIndex, endIndex, targetBuilder);
targetBuilder.Append('{');
targetBuilder.Append(formatIndex++);
targetBuilder.Append('}');
previousEndIndex = group.Index + group.Length;
}
AppendPart(pattern, previousEndIndex, pattern.Length, targetBuilder);
return targetBuilder.ToString();
}
This helper function writes pattern string values into the output, it currently writes everything except \ characters used to escape something.
static void AppendPart(string pattern, int previousEndIndex, int endIndex, StringBuilder targetBuilder)
{
for (int i = previousEndIndex; i < endIndex; i++)
{
char c = pattern[i];
if (c == '\\' && i < pattern.Length - 1 && pattern[i + 1] != '\\')
{
//backslash not followed by another backslash - it's an escape char
}
else
{
targetBuilder.Append(c);
}
}
}
Test cases
static void Test()
{
var cases = new Dictionary<string, string>
{
{ #"Main\.(.+)\.Value", #"Main.{0}.Value" },
{ #"M_(.+)_Sp(.*)", "M_{0}_Sp{1}" },
{ #"M_\(.+)_Sp", #"M_(.+)_Sp" },
};
foreach (var kvp in cases)
{
if (PatternToStringFormat(kvp.Key) != kvp.Value)
{
Console.WriteLine("Test failed for {0} - expected {1} but got {2}", kvp.Key, kvp.Value, PatternToStringFormat(kvp.Key));
}
}
}
To wrap up, here is the usage:
private static string AwesomeRegexReplace(string input, string sourcePattern, string targetPattern)
{
var targetFormat = PatternToStringFormat(targetPattern);
return Regex.Replace(input, sourcePattern, match =>
{
var args = match.Groups.OfType<Group>().Skip(1).Select(g => g.Value).ToArray<object>();
return string.Format(targetFormat, args);
});
}
Something like this might work
var replaced_B = Regex.Replace(input_A, #"Main\.(.+)\.Value", #"M_$1_Sp");
Are you looking for something like this?
public static class StringExtenstions
{
public static string RegExAwesomeReplace(this string inputString,string searchPattern,string replacePattern)
{
Match searchMatch = Regex.Match(inputString,searchPattern);
Match replaceMatch = Regex.Match(inputString, replacePattern);
if (!searchMatch.Success || !replaceMatch.Success)
{
return inputString;
}
return inputString.Replace(searchMatch.Value, replaceMatch.Value);
}
}
The string extension method returns the string with replaced value for search pattern and replace pattern.
This is how you call:
input_A.RegEx_Awesome_Replace(regex_A, regex_B);

Categories

Resources