Compare if input string is in correct format - c#

Check if input string entered by user is in format like IIPIII, where I is integer number, any one digit number can be used on place of I and P is a character.
Example if input is 32P125 it is valid string else N23P33 is invalid.
I tried using string.Length or string.IndexOf("P") but how to validate other integer values?

I'm sure someone can offer a more succinct answer but pattern matching is the way to go.
using System.Text.RegularExpressions;
string test = "32P125";
// 2 integers followed by any upper cased letter, followed by 3 integers.
Regex regex = new Regex(#"\d{2}[A-Z]\d{3}", RegexOptions.ECMAScript);
Match match = regex.Match(test);
if (match.Success)
{
//// Valid string
}
else
{
//// Invalid string
}

Considering that 'P' has to matched literally -
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string st1 = "32P125";
string st2 = "N23P33";
Regex rg = new Regex(#"\d{2}P\d{3}");
// If 'P' is not to be matched literally, reeplace above line with below one
// Regex rg = new Regex(#"\d{2}[A-Za-z]\d{3}");
Console.WriteLine(rg.IsMatch(st1));
Console.WriteLine(rg.IsMatch(st2));
}
}
OUTPUT
True
False

It can be encapsulated in one simple if:
string testString = "12P123";
if(
// check if third number is letter
Char.IsLetter(testString[2]) &&
// if above succeeds, code proceeds to second condition (short-circuiting)
// remove third character, after that it should be a valid number (only digits)
int.TryParse(testString.Remove(2, 1), out int i)
) {...}

I would encourage the usage of MaskedTextProvided over Regex.
Not only is this looking cleaner but it's also less error prone.
Sample code would look like the following:
string Num = "12P123";
MaskedTextProvider prov = new MaskedTextProvider("##P###");
prov.Set(Num);
var isValid = prov.MaskFull;
if(isValid){
string result = prov.ToDisplayString();
Console.WriteLine(result);
}

Use simple Regular expression for this kind of stuff.

Related

Find pattern to solve regex in one step

I have a problem to find the pattern that solves the problem in onestep.
The string looks like this:
Text1
Text1$Text2$Text3
Text1$Text2$Text3$Text4$Text5$Text6 etc.
What i want to get is: Take up to 4x Text. If there are more than "4xText" take only the last sign.
Example:
Text1$Text2$Text3$Text4$Text5$Text6 -> Text1$Text2$Text3$Text4&56
My current solution is:
First pattern:
^([^\$]*)\$?([^\$]*)\$?([^\$]*)\$?([^\$]*)\$?
After this i will do a substitution with the first pattern
New string: Text5$Text6
second pattern is:
([^\$])\b
result: 56
combine both and get the result:
Text1$Text2$Text3$Text4$56
For me it is not clear why i cant easily put the second pattern after the first pattern into one pattern. Is there something like an anchor that tells the engine to start the pattern from here like it would do if is would be the only pattern ?
You might use an alternation with a positive lookbehind and then concatenate the matches.
(?<=^(?:[^$]+\$){0,3})[^$]+\$?|[^$](?=\$|$)
Explanation
(?<= Positive lookbehind, assert what is on the left is
^(?:[^$]+\$){0,3} Match 0-3 times any char except $ followed by an optional $
) Close lookbehind
[^$]+\$? Match 1+ times any char except $, then match an optional $
| Or
[^$] Match any char except $
(?=\$|$) Positive lookahead, assert what is directly to the right is either $ or the end of the string
.NET regex demo | C# demo
Example
string pattern = #"(?<=^(?:[^$]*\$){0,3})[^$]*\$?|[^$](?=\$|$)";
string[] strings = {
"Text1",
"Text1$Text2$Text3",
"Text1$Text2$Text3$Text4$Text5$Text6"
};
Regex regex = new Regex(pattern);
foreach (String s in strings) {
Console.WriteLine(string.Join("", from Match match in regex.Matches(s) select match.Value));
}
Output
Text1
Text1$Text2$Text3
Text1$Text2$Text3$Text4$56
I strongly believe regular expression isn't the way to do that. Mostly because of the readability.
You may consider using simple algorithm like this one to reach your goal:
using System;
public class Program
{
public static void Main()
{
var input = "Text1$Text2$Text3$Text4$Text5$Text6";
var parts = input.Split('$');
var result = "";
for(var i=0; i<parts.Length; i++){
result += (i <= 4 ? parts[i] + "$" : parts[i].Substring(4));
}
Console.WriteLine(result);
}
}
There are also linq alternatives :
using System;
using System.Linq;
public class Program
{
public static void Main()
{
var input = "Text1$Text2$Text3$Text4$Text5$Text6";
var parts = input.Split('$');
var first4 = parts.Take(4);
var remainings = parts.Skip(4);
var result2 = string.Join("$", first4) + "$" + string.Join("", remainings.Select( r=>r.Substring(4)));
Console.WriteLine(result2);
}
}
It has to be adjusted to the actual needs but the idea is there
Try this code:
var texts = new string[] {"Text1", "Text1$Text2$Text3", "Text1$Text2$Text3$Text4$Text5$Text6" };
var parsed = texts
.Select(s => Regex.Replace(s,
#"(Text\d{1,3}(?:\$Text\d{1,3}){0,3})((?:\$Text\d{1,3})*)",
(match) => match.Groups[1].Value +"$"+ match.Groups[2].Value.Replace("Text", "").Replace("$", "")
)).ToArray();
// parsed is now: string[3] { "Text1$", "Text1$Text2$Text3$", "Text1$Text2$Text3$Text4$56" }
Explanation:
solution uses regex pattern: (Text\d{1,3}(?:\$Text\d{1,3}){0,3})((?:\$Text\d{1,3})*)
(...) - first capturing group
(?:...) - non-capturing group
Text\d{1,3}(?:\$Text\d{1,3} - match Text literally, then match \d{1,3}, which is 1 up to three digits, \$ matches $ literally
Rest is just repetition of it. Basically, first group captures first four pieces, second group captures the rest, if any.
We also use MatchEvaluator here which is delegate type defined as:
public delegate string MatchEvaluator(Match match);
We define such method:
(match) => match.Groups[1].Value +"$"+ match.Groups[2].Value.Replace("Text", "").Replace("$", "")
We use it to evaluate match, so takee first capturing group and concatenate with second, removing unnecessary text.
It's not clear to me whether your goal can be achieved using exclusively regex. If nothing else, the fact that you want to introduce a new character '&' into the output adds to the challenge, since just plain matching would never be able to accomplish that. Possibly using the Replace() method? I'm not sure that would work though...using only a replacement pattern and not a MatchEvaluator, I don't see a way to recognize but still exclude the "$Text" portion from the fifth instance and later.
But, if you are willing to mix regex with a small amount of post-processing, you can definitely do it:
static readonly Regex regex1 = new Regex(#"(Text\d(?:\$Text\d){0,3})(?:\$Text(\d))*", RegexOptions.Compiled);
static void Main(string[] args)
{
for (int i = 1; i <= 6; i++)
{
string text = string.Join("$", Enumerable.Range(1, i).Select(j => $"Text{j}"));
WriteLine(KeepFour(text));
}
}
private static string KeepFour(string text)
{
Match match = regex1.Match(text);
if (!match.Success)
{
return "[NO MATCH]";
}
StringBuilder result = new StringBuilder();
result.Append(match.Groups[1].Value);
if (match.Groups[2].Captures.Count > 0)
{
result.Append("&");
// Have to iterate (join), because we don't want the whole match,
// just the captured text.
result.Append(JoinCaptures(match.Groups[2]));
}
return result.ToString();
}
private static string JoinCaptures(Group group)
{
return string.Join("", group.Captures.Cast<Capture>().Select(c => c.Value));
}
The above breaks your requirement into three different capture groups in a regex. Then it extracts the captured text, composing the result based on the results.

Compare String with Regex in C#

I have a list of string taken from a file. Some of this strings are in the format "Q" + number + "null" (e.g. Q98null, Q1null, Q24null, etc)
With a foreach loop I must check if a string is just like the one shown before.
I use this right now
string a = "Q9null" //just for testing
if(a.Contains("Q") && a.Contains("null"))
MessageBox.Show("ok");
but I'd like to know if there is a better way to do this with regex.
Thank you!
Your method will produce a lot of false positives - for example, it would recognize some invalid strings, such as "nullQ" or "Questionable nullability".
A regex to test for the match would be "^Q\\d+null$". The structure is very simple: it says that the target string must start in a Q, then one or more decimal digits should come, and then there should be null at the end.
Console.WriteLine(Regex.IsMatch("Q123null", "^Q\\d+null$")); // Prints True
Console.WriteLine(Regex.IsMatch("nullQ", "^Q\\d+null$")); // Prints False
Demo.
public static bool Check(string s)
{
Regex regex = new Regex(#"^Q\d+null$");
Match match = regex.Match(s);
return match.Success;
}
Apply the above method in your code:
string a = "Q9null" //just for testing
if(Check(a))
MessageBox.Show("ok");
First way: Using the Regex
Use this Regex ^Q\d+null$
Second way: Using the SubString
string s = "Q1123null";
string First,Second,Third;
First = s[0].ToString();
Second = s.Substring(1,s.Length-5);
Third = s.Substring(s.Length-4);
Console.WriteLine (First);
Console.WriteLine (Second);
Console.WriteLine (Third);
then you can check everything after this...

C# Regex Validation Rule using Regex.Match()

I've written a Regular expression which should validate a string using the following rules:
The first four characters must be alphanumeric.
The alpha characters are followed by 6 or 7 numeric values for a total length of 10 or 11.
So the string should look like this if its valid:
CCCCNNNNNN or CCCCNNNNNNN
C being any character and N being a number.
My expression is written: #"^[0-9A-Za-z]{3}[0-9A-Za-z-]\d{0,21}$";
My regex match code looks like this:
var cc1 = "FOOBAR"; // should fail.
var cc2 = "AAAA1111111111"; // should succeed
var regex = #"^[0-9A-Za-z]{3}[0-9A-Za-z-]\d{0,21}$";
Match match = Regex.Match( cc1, regex, RegexOptions.IgnoreCase );
if ( cc1 != string.Empty && match.Success )
{
//"The Number must start with 4 letters and contain no numbers.",
Error = SeverityType.Error
}
I'm hoping someone can take a look at my expression and offer some feedback on improvements to produce a valid match.
Also, am I use .Match() correctly? If Match.Success is true, then does that mean that the string is valid?
The regex for 4 alphanumeric characters follows by 6 to 7 decimal digits is:
var regex = #"^\w{4}\d{6,7}$";
See: Regular Expression Language - Quick Reference
The Regex.Match Method returns a Match object. The Success Property indicates whether the match is successful or not.
var match = Regex.Match(input, regex, RegexOptions.IgnoreCase);
if (!match.Success)
{
// does not match
}
The following code demonstrates the regex usage:
var cc1 = "FOOBAR"; // should fail.
var cc2 = "AAAA1111111"; // should succeed
var r = new Regex(#"^[0-9a-zA-Z]{4}\d{6,7}$");
if (!r.IsMatch(cc2))
{
Console.WriteLine("cc2 doesn't match");
}
if (!r.IsMatch(cc1))
{
Console.WriteLine("cc1 doesn't match");
}
The output will be cc1 doesn't match.
The following code is using a regular expression and checks 4 different patterns to test it, see output below:
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
var p1 = "aaaa999999";
CheckMatch(p1);
p1 = "aaaa9999999";
CheckMatch(p1);
p1 = "aaaa99999999";
CheckMatch(p1);
p1 = "aaa999999";
CheckMatch(p1);
}
public static void CheckMatch(string p1)
{
var reg = new Regex(#"^\w{4}\d{6,7}$");
if (!reg.IsMatch(p1))
{
Console.WriteLine($"{p1} doesn't match");
}
else
{
Console.WriteLine($"{p1} matches");
}
}
}
Output:
aaaa999999 matches
aaaa9999999 matches
aaaa99999999 doesn't match
aaa999999 doesn't match
Try it as DotNetFiddle
Your conditions give:
The first four characters must be alphanumeric: [A-Za-z\d]{4}
Followed by 6 or 7 numeric values: \d{6,7}
Put it together and anchor it:
^[A-Za-z\d]{4}\d{6,7}\z
Altho that depends a bit how you define "alphanumeric". Also if you are using ignore case flag then you can remove the A-Z range from the expression.
Try the following pattern:
#"^[A-za-z\d]{4}\d{6,7}$"

getting number from string

How I can get number from following string:
###E[shouldbesomenumber][space or endofline]
[] here only for Illustration not present in the real string.
I am in .net 2.0.
Thanks.
I would suggest that you use regular expressions or string operations to isolate just the numeric part, and then call int.Parse, int.TryParse, decimal.Parse, decimal.TryParse etc depending on the type of number you need to parse.
The regular expression might look something like:
#"###E(-?\d+) ?$";
You'll need to change it for non-integers, of course. Sample code:
using System;
using System.Text.RegularExpressions;
class Test
{
static void Main(string[] arg)
{
Regex regex = new Regex(#"###E(-?\d+) ?$");
string text = "###E123 ";
Match match = regex.Match(text);
if (match.Success)
{
string group = match.Groups[1].Value;
int parsed = int.Parse(group);
Console.WriteLine(parsed);
}
}
}
Note that this could still fail with a number which exceeds the range of int. (Another reason to use int.TryParse...)
static string ExtractNumber(string text)
{
const string prefix = "###E";
int index = text.IndexOfAny(new []{' ', '\r', '\n'});
string number = text.Substring(prefix.Length, index - prefix.Length);
return number;
}
Now that your number is extracted you can parse it or use it as it is.

Finding numbers in a string using C# / Regex

I need to have the ability to parse out a series of numbers in a string in C# / Regex. The numbers can be one or more digits long and are always at the end of the string and after the word "ID" for example:
"Test 123 Test - ID 589"
In this case I need to be able to pick out 589.
Any suggestions? Some of the code That I have used picks out all the numbers which is not what I want to do.
Thanks
I would use the pattern #"ID (\d+)$"
using System.Text.RegularExpressions;
var s = "Test 123 Test - ID 589";
var match = Regex.Match(s, #"ID (\d+)$");
int? id = null;
if (match.Success) {
id = int.Parse(match.Groups[1].Value);
}
string resultString = null;
try {
Regex regexObj = new Regex(#"ID (?<digit>\d+)$");
resultString = regexObj.Match(subjectString).Groups["digit"].Value;
} catch (ArgumentException ex) {
// Syntax error in the regular expression
}
foo.Substring(foo.IndexOf("ID")+3)
This is the most specific pattern, in case not every line ends with a number you're interested in:
#"\bID\s+(\d+)$"
Note: Target numbers will be in capture group 1.
However, based on your description, you could just use this:
#"\d+$"
It simply looks for a string of numeric digits at the very end of each line. This is what I'd go with.

Categories

Resources