How can I include hypen in my regex? - c#

I have this string: FOO_KEK_-150915
My current regex that is not working: FOO_([A-Z_])-150915
What is wrong with my regex, I'm trying to find files that starts with "FOO" and end with that number?

[A-Z_] matches exactly one character. So it would only match e.g. FOO_K-150915 or even FOO__-150915.
In order to match multiple characters, you need to specify the quantity, for example using +:
FOO_([A-Z_]+)-150915

FOO_([A-Z_]+)-150915
^^
You need to add quantifer * or + or {1,4} or else it will match just 1 and your regex will fail.
See demo.
https://regex101.com/r/vV1wW6/33

Related

C# RegEx to match specific strings

I need to match (using regex) strings that can be like this:
required: custodian_{number 1 - 9}_{fieldType either txt or ssn}
optional: _{fieldLength 1-999}
So for example:
custodian_1_ssn_1 is valid
custodian_1_ssn_1_255 is valid
custodian or custodian_ or custodian_1 or custodian_1_ or custodian_1_ssn or custodian_1_ssn_ or custodian_1_ssn_1_ are not valid
Currently I am working with this:
(?:custodian|signer)_[1-9]?[0-9]_(?:txt|ssn)_[1-9][0-9]?(_[1-9]?[0-9]?[0-9]?)?
as my regex and my api is working to pick up:
custodian_1_txt_1
custodian_1_ssn_1
custodian_1_txt_1_255 <---- not matching the last "5"
any thoughts?
You may use pattern:
^custodian(?:_[a-z0-9]+)+$
^ Assert position beginning of line.
custodian Match literal substring custodian.
(?:_[a-z0-9]+)+ Non capturing group. Multiple sequence of _ followed by alphanumerics.
$ Assert position end of line.
You can check the correct matches here.
Obviously you can modify the pattern to add substring signer in non capturing group as:
^(?:custodian|signer)(?:_[a-z0-9]+)+$.
I suggest using \d for numbers not yours and this is my code try it:-
(?:custodian|signer)_[1-9]?[0-9]_(?:txt|ssn)_[1-9][0-9]?(_[1-9]?\d*)?
I just added a \d value to the end of your pattern to match all end digits before another match.
You could use an anchor to assert the start ^ and the end $ of the string and for the last part make at least the first 1-9 not optional or else it would match and underscore at the end:
^(?:custodian|signer)_[1-9]?[0-9]_(?:txt|ssn)_[1-9][0-9]?(_[1-9][0-9]?[0-9]?)?$
If you're only interested in the last digits, this super generic regex will do:
(?:.+)_(\d+)
If you do need to match the whole string, this worked:
^(?:custodian|signer)_\d+_(?:txt|ssn)(?:_\d+)?_(\d+)$

Regex - find every occurrence of integer surrounded by space and coma

I have the following string:
"121 fd412 4151 3213, 421, 423 41241 fdsfsd"
And I need to get 3213 and 421 - because they both have space in front of them, and a coma behind.
The result will be set inside the string array...How can I do that?
"\\d+" catches every integer.
"\s\\d+(,)" throws some memory errors.
EDIT.
space to the left (<-) of the number, coma to the right (->)
EDIT 2.
string mainString = "Tests run: 5816, 8346, 28364 iansufbiausbfbabsbo3 4";
MatchCollection c = Regex.Matches(a, #"\d+(?=\,)");
var myList = new List<String>();
foreach(Match match in c)
{
myList.Add(match.Value);
}
Console.Write(myList[1]);
Console.ReadKey();
Your regex syntax is incorrect for wanting to match both digits, if you want them as separate results, you could do:
#"\s(\d+),\s(\d+)\s"
Live Demo
Edit
#"\s(\d+),"
Live Demo
\s\\d+(,):
\s is not properly escaped, should be \\s, same as for \\d
\\d matches single digit, you need \\d+ - one or more consecutive digits
(,) captures comma, do you really need this? seems like you need to capture a number, so \\s(\\d+),
you said "because they both have space behind them, and a coma in front", so probably ,\\s(\\d+)
How about this expression :
" \d+," // expression without the quotes
it should find what you need.
How to work with regular expression can you check on the MSDN
Hope it helps
Another solution
\s(\d+), // or maybe you'll need a double slash \\
Output:
3213
421
Demo
I think you mean you're looking for something like ,<space><digit> not ,<digit><space>
If so, try this:
, (\d+) //you might need to add another backslash as the others have noted
Well, based on your new edit
\s(\d+),
Test it here
It's all you need, only the numbers
\d+(?=\,)

Replace with wildcards

I need some advice. Suppose I have the following string: Read Variable
I want to find all pieces of text like this in a string and make all of them like the following:Variable = MessageBox.Show. So as aditional examples:
"Read Dog" --> "Dog = MessageBox.Show"
"Read Cat" --> "Cat = MessageBox.Show"
Can you help me? I need a fast advice using RegEx in C#. I think it is a job involving wildcards, but I do not know how to use them very well... Also, I need this for a school project tomorrow... Thanks!
Edit: This is what I have done so far and it does not work: Regex.Replace(String, "Read ", " = Messagebox.Show").
You can do this
string ns= Regex.Replace(yourString,"Read\s+(.*?)(?:\s|$)","$1 = MessageBox.Show");
\s+ matches 1 to many space characters
(.*?)(?:\s|$) matches 0 to many characters till the first space (i.e \s) or till the end of the string is reached(i.e $)
$1 represents the first captured group i.e (.*?)
You might want to clarify your question... but here goes:
If you want to match the next word after "Read " in regex, use Read (\w*) where \w is the word character class and * is the greedy match operator.
If you want to match everything after "Read " in regex, use Read (.*)$ where . will match all characters and $ means end of line.
With either regex, you can use a replace of $1 = MessageBox.Show as $1 will reference the first matched group (which was denoted by the parenthesis).
Complete code:
replacedString = Regex.Replace(inStr, #"Read (.*)$", "$1 = MessageBox.Show");
The problem with your attempt is, that it cannot know that the replacement string should be inserted after your variable. Let's assume that valid variable names contain letters, digits and underscores (which can be conveniently matched with \w). That means, any other character ends the variable name. Then you could match the variable name, capture it (using parentheses) and put it in the replacement string with $1:
output = Regex.Replace(input, #"Read\s+(\w+)", "$1 = MessageBox.Show");
Note that \s+ matches one or more arbitrary whitespace characters. \w+ matches one or more letters, digits and underscores. If you want to restrict variable names to letters only, this is the place to change it:
output = Regex.Replace(input, #"Read\s+([a-zA-Z]+)", "$1 = MessageBox.Show");
Here is a good tutorial.
Finally note, that in C# it is advisable to write regular expressions as verbatim strings (#"..."). Otherwise, you will have to double escape everything, so that the backslashes get through to the regex engine, and that really lessens the readability of the regex.

Retrive a Digit from a String using Regex

What I am trying to do is fairly simple, although I am running into difficulty. I have a string that is a url, it will have the format http://www.somedomain.com?id=someid what I want to retrive is the someid part. I figure I can use a regular expression but I'm not very good with them, this is what I tried:
Match match = Regex.Match(theString, #"*.?id=(/d.)");
I get a regex exception saying there was an error parsing the regex. The way I am reading this is "any number of characters" then the literal "?id=" followed "by any number of digits". I put the digits in a group so I could pull them out. I'm not sure what is wrong with this. If anyone could tell me what I'm doing wrong I would appreciated it, thanks!
No need for Regex. Just use built-in utilities.
string query = new Uri("http://www.somedomain.com?id=someid").Query;
var dict = HttpUtility.ParseQueryString(query);
var value = dict["id"]
You've got a couple of errors in your regex. Try this:
Match match = Regex.Match(theString, #".*\?id=(\d+)");
Specifically, I:
changed *. to .* (dot matches all non-newline chars and * means zero or more of the preceding)
added a an escape sequence before the ? because the question mark is a special charcter in regular expressions. It means zero or one of the preceding.
changed /d. to \d* (you had the slash going the wrong way and you used dot, which was explained above, instead of * which was also explained above)
Try
var match = RegEx.Match(theString, #".*\?id=(\d+)");
The error is probably due to preceding *. The * character in regex matches zero or more occurrences of previous character; so it cannot be the first character.
Probably a typo, but shortcut for digit is \d, not /d
. matches any character, you need to match one or more digits - so use a +
? is a special character, so it needs to be escaped.
So it becomes:
Match match = Regex.Match(theString, #".*\?id=(\d+)");
That being said, regex is not the best tool for this; use a proper query string parser or things will eventually become difficult to manage.

What is the regex for *abc?

I am trying to use Regex to find out if a string matches *abc - in other words, it starts with anything but finishes with "abc"?
What is the regex expression for this?
I tried *abc but "Regex.Matches" returns true for xxabcd, which is not what I want.
abc$
You need the $ to match the end of the string.
.*abc$
should do.
So you have a few "fish" here, but here's how to fish.
An online expression library and .NET-based tester: RegEx Library
An online Ruby-based tester (faster than the .NET one) Rubular
A windows app for testing exressions (most fully-featured, but no zero-width look-aheads or behind) RegEx Coach
Try this instead:
.*abc$
The $ matches the end of the line.
^.*abc$
Will capture any line ending in abc.
It depends on what exactly you're looking for. If you're trying to match whole lines, like:
a line with words and spacesabc
you could do:
^.*abc$
Where ^ matches the beginning of a line and $ the end.
But if you're matching words in a line, e.g.
trying to match thisabc and thisabc but not thisabcd
You will have to do something like:
\w*abc(?!\w)
This means, match any number of continuous characters, followed by abc and then anything but a character (e.g. whitespace or the end of the line).
If you want a string of 4 characters ending in abc use, /^.abc$/

Categories

Resources