get number from a string after trimming 0 using Regex c# - c#

I have a namestring like ( This is a file name stored in server)
Offer_2018-06-05_PROSP000033998_20180413165327.02155000.NML.050618.1040.67648.0
The file name format is given above. I need to get the number out of
PROSP000033998
and remove the leading zeros ( 33998) using Regex in C# . there are different values that will come instead of PROSP. So i want to use a regex to get the number instead of string split. Tried using (0|[1-9]\d*), but not sure whether this is correct as i got 2018 as the output
Regex regexLetterOfOffer = new Regex (#"0|[1-9]\d*");
Match match = regexLetterOfOffer.Match (fileInfo.Name);
if (match.Success)
{
Console.WriteLine (match.Value);
}

A generalized regular expression for alphabetical characters, possibly followed by zeros, then capturing digits with an underscore afterwards could be
[A-Z]0*([1-9]\d*)(?=_)
That is:
Regex regexLetterOfOffer = new Regex (#"[A-Z]0*([1-9]\d*)(?=_)");
Match match = regexLetterOfOffer.Match("Offer_2018-06-05_PROSP000033998_20180413165327.02155000.NML.050618.1040.67648.0");
if (match.Success)
{
Console.WriteLine (match.Groups[1].Value);
}
This will match similar strings whose digit sequences start with something other than PROSP.

Putting (0|[1-9]\d*) into https://java-regex-tester.appspot.com/ shows that it is actually matching the number you want, it's just also matching all the other numbers in the string. The Match method only returns the first one, 2018 in this case. To only match the part you're after, you could use PROSP0*([1-9]\d*) as the regex. The brackets () around the last part make it a capturing group, which you can retrieve using the Groups property of the Match object:
Console.WriteLine(match.Groups[1].Value)
(Group 0 is the whole match, hence we want group 1.)

Related

c# regex matches example to extract result

I am trying to extract a character/digit from a string that is between single quotes and seems like i am failing to write the correct pattern.
Test string - only value that changes is the single character/digit in single quotes
[+] Random session part: 'm'
I am using the following pattern but it returns empty
var line = "[+] Random session part: 'm'";
Regex pattern = new Regex(#"(?<=\')(.*?)(?=\')");
Match match = pattern.Match(line);
Debug.Log($"{match.Groups["postfix"].Value}");
int postFix = int.Parse(match.Groups["postfix"].Value);
what am i missing?
You have an overly complicated regex, and looking for a group named 'postfix' in you match, while your regex does not have such a named group.
A simpler regex would be:
'(.)'
This looks for a single character between two single quotes, and has that character wrapped in a capture group. Put a breakpoint after your match row, and you can explore the matched object.
You can explore the regex above with your match here:
https://regexr.com/77b0m
BTW: Your code tries to parse the string "m" into an int, this will throw and error, your should probably handle that case with int.TryParse
you can use this regX :
'(.)' // match any string between single quotes
show result
or
(?<=\')(.*?)(?=\') //containing a non-greedy match
show result

Is it possible to use the backreference of match result as a match of another capture group?

I have the following string fragments where I want to match the contents of the key attribute and replace all it's occurrences with ***:
name="prefix1 - key_string suffix1" displayName="prefix1 - key_string suffix2" key="key_string" name2="prefix2 - key_string suffix1" desc="prefix1 - key_string suffix1"
I can easily match the attribute value key_string with something like (?<=key=")\b([^"]+) and replace it with *** so it will read like key="***", but can't seem to figure out how to replace other key_string occurrences using backreference.
Is it possible or do I need to split this into 2 regex passes: 1 to get the match result and another to replace the occurrences of match result?
It is not possible to find a string between double quotes after key= string and then replace all occurrences of the found match in the whole input string using a single Regex.Replace operation.
This would imply saving the value to some buffer, then seek to the string beginning point and re-scan the whole input string. This is not possible since regular expression engine searches from left to right (by default) or from right to left (with the RegexOptions.RightToLeft option) but never allows to re-wind to the string start scan position.
The closest pattern would be (?<=key=\"([^\"]+)\".*)\1|(?=.*key=\"([^\"]+)\".*)\2 (see its demo online) but it is useless as the found match will remain, as it is the "pivot" for all matches (if it is removed before, the lookarounds will not match, and it cannot be remove later as the regex index will be long past the match).
So, use a two-step approach like
var match = Regex.Match(text, #"\bkey=""([^""]+)""")?.Groups[1].Value;
if (!string.IsNullOrEmpty(match)
{
sentence = sentence.Replace(match, ""); // If you just want to remove the found match anywhere inside the string
}

How can I filter out certain combinations?

I'm trying to filter the input of a TextBox using a Regex. I need up to three numbers before the decimal point and I need two after it. This can be in any form.
I've tried changing the regex commands around, but it creates errors and single inputs won't be valid. I'm using a TextBox in WPF to collect the data.
bool containsLetter = Regex.IsMatch(units.Text, "^[0-9]{1,3}([.] [0-9] {1,3})?$");
if (containsLetter == true)
{
MessageBox.Show("error");
}
return containsLetter;
I want the regex filter to accept these types of inputs:
111.11,
11.11,
1.11,
1.01,
100,
10,
1,
As it has been mentioned in the comment, spaces are characters that will be interpreted literally in your regex pattern.
Therefore in this part of your regex:
([.] [0-9] {1,3})
a space is expected between . and [0-9],
the same goes for after [0-9] where the regex would match 1 to 3 spaces.
This being said, for readability purpose you have several way to construct your regex.
1) Put the comments out of the regex:
string myregex = #"\s" // Match any whitespace once
+ #"\n" // Match one newline character
+ #"[a-zA-Z]"; // Match any letter
2) Add comments within your regex by using the syntax (?#comment)
needle(?# this will find a needle)
Example
3) Activate free-spacing mode within your regex:
nee # this will find a nee...
dle # ...dle (the split means nothing when white-space is ignored)
doc: https://www.regular-expressions.info/freespacing.html
Example

Regex to remove certain repeating characters but ignore others [duplicate]

I'm trying to find a regexp that only matches strings if they don't contain a dot, e.g. it matches stackoverflow, 42abc47 or a-bc-31_4 but doesn't match: .swp, stackoverflow or test..
^[^.]*$
or
^[^.]+$
Depending on whether you want to match empty string. Some applications may implicitly supply the ^ and $, in which case they'd be unnecessary. For example: the HTML5 input element's pattern attribute.
You can find a lot more great information on the regular-expressions.info site.
Use a regex that doesn't have any dots:
^[^.]*$
That is zero or more characters that are not dots in the whole string. Some regex libraries I have used in the past had ways of getting an exact match. In that case you don't need the ^ and $. Having a language in your question would help.
By the way, you don't have to use a regex. In java you could say:
!someString.contains(".");
Validation Require: First Character must be Letter and then Dot '.' is not allowed in Target String.
// The input string we are using
string input = "1A_aaA";
// The regular expression we use to match
Regex r1 = new Regex("^[A-Za-z][^.]*$"); //[\t\0x0020] tab and spaces.
// Match the input and write results
Match match = r1.Match(input);
if (match.Success)
{
Console.WriteLine("Valid: {0}", match.Value);
}
else
{
Console.WriteLine("Not Match");
}

Extending [^,]+, Regular Expression in C#

Duplicate
Regex for variable declaration and initialization in c#
I was looking for a Regular Expression to parse CSV values, and I came across this Regular Expression
[^,]+
Which does my work by splitting the words on every occurance of a ",". What i want to know is say I have the string
value_name v1,v2,v3,v4,...
Now I want a regular expression to find me the words v1,v2,v3,v4..
I tried ->
^value_name\s+([^,]+)*
But it didn't work for me. Can you tell me what I am doing wrong? I remember working on regular expressions and their statemachine implementation. Doesn't it work in the same way.
If a string starts with Value_name followed by one or more whitespaces. Go to Next State. In That State read a word until a "," comes. Then do it again! And each word will be grouped!
Am i wrong in understanding it?
You could use a Regex similar to those proposed:
(?:^value_name\s+)?([^,]+)(?:\s*,\s*)?
The first group is non-capturing and would match the start of the line and the value_name.
To ensure that the Regex is still valid over all matches, we make that group optional by using the '?' modified (meaning match at most once).
The second group is capturing and would match your vXX data.
The third group is non-capturing and would match the ,, and any whitespace before and after it.
Again, we make it optional by using the '?' modifier, otherwise the last 'vXX' group would not match unless we ended the string with a final ','.
In you trials, the Regex wouldn't match multiple times: you have to remember that if you want a Regex to match multiple occurrences in a strings, the whole Regex needs to match every single occurrence in the string, so you have to build your Regex not only to match the start of the string 'value_name', but also match every occurrence of 'vXX' in it.
In C#, you could list all matches and groups using code like this:
Regex r = new Regex(#"(?:^value_name\s+)?([^,]+)(?:\s*,\s*)?");
Match m = r.Match(subjectString);
while (m.Success) {
for (int i = 1; i < m.Groups.Count; i++) {
Group g = m.Groups[i];
if (g.Success) {
// matched text: g.Value
// match start: g.Index
// match length: g.Length
}
}
m = m.NextMatch();
}
I would expect it only to get v1 in the group, because the first comma is "blocking" it from grabbing the rest of the fields. How you handle this is going to depend on the methods you use on the regular expression, but it may make sense to make two passes, first grab all the fields seperated by commas and then break things up on spaces. Perhaps ^value_name\s+(?:([^,]+),?)* instead.
Oh yeah, lists....
/(?:^value_name\s+|,\s*)([^,]+)/g will theoreticly grab them, but you will have to use RegExp.exec() in a loop to get the capture, rather than the whole match.
I wish pre-matches worked in JS :(.
Otherwise, go with Logan's idea: /^value_name\s+([^,]+(?:,\s*[^,]+)*)$/ followed by .split(/,\s*/);

Categories

Resources