I have a universal regex code where it uses Groups[1] value to extract the result. It's easy to extract SN and Ref by just giving a sn=(.*?)\. pattern. But it's so difficult to get for example, PKSC and V928. I have to use Groups[1] because users who use this application can choose their own value to display. It can be NC339 or PKXC.
//var source = "SN=1395939213.#variable/OGT84/PKXC/Undetermined.Thank You#{customer}"
//sometimes like this
var source = "SN=8029758034.Ref=BFO7Y95B3KN5#resolved/NC339/V928/ClearenceBBF.Brief#{supervisor}/verified"
var value = Regex.Match(source, pattern, RegexOptions.IgnoreCase | RegexOptions.Singleline).Groups[1].Value
You can use
^(?:[^/]*/){2}([^/]+)
See the regex demo.
Details
^ - start of a string
(?:[^/]*/){2} - two occurrences of any chars other than / and then a /
([^/]+) - Group 1: one or more chars other than /.
Try following :
string pattern = #"SN=(?'sn'\d+).*\{(?'supervisor'[^}]+)";
string input = "SN=1395939213.#variable/OGT84/PKXC/Undetermined.Thank You#{customer}";
Match match = Regex.Match(input, pattern);
string sn = match.Groups["sn"].Value;
string supervisor = match.Groups["supervisor"].Value;
Related
im trying to extract a substring with regex but im having some troubles...
The string is build from a columns of strings and i need the the 4th column only
string stringToExtractFrom = "289 120 00001110 ??
4Control#SimApi##QAEAAV01#ABV01##Z = ??4Control#SimApi##QAEAAV01#ABV01##Z
(public: class SimApi::Control & __thiscall SimApi::Control::operator=(class
SimApi::Control const &))"
string pattern = #"\s+\d+\s+\d+\s+\S+\s(.*)\=";
RegexOptions options = RegexOptions.Multiline;
Regex regX = new Regex(pattern, options);
Match m = regX.Match(stringToExtractFrom);
while (m.Success)
{
Group g = m.Groups[1];
defData += g+"\n";
m = m.NextMatch();
}
this is the wanted string: ??
4Control#SimApi##QAEAAV01#ABV01##Z
with the string below it worked when i got the substring i want as a group
1 0 00002E00 ??0ADOFactory#SimApiEx##QAE#ABV01##Z =
??0ADOFactory#SimApiEx##QAE#ABV01##Z (public: __thiscall
SimApiEx::ADOFactory::ADOFactory(class SimApiEx::ADOFactory const &))
If the second string works for you and the first one does not, you might first match 1+ digits and use \S+ for the third part. Then use a negated character class to capture matching not an equals sign:
\d+\s+\d+\s+\S+\s+([^=]+) =
See a .NET regex demo | C# Demo
Using C#, i am stuck while trying to extract a specific string while limiting the string to be matched. Here is my input string:
NPS_CNTY01_10112018_Adult_Submittal.txt
I would like to extract 01 after CNTY and ingnore anything after 01.
So far i have the regex to be:
(?!NPS_CNTY)\d{2}
But the above regex gets many other digit matches from the input string. One approach i was thinking was to limit the input to 9 characters to eventually get 01. But somehow not able to achieve that. Any help is appreciated.
I would like to add that the only variable data in this input string is:
NPS_CNTY[two digit county code excluding this bracket]_[date in MMDDYYYY format excluding the brackets]_Adult_Submittal.txt.
Also please limit solutions to regex's.
The (?!NPS_CNTY)\d{2} pattern matches a location that is not immediately followed with NPS_CNTY and then matches 2 digits. The lookahead always returns true since two digits cannot start a NPS_CNTY char sequence, it is redundant.
You may use a positive lookbehind like this to get 01:
var m = Regex.Match(s, #"(?<=NPS_CNTY)\d+");
var result = "";
if (m.Success)
{
result = m.Value;
}
See the .NET regex demo
Here, (?<=NPS_CNTY), a positive lookbehind, matches a location that is immediately preceded with NPS_CNTY and then \d+ matches 1 or more digits.
An equivalent solution using capturing mechanism is
var m = Regex.Match(s, #"NPS_CNTY(\d+)");
var result = "";
if (m.Success)
{
result = m.Groups[1].Value;
}
If the string always start with NPS_CNTY and you have to extract 2 digits then you don't need a regular expression. Just use Substring() method:
string text = #"NPS_CNTY01_01141980_Adult_Submittal.txt";
string digits = text.Substring(8, 2);
EDIT:
In case you need to match N digits after NPS_CNTY you can use the following code:
string text = #"NPS_CNTY012_01141980_Adult_Submittal.txt";
string digits = text.Replace("NPS_CNTY", string.Empty)
.Split("_", StringSplitOptions.RemoveEmptyEntries)
.FirstOrDefault();
I have this string (it's from EDI data):
ISA*ESA?ISA*ESA?
The * indicates it could be any character and can be of any length.
? indicates any single character.
Only the ISA and ESA are guaranteed not to change.
I need this split into two strings which could look like this: "ISA~this is date~ESA|" and
"ISA~this is more data~ESA|"
How do I do this in c#?
I can't use string.split, because it doesn't really have a delimeter.
You can use Regex.Split for accomplishing this
string splitStr = "|", inputStr = "ISA~this is date~ESA|ISA~this is more data~ESA|";
var regex = new Regex($#"(?<=ESA){Regex.Escape(splitStr)}(?=ISA)", RegexOptions.Compiled);
var items = regex.Split(inputStr);
foreach (var item in items) {
Console.WriteLine(item);
}
Output:
ISA~this is date~ESA
ISA~this is more data~ESA|
Note that if your string between the ISA and ESA have the same pattern that we are looking for, then you will have to find some smart way around it.
To explain the Regex a bit:
(?<=ESA) Look-behind assertion. This portion is not captured but still matched
(?=ISA) Look-ahead assertion. This portion is not captured but still matched
Using these look-around assertions you can find the correct | character for splitting
Simply use the
int x = whateverString.indexOf("?ISA"); // replace ? with the actual character here
and then just use the substring from 0 to that indexOf, indexOf to length.
Edit:
If ? is not known,
can we just use the regex Pattern and Matcher.
Matcher matcher = Patter.compile("ISA.*ESA").match(whateverString);
if(matcher.find()) {
matcher.find();
int x = matcher.start();
}
Here x would give that start index of that match.
Edit: I mistakenly saw it as java one, for C#
string pattern = #"ISA.*ESA";
Regex myRegex = new Regex(pattern, RegexOptions.IgnoreCase);
Match m = myRegex.Match(whateverString); // m is the first match
while (m.Success)
{
Console.writeLine(m.value);
m = m.NextMatch(); // more matches
}
RegEx will probably be the best for this. See this link
Mask would be
ISA(?<data1>.*?)ESA.ISA(?<data2>.*?)ESA.
This will give you 2 groups with data you need
Match match = Regex.Match(input, #"ISA(?<data1>.*?)ESA.ISA(?<data2>.*?)ESA.",RegexOptions.IgnoreCase);
if (match.Success)
{
var data1 = match.Groups["data1"].Value;
var data2 = match.Groups["data2"].Value;
}
Use Regex.Matches If you need multiple matches found, and specify different RegexOptions if needed.
It's kinda hacky but you could do...
string x = "ISA*ESA?ISA*ESA?";
x = x.Replace("*","~"); // OR SOME OTHER DELIMITER
string[] y = x.Split('~');
Not perfect in all situations, but it could solve your problem simply.
You could split by "ISA" and "ESA" and then put the parts back together.
string input = "ISA~this is date~ESA|ISA~this is more data~ESA|";
string start = "ISA",
end = "ESA";
var splitedInput = input.Split(new[] { start, end }, StringSplitOptions.None);
var firstPart = $"{start}{splitedInput[1]}{end}{splitedInput[2]}";
var secondPart = $"{start}{splitedInput[3]}{end}{splitedInput[4]}";
firstPart = "ISA~this is date~ESA|"
secondPart = "ISA~this is more data~ESA|";
Use a Regex like ISA(.+?)ESA and select the first group
string input = "ISA~mycontent+ESA";
Match match = Regex.Match(input, #"ISA(.+?)ESA",RegexOptions.IgnoreCase);
if (match.Success)
{
string key = match.Groups[1].Value;
}
Instead of "splitting" by a string, I would instead describe your question as "grouping" by a string. This can easily be done using a regular expression:
Regular expression: ^(ISA.*?(?=ESA)ESA.)(ISA.*?(?=ESA)ESA.)$
Explanation:
^ - asserts position at start of the string
( - start capturing group
ISA - match string ISA exactly
.*?(?=ESA) - match any character 0 or more times, positive lookahead on the
string ESA (basically match any character until the string ESA is found)
ESA - match string ESA exactly
. - match any character
) - end capturing group
repeat one more time...
$ - asserts position at end of the string
Try it on Regex101
Example:
string input = "ISA~this is date~ESA|ISA~this is more data~ESA|";
Regex regex = new Regex(#"^(ISA.*?(?=ESA)ESA.)(ISA.*?(?=ESA)ESA.)$",
RegexOptions.Compiled);
Match match = regex.Match(input);
if (match.Success)
{
string firstValue = match.Groups[1].Value; // "ISA~this is date~ESA|"
string secondValue = match.Groups[2].Value; // "ISA~this is more data~ESA|"
}
There are two answers to the question "How to split a string by another string".
var matches = input.Split(new [] { "ISA" }, StringSplitOptions.RemoveEmptyEntries);
and
var matches = Regex.Split(input, "ISA").ToList();
However, the first removes empty entries, while the second does not.
I already tried two days to solve the Problem, that I have a MatchCollection. In the patter is a Group and I want to have a list with the Solutions of the Group (there were two or more Solutions).
string input = "<tr><td>Mi, 09.09.15</td><td>1</td><td>PK</td><td>E</td><td>123</td><td></td></tr><tr><td>Mi, 09.09.15</td><td>2</td><td>ER</td><td>ER</td><td>234</td><td></td></tr>";
string Patter2 = "^<tr>$?<td>$?[D-M][i-r],[' '][0-3][1-9].[0-1][1-9].[0-9][0-9]$?</td>$?<td>$?([1-9][0-2]?)$?</td>$?";
Regex r2 = new Regex(Patter2);
MatchCollection mc2 = r2.Matches(input);
foreach (Match match in mc2)
{
GroupCollection groups = match.Groups;
string s = groups[1].Value;
Datum2.Text = s;
}
But only the last match (2) appears in the TextBox "Datum2".
I know that I have to use e.g. a listbox, but the Groups[1].Value is a string...
Thanks for your help and time.
Dieter
First thing you need to correct in the code is Datum2.Text = s; would overwrite the text in Datum2 if it were more than one match.
Now, about your regex,
^ forces a match at the begging of the line, so there is really only 1 match. If you remove it, it'll match twice.
I can't seem to understand what was intended with $? all over the pattern (just take them out).
[' '] matches "either a quote, a space or a quote (no need to repeat characters in a character class.
All dots in [0-3][1-9].[0-1][1-9].[0-9][0-9] need to be escaped. A dot matches any character otherwise.
[0-1][1-9] matches all months except "10". The second character shoud be [0-9] (or \d).
Code:
string input = "<tr><td>Mi, 09.09.15</td><td>1</td><td>PK</td><td>E</td><td>123</td><td></td></tr><tr><td>Mi, 09.09.15</td><td>2</td><td>ER</td><td>ER</td><td>234</td><td></td></tr>";
string Patter2 = "<tr><td>[D-M][i-r],[' ][0-3][0-9]\\.[0-1][0-9]\\.[0-9][0-9]</td><td>([1-9][0-2]?)</td>";
Regex r2 = new Regex(Patter2);
MatchCollection mc2 = r2.Matches(input);
string s= "";
foreach (Match match in mc2)
{
GroupCollection groups = match.Groups;
s = s + " " + groups[1].Value;
}
Datum2.Text = s;
Output:
1 2
DEMO
You should know that regex is not the tool to parse HTML. It'll work for simple cases, but for real cases do consider using HTML Agility Pack
I am really struggling with Regular Expressions and can't seem to extract the number from this string
"id":143331539043251,
I've tried with this ... but I'm getting compilation errors
var regex = new Regex(#""id:"\d+,");
Note that the full string contains other numbers I don't want. I want numbers between id: and the ending ,
Try this code:
var match = Regex.Match(input, #"\""id\"":(?<num>\d+)");
var yourNumber = match.Groups["num"].Value;
Then use extracted number yourNumber as a string or parse it to number type.
If all you need is the digits, just match on that:
[0-9]+
Note that I am not using \d as that would match on any digit (such as Arabic numerals) in the .NET regex engine.
Update, following comments on the question and on this answer - the following regex will match the pattern and place the matched numbers in a capturing group:
#"""id"":([0-9]+),"
Used as:
Regex.Match(#"""id"":143331539043251,", #"""id"":([0-9]+),").Groups[1].Value
Which returns 143331539043251.
If you are open to using LINQ try the following (c#):
string stringVariable = "123cccccbb---556876---==";
var f = (from a in stringVariable.ToCharArray() where Char.IsDigit(a) == true select a);
var number = String.Join("", f);