I'm looking to apply a regular expression to an input string.
Regular expression:(.*)\\(.*)_(.*)_(.*)-([0-9]{4}).*
Test entries:
Parkman\L9\B137598_00_T-3298-B
Parkman\L9\B137598_00_T-3298
The result should be B137598_00_T-3298 for both test entries. The problem is that if I add 4 digits in the test entries, the result will be, for example, B137598_00_T-3298-5555.
What I need here is that anything after the 3298 should not be taken into account.
What are the changes that I can perform to make that possible
You can use a single capture group with a bit more specific pattern:
\w\\\w+\\((?:[^\W_]+_){2}[^\W_]+-[0-9]{4})\b
The pattern matches:
\w Match a single word char
\\\w+\\ Match 1+ word chars between backslashes
( Capture group 1
(?:[^\W_]+_){2} Repeat 2 times word chars without _ followed by a single _
[^\W_]+- Match 1+ word chars without _ and then -
-[0-9]{4} Match - and 4 digits
) Close group 1
\b A word boundary
Regex demo
Or a bit broader pattern with a match only, where \w also matches an underscore, and asserting \ to the left:
(?<=\\)\w+-[0-9]{4}\b
Regex demo
c# code:
string s1 = #"Parkman\\L9\\B137598_00_T-3298-B";
string s2 = #"Parkman\L9\B137598_00_T-3298";
string pattern = #"\w+_[0-9]{2}_T-[0-9]{4}";
var match = Regex.Matches( s1, pattern);
Console.WriteLine("s1: {0}", match[0]);
match = Regex.Matches(s2, pattern);
Console.WriteLine("s2: {0}" , match[0]);
then the result:
s1: B137598_00_T-3298
s2: B137598_00_T-3298
Related
I have a question at regex I have a string that looks like:
Slot:0 Module:No module in slot
And what I need is a regex that well get values after slot and module, slot will allways be a number but i have a problem with module (this can be word with spaces), I tried:
var pattern = "(?<=:)[a-zA-Z0-9]+";
foreach (string config in backplaneConfig)
{
List<string> values = Regex.Matches(config, pattern).Cast<Match>().Select(x => x.Value).ToList();
modulesInfo.Add(new ModuleIdentyfication { ModuleSlot = Convert.ToInt32(values.First()), ModuleType = values.Last() });
}
So slot part works, but module works only if it is a word with no spaces, in my example it will give me only "No". Is there a way to do that
You may use a regex to capture the necessary details in the input string:
var pattern = #"Slot:(\d+)\s*Module:(.+)";
foreach (string config in backplaneConfig)
{
var values = Regex.Match(config, pattern);
if (values.Success)
{
modulesInfo.Add(new ModuleIdentyfication { ModuleSlot = Convert.ToInt32(values.Groups[1].Value), ModuleType = values.Groups[2].Value });
}
}
See the regex demo. Group 1 is the ModuleSlot and Group 2 is the ModuleType.
Details
Slot: - literal text
(\d+) - Capturing group 1: one or more digits
\s* - 0+ whitespaces
Module: - literal text
(.+) - Capturing group 2: the rest of the string to the end.
The most simple way would be to add 'space' to your pattern
var pattern = "(?<=:)[a-zA-Z0-9 ]+";
But the best solution would probably the answer from #Wiktor Stribiżew
Another option is to match either 1+ digits followed by a word boundary or match a repeating pattern using your character class but starting with [a-zA-Z]
(?<=:)(?:\d+\b|[a-zA-Z][a-zA-Z0-9]*(?: [a-zA-Z0-9]+)*)
(?<=:) Assert a : on the left
(?: Non capturing group
\d+\b Match 1+ digits followed by a word boundary
| Or
[a-zA-Z][a-zA-Z0-9]* Start a match with a-zA-Z
(?: [a-zA-Z0-9]+)* Optionally repeat a space and what is listed in the character class
) Close on capturing group
Regex demo
Plase replace this:
// regular exp.
(\d+)\s*(.+)
You don't need to use regex for such simple parsing. Try below:
var str = "Slot:0 Module:No module in slot";
str.Split(new string[] { "Slot:", "Module:"},StringSplitOptions.RemoveEmptyEntries)
.Select(s => s.Trim());
This question already has answers here:
Need C# Regex for replacing spaces inside of strings
(2 answers)
C# Regex Split - commas outside quotes
(7 answers)
Closed 3 years ago.
I'm trying to replace pipe delimited character inside quotes with a space. The issue is I get to many false positives because some strings are null. I only want to replace the pipe if there is text between the quotes. The regex pattern I'm using is from another stackoverflow post as my regex skills are lacking.
data sample:
"Hello"|"Green | Blue"|123.45|""|""|""|5|45
code i'm using:
internal class Program
{
public static void Main()
{
string pattern = #"(?: (?<= "")|\G(?!^))(\s*[^"" |\s]+(?:\s +[^
""|\s]+)*)\s*\|\s*(?=[^""] * "")";
string substitution = #"\1 \2";
string input = #"""20190430|""Test Text""|""""|""""|""Manual""|""""|""Machine""|""""|""""|10.00|""""|0.00|||0.00||5600.00||||""A+""|""""|40.00||""""|""Vision Service |Troubleshoot""|57|""Y""|838|""Yellow Maroon""|850||""FL""||||0.00|||||||||||""""||""""||""""|||""""||||||""""||""""|""""||""""|""""||||||""""|""""|""""||||||||1||""";
RegexOptions options = RegexOptions.Multiline;
Regex regex = new Regex(pattern, options);
string result = regex.Replace(input, substitution);
Console.WriteLine("Result:" + result);
Console.ReadKey();
}
}
It replaces the 'Blue Green' pipe just fine. But it also replaces the pipes between quotes later which breaks the file as column get removed.
Updated the code with an actual sample of my file I'm processing. The regex finds it but doesn't replace the pipe. Missing something.
If there should be text between the double quotes and the text should be on both sides of the pipe, you might use:
(?<=")(\s*[^"\s|]+)\s*\|\s*([^\s"|]+\s*)(?=")
In the replacement use $1 $2
Explanation
(?<=") Postive lookbehind, assert what is on the left is "
(\s*[^"\s|]+) Capture in group 1 matching 0+ times a whitespace char, 1+ times not ", | or a whitespace char
\s*\|\s* Match a | between 0+ times a whitespace char
([^\s"|]+\s*) Capture in group 2 matching 1+ times not ", | or a whitespace char and match 0+ times a whitespace char
(?=") Positive lookahead, assert what is on the right is "
.NET Regex demo
Edit
If you want to replace multiple pipes with a space between the double quotes you could make use of the \G anchor to assert the position at the end of previous match.
In the replacement use the first capturing group followed by a space $1
(?:(?<=")|\G(?!^))(\s*[^"|\s]+(?:\s+[^"|\s]+)*)\s*\|\s*(?=[^"]*")
Explanation
(?: Non capturing group
(?<=") Assert what is on the left is "
| Or
\G(?!^) Assert position at the end of the previous match
) Close non capturing group
( Capure group 1
\s*[^"|\s]+ Match 0+ times a whitespace char, followed by 1+ times not a | or whitespace char
(?:\s+[^"|\s]+)* Repeat 0+ times matching 1+ whitespace chars followed by 1+ times not a | or whitespace char
) Close capturing group 1
\s*\|\s* Match a | between 0+ times a whitespace char
(?=[^"]*") Assert what is on the right is a "
See another .NET regex demo
My guess is that, we might also want to keep only one space in our text, and this expression,
"([^"]+?)\s+\|\s+([^"]+?)"
with a replacement of $1 $2 might work.
Demo
Example
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = #"""([^""]+?)\s+\|\s+([^""]+?)""";
string substitution = #"\1 \2";
string input = #"""Hello""|""Green | Blue""|123.45|""""|""""|""""|5|45";
RegexOptions options = RegexOptions.Multiline;
Regex regex = new Regex(pattern, options);
string result = regex.Replace(input, substitution);
}
}
for the following input string contains all of these:
a1.aaa[SUBSCRIBED]
a1.bbb
a1.ccc
b1.ddd
d1.ddd[SUBSCRIBED]
I want to get the output:
bbb
ccc
which means: all the words that come after "a1." And not contain the substring "[SUBSCRIBED]"
all the words comes after "a1." And not contains the substring
"[SUBSCRIBED]"
Why regex? Following is crystal clear:
var result = strings
.Where(s => s.StartsWith("a1.") && !s.Contains("[SUBSCRIBED]"))
.Select(s => s.Substring(3));
Tim's answer makes sense. However if you insist on it I would venture that a Regex would look like this though.
^a1\.(.*)(?<!\[SUBSCRIBED\])$
with ^a1 meaning starts with a1
\.(.*) taking any number of character
and the negative lookbehind (?<!\[SUBSCRIBED\])$ would refuse text ending with [SUBSCRIBED]
You may use
^a1\.(?!.*\[SUBSCRIBED])(.*)
See the regex demo.
Details
^ - start of string
a1\. - a literal a1. substring
(?!.*\[SUBSCRIBED]) - a negative lookahead that fails the match if there is a [SUBSCRIBED] substring is present after any 0+ chars (other than newline if the RegexOptions.Singleline option is not used)
(.*) - Group 1: the rest of the line up to the end (if you use RegexOptions.Singleline option, . will match newlines as well).
C# code:
var result = string.Empty;
var m = Regex.Match(s, #"^a1\.(?!.*\[SUBSCRIBED])(.*)");
if (m.Success)
{
result = m.Groups[1].Value;
}
I need to replace all special characters in a string except the following (which includes alphabetic characters):
:)
:P
;)
:D
:(
This is what I have now:
string input = "Hi there!!! :)";
string output = Regex.Replace(input, "[^0-9a-zA-Z]+", "");
This replaces all special characters. How can I modify this to not replace mentioned characters (emojis) but replace any other special character?
You may use a known technique: match and capture what you need and match only what you want to remove, and replace with the backreference to Group 1:
(:(?:[D()P])|;\))|[^0-9a-zA-Z\s]
Replace with $1. Note I added \s to the character class, but in case you do not need spaces, remove it.
See the regex demo
Pattern explanation:
(:(?:[D()P])|;\)) - Group 1 (what we need to keep):
:(?:[D()P]) - a : followed with either D, (, ) or P
| - or
;\) - a ;) substring
(here, you may extend the capture group with more |-separated branches).
| - or ...
[^0-9a-zA-Z\s] - match any char other than ASCII digits, letters (and whitespace, but as I mentioned, you may remove \s if you do not need to keep spaces).
I would use a RegEx to match all emojis and select them out of the text
string input = "Hi there!!! :)";
string output = string.Concat(Regex.Matches(input, "[;|:][D|P|)|(]+").Cast<Match>().Select(x => x.Value));
Pattern [;|:][D|P|)|(]+
[;|:] starts with : or ;
[D|P|)|(] ends with D, P, ) or (
+ one or more
Assume that i have the following sentence
select PathSquares from tblPathFinding where RouteId=470
and StartingSquareId=267 and ExitSquareId=13
Now i want to replace words followed by = and get the rest of the sentence
Lets say i want to replace following word of = with %
Words are separated with space character
So this sentence would become
select PathSquares from tblPathFinding where RouteId=%
and StartingSquareId=% and ExitSquareId=%
With which regex i can achieve this ?
.net 4.5 C#
Use a lookbehind to match all the non-space or word chars which are just after to = symbol . Replacing the matched chars with % wiil give you the desired output.
#"(?<==)\S+"
OR
#"(?<==)\w+"
Replacement string:
%
DEMO
string str = #"select PathSquares from tblPathFinding where RouteId=470
and StartingSquareId=267 and ExitSquareId=13";
string result = Regex.Replace(str, #"(?<==)\S+", "%");
Console.WriteLine(result);
IDEONE
Explanation:
(?<==) Asserts that the match must be preceded by an = symbol.
\w+ If yes, then match the following one or more word characters.