Usually I make my Regex patterns by myself, but I can't figure this one out:
I need a Regex.Replace that replaces "'Number'/'Number'" to "'Number'von'Number'".
Example: "2/5" schould become "2von5".
The problem is that I can't just replace "/" to "von" because there are other "/" that are needed.
You can replace (?<=\d)/(?=\d) with von, using lookaround.
Another option is to replace (\d)/(\d) with $1von$2 (though that would fail on 1/2/3).
Related
Say I have a regex matching a hexadecimal 32 bit number:
([0-9a-fA-F]{1,8})
When I construct a regex where I need to match this multiple times, e.g.
(?<from>[0-9a-fA-F]{1,8})\s*:\s*(?<to>[0-9a-fA-F]{1,8})
Do I have to repeat the subexpression definition every time, or is there a way to "name and reuse" it?
I'd imagine something like (warning, invented syntax!)
(?<from>{hexnum=[0-9a-fA-F]{1,8}})\s*:\s*(?<to>{=hexnum})
where hexnum= would define the subexpression "hexnum", and {=hexnum} would reuse it.
Since I already learnt it matters: I'm using .NET's System.Text.RegularExpressions.Regex, but a general answer would be interesting, too.
RegEx Subroutines
When you want to use a sub-expression multiple times without rewriting it, you can group it then call it as a subroutine. Subroutines may be called by name, index, or relative position.
Subroutines are supported by PCRE, Perl, Ruby, PHP, Delphi, R, and others. Unfortunately, the .NET Framework is lacking, but there are some PCRE libraries for .NET that you can use instead (such as https://github.com/ltrzesniewski/pcre-net).
Syntax
Here's how subroutines work: let's say you have a sub-expression [abc] that you want to repeat three times in a row.
Standard RegEx
Any: [abc][abc][abc]
Subroutine, by Name
Perl: (?'name'[abc])(?&name)(?&name)
PCRE: (?P<name>[abc])(?P>name)(?P>name)
Ruby: (?<name>[abc])\g<name>\g<name>
Subroutine, by Index
Perl/PCRE: ([abc])(?1)(?1)
Ruby: ([abc])\g<1>\g<1>
Subroutine, by Relative Position
Perl: ([abc])(?-1)(?-1)
PCRE: ([abc])(?-1)(?-1)
Ruby: ([abc])\g<-1>\g<-1>
Subroutine, Predefined
This defines a subroutine without executing it.
Perl/PCRE: (?(DEFINE)(?'name'[abc]))(?P>name)(?P>name)(?P>name)
Examples
Matches a valid IPv4 address string, from 0.0.0.0 to 255.255.255.255:
((?:25[0-5])|(?:2[0-4][0-9])|(?:[0-1]?[0-9]?[0-9]))\.(?1)\.(?1)\.(?1)
Without subroutines:
((?:25[0-5])|(?:2[0-4][0-9])|(?:[0-1]?[0-9]?[0-9]))\.((?:25[0-5])|(?:2[0-4][0-9])|(?:[0-1]?[0-9]?[0-9]))\.((?:25[0-5])|(?:2[0-4][0-9])|(?:[0-1]?[0-9]?[0-9]))\.((?:25[0-5])|(?:2[0-4][0-9])|(?:[0-1]?[0-9]?[0-9]))
And to solve the original posted problem:
(?<from>(?P<hexnum>[0-9a-fA-F]{1,8}))\s*:\s*(?<to>(?P>hexnum))
More Info
http://regular-expressions.info/subroutine.html
http://regex101.com/
Why not do something like this, not really shorter but a bit more maintainable.
String.Format("(?<from>{0})\s*:\s*(?<to>{0})", "[0-9a-zA-Z]{1,8}");
If you want more self documenting code i would assign the number regex string to a properly named const variable.
.NET regex does not support pattern recursion, and if you can use (?<from>(?<hex>[0-9a-fA-F]{1,8}))\s*:\s*(?<to>(\g<hex>)) in Ruby and PHP/PCRE (where hex is a "technical" named capturing group whose name should not occur in the main pattern), in .NET, you may just define the block(s) as separate variables, and then use them to build a dynamic pattern.
Starting with C#6, you may use an interpolated string literal that looks very much like a PCRE/Onigmo subpattern recursion, but is actually cleaner and has no potential bottleneck when the group is named identically to the "technical" capturing group:
C# demo:
using System;
using System.Text.RegularExpressions;
public class Test
{
public static void Main()
{
var block = "[0-9a-fA-F]{1,8}";
var pattern = $#"(?<from>{block})\s*:\s*(?<to>{block})";
Console.WriteLine(Regex.IsMatch("12345678 :87654321", pattern));
}
}
The $#"..." is a verbatim interpolated string literal, where escape sequences are treated as combinations of a literal backslash and a char after it. Make sure to define literal { with {{ and } with }} (e.g. $#"(?:{block}){{5}}" to repeat a block 5 times).
For older C# versions, use string.Format:
var pattern = string.Format(#"(?<from>{0})\s*:\s*(?<to>{0})", block);
as is suggested in Mattias's answer.
If I am understanding your question correctly, you want to reuse certain patterns to construct a bigger pattern?
string f = #"fc\d+/";
string e = #"\d+";
Regex regexObj = new Regex(f+e);
Other than this, using backreferences will only help if you are trying to match the exact same string that you have previously matched somewhere in your regex.
e.g.
/\b([a-z])\w+\1\b/
Will only match : text, spaces in the above text :
This is a sample text which is not the title since it does not end with 2 spaces.
There is no such predefined class. I think you can simplify it using ignore-case option, e.g.:
(?i)(?<from>[0-9a-z]{1,8})\s*:\s*(?<to>[0-9a-z]{1,8})
To reuse regex named capture group use this syntax: \k<name> or \k'name'
So the answer is:
(?<from>[0-9a-fA-F]{1,8})\s*:\s*\k<from>
More info: http://www.regular-expressions.info/named.html
When i take a list of items from one list, some css styles are added into that. I want to remove / Replace that.
The following code is used to replace the style generic. But it is not given the result.
flag.Text = flag.Text.Replace("style=[\"'](.*)[\"']", "");
But it is not replacing. How to give this. Or shall i use Contains method?
You probably want to try using Regex.Replace (and using Multiline option, just in case) instead of string.Replace:
RegexOptions options = RegexOptions.Multiline;
flag.Text = Regex.Replace(flag.Text, "style=[\"'](.*)[\"']", "", options);
What you show above is Replace using string.Replace. It tries to find exact match of the static text instead of text with pattern. If you want to replace text with pattern, use Regex.Replace instead.
I think you are trying to replace the style by using the string.replace() methode but i think it cant do a regex like replace. I think you need to take a look at Regex.Replace.
I am using c# and need to rename a lot of files. They all follow the same naming convention. like AA-A0000-(1+)-A_words-sdsd_morewords. The only problem is the all follow this pattern but the A0000 and (1+) sections change file to file. How can I say if string follows that pattern than run my custom funciton on it?
How can I say if the file starts with two letters a hyphen the a letter followed by 4 numbers, another hyphen, a number, then another hyphen, then change the file name?
As the commenters have pointed out, Regular expressions are your answer. In .NET, this uses the Regex class. There are a number of tutorials for regular expressions that you can look at; the .NET version is documented at https://msdn.microsoft.com/en-us/library/az24scfc.aspx.
Depending on how the different sections of the file name change in your example above, you can alter your regular expression to fit. So for instance,
Regex.Replace(fileName, #"[a-z ]+-A(\d{4}-\(\d+)", "BB-B$1", RegexOptions.IgnoreCase);
Will match AA-A0000-(1+)..., AA-A3456-(72+)..., C D-A3456-(72+)..., etc, and replace the A's (and "C D") with B's. See https://dotnetfiddle.net/hFpUkW for an example of this in action.
You can use regex.
If your filenames look, for example, like this:
aB-C0101-2-some text that contains-Numbers_01987etc.ext
then the pattern to match it would be:
[a-zA-Z]{2}-[a-zA-Z]\d{4}-\d-[\s0-9a-zA-Z_-]+\.[a-zA-Z]{3}
Here are some additional resources:
tutorial: http://www.regular-expressions.info/tutorial.html
to test a regex online (there are a lot more):
http://www.regexr.com/
http://www.regexplanet.com/
example use of Regex.Replace() method in C#:
http://www.dotnetperls.com/regex-replace
Background
I am trying to do some regex matching and replacing, but for some reason the replacement isn't correct in .NET.
Regex pattern - "^.*?/rebate/?$"
Input string - "/my-tax/rebate"
Replacement string - "/new-path/rebate"
Basically, if the word 'rebate' is seen in a string, the input string needs to be replaced entirely by the replacement string.
Problem
If I create a regex with the pattern and execute
patternMatch.Pattern.Replace("/my-tax/rebate", "/new-path/rebate")
I get /my-tax/new-path/rebate, which isn't correct.
But, if I execute -
new Regex(#"^.*?/rebate/?$").Replace("/my-tax/rebate", "/new-path/rebate"),
the result is correct - /new-path/rebate
Why is that?
patternMatch is an object with two properties - one Pattern (which is the Regex Pattern) and another one is TargetPath (which is the replacement string). In this example, I am only using the pattern property.
patternMatch.Pattern on debugging is
Here are the results during run time-
You are simply wrongly using the function. I'm not sure how you are getting /my-tax/new-path/rebate since it is giving me an error on ideone.com (Maybe you have a regex named Pattern?).
Anyway, you shouldn't have any issues with using the function like this:
patternMatch.Replace("/my-tax/rebate", "/new-path/rebate");
ideone demo
A number of points in your question are incorrect. The regex is replacing correctly.
Per #XiaoguangQiao's comment, what is patternMatch.Pattern.Replace? Your example...
var patternMatch = new Regex("^.*?/rebate/?$");
patternMatch.Pattern.Replace("/my-tax/rebate", "/new-path/rebate");
...errors with the message...
'System.Text.RegularExpressions.Regex' does not contain a definition for 'Pattern' and no extension method 'Pattern' accepting a first argument of type 'System.Text.RegularExpressions.Regex' could be found
...when I throw it into a quick LINQPad 4 query (set to C# Statement(s)).
pattern is a private string field of System.Text.RegularExpressions.Regex; and patternMatch.Replace("/my-tax/rebate", "/new-path/rebate") - which I expect is what you meant - yields the correct result ("/new-path/rebate") rather than the incorrect result you said you get ("/my-tax/new-path/rebate").
Otherwise your pattern(s) (i.e. with and without the extra / that #rene pointed out) is fine for the input ("/my-tax/rebate") and replacement ("/new-path/rebate") you initially outline - insofar as they match and yield the result you want. You can check this outside your code in quick fiddles with the extra / and without the extra /.
Use String.Replace Method.
str.replace("rebate","new-path/rebate")
http://msdn.microsoft.com/en-us/library/fk49wtc1%28v=vs.110%29.aspx
Need some help on a problem please.
In fact I got a base64 string named "image" like that :
data:image/pjpeg;base64,iVBORw0KGgoAAAANSUhE...
I need to replace the part "data:image/pjpeg;base64," by "".
I try this way :
imageSrc = image.Replace("data:image/(png|jpg|gif|jpeg|pjpeg|x-png);base64,", "");
But it doesn't work.
Is somebody has an idea on that.
Thanks a lot
You should use the static Replace method on the Regex class.
imageSrc = Regex.Replace(image, "data:image/(png|jpg|gif|jpeg|pjpeg|x-png);base64,", "");
Well, for starters your code is doing String.Replace instead of Regex.Replace.
imageSrc = Regex.Replace(image, "data:image/(png|jpg|gif|jpeg|pjpeg|x-png);base64,", "");
But Regex is a rather heavy for this use case, why not just take everything after the comma?
imageSrc = image.SubString(image.IndexOf(",") + 1);
You are just using String.Replace, but you should use Regex.Replace for regular expressions.
But why not just use Substring?
imageSrc = image.Substring(image.IndexOf(',') + 1)
Since you know that your string is always starting with data:image/..., you don't need regular expressions at all.
Keep it simple and just take the substring after the first ,.
String.Replace() has no overload with regexp. Use Regex.Replace() instead.
There is a mistake in your regex, you must specify ?: for images alternatives and use Regex object, so :
Regex.Replace("data:image/(?:png|jpg|gif|jpeg|pjpeg|x-png);base64,", "");
it should work