Need Regex to match [#URL^Url Description^#] - c#

I need regex to find this text
[#URL^Url Description^#]
in a string and replace it with
Url Description
"Url Description" can be set of characters in any language.
Any Regex Experts out there to help me?
Thanks.

It might be a bit confusing, but you can use the following:
string str = #"[#URL^Url Description^#]";
var regex = new Regex(#"^[^^]+\^([^^]+)\^[^^]+$");
var result = regex.Replace(str, #"$1");
The first ^ means the beginning of the string;
The [^^]+ means anything not a caret character;
The \^ is a literal caret;
The $ is the end of the string.
Basically, it captures all characters between the carets (^) and replace this in between the <a> tags.
See ideone demo.
You can also replace the last line with this:
var result = regex.Replace(str, #"$1");
Where link is the variable containing the link you want to replace in.

Why don't you use String.Replace()? A regex would work, but it looks like the format is well defined and regexes are harder to read.
string url = "[#URL^blah^#]";
string url_html = url.Replace("[#URL^", "<a href=\"http://www.somewhere.net\">")
.Replace("^#]", "</a>");

Related

Regex without taking care of escape codes

I want to validate a string like this (netsh cmd output):
"\r\nR‚servations d'URLÿ:\r\n--------------------\r\n\r\n URL r‚serv‚e : https://+:443/SomeWebSite/ \r\n Utilisateurÿ: AUTORITE NT\\SERVICE R\u0090SEAU\r\n \u0090couterÿ: Yes\r\n D‚l‚guerÿ: Yes\r\n SDDLÿ: D:(A;;GA;;;NS) \r\n\r\n\r\n"
with this pattern:
"URL .+https:\/\/\+:443\/SomeWebSite\/.+Yes.+Yes.+SDDL.+"
So, I intend to detect this kind of strings (xxxxx is something(+)):
xxxxxURLxxxxxhttps://+:443/SomeWebSite/xxxxxYesxxxxxYesxxxxxSDDLxxxx
I wrote this code in C# to do it but my expression still doesn't work:
string output = "\r\nR‚servations d'URLÿ:\r\n--------------------\r\n\r\n URL r‚serv‚e : https://+:443/SomeWebSite/ \r\n Utilisateurÿ: AUTORITE NT\\SERVICE R\u0090SEAU\r\n \u0090couterÿ: Yes\r\n D‚l‚guerÿ: Yes\r\n SDDLÿ: D:(A;;GA;;;NS) \r\n\r\n\r\n";
output = output.Replace(Environment.NewLine, ""); //==> output2=="R‚servations d'URLÿ:-----------
Regex testUrlOpened = new Regex(output, RegexOptions.Singleline);
MessageBox.Show(testUrlOpened.IsMatch(#"URL").ToString()); // ==> False
MessageBox.Show(testUrlOpened.IsMatch(#".+URL.+").ToString()); // ==> False
MessageBox.Show(testUrlOpened.IsMatch(#"URL .+https:\/\/\+:443\/SomeWebSite\/.+Yes.+Yes.+SDDL.+").ToString()); // ==> False
So I suppose that I've another issue with regex in c#...
May be encoding issue?
Start by removing the escape codes expected in the string . It might be better to remove them all depending on your use scenario (C# escape codes)
output = output.Replace('\n').Replace('\r').Replace('\t')
Now you have a single line string, you can do the regex matching
.+URL.+https:\/\/.+:443\/SomeWebSite\/.+Yes.+Yes.+SDDL.+
Notice the following:
1- the ^ and $ means to match the exact begin and end of the string. If you have the target string within the line using these will cause the matching to fail.
2- You need to escape the necessary regex characters .
3- To match "Any character except new line one or more times" you use .+
I hope this helps
You can use Regex.Unescape to unescape the string, and then do your regex match :
var output = #"\r\nR‚servations d'URLÿ:\r\n--------------------\r\n\r\n URL r‚serv‚e : https://+:443/SomeWebSite/ \r\n Utilisateurÿ: AUTORITE NT\\SERVICE R\u0090SEAU\r\n \u0090couterÿ: Yes\r\n D‚l‚guerÿ: Yes\r\n SDDLÿ: D:(A;;GA;;;NS) \r\n\r\n\r\n";
output = Regex.Unescape(output).Dump();
var foundUrl = Regex.IsMatch(output, #"URL .+ https://\+:443/SomeWebSite/.+YES.+YES.+SDDL.+");
+ indicates 1 or more of the previously stated pattern, if we put the pattern (.|\n), which matches anything, in front of those +'s, you'll be all set, without having to remove or account for escape codes.
^(.|\n)+URL(.|\n)+https://(.|\n)+:443/SomeWebSite/(.|\n)+Yes(.|\n)+Yes(.|\n)+SDDL(.|\n)+$
EDIT: The risk of doing something like this instead of sanitizing your string first is that you may get false positives because there could be any character separating the matches, all this regex does is ensure that somewhere in the string, in order, are the strings
"URL", "https://", ":443/SomeWebSite/", "Yes", "Yes", "SDDL"
So simple. Last issue was due to reg expression to put in Regex constructor and input string in IsMatch Method... :(
So final code is:
string output = "\r\nR‚servations d'URLÿ:\r\n--------------------\r\n\r\n URL r‚serv‚e : https://+:443/SomeWebSite/ \r\n Utilisateurÿ: AUTORITE NT\\SERVICE R\u0090SEAU\r\n \u0090couterÿ: Yes\r\n D‚l‚guerÿ: Yes\r\n SDDLÿ: D:(A;;GA;;;NS) \r\n\r\n\r\n";
output = output.Replace(Environment.NewLine, ""); //==> output2=="R‚servations d'URLÿ:-----------
Regex testUrlOpened = new Regex((#"URL .+https:\/\/\+:443\/SomeWebSite\/.+Yes.+Yes.+SDDL.+", RegexOptions.Singleline);
MessageBox.Show(testUrlOpened.IsMatch(output).ToString()); // ==> True!!!
Regex taking decimal number only without using escape character.
^[0-9]+([.][0-9]+)?$
Test It

How to replace two first characters before underscore with regex?

I have example this string:
HU_husnummer
HU_Adrs
How can I replace HU? with MI?
So it will be MI_husnummer and MI_Adrs.
I am not very good at regex but I would like to solve it with regex.
EDIT:
The sample code I have now and that still does not work is:
string test = Regex.Replace("[HU_husnummer] int NOT NULL","^HU","MI");
Judging by your comments, you actually need
string test = Regex.Replace("[HU_husnummer] int NOT NULL",#"^\[HU","[MI");
Have a look at the demo
In case your input string really starts with HU, remove the \[ from the regex pattern.
The regex is #"^\[HU" (note the verbatim string literal notation used for regex pattern):
^ - matches the start of string
\[ - matches a literal [ (since it is a special regex metacharacter denoting a beginning of a character class)
HU - matches HU literally.
String varString="HU_husnummer ";
varString=varString.Replace("HU_","MI_");
Links
https://msdn.microsoft.com/en-us/library/system.string.replace(v=vs.110).aspx
http://www.dotnetperls.com/replace
using Substring
var abc = "HU_husnummer";
var result = "MI" + abc.Substring(2);
Replace in Regex.
string result = Regex.Replace(abc, "^HU", "MI");

Regex to match only numbers , no apostrophes

I want to match only numbers in the following string
String : "40’000"
Match : "40000"
basically tring to ignore apostrophe.
I am using C#, in case it matters.
Cant use any C# methods, need to only use Regex.
Replace like this it replace all char excpet numbers
string input = "40’000";
string result = Regex.Replace(input, #"[^\d]", "");
Since you said; I just want to pick up numbers only, how about without regex?
var s = "40’000";
var result = new string(s.Where(char.IsDigit).ToArray());
Console.WriteLine(result); // 40000
I suggest use regex to find the special characters not the digits, and then replace by ''.
So a simple (?=\S)\D should be enough, the (?=\S) is to ignore the whitespace at the end of number.
DEMO
Replace like this it replace all char excpet numbers and points
string input = "40’000";
string result = Regex.Replace(input, #"[^\d^.]", "");
Don't complicate your life, use Regex.Replace
string s = "40'000";
string replaced = Regex.Replace(s, #"\D", "");

Regex pattern for text between 2 strings

I am trying to extract all of the text (shown as xxxx) in the follow pattern:
Session["xxxx"]
using c#
This may be Request.Querystring["xxxx"] so I am trying to build the expression dynamically. When I do so, I get all sorts of problems about unescaped charecters or no matches :(
an example might be:
string patternstart = "Session[";
string patternend = "]";
string regexexpr = #"\\" + patternstart + #"(.*?)\\" + patternend ;
string sText = "Text to be searched containing Session[\"xxxx\"] the result would be xxxx";
MatchCollection matches = Regex.Matches(sText, #regexexpr);
Can anyone help with this as I am stumped (as I always seem to be with RegEx :) )
With some little modifications to your code.
string patternstart = Regex.Escape("Session[");
string patternend = Regex.Escape("]");
string regexexpr = patternstart + #"(.*?)" + patternend;
The pattern you construct in your example looks something like this:
\\Session[(.*?)\\]
There are a couple of problems with this. First it assumes the string starts with a literal backslash, second, it wraps the entire (.*?) in a character class, that means it will match any single open parenthesis, period, asterisk, question mark, close parenthesis or backslash. You'd need to escape the the brackets in your pattern, if you want to match a literal [.
You could use a pattern like this:
Session\[(.*?)]
For example:
string regexexpr = #"Session\[(.*?)]";
string sText = "Text to be searched containing Session[\"xxxx\"] the result would be xxxx";
MatchCollection matches = Regex.Matches(sText, #regexexpr);
Console.WriteLine(matches[0].Groups[1].Value); // "xxxx"
The characters [ and ] have a special meaning with regular expressions - they define a group where one of the contained characters must match. To work around this, simply 'escape' them with a leading \ character:
string patternstart = "Session\[";
string patternend = "\]";
An example "final string" could then be:
Session\["(.*)"\]
However, you could easily write your RegEx to handle Session, Querystring, etc automatically if you require (without also matching every other array you throw at it), and avoid having to build up the string in the first place:
(Querystring|Session|Form)\["(.*)"\]
and then take the second match.

.NET regex replace using backreference

I have a fairly long string that contains sub strings with the following format:
project[1]/someword[1]
project[1]/someotherword[1]
There will be about 10 or so instances of this pattern in the string.
What I want to do is to be able to replace the second integer in square brackets with a different one. So the string would look like this for instance:
project[1]/someword[2]
project[1]/someotherword[2]
I''m thinking that regular expressions are what I need here. I came up with the regex:
project\[1\]/.*\[([0-9])\]
Which should capture the group [0-9] so I can replace it with something else. I'm looking at MSDN Regex.Replace() but I'm not seeing how to replace part of a string that is captured with a value of your choosing. Any advice on how to accomplish this would be appreciated. Thanks much.
*Edit: * After working with #Tharwen some I have changed my approach a bit. Here is the new code I am working with:
String yourString = String yourString = #"<element w:xpath=""/project[1]/someword[1]""/> <anothernode></anothernode> <another element w:xpath=""/project[1]/someotherword[1]""/>";
int yourNumber = 2;
string anotherString = string.Empty;
anotherString = Regex.Replace(yourString, #"(?<=project\[1\]/.*\[)\d(?=\]"")", yourNumber.ToString());
Matched groups are replaced using the $1, $2 syntax as follows :-
csharp> Regex.Replace("Meaning of life is 42", #"([^\d]*)(\d+)", "$1($2)");
"Meaning of life is (42)"
If you are new to regular expressions in .NET I recommend http://www.ultrapico.com/Expresso.htm
Also http://www.regular-expressions.info/dotnet.html has some good stuff for quick reference.
I've adapted yours to use a lookbehind and lookahead to only match a digit which is preceded by 'project[1]/xxxxx[' and followed by ']':
(?<=project\[1\]/.*\[)\d(?=\]")
Then, you can use:
String yourString = "project[1]/someword[1]";
int yourNumber = 2;
yourString = Regex.Replace(yourString, #"(?<=project\[1\]/.*\[)\d(?=\]"")", yourNumber.ToString());
I think maybe you were confused because Regex.Replace has lots of overloads which do slightly different things. I've used this one.
If you want to process the value of a captured group before replacing it, you'll have to separate the different parts of the string, make your modifications and put them back together.
string test = "project[1]/someword[1]\nproject[1]/someotherword[1]\n";
string result = string.Empty;
foreach (Match match in Regex.Matches(test, #"(project\[1\]/.*\[)([0-9])(\]\n)"))
{
result += match.Groups[1].Value;
result += (int.Parse(match.Groups[2].Value) + 1).ToString();
result += match.Groups[3].Value;
}
If you just want to replace text verbatim, it's easier: Regex.Replace(test, #"abc(.*)cba", #"cba$1abc").
you can use String.Replace (String, String)
for example
String.Replace ("someword[1]", "someword[2]")

Categories

Resources