c# and Regex: How do I make a match? - c#

Aloha,
I need to match page.aspx?cmd=Show&id= but not match http://random address/page.aspx?cmd=Show&id=
Right now I'm just doing a simple resting replace with strContent.Replace("page.aspx?cmd=Show&id=", "page.aspx/Show/"); but to avoid the above i think I need to move to regular expressions.
Edit: Sorry i was a little short on details The current replace works because I was not taking into account that the address being modified should only be relative addresses, not absolute. I need to catch all relative addresses that match page.aspx?cmd=Show&id= and convert that string into page.aspx/Show/.
The .StartsWith and ^ wont work because I an looking in a string that is a full html page. So I need to be able to convert:
<p>Hello, please click here.</p>
into
<p>Hello, please click here.</p>
but not convert
<p>Hello, please click here.</p>
I used an A tag for my example but I need to convert IMG tags as well.
Thanks!

I would suggest using ParseQueryString rather than a regular expression because this will work even if the parameters are in a different order.
Otherwise you can use string.StartsWith to test if there is a match.
If you want to use a regular expression you need to use ^ and also escape all the special characters in your string:
Regex regex = new Regex(#"^page\.aspx\?cmd=Show&id=");
If you don't want to escape the characters yourself you can get Regex.Escape to do it for you:
string urlToMatch = "page.aspx?cmd=Show&id=";
Regex regex = new Regex("^" + Regex.Escape(urlToMatch));

You can just use a ^ to look for start of line.
... Or you can be cool and use a negative lookbehind (?<!http://.*?/). ;)
See http://www.geekzilla.co.uk/View6BD88331-350D-429F-AB49-D18E90E0E705.htm

Related

how can I use unnamed Regex groups in C# inside my regex?

hey so my current regex is #"(into)(to)add\s[^\s]{1,}\1|\2[^\s]{1,}" I want the input to be something "add word into/to category" the regex in general works fine but just the \1|\2 part, I tried using groups and all sorts of solutions but I just can't seem to figure out how I can make it so that the input can be into or to
Can anyone help me out? (this is in C# and using the Regex class)
If I have understood you correctly, then you don't need back references to (unnamed) Groups, you can use a simple alternation, like this:
#"add \w+ (into|to) \w+"
That will select either into or to in the search string.
Edit:
Let's get a Little more 'advanced', using the optional sign '?':
#"add \w+ (in)?to \w+"
This will match 'in' zero or one time, followed by 'to', so it will match into as well as to, exactly as the original RegEx.
Edit2:
I have a feeling, you want to use a variable inside your RegEx, you can of course do that like this:
string search = "into|to";
RegEx regEx = new ReqEx(#"add \w+ (" + search + ") \w+");
From your given example I think you're looking for a regex like add\s\w+\s(into|to)\s\w+. Your current regex matches only strings starting with "intoto" wich is probably not what you want.

Regex to match string not working .*/text.*

i just want to match this string with regex:
How can i do this?
/profil
i try do this in this way:
.*/profil.*
But my software dont match any results in text.
All you need is this
#"(\/profil)"
if all you need is to match "/profil" then there is no need for the ".*"
Remember the expressions are greedy.
The . matches any character, and * tells it that this . can go on forever. So this expression will eat up all of your input, leaving nothing for the /profil part.
It seems like you're trying to put a wildcard around /profil. This is not needed with regular expressions. You should just be able to use /profil as the full expression and match your string.

Find special character in string and change it using Regular expression in C#

I am trying to find the '&' character in my string and switch it with "and" string using regular expression, but I am obviously doing it wrong.
This is a part of code that is checking if there is a '&' symbol and if there is one than it should change it to "and".
if (Regex.IsMatch(toCheck, #"[^&]"))
return Regex.Replace(toCheck, #"[^&]", "and");
What I am getting as outcome is string that contains only '&' symbols.
Can someone help me with this regular expression thing, it is a bit confusing to me. Thanks!
I would just do it using regular string functions:
return toCheck.Replace("&","and");
If you really want to do it with regex, your function is a bit wrong, [^&] actually means doesn't contain &. Remove that and you'll be fine. It's not even necessary to put it between brackets as it's not a special character in regex. Just remember not to use regex for trivial things like searching one character and replacing it, it's using a sledgehammer to crack a nut.
You don't need to check with IsMatch if you want to replace, it just won't replace if there are no occurences.
Also a simple string replace is enough, you do not need regexes for this to solve:
Console.WriteLine("Hello&World&Mars".Replace("&", " and "));
This is enough
[^&] means all characters that are not equal to &.
^ inverts the selection
This is how you'll perform a check on a string if it contains '&' Character in it!
if(Regex.IsMatch(toCheck,#"[&]"))
//do whatever you want to
although, there are many topics discussing Regex, you can visit any of them, Here i have got a link for you, you can learn Regular Expressions Briefly here!

Regular expression with URL extraction

I am using C# for this project and basically what I need is a way to make plain text into HTML, I found a regular expression (I think on Stack Overflow actually) for converting links in the text to anchor links in HTML, it looks like this:
Regex regx = new Regex(#"https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?", RegexOptions.IgnoreCase);
MatchCollection mactches = regx.Matches(input);
foreach (Match match in mactches)
{
output = output.Replace(match.Value, String.Format("{0}", match.Value));
}
It works great, however I found a flaw in that it doesn't consider a dash (-) as part of the URL, so when it hits the first dash it closes the anchor tag.
So I obviously need to include the dash somehow in the regular expression, but the problem is that I have absolutely no clue about RegEx and it just looks like Russian to me.
Does anyone have an idea what small edit I need to make to the RegEx expression to make it include a dash as allowed characters in the URL?
Try this: #"https?://([-\w\.]+)+(:\d+)?(/([-\w/_\.]*(\?\S+)?)?)?"
I added a dash to the second character class (the part in square brackets) to match dashes in the part of the URL that is not the domain name.
I use this one which supports the ftp and file schemes as well as http:
#"\b((https?|ftp|file)://|(www|ftp)\.)[-A-Z0-9+&##/%?=~_|$!:,.;\(\)]*[A-Z0-9+&##/%=~_|$]"
It will recognise a URL that contain parameters delimited by & like this:
http://www.cbsnews.com/video/watch/?id=7400904n&tag=re1.channel
The original is at Extract URLs from a text (Regex). I modified it slightly to recognise a URL that contains parentheses like this:
http://msdn.microsoft.com/en-us/library/ms686722(v=VS.85).aspx
You need to specify RegexOptions.IgnoreCase with this regex though of course you could simplify by replacing A-Z with \w.

How to ignore words in string using Regular Expressions

I need a piece of regex that can be used to do a NOT match. Take for example the following URL's
http://www.site.com/layout/default.aspx
http://www.site.com/default.aspx
http://www.site.com/layout.aspx
The regex should NOT match any url string that contains the directory "layout"
http://www.site.com/layout/default.aspx
and instead should match on
http://www.site.com/default.aspx
http://www.site.com/layout.aspx
How can i do this using .NET regex?
Use negative lookahead:
^(?!.*/layout/)
You have to anchor to the start of the string or you'll get false positives i.e. (?!/layout/) alone won't work.
If you need to squeeze everything in one regex, try negative lookahead, something along the lines of this:
(?!layout)
Just match /layout/ and invert the result, whatever language you use.
E.g. with PHP:
if(!preg_match('#/layout/#i', $url)) {
// does not match layout
}
PYTHON:
import re
if not re.match('layout'):
#do whatever here
re is regex for python

Categories

Resources