Regular expression to find URLs within a string [duplicate] - c#

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
C# code to linkify urls in a string
I'm sure this is a stupid question but I can't find a decent answer anywhere. I need a good URL regular expression for C#. It needs to find all URLs in a string so that I can wrap each one in html to make it clickable.
What is the best expression to use for this?
Once I have the expression, what is the best way to replace these URLs with their properly formatted counterparts?
Thanks in advance!

I am using this right now:
text = Regex.Replace(text,
#"((http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,#?^=%&:/~\+#]*[\w\-\#?^=%&/~\+#])?)",
"<a target='_blank' href='$1'>$1</a>");

Use this code
protected string MakeLink(string txt)
{
Regex regx = new Regex("http://([\\w+?\\.\\w+])+([a-zA-Z0-9\\~\\!\\#\\#\\$\\%\\^\\&\\*\\(\\)_\\-\\=\\+\\\\\\/\\?\\.\\:\\;\\'\\,]*)?", RegexOptions.IgnoreCase);
MatchCollection mactches = regx.Matches(txt);
foreach (Match match in mactches)
{
txt = txt.Replace(match.Value, "<a href='" + match.Value + "'>" + match.Value + "</a>");
}
return txt;
}

Related

mvc regex replace array of pattern on one sentence [duplicate]

This question already has answers here:
Why \b does not match word using .net regex
(2 answers)
Closed 5 years ago.
I want to replace certain words in one sentence string with regex replace.
For it, I create pattern array :
string[] words = {"abc","132","qwe","bold","test"};
and for replace, I do it :
foreach (string item in words){
output = Regex.Replace(output,#"\b" + item + "\b", " ");
}
but this way don't work ...
Someone has an idea?
Explanation
I use the above method in VB.net and will respond without problems.
I am a beginner in C #
It looks like you forgot the # literal at the second \b:
either put it in there
output = Regex.Replace(output, #"\b" + item + #"\b", " ");
or double the backslash so it is used as an escape character:
output = Regex.Replace(output, #"\b" + item + "\\b", " ");

Building dynamic Regex in C# [duplicate]

This question already has answers here:
What characters need to be escaped in .NET Regex?
(4 answers)
Closed 7 years ago.
I use dynamically built regex. Problem is when symbol = "aaaa (1)" because regex tries to parse it, but I want to treat it literary
Regex regex = new Regex(#"(^" + "/(" + symbol + #" \(\d+\)$)|" + symbol);
You need to escape special chars:
var escapedSymbol = Regex.Escape(symbol);
Regex regex = new Regex(#"(^" + "/(" + escapedSymbol + #" \(\d+\)$)|" + escapedSymbol );
Reffer: msdn

How do I fix open <'s without closing >'s with C#?

I'm using C# with the .NEt 4.5 version of the HTML Agility Pack. I have to be able to import a large number of different html documents and always be able to load them into the .NET XmlDocument.
My current issue is that I am seeing html similar to this:
<p class="s18">(4) if qual. ch ild <17 f or</p>
I need to convert that "<" to anything else but I need to preserve all of the other <'s and >'s. I'd like to use as few lines of code as possible and hope that someone can show me how the Html Agility Pack (already being used in my project for other things) can be leveraged to solve this problem.
EDIT: If Html Agility Pack doesn't satisfy the need then I'd appreciate a C# method which will eliminate or close any open flags while preserving any valid tags.
EDIT 2: Removed, no longer relevant.
EDIT 3: I've partially solved this problem but there is a bug that I'd appreciate help resolving.
My method is below. This method successfully removes the '<' and '>' characters from this HTML.
<p>yo hi</p><p> Gee I love 1<'s</p><td name=\"\" /><p>bazinga ></p>
The problem that I am having is that the Regex.Matches() method seems to not actually find all matches. It will find a match and then look for the next match, positioned after the first match ends. This behavior makes the " Gee I love 2<'s" '<' character get skipped in following HTML.
<p>yo hi</p><p> Gee I love 1<'s<p> Gee I love 2<'s<p> Gee I love 3<'s</p></p></p><td name=\"\" /><p>bazinga ></p>
In my opinion " Gee I love 2<'s" should be a match but the Regex.Matches() method is skipping it because of, what I assume, is a position location being moved forward to the end of the last match.
private static string RemovePartialTags(string input)
{
Regex regex = new Regex(#"<[^<>/]+>(.*?)<[^<>]+>");
string output = regex.Replace(input, delegate(Match m)
{
string v = m.Value;
Regex reg = new Regex(#"<[^<>]+>");
MatchCollection matches = reg.Matches(v);
int locEndTag = v.IndexOf(matches[1].Value);
List<string> tokens = new List<string>
{
v.Substring(0, matches[0].Length),
v.Substring(matches[0].Length, locEndTag - matches[0].Length)
.Replace(#"<", string.Empty)
.Replace(#">", string.Empty)
};
tokens.Add(v.Substring(tokens[0].Length + (locEndTag - matches[0].Length)));
return tokens[0] + tokens[1] + tokens[2];
}
);
return output;
}
Thank you in advance!
I solved my problem by using the same method as above but with a modified regex expression
#"<[^<>/]+>(.*?)[<](.*?)<[^<>]+>"
Method:
private static string RemovePartialTags(string input)
{
Regex regex = new Regex(#"<[^<>/]+>(.*?)[<](.*?)<[^<>]+>");
string output = regex.Replace(input, delegate(Match m)
{
string v = m.Value;
Regex reg = new Regex(#"<[^<>]+>");
MatchCollection matches = reg.Matches(v);
int locEndTag = v.IndexOf(matches[1].Value);
List<string> tokens = new List<string>
{
v.Substring(0, matches[0].Length),
v.Substring(matches[0].Length, locEndTag - matches[0].Length)
.Replace(#"<", string.Empty)
.Replace(#">", string.Empty)
};
tokens.Add(v.Substring(tokens[0].Length + (locEndTag - matches[0].Length)));
return tokens[0] + tokens[1] + tokens[2];
}
);
return output;
}

How to remove spaces and newlines in a string [duplicate]

This question already has answers here:
Fastest way to remove white spaces in string
(13 answers)
Closed 9 years ago.
sorry if they are not very practical for C # Asp.Net, I hope to make me understand
I have this situation
string content = ClearHTMLTags(HttpUtility.HtmlDecode(e.Body));
content=content.Replace("\r\n", "");
content=content.Trim();
((Post)sender).Description = content + "...";
I would make sure that the string does not contain content nor spaces (Trim) and neither carriage return with line feed, I'm using the above code inserted but it does not work great either
any suggestions??
Thank you very much
Fabry
You can remove all whitespaces with this regex
content = Regex.Replace(content, #"\s+", string.Empty);
what are whitespace characters from MSDN.
Btw you are mistaking Trim with removing spaces, in fact it's only removing spaces at the begining and at the end of string. If you want to replace all spaces and carige returns use my regex.
this should do it
String text = #"hdjhsjhsdcj/sjksdc\t\r\n asdf";
string[] charactersToReplace = new string[] { #"\t", #"\n", #"\r", " " };
foreach (string s in charactersToReplace)
{
text = text.Replace(s, "");
}
simple change only you missed # symbol
string content = ClearHTMLTags(HttpUtility.HtmlDecode(e.Body));
content=content.Replace(#"\r\n", "");
content=content.Trim();
((Post)sender).Description = content + "...";

how to replace a string ignoring case? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Is there an alternative to string.Replace that is case-insensitive?
I'm looking for alternative to:
string str = "data ... ";
string replace = "data";
str = str.Replace(replace, "new value");
str = str.Replace(replace.ToLower(),"new value");
if possible using no regex.
Thanks in advance.
Without Regex, I don't know.
With Regex, it would give you something like this :
var regex = new Regex( str, RegexOptions.IgnoreCase );
var newSentence = regex.Replace( sentence, "new value" );
I found an interesting article here, with a sample code that looks to work faster than Regex.Replace : http://www.codeproject.com/KB/string/fastestcscaseinsstringrep.aspx

Categories

Resources