This question already has answers here:
My regex is matching too much. How do I make it stop? [duplicate]
(5 answers)
Closed 4 years ago.
I am trying to get my regex expression to work to no avail:
All I want to do is find the image tags in an html string so I can replace them:
This is what I think should work:
var regex = new Regex(#"<img.*>");
return regex.Replace(content, "<p><i><b>(See Image Online)</b></i></p>");
And it does work partially, but it seems to be stripping out more than just the image tag.
This is an example of what I want to match:
<img src="
NCAMAAAAsYgRbAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5c
cllPAAAABJQTFRF3NSmzMewPxIG//ncJEJsldTou1jHgAAAARBJREFUeNrs2EEK
gCAQBVDLuv+V20dENbMY831wKz4Y/VHb/5RGQ0NDQ0NDQ0NDQ0NDQ0NDQ
0NDQ0NDQ0NDQ0NDQ0NDQ0NDQ0PzMWtyaGhoaGhoaGhoaGhoaGhoxtb0QGho
aGhoaGhoaGhoaGhoaMbRLEvv50VTQ9OTQ5OpyZ01GpM2g0bfmDQaL7S+ofFC6x
v3ZpxJiywakzbvd9r3RWPS9I2+MWk0+kbf0Hih9Y17U0nTHibrDDQ0NDQ0NDQ0
NDQ0NDQ0NTXbRSL/AK72o6GhoaGhoRlL8951vwsNDQ0NDQ1NDc0WyHtDTEhD
Q0NDQ0NTS5MdGhoaGhoaGhoaGhoaGhoaGhoaGhoaGposzSHAAErMwwQ2HwRQ
AAAAAElFTkSuQmCC" alt="beastie.png">
You need either
new Regex(#"<img.*?>");
if supported, or if not,
new Regex(#"<img[^>]*>");
Your problem is that your regular expression is not matching the first ">" it finds but LAST.
Related
This question already has answers here:
What is the best way to parse html in C#? [closed]
(15 answers)
Closed 2 years ago.
I prefer to use regex not any HTML Parser.
Best way to extract base64 image from a HTMl that string is like:
"<p>This is test </p>
<p><img src=\"....+tzPaXLlstlSjpcxKPEqV/zH//2Q==\"></p>"
I need this line so I can have access to base 64 image:
/9j/4AAQSkZJRgABAQAAAQABAAD/4gKgSUNDX1BST0ZJTEUA....+tzPaXLlstlSjpcxKPEqV/zH//2Q==
If there is an adequate HTML parser for this use case as suggested by others in the comments, go for that...
But, if that doesn't work, regular expressions to the rescue! This is using a positive lookbehind assertion and is matching everything until the first double quote. Should work -- adjust if it doesn't...
var val = "<p>This is test </p><p><img src=\"....+tzPaXLlstlSjpcxKPEqV/zH//2Q==";
var match = Regex.Match(val, "(?<=data:image/jpeg;base64,)[^\"]*");
Console.WriteLine(match.Value);
// output: /9j/4AAQSkZJRgABAQAAAQABAAD/4gKgSUNDX1BST0ZJTEUA....+tzPaXLlstlSjpcxKPEqV/zH//2Q==
This question already has answers here:
What special characters must be escaped in regular expressions?
(13 answers)
Closed 3 years ago.
I have a question with regex, I'm trying to scrape a stock price. The element surrounding it looks like this:
<span class="Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)" data-reactid="52">17.89</span>
As you see there are multiple parentheses in that element so when I try to narrow it down with the following code the parentheses cause my script not to display any results.
MatchCollection list = Regex.Matches(data, "<span class=\"Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)\" (.+?)</span>", RegexOptions.Singleline);
I Know you can ignore quotation marks using \ but is there away to ignore parentheses?
Double escape the parentheses solved it.
This question already has answers here:
Capturing parts of string using regular expression in R
(3 answers)
Closed 5 years ago.
I need to parse a text file for certain information. I am using a regular expression to do so. My question is, is it possible to match an expression but only capture a relevant part, negating the need to strip the unnecessary characters after capture?
Of course, google "regex capturing groups" or check this link: https://msdn.microsoft.com/en-us/library/bs2twtah(v=vs.110).aspx#named_matched_subexpression
This question already has answers here:
C# Substring Alternative - Return rest of line within string after character
(5 answers)
Closed 6 years ago.
I got on url like.
http://EddyFox.com/x/xynua
Need to fetch substring after /x/ what ever string is there.
complex example I faced is :
http://EddyFox.com/x//x/
Here result should be /x/
It can be achieved with substring ,But we need to perform it with regular expression.
This should do it:
string s = "http://EddyFox.com/x/xynua";
// I guess you don't want the /x/ in your match ?=!
Console.WriteLine(Regex.Match(s, "/x/(.*)").Groups[1].Value );
this is probably even better:
Console.WriteLine(Regex.Match(s, "(?<=/x/)(.*)").Value );
the output is
xynua
Have a look at this post: Regex to match after specific characters SO is full of RegEx posts. The probability is very high that a RegEx question has already been asked before. :)
The regex /x/(.*) will capture everything following the /x/
And where is the problem?
var r = new Regex("/x/(\\S*)");
var matches = r.Matches(myUrl);
This regex matches everything from /x/ until the first occurence of a white-space.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
RegEx match open tags except XHTML self-contained tags
string regex = "<Name[.\\s]*>[.]*s[.]*</Name>";
string source = "<Name xmlns=\"http://xml.web.asdf.com\">Session</Name>";
bool hit = System.Text.RegularExpressions.Regex.IsMatch(
source,
regex,
System.Text.RegularExpressions.RegexOptions.IgnoreCase
);
Why is hit false? I'm trying to find any Name XML field that has an 's' in the name. I don't understand what could be wrong.
Thanks!
You are using . in a character class, where it means literally ., I think you mean to use in the sense of any character - so .* rather than [.]*
string regex = "<Name(.|\\s)*>.*s.*</Name>";
With XPath, this could be as easy as /Name[contains(.,'s')]