Ignore nested single quotes inside of single quotes - c#

I'm working with matching an entire string within single quotes. The problem is, these strings are dynamically generated and I need to ignore all other single quotes within the first set of quotes. I've come across other solutions that are similar but I can't seem to tweak them to my needs.
Here is what I've worked with so far:
'(?:''|[^'])*'
I would like to match essentially everything within the first and last single quotes between content: and ;
Some example text:
#bottom {
content: 'Here we have an embedded unescaped 'single' that is generated at runtime. {Let's ignore it
please'
;
}
This is the playground I've been working in:
https://regex101.com/r/ITHciu/2
Any help would be greatly appreciated.

If you absolutely have to use Regexes for this and you are certain that ; will not be inside the string you are searching for, you could try this: '[^;]*'\s*;$. It will select everything from a ' and go until a like that ends with whitesapce and a ;.
Edit: if you need the stuff between the ' and ';, you could use a group '([^;]*)'\s*;$.
However, a much cleaner solution would be to make a little parser, that will read the string char by char. It's a fun exercise if you got a little bit more time.
If nothing else, you could use that regex to correct the invalid syntax in your files. And tell the people manually writing them what the valid syntax should be.

Related

How do I see if a string contains another string with quotes in it?

I am trying to see if a large string contains this line of HTML:
<label ng-class="choiceCaptionClass" class="ng-binding choice-caption">Was this information helpful?</label>
As you can see, this snippet has quotations in multiple places and it's causing problems when I do something like this:
Assert.IsTrue(responseContent.Contains("<label ng-class="choiceCaptionClass" class="ng - binding choice - caption">Was this information helpful?</label>"));
I've tried both of these ways of defining the string:
#"<label ng-class=""choiceCaptionClass"" class=""ng - binding choice - caption"">Was this information helpful?</label>"
and
"<label ng-class=\"choiceCaptionClass\" class=\"ng - binding choice - caption\">Was this information helpful?</label>"
But in each case the Contains() method looks for the literal string with either the double quotes or the backslashes. Is there another way I could define this string so I can correctly search for it?
Escaping the double-quotes with backslashes is the proper thing to do.
The reason your search may be failing is that the strings don't actually match. For example, in your version with backslashes, you have spaces around some of the dashes but your HTML string does not.
Try using regular expressions. I made this one for you but you can test your own regex here.
var regex = new Regex(#"<label\s+ng-class\s*=\s*""choiceCaptionClass""\s+class\s*=\s*""ng-binding choice-caption""\s*>\s*Was this information helpful\?\s*</label>", RegexOptions.IgnoreCase);
Assert.IsTrue(regex.IsMatch(responseContent));
If this is not working use the tester tool to figure it out what part of the pattern is getting off.
Hope this help!

Matching and replacing function expressions

I need to do some very light parsing of C# (actually transpiled Razor code) to replace a list of function calls with textual replacements.
If given a set containing {"Foo.myFunc" : "\"def\"" } it should replace this code:
var res = "abc" + Foo.myFunc(foo, Bar.otherFunc( Baz.funk()));
with this:
var res = "abc" + "def"
I don't care about the nested expressions.
This seems fairly trivial and I think I should be able to avoid building an entire C# parser using something like this for every member of the mapping set:
find expression start (e.g. Foo.myFunc)
Push()/Pop() parentheses on a Stack until Count == 0.
Mark this as expression stop
replace everything from expression start until expression stop
But maybe I don't need to ... Is there a (possibly built-in) .NET library that can do this for me? Counting is not possible in the family of languages that RE is in, but maybe the extended regex syntax in C# can handle this somehow using back references?
edit:
As the comments to this answer demonstrates simply counting brackets will not be sufficient generally, as something like trollMe("(") will throw off those algorithms. Only true parsing would then suffice, I guess (?).
The trick for a normal string will be:
(?>"(\\"|[^"])*")
A verbatim string:
(?>#"(""|[^"])*")
Maybe this can help, but I'm not sure that this will work in all cases:
<func>(?=\()((?>/\*.*?\*/)|(?>#"(""|[^"])*")|(?>"(\\"|[^"])*")|\r?\n|[^()"]|(?<open>\()|(?<-open>\)))+?(?(open)(?!))
Replace <func> with your function name.
Useless to say that trollMe("\"(", "((", #"abc""de((f") works as expected.
DEMO

Regex to identify C# functions

I need to find all functions in my VS solution with a certain attribute and insert a line of code at the end and at the beginning of each one. For identifying the functions, I've got as far as
\[attribute\]\r?\n(.*)void(.*)\r?\n.*\{\r?\n([^\{\}]*)\}
But that only works on functions that don't contain any other blocks of code delimited by braces. If I set the last capturing group to [\s\S] (all characters), it simply selects all text from the start of the first function to the end of the last one. Is there a way to get around this and select just one whole function?
I am afraid balancing constructs themselves are not enough since you may have unbalanced number of them in the method body. You can still try this regex that will handle most of the caveats:
\[attribute\](?<signature>[^{]*)(?<body>(?:\{[^}]*\}|//.*\r?\n|"[^"]*"|[\S\s])*?\{(?:\{[^}]*\}|//.*\r?\n|"[^"]*"|[\S\s])*?)\}
See demo on RegexStorm
The regex will ignore all { and } in the string literals and //-like comments, and will consume {...} blocks. The only thing it does not support is /*...*/ multiline comments. Please let me know if you also need to account for them.
The bad news is that you can't do that by the Search-And-Replace feature because it doesn't support balancing groups. You can write a separate program in C# that does it for you.
The construct to get the matching closing brace is:
(?=\{)(?:(?<open>\{)|(?<-open>\})|[^\{\}])+?(?(open)(?!))
This matches a block of {...}. But as #DmitryBychenko mentioned it doesn't respect comments or strings.

C# Regex replace round brackets and contents at end of string

Been struggling for an hour to get this working.
Have string of following format:
"blabla(arbitrarycontent)sfsf (arbytrarycontent)"
and also
"blabla (arbytrarycontent)"
I need to ditch the "(arbitrarycontent)", including the brackets, if it occurs at the end of the string.
So the first example the result should be "blabla(arbitrarycontent)sfsf".
For the second it should be "blabla".
Have tried all sorts of Regex patterns like below but unsuccessful.
\(.*\)$
Using .NET 4.0
Thx for any help
Simply forbid the part between the parentheses to contain parentheses. That makes sure that you only match the last pair:
\([^()]*\)$

Regex between, from the last to specific end

Today my wish is to take text form the string.
This string must be, between last slash and .partX.rar or .rar
First I tried to find edge's end of the word and then the beginning. After I get that two elements I merged them but I got empty results.
String:
http://hosting.xyz/1234/15-game.part4.rar.html
http://hosting.xyz/1234/16-game.rar.html
Regex:
Begin:(([^/]*)$) - start from last /
End:(.*(?=.part[0-9]+.rar|.rar)) stop before .partX.rar or .rar
As you see, if I merge that codes I won't get any result.
What is more, "end" select me only .partX instead of .partX.rar
All what I want is:
15-game.part4.rar and 16-game.rar
What i tried:
(([^/]*)$)(.*(?=.part[0-9]+.rar|.rar))
(([^/]*)$)
(.*(?=.part[0-9]+.rar|.rar))
I tried also
/[a-zA-Z0-9]+
but I do not know how select symbols.. This could be the easiest way. But this select only letters and numbers, not - or _.
If I could select symbols..
You don't really need a regex for this as you can merely split the url on / and then grab the part of the file name that you need. Since you didn't mention a language, here's an implementation in Perl:
use strict;
use warnings;
my $str1="http://hosting.xyz/1234/15-game.part4.rar.html";
my $str2="http://hosting.xyz/1234/16-game.rar.html";
my $file1=(split(/\//,$str1))[-1]; #last element of the resulting array from splitting on slash
my $file2=(split(/\//,$str2))[-1];
foreach($file1,$file2)
{
s/\.html$//; #for each file name, if it ends in ".html", get rid of that ending.
print "$_\n";
}
The output is:
15-game.part4.rar
16-game.rar
Nothing could be simpler! :-)
Use this:
new Regex("^.*\/(.*)\.html$")
You'll find your filename in the first captured group (don't have a c# compiler at hand, so can't give you working sample, but you have a working regex now! :-) )
See a demo here: http://rubular.com/r/UxFNtJenyF
I'm not a C# coder so can't write full code here but I think you'll need support of negative lookahead here like this:
new Regex("/(?!.*/)(.+?)\.html$");
Matched Group # 1 will have your string i.e. "16-game.rar" OR "15-game.part4.rar"
Use two regexes:
start to substitute .*/ with nothing;
then substitute \.html with nothing.
Job done!

Categories

Resources