Regex For singlequote and doublequote - c#

I just wanna know whats the Regex for singlequote and doublequote specifically something like this 1:
openquote(startswith) + word + closequote(endswith)
(singlequote)word(/singlequote) sample-> 'asdasdasdass'
(doublequote)word(/doublequote) sample-> "asdasdasdass"
in c#winforms /thanks .
--- updated:
replacing regex within this line:
string hoveredWord = r.GetFragment("[a-zA-Z]").Text;
thanks!

The following RegEx is for Single Quotation sign + Some Texts + Single Quotation sign ('asdasdasdass'):
Regex regexString = new Regex(#"'[^']*'");
The following RegEx is for Double Quotation sign + Some Texts + Double Quotation sign ("asdasdasdass"):
Regex regexString = new Regex(#"""[^""]*""");

Regex pattern :
Former example : ^\'\w+\'
Later example : ^\"\w+\"
^ : match the beginning of the string
\' or \" : match the single quote or double quote
\w+ : match the alpha-numeric characters and underscore

Related

Combine multiple regex expressions into a single line

I am trying to get the following Regex expressions into a single line.
filename = Regex.Replace(filename, #"[^a-z0-9\s-!/\-_\.\*\(\)']", ""); // Remove all non valid chars
filename = Regex.Replace(filename, #"\s+", " ").Trim(); // convert multiple spaces into one space
filename = Regex.Replace(filename, #"\s", "_"); // //Replace spaces by dashes
You can use
filename = Regex.Replace(filename, #"[^a-z0-9\s!/_.*()'-]|^\s+|\s+$|(\s+)", m => m.Groups[1].Success ? "_" : "");
The regex matches
[^a-z0-9\s!/_.*()'-]| - any char but lowercase ASCII letters, digits, whitespace, !, /, _, ., *, (, ), ' and -, or
^\s+| - start of string and then one or more whitespaces, or
\s+$| - one or more whitespaces and then end of string
(\s+) - Group 1: one or more whitespaces in any other context.
If Group 1 matches, the replacement is a _ char, else, the replacement is an empty string (the match is removed).
See the regex demo.

Getting the substring after a character in C# using regex

I have the following input string:
string val = "[01/02/70]\nhello world ";
I want to get the all words after the last ] character.
Example output for a sample string above:
\nhello world
In C#, use Substring() with IndexOf:
string val = val.Substring(val.IndexOf(']') + 1);
If you have multiple ] symbols, and you want to get all the string after the last one, use LastIndexOf:
string val = "[01/02/70]\nhello [01/02/80] world ";
string val = val.Substring(val.LastIndexOf(']') + 1); // => " world "
If you are a fan of Regex, you might want to use a Regex.Replace like
string val = "[01/02/70]\nhello [01/02/80] world ";
val = Regex.Replace(val, #"^.*\]", string.Empty, RegexOptions.Singleline); // => " world "
See demo
Notes on REGEX:
RegexOptions.Singleline makes . match a linebreak
^ - matches beginning of string
.* - matches 0 or more characters but as many as possible (greedy matching)
\] - matches literal ] (as it is a special regex metacharacter, it must be escaped).
You need to use lookbehind assertion. And not only that, you have to enable DOTALL modifier also, so that it would also match the newline character present inbetween.
"(?s)(?<=\\]).*"
(?s) - DOTALL modifier.
(?<=\\]) - lookbehind which asserts that the match must be preceeded by a close bracket
.* - Matches any chracater zero or more times.
or
"(?s)(?<=\\])[\\s\\S]*"
Try this if you don't want to match the following newline character.
#"(?<=\][\n\r]*).*"

I think my regular expression pattern in C# is incorrect

I'm checking to see if my regular expression matches my string.
I have a filename that looks like somename_somthing.txt and I want to match it to somename_*.txt, but my code is failing when I try to pass something that should match. Here is my code.
string pattern = "somename_*.txt";
Regex r = new Regex(pattern, RegexOptions.IgnoreCase);
using (ZipFile zipFile = ZipFile.Read(fullPath))
{
foreach (ZipEntry e in zipFile)
{
Match m = r.Match("somename_something.txt");
if (!m.Success)
{
throw new FileNotFoundException("A filename with format: " + pattern + " not found.");
}
}
}
The asterisk is matching the underscore and throwing it off.
Try:
somename_(\w+).txt
The (\w+) here will match the group at this location.
You can see it match here: https://regex101.com/r/qS8wA5/1
In General
Regex give in this code matches the _ with an * meaning zero or more underscores instead of what you intended. The * is used to denote zero or more of the previous item. Instead try
^somename_(.*)\.txt$
This matches exactly the first part "somename_".
Then anything (.*)
And finally the end ".txt". The backslash escapes the 'dot'.
More Specific
You can also say if you only want letters and not numbers or symbols in the middle part of the match with:
^somename_[a-z]*\.txt$
As written, your regular expression
somename_*.txt
matches (in a case-insensitive manner):
the literal text somename, followed by
zero or more underscore characters (_), followed
any character (other than newline), followed
the literal text txt
And it will match that anywhere in the source text. You probably want to write something like
Regex myPattern = new Regex( #"
^ # anchor the match to start-of-text, followed by
somename # the literal 'somename', followed by
_ # a literal underscore character, followed by
.* # zero or of any character (except newline), followed by
\. # a literal period/fullstop, followed by
txt # the literal text 'txt'
$ # with the match anchored at end-of-text
" , RegexOptions.IgnoreCase|RegexOptions.IgnorePatternWhitespace
) ;
Hi I think the pattern should be
string pattern = "somename_.*\\.txt";
Regards

How can I escape these quotes in regex expressions?

I have a string text which is like:
"ruf": "the text I want",
"puf":
I want extract the text inside the quotes.
tried this :
string cg="?<=\"ruf\":\")(.*?)(?=\",puf";
Regex g = new Regex(cg);
It didnt work.
Try with below regex:
(?<="ruf":\s\")[^"]*
Online demo
String literals for use in programs:
C#
#"(?<=""ruf"":\s\"")[^""]*"
output:
the text I want
Pattern description:
(?<= look behind to see if there is:
"ruf": '"ruf":'
\s whitespace (\n, \r, \t, \f, and " ")
\" '"'
) end of look-behind
[^"]* any character except: '"' (0 or more times
(matching the most amount possible))
Debuggex Demo
EDIT
Can you add puf. Because it is a long text which has multiple quotes in it
If you are looking till "puf" is found then try below regex:
(?<="ruf":\s\")[\s\S]*(?=",\s*"puf")
Online demo
String literals for use in programs:
C#
#"(?<=""ruf"":\s\"")[\s\S]*(?="",\s*""puf"")"
You could try the below regex with s modifier,
/(?<=\"ruf\": \")[^\"]*(?=\",.*?\"puf\":)/s
DEMO
With the s modifier, dot matches even newline character also.
Do it like this:
var myRegex = new Regex(#"(?s)(?<=""ruf"": "")[^""]*(?=\s*""puf"")");
string resultString = myRegex.Match(yourString).Value;
Console.WriteLine(resultString);

Using \b in C# regular expressions doesn't work?

I am wondering why the following regex does not match.
string query = "\"1 2\" 3";
string pattern = string.Format(#"\b{0}\b", Regex.Escape("\"1 2\""));
string repl = Regex.Replace(query, pattern, "", RegexOptions.CultureInvariant);
Note that if I remove the word boundary characters (\b) from pattern, it matches fine. Is there something about '\b' that might be tripping this up?
A quote is not a word character, so \b will not be a match if it is there. There is no word character before the quote; so, before the quote, there is no transition between word characters and non-word characters. So, no match.
From your comment you are trying to remove word characters from a string. The most straightforward way to do that would be to replace \w with an empty string:
string repl = Regex.Replace(query, "\w", "", RegexOptions.CultureInvariant);
you are expecting a whitespace.
it isn't finding one.
replace
string query = "\"1 2\" 3";
with
string query = "\" 1 2 \" 3";
and you'll see what i mean.

Categories

Resources