What would be the regex expression to find (PoundSomenumberSemiColonPound) (aka #Number;#)? I used this but not working
string st = Regex.Replace(string1, #"(#([\d]);#)", string.Empty);
You're looking for #\d+;#.
\d matches a single numeric character
+ matches one or more of the preceding character.
(\x23\d+\x3B\x32)
# and / are both used around patterns, thus the trouble. Try using the above (usually when I come in to trouble with specific characters I revert to their hex facsimile (asciitable.com has a good reference)
EDIT Forgot to group for replacement.
EDITv2 The below worked for me:
String string1 = "sdlfkjsld#132;#sdfsdfsdf#1;#sdfsdfsf#34d;#sdfs";
String string2 = System.Text.RegularExpressions.Regex.Replace(string1, #"(\x23\d+\x3B\x23)", String.Empty);
Console.WriteLine("from: {0}\r\n to: {1}", string1, string2);;
Output:
from: sdlfkjsld#132;#sdfsdfsdf#1;#sdfsdfsf#34d;#sdfs
to: sdlfkjsldsdfsdfsdfsdfsdfsf#34d;#sdfs
Press any key to continue . . .
You don't need a character class when using \d, and as SLaks points out you need + to match one or more digits. Also, since you're not capturing anything the parentheses are redundant too, so something like this should do it
string st = Regex.Replace(string1, #"#\d+;#", string.Empty);
You may need to escape the # symbols, they're usually interpreted as comment markers, in addition to #SLaks comment about using + to allow multiple digits
Related
I have the following string:
"121 fd412 4151 3213, 421, 423 41241 fdsfsd"
And I need to get 3213 and 421 - because they both have space in front of them, and a coma behind.
The result will be set inside the string array...How can I do that?
"\\d+" catches every integer.
"\s\\d+(,)" throws some memory errors.
EDIT.
space to the left (<-) of the number, coma to the right (->)
EDIT 2.
string mainString = "Tests run: 5816, 8346, 28364 iansufbiausbfbabsbo3 4";
MatchCollection c = Regex.Matches(a, #"\d+(?=\,)");
var myList = new List<String>();
foreach(Match match in c)
{
myList.Add(match.Value);
}
Console.Write(myList[1]);
Console.ReadKey();
Your regex syntax is incorrect for wanting to match both digits, if you want them as separate results, you could do:
#"\s(\d+),\s(\d+)\s"
Live Demo
Edit
#"\s(\d+),"
Live Demo
\s\\d+(,):
\s is not properly escaped, should be \\s, same as for \\d
\\d matches single digit, you need \\d+ - one or more consecutive digits
(,) captures comma, do you really need this? seems like you need to capture a number, so \\s(\\d+),
you said "because they both have space behind them, and a coma in front", so probably ,\\s(\\d+)
How about this expression :
" \d+," // expression without the quotes
it should find what you need.
How to work with regular expression can you check on the MSDN
Hope it helps
Another solution
\s(\d+), // or maybe you'll need a double slash \\
Output:
3213
421
Demo
I think you mean you're looking for something like ,<space><digit> not ,<digit><space>
If so, try this:
, (\d+) //you might need to add another backslash as the others have noted
Well, based on your new edit
\s(\d+),
Test it here
It's all you need, only the numbers
\d+(?=\,)
I need some advice. Suppose I have the following string: Read Variable
I want to find all pieces of text like this in a string and make all of them like the following:Variable = MessageBox.Show. So as aditional examples:
"Read Dog" --> "Dog = MessageBox.Show"
"Read Cat" --> "Cat = MessageBox.Show"
Can you help me? I need a fast advice using RegEx in C#. I think it is a job involving wildcards, but I do not know how to use them very well... Also, I need this for a school project tomorrow... Thanks!
Edit: This is what I have done so far and it does not work: Regex.Replace(String, "Read ", " = Messagebox.Show").
You can do this
string ns= Regex.Replace(yourString,"Read\s+(.*?)(?:\s|$)","$1 = MessageBox.Show");
\s+ matches 1 to many space characters
(.*?)(?:\s|$) matches 0 to many characters till the first space (i.e \s) or till the end of the string is reached(i.e $)
$1 represents the first captured group i.e (.*?)
You might want to clarify your question... but here goes:
If you want to match the next word after "Read " in regex, use Read (\w*) where \w is the word character class and * is the greedy match operator.
If you want to match everything after "Read " in regex, use Read (.*)$ where . will match all characters and $ means end of line.
With either regex, you can use a replace of $1 = MessageBox.Show as $1 will reference the first matched group (which was denoted by the parenthesis).
Complete code:
replacedString = Regex.Replace(inStr, #"Read (.*)$", "$1 = MessageBox.Show");
The problem with your attempt is, that it cannot know that the replacement string should be inserted after your variable. Let's assume that valid variable names contain letters, digits and underscores (which can be conveniently matched with \w). That means, any other character ends the variable name. Then you could match the variable name, capture it (using parentheses) and put it in the replacement string with $1:
output = Regex.Replace(input, #"Read\s+(\w+)", "$1 = MessageBox.Show");
Note that \s+ matches one or more arbitrary whitespace characters. \w+ matches one or more letters, digits and underscores. If you want to restrict variable names to letters only, this is the place to change it:
output = Regex.Replace(input, #"Read\s+([a-zA-Z]+)", "$1 = MessageBox.Show");
Here is a good tutorial.
Finally note, that in C# it is advisable to write regular expressions as verbatim strings (#"..."). Otherwise, you will have to double escape everything, so that the backslashes get through to the regex engine, and that really lessens the readability of the regex.
Is there a way to use wildcards to define the following:
I would like a "\" to come before and after a comma, when a comma character does not contain a "\"" before it or after it.
I am a little unsure how to do the negation.
EDIT Sample data:
"col1,col2,col3"
should become
"\"col1\",\"col2\",\"col3\""
where "\"" just means a quote string
Use the "negative look behind" assertion:
(?<!\\),
Can't give you a better answer without having sample input/output.
Try (?<!\"),(?!\"), which is called Zero-Width Assertions
I'll busy now, would explain later, sorry for that.
Replace everything that matches the following: ^(\\\"),^(\\\") with: \",\"
It means anything but a backslash followed by a quote, followed by a comma, followed by anything but a backslash followed by a quote.
Use regular expressions or a simple replace:
string s = "col1,col2\",\"col3";
// replace all existing quotes and replace all commas with escaped characters again
string r = s.Replace('\"','').Replace(",","\",\"");
// r = "col1\",\"col2\",\"col3"
But this does not do what your sample data looks like:
"col1,col2,col3" should become "col1\",\"col2\",\"col3\""
This isn't following your rule (look at the trailing \" !). Maybe you want to wrap all col's, so you can add a \" at the beginning and the end, too. (Assuming the seperator is always just ,, not including spaces)
I know this thread is a bit old but for the new visitors this can also be done:
string sample = "col1,col2,col3"
string result = sample.Replace("""","");
result = "\"" + result.replace(",","\",\"") + "\""
Hope it helps!
I'm after a regex for C# which will turn this:
"*one*" *two** two and a bit "three four"
into this:
"*one*" "*two**" two and a bit "three four"
IE a quoted string should be unchanged whether it contains one or many words.
Any words with asterisks to be wrapped in double quotes.
Any unquoted words with no asterisks to be unchanged.
Nice to haves:
If multiple asterisks could be merged into one in the same step that would be better.
Noise words - eg and, a, the - which are not part of a quoted string should be dumped.
Thanks for any help / advice.
Julio
The following regex will do what you're looking for:
\*+ # Match 1 or more *
(
\w+ # Capture character string
)
\*+ # Match 1 or more *
If you use this in conjunction with this replace statement, all you words matched by (\w+) will be wrapped in "**":
string s = "\"one\" *two** two and a bit \"three four\"";
Regex r = new Regex(#"\*+(\w+)\*+");
var output = r.Replace(s, #"""*$1*""");
Note: This will leave the below string unquoted:
*two two*
If you wish to match those strings as well, use this regex:
\*+([^*]+)\*+
EDIT: updated code.
This solution works for your request, as well as the nice to have items:
string text = #"test the ""one"" and a *two** two and a the bit ""three four"" a";
string result = Regex.Replace(text, #"\*+(.*?)\*+", #"""*$1*""");
string noiseWordsPattern = #"(?<!"") # match if double quote prefix is absent
\b # word boundary to prevent partial word matches
(and|a|the) # noise words
\b # word boundary
(?!"") # match if double quote suffix is absent
";
// to use the commented pattern use RegexOptions.IgnorePatternWhitespace
result = Regex.Replace(result, noiseWordsPattern, "", RegexOptions.IgnorePatternWhitespace);
// or use this one line version instead
// result = Regex.Replace(result, #"(?<!"")\b(and|a|the)\b(?!"")", "");
// remove extra spaces resulting from noise words replacement
result = Regex.Replace(result, #"\s+", " ");
Console.WriteLine("Original: {0}", text);
Console.WriteLine("Result: {0}", result);
Output:
Original: test the "one" and a *two** two and a the bit "three four" a
Result: test "one" "*two*" two bit "three four"
The 2nd regex replacement for noise words causes potential duplicate of blank spaces. To remedy this side effect I added the 3rd regex replacement to clean it up.
Something like this. ArgumentReplacer is a callback that is called for each match. The return value is substituted into the returned string.
void Main() {
string text = "\"one\" *two** and a bit \"three *** four\"";
string finderRegex = #"
(""[^""]*"") # quoted
| ([^\s""*]*\*[^\s""]*) # with asteriks
| ([^\s""]+) # without asteriks
";
return Regex.Replace(text, finderRegex, ArgumentReplacer,
RegexOptions.IgnorePatternWhitespace);
}
public static String ArgumentReplacer(Match theMatch) {
// Don't touch quoted arguments, and arguments with no asteriks
if (theMatch.Groups[2].Value.Length == 0)
return theMatch.Value;
// Quote arguments with asteriks, and replace sequences of such
// by a single one.
return String.Format("\"%s\"",
Regex.Replace(theMatch.Value, #"\*\*+", "*"));
}
Alternatives to the left in the pattern has priority over those to the right. This is why I just needed to write "[^\s""]+" in the last alternative.
The quotes, on the other hand, are only matched if they occur at the beginning of the argument. They will not be detected if they occur in the middle of the argument, and we must stop before those if they occur.
Given that you wish to match pairs of quotes, I don’t think your language is regular, therefore I don’t think RegEx is a good solution. E.g
Some people, when confronted with a problem, think “I know, I'll use
regular expressions.”
Now they have two problems.
See "When not to use Regex in C# (or Java, C++ etc)"
I've decided to follow the advice of a couple of responses and go with a parser solution. I've tried the regexes contributed so far and they seem to fail in some cases. That's probably an indication that regexes aren't the appropriate solution to this problem. Thanks for all responses.
Given $displayHeight = "800";, replace whatever number is at 800 with int value y_res.
resultString = Regex.Replace(
im_cfg_contents,
#"\$displayHeight[\s]*=[\s]*""(.*)"";",
Convert.ToString(y_res));
In Python I'd use re.sub and it would work. In .NET it replaces the whole line, not the matched group.
What is a quick fix?
Building on a couple of the answers already posted. The Zero-width assertion allows you to do a regular expression match without placing those characters in the match. By placing the first part of the string in a group we've separated it from the digits that you want to be replaced. Then by using a zero-width lookbehind assertion in that group we allow the regular expression to proceed as normal but omit the characters in that group in the match. Similarly, we've placed the last part of the string in a group, and used a zero-width lookahead assertion. Grouping Constructs on MSDN shows the groups as well as the assertions.
resultString = Regex.Replace(
im_cfg_contents,
#"(?<=\$displayHeight[\s]*=[\s]*"")(.*)(?="";)",
Convert.ToString(y_res));
Another approach would be to use the following code. The modification to the regular expression is just placing the first part in a group and the last part in a group. Then in the replace string, we add back in the first and third groups. Not quite as nice as the first approach, but not quite as bad as writing out the $displayHeight part. Substitutions on MSDN shows how the $ characters work.
resultString = Regex.Replace(
im_cfg_contents,
#"(\$displayHeight[\s]*=[\s]*"")(.*)("";)",
"${1}" + Convert.ToString(y_res) + "${3}");
Try this:
resultString = Regex.Replace(
im_cfg_contents,
#"\$displayHeight[\s]*=[\s]*""(.*)"";",
#"\$displayHeight = """ + Convert.ToString(y_res) + #""";");
It replaces the whole string because you've matched the whole string - nothing about this statement tells C# to replace just the matched group, it will find and store that matched group sure, but it's still matching the whole string overall.
You can either change your replacer to:
#"\$displayHeight = """ + Convert.ToString(y_res) + #""";"
..or you can change your pattern to just match the digits, i.e.:
#"[0-9]+"
..or you could see if C# regex supports lookarounds (I'm not sure if it does offhand) and change your match accordingly.
You could also try this, though I think it is a little slower than my other method:
resultString = Regex.Replace(
im_cfg_contents,
"(?<=\\$displayHeight[\\s]*=[\\s]*\").*(?=\";)",
Convert.ToString(y_res));
Check this pattern out
(?<=(\$displayHeight\s*=\s*"))\d+(?=";)
A word about "lookaround".