Regex.Matches c# double quotes - c#

I got this code below that works for single quotes.
it finds all the words between the single quotes.
but how would I modify the regex to work with double quotes?
keywords is coming from a form post
so
keywords = 'peace "this world" would be "and then" some'
// Match all quoted fields
MatchCollection col = Regex.Matches(keywords, #"'(.*?)'");
// Copy groups to a string[] array
string[] fields = new string[col.Count];
for (int i = 0; i < fields.Length; i++)
{
fields[i] = col[i].Groups[1].Value; // (Index 1 is the first group)
}// Match all quoted fields
MatchCollection col = Regex.Matches(keywords, #"'(.*?)'");
// Copy groups to a string[] array
string[] fields = new string[col.Count];
for (int i = 0; i < fields.Length; i++)
{
fields[i] = col[i].Groups[1].Value; // (Index 1 is the first group)
}

You would simply replace the ' with \" and remove the literal to reconstruct it properly.
MatchCollection col = Regex.Matches(keywords, "\\\"(.*?)\\\"");

The exact same, but with double quotes in place of single quotes. Double quotes aren't special in a regex pattern. But I usually add something to make sure I'm not spanning accross multiple quoted strings in a single match, and to accomodate double-double quote escapes:
string pattern = #"""([^""]|"""")*""";
// or (same thing):
string pattern = "\"(^\"|\"\")*\"";
Which translates to the literal string
"(^"|"")*"

Use this regex:
"(.*?)"
or
"([^"]*)"
In C#:
var pattern = "\"(.*?)\"";
or
var pattern = "\"([^\"]*)\"";

Do you want to match " or ' ?
in which case you might want to do something like this:
[Test]
public void Test()
{
string input = "peace \"this world\" would be 'and then' some";
MatchCollection matches = Regex.Matches(input, #"(?<=([\'\""])).*?(?=\1)");
Assert.AreEqual("this world", matches[0].Value);
Assert.AreEqual("and then", matches[1].Value);
}

Related

Replace only 'n' occurences of a substring in a string in C#

I have a input string like -
abbdabab
How to replace only the 2nd, 3rd and subsequent occurances of the substring "ab" with any random string like "x" keeping the original string intact. Example in this case -
1st Output - xbdabab 2nd Output - abbdxab 3rd Output - abbdabx and so on...
I have tried using Regex like -
int occCount = Regex.Matches("abbdabab", "ab").Count;
if (occCount > 1)
{
for (int i = 1; i <= occCount; i++)
{
Regex regReplace = new Regex("ab");
string modifiedValue = regReplace.Replace("abbdabab", "x", i);
//decodedMessages.Add(modifiedValue);
}
}
Here I am able to get the 1st output when the counter i value is 1 but not able to get the subsequent results. Is there any overloaded Replace method which could achieve this ? Or Can anyone help me in pointing where I might have gone wrong?
You can try IndexOf instead of regular expressions:
string source = "abbdabab";
string toFind = "ab";
string toSet = "X";
for (int index = source.IndexOf(toFind);
index >= 0;
index = source.IndexOf(toFind, index + 1)) {
string result = source.Substring(0, index) +
toSet +
source.Substring(index + toFind.Length);
Console.WriteLine(result);
}
Outcome:
Xbdabab
abbdXab
abbdabX
You can use a StringBuilder:
string s = "abbdabab";
var matches = Regex.Matches(s, "ab");
StringBuilder sb = new StringBuilder(s);
var m = matches[0]; // 0 for first output, 1 for second output, and so on
sb.Remove(m.Index, m.Length);
sb.Insert(m.Index, "x");
var result = sb.ToString();
Console.WriteLine(result);
You may use a dynamically built regex to be used with regex.Replace directly:
var s = "abbdabab";
var idx = 1; // First = 1, Second = 2
var search = "ab";
var repl = "x";
var pat = new Regex($#"(?s)((?:{search}.*?){{{idx-1}}}.*?){search}"); // ((?:ab.*?){0}.*?)ab
Console.WriteLine(pat.Replace(s, $"${{1}}{repl}", 1));
See the C# demo
The pattern will look like ((?:ab.*?){0}.*?)ab and will match
(?s) - RegexOptions.Singleline to make . also match newlines
((?:ab.*?){0}.*?) - Group 1 (later, this value will be put back into the result with ${1} backreference)
(?:ab.*?){0} - 0 occurrences of ab followed with any 0+ chars as few as possible
.*? - any 0+ chars as few as possible
ab - the search string/pattern.
The last argument to pat.Replace is 1, so that only the first occurrence could be replaced.
If search is a literal text, you need to use var search = Regex.Escape("a+b");.
If the repl can have $, add repl = repl.Replace("$", "$$");.

Regex replace the nth element from string and adding custom HTML Tags

I have a list of words "this", "be able", "it" that I want to find inside a paragraph so I can replace preserving their capitalization.
Having this paragraph:
This is my text and this is why I want to match it! As this is just a
text, I would like to be able to solve it. This is the final phrase of
this paragraph.
"this" is found 5 times and if I decide to replace the 4th one ("This") I want to still be able to keep the T capital. Now you will see that's not actually a replace but more of an adding problem as the actual replace would be from this to This
so my final paragraph would be:
This is my text and this is why I want to match it! As this is just a text, I would like to be able to solve it. This is the final phrase of this paragraph.
My code so far:
List<string> words = new List<string>(new string[] { "this", "be able", "it"});
var paragraph = "This is my text and this is why I want to match it! As this is just a text, I would like to be able to solve it. This is the final phrase of this paragraph.";
//List<string>
for (int w = 0; w < words.Count; w++)
{
var foudItems = Regex.Matches(paragraph, #"\b" + words[w] + "\\b", RegexOptions.IgnoreCase);
if (foudItems.Count != 0)
{
Random rnd = new Random();
int rndWord = rnd.Next(0, foudItems.Count);
Regex.Replace(paragraph, #"\b" + words[w] + "\\b", "<strong>" + foudItems[rndWord] + "</strong>");
Console.WriteLine(paragraph);
}
//Regex.Replace()
Console.WriteLine(foudItems[0] + " " + foudItems[1]);
}
The main problem is that I don't know how to replace only the n'th word using regex. Another issue would be the complicated approach in solving this so I'm open to new suggestions.
If you want to replace nth occurrence of something, you can use MatchEvaluator delegate which checks current occurrence index and returns unmodified matched value if index match is not one you want to replace. To track current index you can capture local variable:
int occurrenceToReplace = 4;
int index = 0;
MatchEvaluator evaluator = m => (++index == occurrenceToReplace)
? $"<strong>{m.Value}</strong>"
: m.Value;
text = Regex.Replace(text, #"\bthis\b", evaluator, RegexOptions.IgnoreCase);
Now back to your problem - you can write method which wraps nth occurrence of given word into html tag:
private static string MakeStrong(string text, string word, int occurrence)
{
int index = 0;
MatchEvaluator evaluator = m => (++index == occurrence)
? $"<strong>{m.Value}</strong>"
: m.Value;
return Regex.Replace(text, $#"\b{word}\b", evaluator, RegexOptions.IgnoreCase);
}
And if you want to randomly replace one of the occurrences of each word, then just use this method in a loop:
string[] words = { "this", "be able", "it"};
var paragraph = #"This is my text and this is why I want to match it! As this is just
a text, I would like to be able to solve it. This is the final phrase of this paragraph.";
var random = new Random();
foreach(var word in words)
{
int count = Regex.Matches(paragraph, $#"\b{word}\b", RegexOptions.IgnoreCase).Count;
int occurrence = random.Next(count + 1);
paragraph = MakeStrong(paragraph, word, occurrence);
}
Sample output:
This is my text and this is why I want to match
it! As this is just a text, I would like to
be able to solve it. This is the final phrase of this
paragraph.
If you want to keep the regex side quite simple, you can use this algo:
List<string> words = new List<string>(new string[] { "this", "be able", "it" });
var paragraph = "This is my text and this is why I want to match it! As this is just a text, I would like to be able to solve it. This is the final phrase of this paragraph.";
//List<string>
foreach (string word in words)
{
var foundItems = Regex.Matches(paragraph, #"\b" + word + #"\b", RegexOptions.IgnoreCase);
if (foundItems.Count != 0)
{
var count = 0;
var toReplace = 3;
foreach (Match foudItem in foundItems)
{
count++;
if(count != toReplace)
continue;
var regex = $"(^.{{{foudItem.Index}}}){foudItem.Value}(.*)";
paragraph = Regex.Replace(paragraph, regex, $"$1<strong>{foudItem.Value}</strong>$2");
}
Console.WriteLine(paragraph);
}
Console.WriteLine(foundItems[0] + " " + foundItems[1]);
}

Change in string some part, but without one part - where are numbers

For example I have such string:
ex250-r-ninja-08-10r_
how could I change it to such string?
ex250 r ninja 08-10r_
as you can see I change all - to space, but didn't change it where I have XX-XX part... how could I do such string replacement in c# ? (also string could be different length)
I do so for -
string correctString = errString.Replace("-", " ");
but how to left - where number pattern XX-XX ?
You can use regular expressions to only perform substitutions in certain cases. In this case, you want to perform a substitution if either side of the dash is a non-digit. That's not quite as simple as it might be, but you can use:
string ReplaceSomeHyphens(string input)
{
string result = Regex.Replace(input, #"(\D)-", "${1} ");
result = Regex.Replace(result, #"-(\D)", " ${1}");
return result;
}
It's possible that there's a more cunning way to do this in a single regular expression, but I suspect that it would be more complicated too :)
A very uncool approach using a StringBuilder. It'll replace all - with space if the two characters before and the two characters behind are not digits.
StringBuilder sb = new StringBuilder();
for (int i = 0; i < text.Length; i++)
{
bool replace = false;
char c = text[i];
if (c == '-')
{
if (i < 2 || i >= text.Length - 2) replace = true;
else
{
bool leftDigit = text.Substring(i - 2, 2).All(Char.IsDigit);
bool rightDigit = text.Substring(i + 1, 2).All(Char.IsDigit);
replace = !leftDigit || !rightDigit;
}
}
if (replace)
sb.Append(' ');
else
sb.Append(c);
}
Since you say you won't have hyphens at the start of your string then you need to capture every occurrence of - that is preceded by a group of characters which contains at least one letter and zero or many numbers. To achieve this, use positive lookbehind in your regex.
string strRegex = #"(?<=[a-z]+[0-9]*)-";
Regex myRegex = new Regex(strRegex, RegexOptions.IgnoreCase | RegexOptions.Multiline);
string strTargetString = #"ex250-r-ninja-08-10r_";
string strReplace = #" ";
return myRegex.Replace(strTargetString, strReplace);
Here are the results:

How to split string that delimiters remain in the end of result?

I have several delimiters. For example {del1, del2, del3 }.
Suppose I have text : Text1 del1 text2 del2 text3 del3
I want to split string in such way:
Text1 del1
text2 del2
text3 del3
I need to get array of strings, when every element of array is texti deli.
How can I do this in C# ?
String.Split allows multiple split-delimeters. I don't know if that fits your question though.
Example :
String text = "Test;Test1:Test2#Test3";
var split = text.Split(';', ':', '#');
//split contains an array of "Test", "Test1", "Test2", "Test3"
Edit: you can use a regex to keep the delimeters.
String text = "Test;Test1:Test2#Test3";
var split = Regex.Split(text, #"(?<=[;:#])");
// contains "Test;", "Test1:", "Test2#","Test3"
This should do the trick:
const string input = "text1-text2;text3-text4-text5;text6--";
const string matcher= "(-|;)";
string[] substrings = Regex.Split(input, matcher);
StringBuilder builder = new StringBuilder();
foreach (string entry in substrings)
{
builder.Append(entry);
}
Console.Out.WriteLine(builder.ToString());
note that you will receive empty strings in your substring array for the matches for the two '-';s at the end, you can choose to ignore or do what you like with those values.
You could use a regex. For a string like this "text1;text2|text3^" you could use this:
(.*;|.*\||.*\^)
Just add more alternative pattens for each delimiter.
If you want to keep the delimiter when splitting the string you can use the following:
string[] delimiters = { "del1", "del2", "del3" };
string input = "text1del1text2del2text3del3";
string[] parts = input.Split(delimiters, StringSplitOptions.RemoveEmptyEntries);
for(int index = 0; index < parts.Length; index++)
{
string part = parts[index];
string temp = input.Substring(input.IndexOf(part) + part.Length);
foreach (string delimter in delimiters)
{
if ( temp.IndexOf(delimter) == 0)
{
parts[index] += delimter;
break;
}
}
}
parts will then be:
[0] "text1del1"
[1] "text2del2"
[2] "text3del3"
As #Matt Burland suggested, use Regex
List<string> values = new List<string>();
string s = "abc123;def456-hijk,";
Regex r = new Regex(#"(.*;|.*-|.*,)");
foreach(Match m in r.Matches(s))
values.Add(m.Value);

Replace placeholders in order

I have a part of a URL like this:
/home/{value1}/something/{anotherValue}
Now i want to replace all between the brackets with values from a string-array.
I tried this RegEx pattern: \{[a-zA-Z_]\} but it doesn't work.
Later (in C#) I want to replace the first match with the first value of the array, second with the second.
Update: The /'s cant be used to separate. Only the placeholders {...} should be replaced.
Example: /home/before{value1}/and/{anotherValue}
String array: {"Tag", "1"}
Result: /home/beforeTag/and/1
I hoped it could works like this:
string input = #"/home/before{value1}/and/{anotherValue}";
string pattern = #"\{[a-zA-Z_]\}";
string[] values = {"Tag", "1"};
MatchCollection mc = Regex.Match(input, pattern);
for(int i, ...)
{
mc.Replace(values[i];
}
string result = mc.GetResult;
Edit:
Thank you Devendra D. Chavan and ipr101,
both solutions are greate!
You can try this code fragment,
// Begin with '{' followed by any number of word like characters and then end with '}'
var pattern = #"{\w*}";
var regex = new Regex(pattern);
var replacementArray = new [] {"abc", "cde", "def"};
var sourceString = #"/home/{value1}/something/{anotherValue}";
var matchCollection = regex.Matches(sourceString);
for (int i = 0; i < matchCollection.Count && i < replacementArray.Length; i++)
{
sourceString = sourceString.Replace(matchCollection[i].Value, replacementArray[i]);
}
[a-zA-Z_] describes a character class. For words, you'll have to add * at the end (any number of characters within a-zA-Z_.
Then, to have 'value1' captured, you'll need to add number support : [a-zA-Z0-9_]*, which can be summarized with: \w*
So try this one : {\w*}
But for replacing in C#, string.Split('/') might be easier as Fredrik proposed. Have a look at this too
You could use a delegate, something like this -
string[] strings = {"dog", "cat"};
int counter = -1;
string input = #"/home/{value1}/something/{anotherValue}";
Regex reg = new Regex(#"\{([a-zA-Z0-9]*)\}");
string result = reg.Replace(input, delegate(Match m) {
counter++;
return "{" + strings[counter] + "}";
});
My two cents:
// input string
string txt = "/home/{value1}/something/{anotherValue}";
// template replacements
string[] str_array = { "one", "two" };
// regex to match a template
Regex regex = new Regex("{[^}]*}");
// replace the first template occurrence for each element in array
foreach (string s in str_array)
{
txt = regex.Replace(txt, s, 1);
}
Console.Write(txt);

Categories

Resources