Replace Multiple References of a pattern with Regex - c#

I have a string which is in the following form
$KL\U#, $AS\gehaeuse#, $KL\tol_plus#, $KL\tol_minus#
Basically this string is made up of the following parts
$ = Delimiter Start
(Some Text)
# = Delimiter End
(all of this n times)
I would now like to replace each of these sections with some meaningful text. Therefore I need to extract these sections, do something based on the text inside each section and then replace the section with the result. So the resulting string should look something like this:
12V, 0603, +20%, -20%
The commas and everything else that is not contained within the section stays as it is, the sections get replaced by meaningful values.
For the question: Can you help me with a Regex pattern that finds out where these sections are so I can replace them?

You need to use the Regex.Replace method and use a MatchEvaluator delegate to decide what the replacement value should be.
The pattern you need can be $ then anything except #, then #. We put the middle bit in brackets so it is stored as a separate group in the result.
\$([^#]+)#
The full thing can be something like this (up to you to do the correct appropriate replacement logic):
string value = #"$KL\U#, $AS\gehaeuse#, $KL\tol_plus#, $KL\tol_minus#";
string result = Regex.Replace(value, #"\$([^#]+)#", m =>
{
// This method takes the matching value and needs to return the correct replacement
// m.Value is e.g. "$KL\U#", m.Groups[1].Value is the bit in ()s between $ and #
switch (m.Groups[1].Value)
{
case #"KL\U":
return "12V";
case #"AS\gehaeuse":
return "0603";
case #"KL\tol_plus":
return "+20%";
case #"KL\tol_minus":
return "-20%";
default:
return m.Groups[1].Value;
}
});

As far as matching the pattern, you're wanting:
\$[^#]+#
The rest of your question isn't very clear. If you need to replace the original string with some meaningful values, just loop through your matches:
var str = #"$KL\U#, $AS\gehaeuse#, $KL\tol_plus#, $KL\tol_minus#";
foreach (Match match in Regex.Matches(str, #"\$[^#]+#"))
{
str = str.Replace(match.ToString(), "something meaningful");
}
beyond that you'll have to provide more context

are you sure you don't want to do just plain string manipulations?
var str = #"$KL\U#, $AS\gehaeuse#, $KL\tol_plus#, $KL\tol_minus#";
string ReturnManipulatedString(string str)
{
var list = str.split("$");
string newValues = string.Empty;
foreach (string st in str)
{
var temp = st.split("#");
newValues += ManipulateStuff(temp[0]);
if (0 < temp.Count();
newValues += temp[1];
}
}

Related

Replacing a portion of a string with an exact matching

I just want to replace a portion of a string only if matches the given text.
My use case is as follows:
var text = "<wd:response><wd:response-data></wd:response-data></wd:response >";
string result = text.Replace("wd:response", "response");
/*
* expecting the below text
<response><wd:response-data></wd:response-data></response>
*
*/
I followed the following answers:
Way to have String.Replace only hit "whole words"
Regular expression for exact match of a string
But I failed to achieve what I want.
Please share your thoughts/solutions.
Sample on
https://dotnetfiddle.net/pMkO8Q
In general, you should really be parsing and manipulating XML as XML, using functions that know how XML works and what's legal in the language. Regex and other naive text manipulation will often lead you into trouble.
That said, for a very simple solution to this specific problem, you can do this with two replaces:
var text = "<wd:response><wd:response-data></wd:response-data></wd:response >";
text.Replace("wd:response>", "response>").Replace("wd:response ", "response ")
(Note the spaces at the end of the parameters to the second replace.)
Alternatively use a regex similar to "wd:response\s*>"
The easiest way to achieve your result as per your .net fiddle is use the replace as below.
string result = text.Replace("wd:response>", "response>");
But proper way to achieve this is parsing using XML
You can capture the string wd-response in a capturing group and replace using Regex.Replace using the MatchEvaluator like this.
Regex explanation - <[/]?(wd:response)[\s+]?>
Match < literally
Match / optionally hence the ?
Match the string wd:response and place it in a capturing group enclosed with ()
Match one or more optional whitespace [\s+]?
Match > literally
public class Program
{
public static void Main(string[] args)
{
string text = "<wd:response><wd:response-data></wd:response-data></wd:response >";
string replacePattern = "response";
string pattern = #"<[/]?(wd:response)[\s+]?>";
string replacedPattern = Regex.Replace(text, pattern, match =>
{
// Extract the first group
Group group = match.Groups[1];
// Replace the group value with the replacePattern
return string.Format("{0}{1}{2}", match.Value.Substring(0, group.Index - match.Index), replacePattern, match.Value.Substring(group.Index - match.Index + group.Length));
});
Console.WriteLine(replacedPattern);
}
}
Outputting:
<response><wd:response-data></wd:response-data></response >

C# split by regex

I have a little problem that I don't know how to call it like, so I will do my best to explain you that.
String text = "Random text over here boyz, I dunno what to do";
I want to take by split only over here boyz for example, I want to let split the word text and the word , and it will show me the whole text that in thoose 2 strings. Any ideas?
Thank you,
Sagi.
From your comments I get that from this string:
foo bar id="baz" qux
You want to obtain the value baz, because it is in the id="{text}" pattern.
For that you can use a regular expression:
string result = Regex.Match(text, "id=\"(.*?)\"").Groups[1].Value;
Note that this will match any character. Also note that this will yield false positives, like fooid="bar", and that this won't match unquoted values.
So all in all, for parsing HTML, you should not use regular expressions. Try HtmlAgilityPack and an XPath expression.
There is a Split overload that can receive multiple string seperators:
var rrr = text.Split(new string[] { ",", "text" }, StringSplitOptions.None);
If you would like to extract only the text between these two strings using regex you can do something like this:
var pattern = #"text(.*),";
var a = new Regex(pattern).Match(text);
var result = a.Groups[1];
You can use Regex class:
https://msdn.microsoft.com/pl-pl/library/ze12yx1d%28v=vs.110%29.aspx
But first of all (as it was said) you need to clarify for yourself how you will identify string that you want.
in first case you can use
string stringResult;
if (text.Contains("over here boyz"))
stringResult = string.Empty;
else
stringResult = "over here boyz";
but the second case can solve by this code
String text = "Random text over here boyz, I dunno what to do";
//Second dream without whitespace
var result = Regex.Split(text, " *text *| *, *");
foreach (var x in result)
{
Console.WriteLine(x);
}
//Second dream with whitespace
result = Regex.Split(text, "text|,");
foreach (var x in result)
{
Console.WriteLine(x);
}
You can train to write Regex with this tool http://www.regexbuddy.com/ or http://www.regexr.com/

Regex within a regex?

Truth is, I'm having a hard time writing a regex string to parse something in the form of
[[[tab name=dog content=cat|tab name=dog2 content=cat2]]]
This regex would be parsed so that I can dynamically build tabs as demonstrated here. Initially I tried a regex pattern like \[\[\[tab name=(?'name'.*?) content=(?'content'.*?)\]\]\]
But I realized I couldn't get the tab as a whole and build upon a query without doing a regex.replace. Is it possible to take the entire tab leading up to the pipe symbol as a group and then parse that group down from the sub key/value pairs?
This is the current regex string I'm working with \[\[\[(?'tab'tab name=(?'name'.*?) content=(?'content'.*?))\]\]\]
And here is my code for performing the regex. Any guidance would be appreciated.
public override string BeforeParse(string markupText)
{
if (CompiledRegex.IsMatch(markupText))
{
// Replaces the [[[code lang=sql|xxx]]]
// with the HTML tags (surrounded with {{{roadkillinternal}}.
// As the code is HTML encoded, it doesn't get butchered by the HTML cleaner.
MatchCollection matches = CompiledRegex.Matches(markupText);
foreach (Match match in matches)
{
string tabname = match.Groups["name"].Value;
string tabcontent = HttpUtility.HtmlEncode(match.Groups["content"].Value);
markupText = markupText.Replace(match.Groups["content"].Value, tabcontent);
markupText = Regex.Replace(markupText, RegexString, ReplacementPattern, CompiledRegex.Options);
}
}
return markupText;
}
Is this what you want?
string input = "[[[tab name=dog content=cat|tab name=dog2 content=cat2]]]";
Regex r = new Regex(#"tab name=([a-z0-9]+) content=([a-z0-9]+)(\||])");
foreach (Match m in r.Matches(input))
{
Console.WriteLine("{0} : {1}", m.Groups[1].Value, m.Groups[2].Value);
}
http://regexr.com/3boot
Maybe string.split will be better in that case? For example something like that :
strgin str = "[[[tab name=dog content=cat|tab name=dog2 content=cat2]]]";
foreach(var entry in str.Split('|')){
var eqBlocks = entry.Split('=');
var tabName = eqBlocks[1].TrimEnd(" content");
var content = eqBlocks[2];
}
Ugly code, but should work.
Try this:
Starts with a word boundary and followed only by allowed characters.
/\b[\w =]*/g
https://regex101.com/r/cI7jS7/1
Just distill the regex pattern down to the individual tab patterns such as name=??? content=??? and match that only. That pattern which will make each Match (two in you example) where the data can be extracted.
string text = #"[[[tab name=dog content=cat|tab name=dog2 content=cat2]]]";
string pattern = #"name=(?<Name>[^\s]+)\scontent=(?<Content>[^\s|\]]+)";
var result = Regex.Matches(text, pattern)
.OfType<Match>()
.Select(mt => new
{
Name = mt.Groups["Name"].Value,
Content = mt.Groups["Content"].Value,
});
The result is an enumerable list with the created dynamic entities with the tabs needed which can be directly bound to the control:
Note in the set notation [^\s|\]] the pipe | is treated as a literal in the set and not used as an or. The bracket ] does have to be escaped though to be treated as a literal. Finally the logic the parse will look for: "To not (^) be a space or a pipe or a brace for that set".

replacing characters in a single field of a comma-separated list

I have string in my c# code
a,b,c,d,"e,f",g,h
I want to replace "e,f" with "e f" i.e. ',' which is inside inverted comma should be replaced by space.
I tried using string.split but it is not working for me.
OK, I can't be bothered to think of a regex approach so I am going to offer an old fashioned loop approach which will work:
string DoReplace(string input)
{
bool isInner = false;//flag to detect if we are in the inner string or not
string result = "";//result to return
foreach(char c in input)//loop each character in the input string
{
if(isInner && c == ',')//if we are in an inner string and it is a comma, append space
result += " ";
else//otherwise append the character
result += c;
if(c == '"')//if we have hit an inner quote, toggle the flag
isInner = !isInner;
}
return result;
}
NOTE: This solution assumes that there can only be one level of inner quotes, for example you cannot have "a,b,c,"d,e,"f,g",h",i,j" - because that's just plain madness!
For the scenario where you only need to match one pair of letters, the following regex will work:
string source = "a,b,c,d,\"e,f\",g,h";
string pattern = "\"([\\w]),([\\w])\"";
string replace = "\"$1 $2\"";
string result = Regex.Replace(source, pattern, replace);
Console.WriteLine(result); // a,b,c,d,"e f",g,h
Breaking apart the pattern, it is matching any instance where there is a "X,X" sequence where X is any letter, and is replacing it with the very same sequence, with a space in between the letters instead of a comma.
You could easily extend this if you needed to to have it match more than one letter, etc, as needed.
For the case where you can have multiple letters separated by commas within quotes that need to be replaced, the following can do it for you. Sample text is a,b,c,d,"e,f,a",g,h:
string source = "a,b,c,d,\"e,f,a\",g,h";
string pattern = "\"([ ,\\w]+),([ ,\\w]+)\"";
string replace = "\"$1 $2\"";
string result = source;
while (Regex.IsMatch(result, pattern)) {
result = Regex.Replace(result, pattern, replace);
}
Console.WriteLine(result); // a,b,c,d,"e f a",g,h
This does something similar compared to the first one, but just removes any comma that is sandwiched by letters surrounded by quotes, and repeats it until all cases are removed.
Here's a somewhat fragile but simple solution:
string.Join("\"", line.Split('"').Select((s, i) => i % 2 == 0 ? s : s.Replace(",", " ")))
It's fragile because it doesn't handle flavors of CSV that escape double-quotes inside double-quotes.
Use the following code:
string str = "a,b,c,d,\"e,f\",g,h";
string[] str2 = str.Split('\"');
var str3 = str2.Select(p => ((p.StartsWith(",") || p.EndsWith(",")) ? p : p.Replace(',', ' '))).ToList();
str = string.Join("", str3);
Use Split() and Join():
string input = "a,b,c,d,\"e,f\",g,h";
string[] pieces = input.Split('"');
for ( int i = 1; i < pieces.Length; i += 2 )
{
pieces[i] = string.Join(" ", pieces[i].Split(','));
}
string output = string.Join("\"", pieces);
Console.WriteLine(output);
// output: a,b,c,d,"e f",g,h

How to rewrite a string by pattern

I have a string, where the "special areas" are enclosed in curly braces:
{intIncG}/{intIncD}/02-{yy}
I need to iterate through all of these elements inbetween {} and replace them based on their content. What is the best code structure to do it in C#?
I can't just do a replace since I need to know the index of each "speacial area {}" in order to replace it with the correct value.
Regex rgx = new Regex( #"\({[^\}]*\})");
string output = rgx.Replace(input, new MatchEvaluator(DoStuff));
static string DoStuff(Match match)
{
//Here you have access to match.Index, and match.Value so can do something different for Match1, Match2, etc.
//You can easily strip the {'s off the value by
string value = match.Value.Substring(1, match.Value.Length-2);
//Then call a function which takes value and index to get the string to pass back to be susbstituted
}
string.Replace will do just fine.
var updatedString = myString.Replace("{intIncG}", "something");
Do once for every different string.
Update:
Since you need the index of { in order to produce the replacement string (as you commented), you can use Regex.Matches to find the indices of { - each Match object in the Matches collection will include the index in the string.
Use Regex.Replace:
Replaces all occurrences of a character pattern defined by a regular expression with a specified replacement character string.
from msdn
You can define a function and join it's output -- so you'll only need to traverse the parts once and not for every replace rule.
private IEnumerable<string> Traverse(string input)
{
int index = 0;
string[] parts = input.Split(new[] {'/'});
foreach(var part in parts)
{
index++;
string retVal = string.Empty;
switch(part)
{
case "{intIncG}":
retVal = "a"; // or something based on index!
break;
case "{intIncD}":
retVal = "b"; // or something based on index!
break;
...
}
yield return retVal;
}
}
string replaced = string.Join("/", Traverse(inputString));

Categories

Resources