C# Regex - Match and replace, Auto Increment - c#

I have been toiling with a problem and any help would be appreciated.
Problem: I have a paragraph and I want to replace a variable which appears several times (Variable = #Variable). This is the easy part, but the portion which I am having difficulty is trying to replace the variable with different values.
I need for each occurrence to have a different value. For instance, I have a function that does a calculation for each variable. What I have thus far is below:
private string SetVariables(string input, string pattern){
Regex rx = new Regex(pattern);
MatchCollection matches = rx.Matches(input);
int i = 1;
if(matches.Count > 0)
{
foreach(Match match in matches)
{
rx.Replace(match.ToString(), getReplacementNumber(i));
i++
}
}
I am able to replace each variable that I need to with the number returned from getReplacementNumber(i) function, but how to I put it back into my original input with the replaced values, in the same order found in the match collection?
Thanks in advance!
Marcus

Use the overload of Replace that takes a MatchEvaluator as its second parameter.
string result = rx.Replace(input, match => { return getReplacementNumber(i++); });
I'm assuming here that getReplacementNumber(int i) returns a string. If not, you will have to convert the result to a string.
See it working online: ideone

Related

Check array for string that starts with given one (ignoring case)

I am trying to see if my string starts with a string in an array of strings I've created. Here is my code:
string x = "Table a";
string y = "a table";
string[] arr = new string["table", "chair", "plate"]
if (arr.Contains(x.ToLower())){
// this should be true
}
if (arr.Contains(y.ToLower())){
// this should be false
}
How can I make it so my if statement comes up true? Id like to just match the beginning of string x to the contents of the array while ignoring the case and the following characters. I thought I needed regex to do this but I could be mistaken. I'm a bit of a newbie with regex.
It seems you want to check if your string contains an element from your list, so this should be what you are looking for:
if (arr.Any(c => x.ToLower().Contains(c)))
Or simpler:
if (arr.Any(x.ToLower().Contains))
Or based on your comments you may use this:
if (arr.Any(x.ToLower().Split(' ')[0].Contains))
Because you said you want regex...
you can set a regex to var regex = new Regex("(table|plate|fork)");
and check for if(regex.IsMatch(myString)) { ... }
but it for the issue at hand, you dont have to use Regex, as you are searching for an exact substring... you can use
(as #S.Akbari mentioned : if (arr.Any(c => x.ToLower().Contains(c))) { ... }
Enumerable.Contains matches exact values (and there is no build in compare that checks for "starts with"), you need Any that takes predicate that takes each array element as parameter and perform the check. So first step is you want "contains" to be other way around - given string to contain element from array like:
var myString = "some string"
if (arr.Any(arrayItem => myString.Contains(arrayItem)))...
Now you actually asking for "string starts with given word" and not just contains - so you obviously need StartsWith (which conveniently allows to specify case sensitivity unlike Contains - Case insensitive 'Contains(string)'):
if (arr.Any(arrayItem => myString.StartsWith(
arrayItem, StringComparison.CurrentCultureIgnoreCase))) ...
Note that this code will accept "tableAAA bob" - if you really need to break on word boundary regular expression may be better choice. Building regular expressions dynamically is trivial as long as you properly escape all the values.
Regex should be
beginning of string - ^
properly escaped word you are searching for - Escape Special Character in Regex
word break - \b
if (arr.Any(arrayItem => Regex.Match(myString,
String.Format(#"^{0}\b", Regex.Escape(arrayItem)),
RegexOptions.IgnoreCase)) ...
you can do something like below using TypeScript. Instead of Starts with you can also use contains or equals etc..
public namesList: Array<string> = ['name1','name2','name3','name4','name5'];
// SomeString = 'name1, Hello there';
private isNamePresent(SomeString : string):boolean{
if (this.namesList.find(name => SomeString.startsWith(name)))
return true;
return false;
}
I think I understand what you are trying to say here, although there are still some ambiguity. Are you trying to see if 1 word in your String (which is a sentence) exists in your array?
#Amy is correct, this might not have to do with Regex at all.
I think this segment of code will do what you want in Java (which can easily be translated to C#):
Java:
x = x.ToLower();
string[] words = x.Split("\\s+");
foreach(string word in words){
foreach(string element in arr){
if(element.Equals(word)){
return true;
}
}
}
return false;
You can also use a Set to store the elements in your array, which can make look up more efficient.
Java:
x = x.ToLower();
string[] words = x.Split("\\s+");
HashSet<string> set = new HashSet<string>(arr);
for(string word : words){
if(set.contains(word)){
return true;
}
}
return false;
Edit: (12/22, 11:05am)
I rewrote my solution in C#, thanks to reminders by #Amy and #JohnyL. Since the author only wants to match the first word of the string, this edited code should work :)
C#:
static bool contains(){
x = x.ToLower();
string[] words = x.Split(" ");
var set = new HashSet<string>(arr);
if(set.Contains(words[0])){
return true;
}
return false;
}
Sorry my question was so vague but here is the solution thanks to some help from a few people that answered.
var regex = new Regex("^(table|chair|plate) *.*");
if (regex.IsMatch(x.ToLower())){}

Replace Multiple References of a pattern with Regex

I have a string which is in the following form
$KL\U#, $AS\gehaeuse#, $KL\tol_plus#, $KL\tol_minus#
Basically this string is made up of the following parts
$ = Delimiter Start
(Some Text)
# = Delimiter End
(all of this n times)
I would now like to replace each of these sections with some meaningful text. Therefore I need to extract these sections, do something based on the text inside each section and then replace the section with the result. So the resulting string should look something like this:
12V, 0603, +20%, -20%
The commas and everything else that is not contained within the section stays as it is, the sections get replaced by meaningful values.
For the question: Can you help me with a Regex pattern that finds out where these sections are so I can replace them?
You need to use the Regex.Replace method and use a MatchEvaluator delegate to decide what the replacement value should be.
The pattern you need can be $ then anything except #, then #. We put the middle bit in brackets so it is stored as a separate group in the result.
\$([^#]+)#
The full thing can be something like this (up to you to do the correct appropriate replacement logic):
string value = #"$KL\U#, $AS\gehaeuse#, $KL\tol_plus#, $KL\tol_minus#";
string result = Regex.Replace(value, #"\$([^#]+)#", m =>
{
// This method takes the matching value and needs to return the correct replacement
// m.Value is e.g. "$KL\U#", m.Groups[1].Value is the bit in ()s between $ and #
switch (m.Groups[1].Value)
{
case #"KL\U":
return "12V";
case #"AS\gehaeuse":
return "0603";
case #"KL\tol_plus":
return "+20%";
case #"KL\tol_minus":
return "-20%";
default:
return m.Groups[1].Value;
}
});
As far as matching the pattern, you're wanting:
\$[^#]+#
The rest of your question isn't very clear. If you need to replace the original string with some meaningful values, just loop through your matches:
var str = #"$KL\U#, $AS\gehaeuse#, $KL\tol_plus#, $KL\tol_minus#";
foreach (Match match in Regex.Matches(str, #"\$[^#]+#"))
{
str = str.Replace(match.ToString(), "something meaningful");
}
beyond that you'll have to provide more context
are you sure you don't want to do just plain string manipulations?
var str = #"$KL\U#, $AS\gehaeuse#, $KL\tol_plus#, $KL\tol_minus#";
string ReturnManipulatedString(string str)
{
var list = str.split("$");
string newValues = string.Empty;
foreach (string st in str)
{
var temp = st.split("#");
newValues += ManipulateStuff(temp[0]);
if (0 < temp.Count();
newValues += temp[1];
}
}

match first digits before # symbol

How to match all first digits before # in this line
26909578#Sbrntrl_7x06-lilla.avi#356028416#2012-10-24 09:06#0#http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html#[URL=http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html]http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html[/URL]#http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html#http://bitshare.com/?f=dvk9o1oz#http://bitshare.com/delete/dvk9o1oz/4511e6f3612961f961a761adcb7e40a0/Sbrntrl_7x06-lilla.avi.html
Im trying to get this number 26909578
My try
string text = #"26909578#Sbrntrl_7x06-lilla.avi#356028416#2012-10-24 09:06#0#http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html#[URL=http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html]http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html[/URL]#http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html#http://bitshare.com/?f=dvk9o1oz#http://bitshare.com/delete/dvk9o1oz/4511e6f3612961f961a761adcb7e40a0/Sbrntrl_7x06-lilla.avi.html";
MatchCollection m1 = Regex.Matches(text, #"(.+?)#", RegexOptions.Singleline);
but then its outputs all text
Make it explicit that it has to start at the beginning of the string:
#"^(.+?)#"
Alternatively, if you know that this will always be a number, restrict the possible characters to digits:
#"^\d+"
Alternatively use the function Match instead of Matches. Matches explicitly says, "give me all the matches", while Match will only return the first one.
Or, in a trivial case like this, you might also consider a non-RegEx approach. The IndexOf() method will locate the '#' and you could easily strip off what came before.
I even wrote a sscanf() replacement for C#, which you can see in my article A sscanf() Replacement for .NET.
If you dont want to/dont like to use regex, use a string builder and just loop until you hit the #.
so like this
StringBuilder sb = new StringBuilder();
string yourdata = "yourdata";
int i = 0;
while(yourdata[i]!='#')
{
sb.Append(yourdata[i]);
i++;
}
//when you get to that # your stringbuilder will have the number you want in it so return it with .toString();
string answer = sb.toString();
The entire string (except the final url) is composed of segments that can be matched by (.+?)#, so you will get several matches. Retrieve only the first match from the collection returned by matching .+?(?=#)

Regex: C# extract text within double quotes

I want to extract only those words within double quotes. So, if the content is:
Would "you" like to have responses to your "questions" sent to you via email?
The answer must be
you
questions
Try this regex:
\"[^\"]*\"
or
\".*?\"
explain :
[^ character_group ]
Negation: Matches any single character that is not in character_group.
*?
Matches the previous element zero or more times, but as few times as possible.
and a sample code:
foreach(Match match in Regex.Matches(inputString, "\"([^\"]*)\""))
Console.WriteLine(match.ToString());
//or in LINQ
var result = from Match match in Regex.Matches(line, "\"([^\"]*)\"")
select match.ToString();
Based on #Ria 's answer:
static void Main(string[] args)
{
string str = "Would \"you\" like to have responses to your \"questions\" sent to you via email?";
var reg = new Regex("\".*?\"");
var matches = reg.Matches(str);
foreach (var item in matches)
{
Console.WriteLine(item.ToString());
}
}
The output is:
"you"
"questions"
You can use string.TrimStart() and string.TrimEnd() to remove double quotes if you don't want it.
I like the regex solutions. You could also think of something like this
string str = "Would \"you\" like to have responses to your \"questions\" sent to you via email?";
var stringArray = str.Split('"');
Then take the odd elements from the array. If you use linq, you can do it like this:
var stringArray = str.Split('"').Where((item, index) => index % 2 != 0);
This also steals the Regex from #Ria, but allows you to get them into an array where you then remove the quotes:
strText = "Would \"you\" like to have responses to your \"questions\" sent to you via email?";
MatchCollection mc = Regex.Matches(strText, "\"([^\"]*)\"");
for (int z=0; z < mc.Count; z++)
{
Response.Write(mc[z].ToString().Replace("\"", ""));
}
I combine Regex and Trim:
const string searchString = "This is a \"search text\" and \"another text\" and not \"this text";
var collection = Regex.Matches(searchString, "\\\"(.*?)\\\"");
foreach (var item in collection)
{
Console.WriteLine(item.ToString().Trim('"'));
}
Result:
search text
another text
Try this (\"\w+\")+
I suggest you to download Expresso
http://www.ultrapico.com/Expresso.htm
I needed to do this in C# for parsing CSV and none of these worked for me so I came up with this:
\s*(?:(?:(['"])(?<value>(?:\\\1|[^\1])*?)\1)|(?<value>[^'",]+?))\s*(?:,|$)
This will parse out a field with or without quotes and will exclude the quotes from the value while keeping embedded quotes and commas. <value> contains the parsed field value. Without using named groups, either group 2 or 3 contains the value.
There are better and more efficient ways to do CSV parsing and this one will not be effective at identifying bad input. But if you can be sure of your input format and performance is not an issue, this might work for you.
Slight improvement on answer by #ria,
\"[^\" ][^\"]*\"
Will recognize a starting double quote only when not followed by a space to allow trailing inch specifiers.
Side effect: It will not recognize "" as a quoted value.

Multiple occurrences of text in a string

How could I use a for loop to go through each iteration of a given phrase in a string? For instance, say I had the following string:
Hey, this is an example string. A string is a collection of characters.
And every time there was an "is", I wanted to assign the three characters after it to a new string. I understand how to do that ONCE, but I'm trying to figure out how a for loop could be used to go through multiple instances of the same word.
If you must use a for-loop for whatever reason, you can replace the relevant part of the code provided by ja72 with:
for (int i = 0; i < text.Length; i++)
{
if (text[i] == 'i' && text[i+1] == 's')
sb.Append(text.Substring(i + 2, 3));
}
Unfortunately, I don't have enough reputation to add this as a comment here, hence posting it as an answer!
Is this what you want?
static void Main(string[] args)
{
string text=#"Hey, this is an example string. A string is a collection of characters.";
StringBuilder sb=new StringBuilder();
int i=-1;
while ((i=text.IndexOf("is", i+1))>=0)
{
sb.Append(text.Substring(i+2, 3));
}
string result=sb.ToString();
}
//result " is an a "
You can use a regex like this:
Regex re = new Regex("(?:is)(.{3})");
This regex looks for is (?:is), and takes the next three characters (.{3})
Then you use the regex to find all matches: Regex.Matches(). This will return a match for each is found in the string, followed by 3 characters. Each match has two groups:
Group 0: that includes is and the next three characters
Group 1: which includes the next thress characters
Matches matches = re.Matches("Hey, this is an example string. A string is a collection of characters.");
StringBuilder sb = new StringBuilder();
foreach (Match m in matches)
{
sb.Append(m.Groups1.Value);
}
Using Regex is much faster than looping through the characters of the string. Even more if you use RegexOptions.Compiled in your regex constructor: Regex Constructor (String, RegexOptions)

Categories

Resources