I got a problem about string replacement, because of substrings chancing somewhere. For example
component1 = 5;
component2 = 6;
component10= 7;
when I want to replace component1 with variable, component10 will change as variable0
How should I prevent this in C#
You can use non word boundary.So,your regex would be
\bcomponent1\b
This would match component1 as a separate word and not as a substring
your code would be
string output=Regex.Replace(input,#"\bcomponent1\b");
# is required else \b would be treated as special character which would give you error because \b is not a valid escape character or use \\b
Just replace them in descending order of substring length.
Related
I am trying to make a regular expression that will tell me if a string has {0#} where zero can be repeated. Once I confirm that a string has this I am then trying to set it to a variable so I can count the number of 0s and replace the # with another number. I have /([{0]})([#}])/g which works on detection but not on pulling it out to another variable.
Edit:
Thanks to all, the answer was
Regex regex = new Regex(#"\{(0+)(#)\}");
Match match = regex.Match(text);
if (match.Success)
{
int zeros = Regex.Matches(match.Value, "0").Count;
}
Use this:
\{(0+)(#)\}
character {
then one or more occurance of 0
a # sign
character }
Live Demo
You are super close. The problem you are having is because your capture group - the ( ) needs to be just around the zeroes. You also don't strictly need the other capture group unless you are doing something with it. You can rewrite your regex like this:
{(0+)#}
{ - match '{'
(0+) - match and capture one or more '0'
# - match '#'
} - match '}'
I have a regex that I am trying to pass a variable to:
int i = 0;
Match match = Regex.Match(strFile, "(^.{i})|(godness\\w+)(?<=\\2(\\d+).*?\\2)(\\d+)");
I'd like the regex engine to parse {i} as the number that the i variable holds.
The way I am doing that does not work as I get no matches when the text contains matching substrings.
It is not clear what strings you want to match with your regex, but if you need to use a vriable in the pattern, you can easily use string interpolation inside a verbatim string literal. Verbatim string literals are preferred when declaring regex patterns in order to avoid overescaping.
Since string interpolation was introduced in C#6.0 only, you can use string.Format:
string.Format(#"(^.{{{0}}})|(godness\w+)(?<=\2(\d+).*?\2)(\d+)", i)
Else, beginning with C#6.0, this seems a better alternative:
int i = 0;
Match match = Regex.Match(strFile, $#"(^.{{{i}}})|(godness\w+)(?<=\2(\d+).*?\2)(\d+)");
The regex pattern will look like
(^.{0})|(godness\w+)(?<=\2(\d+).*?\2)(\d+)
^^^
You may try this Concept, where you may use i as parameter and put any value of i.
int i = 0;
string Value =string.Format("(^.{0})|(godness\\w+)(?<=\\2(\\d+).*?\\2)(\\d+)",i);
Match match = Regex.Match(strFile, Value);
I have example this string:
HU_husnummer
HU_Adrs
How can I replace HU? with MI?
So it will be MI_husnummer and MI_Adrs.
I am not very good at regex but I would like to solve it with regex.
EDIT:
The sample code I have now and that still does not work is:
string test = Regex.Replace("[HU_husnummer] int NOT NULL","^HU","MI");
Judging by your comments, you actually need
string test = Regex.Replace("[HU_husnummer] int NOT NULL",#"^\[HU","[MI");
Have a look at the demo
In case your input string really starts with HU, remove the \[ from the regex pattern.
The regex is #"^\[HU" (note the verbatim string literal notation used for regex pattern):
^ - matches the start of string
\[ - matches a literal [ (since it is a special regex metacharacter denoting a beginning of a character class)
HU - matches HU literally.
String varString="HU_husnummer ";
varString=varString.Replace("HU_","MI_");
Links
https://msdn.microsoft.com/en-us/library/system.string.replace(v=vs.110).aspx
http://www.dotnetperls.com/replace
using Substring
var abc = "HU_husnummer";
var result = "MI" + abc.Substring(2);
Replace in Regex.
string result = Regex.Replace(abc, "^HU", "MI");
what is the best way to trim ALL non alpha numeric characters from the beginning and end of a string ? I tried to add characters that I do no need manually but it doesn't work well and use the . I just need to trim anything not alphanumeric.
I tried using this function:
string something = "()&*1#^#47*^#21%Littering aaaannnndóú(*&^1#*32%#**)7(#9&^";
string somethingNew = Regex.Replace(something, #"[^\p{L}-\s]+", "");
But it removes all characters that are non alpha numeric from the string. What I basically want is like this:
"test1" -> test1
#!#!2test# -> 2test
(test3) -> test3
##test4---- -> test4
I do want to support unicode characters but not symbols..
EDIT:
The output of the example should be:
Littering aaaannnndóú
Regards
Assuming you want to trim non-alphanumeric characters from the start and end of your string:
s = new string(s.SkipWhile(c => !char.IsLetterOrDigit(c))
.TakeWhile(char.IsLetterOrDigit)
.ToArray());
#"[^\p{L}\s-]+(test\d*)|(test\d*)[^\p{L}\s-]+","$1"
You can use String function String.Trim Method (Char[]) in .NET library to trim the unnecessary characters from the given string.
From MSDN : String.Trim Method (Char[])
Removes all leading and trailing occurrences of a set of characters
specified in an array from the current String object.
Before trimming the unwanted characters, you need to first identify whether the character is Letter Or Digit, if it is non-alphanumeric then you can use String.Trim Method (Char[]) function to remove it.
you need to use Char.IsLetterOrDigit() function to identify wether the character is alphanumeric or not.
From MSDN: Char.IsLetterOrDigit()
Indicates whether a Unicode character is categorized as a letter or a
decimal digit.
Try This:
string str = "()&*1#^#47*^#21%Littering aaaannnndóú(*&^1#*32%#**)7(#9&^";
foreach (char ch in str)
{
if (!char.IsLetterOrDigit(ch))
str = str.Trim(ch);
}
Output:
1#^#47*^#21%Littering aaaannnndóú(*&^1#*32%#**)7(#9
If you need to remove any character which is not alphanumeric, you can use IsLetterOrDigit paired with a Where to go through every character. And because we're working at the char level, we'll need a little Concat at the end to bring everything back into a string.
string result = string.Concat(input.Where(char.IsLetterOrDigit));
which you can easily convert into an extension method
public static class Extensions
{
public static string ToAlphaNum(this string input)
{
return string.Concat(input.Where(char.IsLetterOrDigit));
}
}
that you can use like this :
string testString = "#!#!\"(test123)\"";
string result = testString.ToAlphaNum(); //test123
Note: this will remove every non-alphanumeric character from your string, if you really need to remove only those at the beginning/end, please add more details about what defines a beginning or an end and add more examples.
And you could also replace all the non-letters/numbers at the beginning and/or end of the line:
^[^\p{L}\p{N}]*|[^\p{L}\p{N}]*$
used as
resultString = Regex.Replace(subjectString, #"^[^\p{L}\p{N}]*|[^\p{L}\p{N}]*$", "", RegexOptions.Multiline);
If you really want to only remove characters at the beginning and end of the "String" and not do this line by line, then remove the ^$ match at linebreak option (RegexOption.Multiline)
If you wanted to include leading or trailing underscores, as characters to be retained, you could simplify the regex to:
^\W+|\W+$
The core of the regex:
[^\p{L}\p{N}]
is a negated character class which includes all of the characters in the Unicode class of Letters \p{L} or Numbers \p{N}
In other words:
Trim non-unicode alphanumeric characters
^[^\p{L}\p{N}]*|[^\p{L}\p{N}]*$
Options: Case sensitive; Exact spacing; Dot doesn't match line breaks; ^$ match at line breaks; Parentheses capture
Match this alternative «^[^\p{L}\p{N}]*»
Assert position at the beginning of a line «^»
Match any single character NOT present in the list below «[^\p{L}\p{N}]*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
A character from the Unicode category “letter” «\p{L}»
A character from the Unicode category “number” «\p{N}»
Or match this alternative «[^\p{L}\p{N}]*$»
Match any single character NOT present in the list below «[^\p{L}\p{N}]*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
A character from the Unicode category “letter” «\p{L}»
A character from the Unicode category “number” «\p{N}»
Assert position at the end of a line «$»
Created with RegexBuddy
Without using regex:
In Java, you could do: (in c# syntax would be nearly the same with same functionality)
while (true) {
if (word.length() == 0) {
return ""; // bad
}
if (!Character.isLetter(word.charAt(0))) {
word = word.substring(1);
continue; // so we are doing front first
}
if (!Character.isLetter(word.charAt(word.length()-1))) {
word = word.substring(0, word.length()-1);
continue; // then we are doing end
}
break; // if front is done, and end is done
}
you could use this pattern
^[^[:alnum:]]+|[^[:alnum:]]+$
with g option
Demo
I am using regex to replace certain keywords from a string (or Stringbuilder) with the ones that I choose. However, I fail to build a valid regex pattern to replace only whole words.
For example, if I have InputString = "fox foxy" and want to replace "fox" with "dog" it the output would be "dog dogy".
What is the valid RegEx pattern to take only "fox" and leave "foxy"?
public string Replace(string KeywordToReplace, string Replacement) /
{
this.Replacement = Replacement;
this.KeywordToReplace = KeywordToReplace;
Regex RegExHelper = new Regex(KeywordToReplace, RegexOptions.IgnoreCase);
string Output = RegExHelper.Replace(InputString, Replacement);
return Output;
}
Thanks!
Regexes support a special escape sequence that represents a word boundary. Word-characters are everything in [a-zA-Z0-9]. So a word-boundary is between any character that belongs in this group and a character that doesn't. The escape sequence is \b:
\bfox\b
Do not forget to put '#' symbol before your '\bword\b'.
For example:
address = Regex.Replace(address, #"\bNE\b", "Northeast");
# symbol ensures escape character, backslash(\), does not get escaped!
You need to use boundary..
KeywordToReplace="\byourWord\b"