I'm trying to write a regex that will take a string of the form:
<123>, ;<123>:::,<123>
where 123 is some number and in between the numbers is some punctuation.
I need a regex that will replace all the punctuation between the number fields with "".
I tried this:
Regex.Replace(s, ">.*<", "");
But had no luck. What regex would accomplish this?
Edit: My original regex was a bit misleading, sorry! As the commenters said, I'm looking for <123><123><123>
Not sure about the exact C# syntax either, but if your string is guaranteed not to have numbers outside those angle brackets, then you should be able to get away with this:
Regex.Replace(s, "[^\d<>]*", "");
So remove anything that isn't a number or "<" or ">". If you also want to remove the angle brackets it's even simpler:
Regex.Replace(s, "[^\d]*", "");
You need to make the .* part non-greedy, otherwise it will pick up everything between the first > and the last < in your string. Try something like:
Regex.Replace(s, ">.*?<", "");
This will erase the > and < chars also. If you want to preserve those:
Regex.Replace(s, ">.*?<", "><");
Both of these should work:
Regex.Replace(s, #"(\>|^).*?($|\<(?=\d{3}\>))", "$1$2");
or
String.Concat(Regex.Matches(s, #"\<\d{3}\>")
.OfType<Match>().Select(a => a.Groups[0]));
you should use brackets as suggested. but i didnt get what exactly you wanted to replace.
string s = "<123>, ;<123>:::,<123>";
s = (new Regex("[<>:, ;]")).Replace(s, "\"");
final string will be;
"123"""""123""""""123"
Related
I have this
regex Regex.Replace(listing.Company, #"[^A-Za-z0-9_\.~]+", "-");
listing.Company is a string, this works but when a string has dots it does not remove them.
Could you please help me out
In your current regex, you have \. in your exclusion, which will cause it to be ignored by Regex.Replace. Also, your regex does nothing to convert the input string to lower case. You can do that afterwards, but doing it before your Replace makes your pattern simpler.
Try this method out:
var output = Regex.Replace(listing.Company.ToLower(), "[^a-z0-9_]+", "-");
Perhaps you are looking for something like this:
string res = Regex.Replace(listing.Company, #"[\W+\.~]", "-");
Here regex engine will look for any character other than A-Z, a-z, underscore along with dot and ~ and will replace it with "-".
Demo
try
Regex.Replace(listing.Company.ToLower(), #"[^a-z0-9_]+", "-");
you are excluding \. which is for dot.
Also, if you want it in lower letters, you need to convert the string to lower case first.
I have a string something like JSON format:
XYZ DIV Parameters: width=\"1280\" height=\"720\", session=\"1\"
Now I want to remove width=\"1280\" height=\"720\" from this string.
Note: There can be any number in place of 1280 and 720. So, I can't just replace it with null.
Please tell me how to solve it? Either by Regex or any other better method possible.
Regex to be replaced with empty string:
(width|height)=\\"\d+\\"
Regex visualization:
Code:
string input = #"XYZ DIV Parameters: width=\""1280\"" height=\""720\"", session=\""1\""";
string output = Regex.Replace(input, #"(width|height)=\\""\d+\\""", string.Empty);
You could do a find and replace using the following regex:
width=\\"\d*+\\" replace with a blank string, as well as replacing height=\\"\d*+\\" with a blank string.
This is removing the entire text of width=\"XYZ\", if you wanted to just replace the numbers or blank out the numbers you can replace with a string that suits your needs (width=\"\" for example)
If you can guarantee the width and height will ALWAYS be in that format and ALWAYS follow each other seperated by a space, you can combine that into one bigger regex find/replace using width=\\"\d*+\\" height=\\"\d*+\\".
A little more explanation on the regex so you take something away, not just a quick fix :)
width=\\"\d*+\\" breaks down to:
width= pretty simple, just find the text you are looking for to start your removal.
\\" since \ is a special char in regex you have to escape it, then the " char can just follow it up like normal.
\d*+ digits \d, zero or more of them *, and then non greedy +. The important part here is the non greedy on the digits. If you left that off, your regex would look and consume digits until it found the last ". Not 100% needed in your case (since height is buffering) but it is still a lot safer.
\\" to end the regex out
This will do it:
string resultString = null;
try {
Regex regexObj = new Regex(#"^(.*?)width=\\"".*?\\"" height=\\"".*?\\""(.*?)$", RegexOptions.IgnoreCase);
resultString = regexObj.Replace(subjectString, #"$1width=\""\"" height=\""\""$2");
} catch (ArgumentException ex) {
// Syntax error in the regular expression
}
I'm quite the Regex novice, but I have a series of strings similar to this "[$myVar.myVar_STATE]" I need to replace the 2nd myVar that begins with a period and ends with an underscore. I need it to match it exactly, as sometimes I'll have "[$myVar.myVar_moreMyVar_STATE]" and in that case I wouldn't want to replace anything.
I've tried things like "\b.myVar_\b", "\.\bmyVar_\b" and several more, all to no luck.
How about this:
\[\$myVar\.([^_]+)_STATE\]
Matches:
[$myVar.myVar_STATE] // matches and captures 'myvar'
[$myVar.myVar_moreMyVar_STATE] // no match
Working regex example:
http://regex101.com/r/yM9jQ3
Or if _STATE was variable, you could use this: (as long as the text in the STATE part does not have underscores in it.)
\[\$myVar\.([^_]+)_[^_]+\]
Working regex example:
http://regex101.com/r/kW8oE1
Edit: Conforming to OP's comments below, This should be what he's going for:
(\[\$myVar\.)([^_]+)(_[^_]+\])
Regex replace example:
http://regex101.com/r/pU6yL8
C#
var pattern = #"(\[\$myVar\.)([^_]+)(_[^_]+\])";
var replaced = Regex.Replace(input, pattern, "$1"+ newVar + "$3")
What about something like:
.*.(myVar_).*
This looks for anything then a . and "myVar_" followed by anything.
It matches:
"[$myVar.myVar_STATE]"
And only the first myVar_ here:
"[$myVar.myVar_moremyVar_STATE]"
See it in action.
This should do it:
\[\$myVar\.(.*?)_STATE\]
You can use this little trick to pick out the groups, and build the replacement at the end, like so:
var replacement = "something";
var input = #"[$myVar.myVar_STATE]";
var pattern = #"(\[\$myVar\.)(.*?)_(.*?)]";
var replaced = Regex.Replace(input, pattern, "$1"+ replacement + "_$2]")
C# already has builtin method to do this
string text = ".asda_";
Response.Write((text.StartsWith(".") && text.EndsWith("_")));
Is Regex really required?
string input = "[$myVar.myVar_STATE]";
string oldVar = "myVar";
string newVar = "myNewVar";
string result = input.Replace("." + oldVar + "_STATE]", "." + newVar + "_STATE]");
In case "STATE" is a variable part, then we'll need to use Regex. The easiest way is to use this Regex pattern which matches a position between a prefix and a suffix. Prefix and suffix are used for searching but are not included in the resulting match:
(?<=prefix)find(?=suffix)
result =
Regex.Replace(input, #"(?<=\.)" + Regex.Escape(oldVar) + "(?=_[A-Z]+])", newVar);
Explanation:
The prefix part is \., which stand for ".".
The find part is the escaped old variable to be replaced. Regex escaping makes sure that characters with a special meaning in Regex are escaped.
The suffix part is _[A-Z]+], an underscore followed by at least one letter followed by "]". Note: the second ] needs not to be escaped. An opening bracket [ would have to be escaped like this: \[. We cannot use \w for word characters for the STATE-part as \w includes underscores. You might have to adapt the [A-Z] part to exactly match all possible states (e.g. if state has digits, use [A-Z0-9].
I have a bunch of strings and I am using a Regex to replace unwanted characters as needed.
However, I am having an issue with removing dates, example: 1/09/2014 1/29 or 1-29.
How can I remove those. I'm experimenting with something like this but it is way off: I can simply enter individual characters which does not work. Taken from here: Strip Invalid Character
Regex.Replace(strIn, #"[^\w\.#-]", "");
Sample input will look exactly like this: Today 01/29/2014 I will go to the concert.
Output: Today I will go to the concert.
This should work. In order to generate it I have used http://txt2re.com/ -> very handy tool.
string txt="Today 01/29/2014 I will go to the concert";
string re1=".*?"; // Non-greedy match on filler
string re2="((?:[0]?[1-9]|[1][012])[-:\\/.](?:(?:[0-2]?\\d{1})|(?:[3][01]{1}))[-:\\/.](?:(?:[1]{1}\\d{1}\\d{1}\\d{1})|(?:[2]{1}\\d{3})))(?![\\d])"; // MMDDYYYY 1
var newString = Regex.Replace(txt, re1+re2, "");
Try this Regex
(?<=[0-9]{2}[\/\-][0-9]{2}[\/\-][0-9]{4}).*$|.*(?=[0-9]{2}[\/\-][0-9]{2}[\/\-][0-9]{4})
REGEX DEMO
Have some imported data which is leaving me with little invalid character symbols such as:
Caf�
Just wondering what's the easiest way to find/replace these in string content?
var newString = yourString.Replace("�", "");
where yourString is Caf�.
The special character can be used in the Replace statement. It should be as simple as that.
This may help you. Results depend on what type of text you want to keep or remove...
MSDN: How to: Strip Invalid Characters from a String.
This will replace every nonalphanumeric characters(leaving punctuation intact):
string result = Regex.Replace(textBox1.Text, #"[^\w(\p{P}) ]+", "");
if you want only the letters and numbers and want to clear punctuation remove (\p{P}) from the expression.