Regex to remove dots from string - c#

I have this
regex Regex.Replace(listing.Company, #"[^A-Za-z0-9_\.~]+", "-");
listing.Company is a string, this works but when a string has dots it does not remove them.
Could you please help me out

In your current regex, you have \. in your exclusion, which will cause it to be ignored by Regex.Replace. Also, your regex does nothing to convert the input string to lower case. You can do that afterwards, but doing it before your Replace makes your pattern simpler.
Try this method out:
var output = Regex.Replace(listing.Company.ToLower(), "[^a-z0-9_]+", "-");

Perhaps you are looking for something like this:
string res = Regex.Replace(listing.Company, #"[\W+\.~]", "-");
Here regex engine will look for any character other than A-Z, a-z, underscore along with dot and ~ and will replace it with "-".
Demo

try
Regex.Replace(listing.Company.ToLower(), #"[^a-z0-9_]+", "-");
you are excluding \. which is for dot.
Also, if you want it in lower letters, you need to convert the string to lower case first.

Related

How to split Alphanumeric with Symbol in C#

I want to spilt Alphanumeric with two part Alpha and numeric with special character like -
string mystring = "1- Any Thing"
I want to store like:
numberPart = 1
alphaPart = Any Thing
For this i am using Regex
Regex re = new Regex(#"([a-zA-Z]+)(\d+)");
Match result = re.Match("1- Any Thing");
string alphaPart = result.Groups[1].Value;
string numberPart = result.Groups[2].Value;
If there is no space in between string its working fine but space and symbol both alphaPart and numberPart showing null where i am doing wrong Might be Regex expression is wrong for this type of filter please suggest me on same
Try this:
(\d+)(?:[^\w]+)?([a-zA-Z\s]+)
Demo
Explanation:
(\d+) - capture one or more digit
[^\w]+ match anything except alphabets
? this tell that anything between word and number can appear or not(when not space is between them)
[a-zA-Z\s]+ match alphabets(even if between them have spaces)
Start of string is matched with ^.
Digits are matched with \d+.
Any non-alphanumeric characters are matched with [\W_] or \W.
Anything is matched with .*.
Use
(?s)^(\d+)\W*(.*)
See proof
(?s) makes . match linebreaks. So, it literally matches everything.

Regex to match a word beginning with a period and ending with an underscore?

I'm quite the Regex novice, but I have a series of strings similar to this "[$myVar.myVar_STATE]" I need to replace the 2nd myVar that begins with a period and ends with an underscore. I need it to match it exactly, as sometimes I'll have "[$myVar.myVar_moreMyVar_STATE]" and in that case I wouldn't want to replace anything.
I've tried things like "\b.myVar_\b", "\.\bmyVar_\b" and several more, all to no luck.
How about this:
\[\$myVar\.([^_]+)_STATE\]
Matches:
[$myVar.myVar_STATE] // matches and captures 'myvar'
[$myVar.myVar_moreMyVar_STATE] // no match
Working regex example:
http://regex101.com/r/yM9jQ3
Or if _STATE was variable, you could use this: (as long as the text in the STATE part does not have underscores in it.)
\[\$myVar\.([^_]+)_[^_]+\]
Working regex example:
http://regex101.com/r/kW8oE1
Edit: Conforming to OP's comments below, This should be what he's going for:
(\[\$myVar\.)([^_]+)(_[^_]+\])
Regex replace example:
http://regex101.com/r/pU6yL8
C#
var pattern = #"(\[\$myVar\.)([^_]+)(_[^_]+\])";
var replaced = Regex.Replace(input, pattern, "$1"+ newVar + "$3")
What about something like:
.*.(myVar_).*
This looks for anything then a . and "myVar_" followed by anything.
It matches:
"[$myVar.myVar_STATE]"
And only the first myVar_ here:
"[$myVar.myVar_moremyVar_STATE]"
See it in action.
This should do it:
\[\$myVar\.(.*?)_STATE\]
You can use this little trick to pick out the groups, and build the replacement at the end, like so:
var replacement = "something";
var input = #"[$myVar.myVar_STATE]";
var pattern = #"(\[\$myVar\.)(.*?)_(.*?)]";
var replaced = Regex.Replace(input, pattern, "$1"+ replacement + "_$2]")
C# already has builtin method to do this
string text = ".asda_";
Response.Write((text.StartsWith(".") && text.EndsWith("_")));
Is Regex really required?
string input = "[$myVar.myVar_STATE]";
string oldVar = "myVar";
string newVar = "myNewVar";
string result = input.Replace("." + oldVar + "_STATE]", "." + newVar + "_STATE]");
In case "STATE" is a variable part, then we'll need to use Regex. The easiest way is to use this Regex pattern which matches a position between a prefix and a suffix. Prefix and suffix are used for searching but are not included in the resulting match:
(?<=prefix)find(?=suffix)
result =
Regex.Replace(input, #"(?<=\.)" + Regex.Escape(oldVar) + "(?=_[A-Z]+])", newVar);
Explanation:
The prefix part is \., which stand for ".".
The find part is the escaped old variable to be replaced. Regex escaping makes sure that characters with a special meaning in Regex are escaped.
The suffix part is _[A-Z]+], an underscore followed by at least one letter followed by "]". Note: the second ] needs not to be escaped. An opening bracket [ would have to be escaped like this: \[. We cannot use \w for word characters for the STATE-part as \w includes underscores. You might have to adapt the [A-Z] part to exactly match all possible states (e.g. if state has digits, use [A-Z0-9].

Replace Pattern with ""

I'm trying to write a regex that will take a string of the form:
<123>, ;<123>:::,<123>
where 123 is some number and in between the numbers is some punctuation.
I need a regex that will replace all the punctuation between the number fields with "".
I tried this:
Regex.Replace(s, ">.*<", "");
But had no luck. What regex would accomplish this?
Edit: My original regex was a bit misleading, sorry! As the commenters said, I'm looking for <123><123><123>
Not sure about the exact C# syntax either, but if your string is guaranteed not to have numbers outside those angle brackets, then you should be able to get away with this:
Regex.Replace(s, "[^\d<>]*", "");
So remove anything that isn't a number or "<" or ">". If you also want to remove the angle brackets it's even simpler:
Regex.Replace(s, "[^\d]*", "");
You need to make the .* part non-greedy, otherwise it will pick up everything between the first > and the last < in your string. Try something like:
Regex.Replace(s, ">.*?<", "");
This will erase the > and < chars also. If you want to preserve those:
Regex.Replace(s, ">.*?<", "><");
Both of these should work:
Regex.Replace(s, #"(\>|^).*?($|\<(?=\d{3}\>))", "$1$2");
or
String.Concat(Regex.Matches(s, #"\<\d{3}\>")
.OfType<Match>().Select(a => a.Groups[0]));
you should use brackets as suggested. but i didnt get what exactly you wanted to replace.
string s = "<123>, ;<123>:::,<123>";
s = (new Regex("[<>:, ;]")).Replace(s, "\"");
final string will be;
"123"""""123""""""123"

Remove punctuation from string with Regex

I'm really bad with Regex but I want to remove all these .,;:'"$##!?/*&^-+ out of a string
string x = "This is a test string, with lots of: punctuations; in it?!.";
How can I do that ?
First, please read here for information on regular expressions. It's worth learning.
You can use this:
Regex.Replace("This is a test string, with lots of: punctuations; in it?!.", #"[^\w\s]", "");
Which means:
[ #Character block start.
^ #Not these characters (letters, numbers).
\w #Word characters.
\s #Space characters.
] #Character block end.
In the end it reads "replace any character that is not a word character or a space character with nothing."
This code shows the full RegEx replace process and gives a sample Regex that only keeps letters, numbers, and spaces in a string - replacing ALL other characters with an empty string:
//Regex to remove all non-alphanumeric characters
System.Text.RegularExpressions.Regex TitleRegex = new
System.Text.RegularExpressions.Regex("[^a-z0-9 ]+",
System.Text.RegularExpressions.RegexOptions.IgnoreCase);
string ParsedString = TitleRegex.Replace(stringToParse, String.Empty);
return ParsedString;
And I've also stored the code here for future use:
http://code.justingengo.com/post/Use%20a%20Regular%20Expression%20to%20Remove%20all%20Punctuation%20from%20a%20String
Sincerely,
S. Justin Gengo
http://www.justingengo.com

.NET RegEx for letters and spaces

I am trying to create a regular expression in C# that allows only alphanumeric characters and spaces. Currently, I am trying the following:
string pattern = #"^\w+$";
Regex regex = new Regex(pattern);
if (regex.IsMatch(value) == false)
{
// Display error
}
What am I doing wrong?
If you just need English, try this regex:
"^[A-Za-z ]+$"
The brackets specify a set of characters
A-Z: All capital letters
a-z: All lowercase letters
' ': Spaces
If you need unicode / internationalization, you can try this regex:
#"$[\\p{L}\\s]+$"
See https://learn.microsoft.com/en-us/dotnet/standard/base-types/character-classes-in-regular-expressions#word-character-w
This regex will match all unicode letters and spaces, which may be more than you need, so if you just need English / basic Roman letters, the first regex will be simpler and faster to execute.
Note that for both regex I have included the ^ and $ operator which mean match at start and end. If you need to pull this out of a string and it doesn't need to be the entire string, you can remove those two operators.
try this for all letter with space :
#"[\p{L} ]+$"
The character class \w does not match spaces. Try replacing it with [\w ] (there's a space after the \w to match word characters and spaces. You could also replace the space with \s if you want to match any whitespace.
If, other then 0-9, a-z and A-Z, you also need to cover any accented letters like ï, é, æ, Ć or Ş then you should better use the Unicode properties \p{...} for matching, i.e. (note the space):
string pattern = #"^[\p{IsLetter}\p{IsDigit} ]+$";
This regex works great for me.
Regex rgx = new Regex("[^a-zA-Z0-9_ ]+");
if (rgx.IsMatch(yourstring))
{
var err = "Special charactes are not allowed in Tags";
}

Categories

Resources