Regex to remove certain repeating characters but ignore others [duplicate]

Regex to remove certain repeating characters but ignore others [duplicate] - c#

I'm trying to find a regexp that only matches strings if they don't contain a dot, e.g. it matches stackoverflow, 42abc47 or a-bc-31_4 but doesn't match: .swp, stackoverflow or test..

^[^.]*$
or
^[^.]+$
Depending on whether you want to match empty string. Some applications may implicitly supply the ^ and $, in which case they'd be unnecessary. For example: the HTML5 input element's pattern attribute.
You can find a lot more great information on the regular-expressions.info site.

Use a regex that doesn't have any dots:
^[^.]*$
That is zero or more characters that are not dots in the whole string. Some regex libraries I have used in the past had ways of getting an exact match. In that case you don't need the ^ and $. Having a language in your question would help.
By the way, you don't have to use a regex. In java you could say:
!someString.contains(".");

Validation Require: First Character must be Letter and then Dot '.' is not allowed in Target String.
// The input string we are using
string input = "1A_aaA";
// The regular expression we use to match
Regex r1 = new Regex("^[A-Za-z][^.]*$"); //[\t\0x0020] tab and spaces.
// Match the input and write results
Match match = r1.Match(input);
if (match.Success)
{
Console.WriteLine("Valid: {0}", match.Value);
}
else
{
Console.WriteLine("Not Match");
}

Related

Writing a proper regex to allow number and only combinations of letters and numbers mixed up

I have a string example which looks like this:
51925120851209567
The length of the string and numbers may vary, however I want to only enable the string to contain just either numbers, or for it to be a combination of letters and numbers. For example a valid one would be something like this:
B0031Y4M8S // contains combination of letters and numbers without white space
Invalid regex would be:
Does not apply // this one contains white spaces and has only letters
To summarize things up, the regex should allow only these combinations:
51925120851209567 // contains only numbers and is valid
B0031Y4M8S // contains combination of numbers and letters and is valid as well
Everything else is invalid...
The current solution that I have covers only for the string to be a set of integers and nothing else... However I'm not really sure how to filter out combination of numbers and letters without white spaces and special charachters to be valid as well for the regex?
Regex regex = new Regex("^[0-9]+$");
if (regex.IsMatch(parameter))
{
// allow if statement to pass if the regex matches
}
Can someone help me out ?

You may use
^(?![A-Za-z]+$)[0-9A-Za-z]+$
It matches 1+ alphanumeric chars but will fail a match if all string consists of just letters.
Details
^ - start of a string
(?![A-Za-z]+$) - a negative lookahead that fails the match if there are 1+ ASCII letters followed with the end of string immediately to the right of the current location
[0-9A-Za-z]+ - 1+ ASCII letters
$ - end of string.
See the regex demo.

#The fourth bird's answer will almost get you there. I'm no regex expert, but an easy way to get you what you want would be to use:
Regex regex = new Regex("^[a-zA-Z0-9]+$");
This will get you the first level of exclusion. If it passes that, then check with:
Regex regex = new Regex("^[a-zA-Z]+$");
If it matches that, then you know it's only alphabetical characters and you can skip it. I'm sure there's a better way to code golf this one out, but this should work for now if you're in a crunch.

Regex working in Regexr but not C#, why?

From the below mentioned input string, I want to extract the values specified in {} for s:ds field. I have attached my regex pattern. Now the pattern I used for testing on http://www.regexr.com/ is:
s:ds=\\\"({[\d\w]{8}\-([\d\w]{4}\-){3}[\d\w]{12}})\\\"
and it works absolutely fine.
But the same in C# code does not work. I have also added \\ instead of \ for c# code and replaced \" with \"" . Let me know if Im doing something wrong. Below is the code snippet.
string inputString is "s:ds=\"{46C01EB7-6D43-4E2A-9267-608DE8AFA311}\" s:ds=\"{37BA4BA0-581C-40DC-A542-FFD9E99BC345}\" s:id=\"{C091E71D-4817-49BC-B120-56CE88BC52C2}\"";
string regex = #"s:ds=\\\""({[\d\w]{8}\-(?:[\d\w]{4}\-){3}[\d\w]{12}})\\\""";
MatchCollection matchCollection = Regex.Matches(layoutField, regex);
if (matchCollection.Count > 1)
{
Log.Info("Collection Found.", this);
}

If you only watch to match the values...
You should be able to just use ([\d\w]{8}\-([\d\w]{4}\-){3}[\d\w]{12}) for your expression if you only want to match the withing your gullwing braces :
string input = "s:ds=\"{46C01EB7-6D43-4E2A-9267-608DE8AFA311} ...";
// Use the following expression to just match your GUID values
string regex = #"([\d\w]{8}\-([\d\w]{4}\-){3}[\d\w]{12})";
// Store your matches
MatchCollection matchCollection = Regex.Matches(input, regex);
// Iterate through each one
foreach(var match in matchCollection)
{
// Output the match
Console.WriteLine("Collection Found : {0}", match);
}
You can see a working example of this in action here and example output demonstrated below :
If you want to only match those following s:ds...
If you only want to capture the values for s:ds sections, you could consider appending (?<=(s:ds=""{)) to the front of your expression, which would be a look-behind that would only match values that were preceded by "s:ds={" :
string regex = #"(?<=(s:ds=""{))([\d\w]{8}\-([\d\w]{4}\-){3}[\d\w]{12})";
You can see an example of this approach here and demonstrated below (notice it doesn't match the s:id element :
Another Consideration
Currently you are using \w to match "word" characters within your expression and while this might work for your uses, it will match all digits \d, letters a-zA-z and underscores _. It's unlikely that you would need some of these, so you may want to consider revising your character sets to use just what you would expect like [A-Z\d] to only match uppercase letters and numbers or [0-9A-Fa-f] if you are only expected GUID values (e.g. hex).

Looks like you might be over-escaping.
Give this a shot:
#"s:ds=\""({[\d\w]{8}\-([\d\w]{4}\-){3}[\d\w]{12}})\"""

Regular Expression to match /u/{word or underscore or numbers}

I have tried and failed for two days now to successfully match /u/{word or underscore or numbers}. I also need to ignore the value if it is in a link (ex: <a href="asdfasdf/u/word" />. I have exhausted all options. Can someone please help me out here?
Edit: I am unfamiliar with regular expressions and am still trying to figure them out. Excuse me if this is a noobish question. And to clarify, I can get the matches fine. I just don't understand in Regex how to ignore a match completely if a certain character follows.
Example:
/u/username
/u/username this is
this/is/u/user
<a href="http://www.regex.com/u/something/" />
I want to match the first two occurrences of /u/username.
This is embarrassing, but here is my current regex /u/\w*[^"]

You can use do this pattern:
/u/\w*
It will match the string /u/ followed by zero or more letters, numbers, or underscores. To ensure that the string consists only of this pattern, use start (^) and end ($) anchors, like this:
^/u/\w*$
For example:
string result = Regex.Match(input, #"^/u/\w*$").Value;
If you're trying to do some special parsing of HTML, I'm afraid regular expressions are a pretty bad option. You really should find some way of properly parsing the document first. Nevertheless, here's a very crude pattern that will ignore this sequence if it happens to be within inside an href attribute (it also assumes the attribute value will be surrounded by quotation marks):
(?<!href="[^"]*)/u/\w*
For example:
string input = #"/u/bar";
string pattern = #"(?<!href=""[^""]+)/u/\w*";
string Regex.Match(input, pattern).Value; // will match /u/bar but not /u/foo
This pattern will match any sequence that doesn't have a word character (letter, number, or underscore), quote, or forward slash in front of it:
(?<![\w""/])/u/\w*
This example shows how it can be used get all matches from the string:
var input = #"/u/username
/u/username this is
this/is/u/user <a href=""http://www.regex.com/u/something/"" />";
var pattern = #"(?<![\w""/])/u/\w*";
foreach(Match match in Regex.Matches(input, pattern))
{
System.Console.WriteLine(match.Value);
}
The output will me:
/u/username
/u/username

This regular expression will meet your test scenario
\w*(/u)*[a-z,A-Z,0-9]+$
This actually catches on the characters unique to HTML tags, so as long as you want to ignore HTML code. this will do the trick.

Regex to exclude all chars except letters

I'm a real regex n00b so I ask your help:
I need a regex witch match only letters and numbers and exclude punctations, non ascii characters and spaces.
"ilikestackoverflow2012" would be a valid string.
"f### you °§è" not valid.
"hello world" not valid
"hello-world" and "*hello_world*" not valid
and so on.
I need it to make a possibly complex business name url friendly.
Thanks in advance!

You don't need regex for this.
string s = "......"
var isValid = s.All(Char.IsLetterOrDigit);
-
I need it to make a possibly complex business name url friendly
You can also use HttpUtility.UrlEncode
var urlFriendlyString = HttpUtility.UrlEncode(yourString);

To validate a string you can use the following regular expression with Regex.IsMatch:
"^[0-9A-Za-z]+$"
Explanation:
^ is a start of string anchor.
[...] is a character class.
+ means one or more.
$ is an end of string anchor.
I need it to make a possibly complex business name url friendly
Then you want to replace the characters that don't match. Use Regex.Replace with the following regular expression:
"[^0-9A-Za-z]+"
Explanation:
[^...] is a negated character class.
Code:
string result = Regex.Replace(input, "[^0-9A-Za-z]+" , "");
See it working online: ideone
Note that different business names could give the same resulting string. For example, businesses whose names contain only Chinese characters will all give the empty string.

You can use below regex.
^[a-zA-Z0-9]+$

^[0-9a-zA-Z]+$
Matches one or more alphanumeric characters with no spaces or non-alpha characters.

Try this:
var regex = new Regex(#"^[a-zA-Z0-9]+$");
var test = new[] {"ilikestack", "hello world", "hello-world", "###"};
foreach (var s in test)
Console.WriteLine("{0}: {1}", s, regex.IsMatch(s));
EDIT: If you want something like #Andre_Miller said, you should use the same regex with Regex.Replace();
Regex.Replace(s, #"[^a-zA-Z0-9]+", "")
OR
var regex = new Regex(#"^[a-zA-Z0-9]+$");
regex.Replace("input-string-##$##");

Try
^[a-zA-Z0-9]+$
www.regexr.com is a GREAT resource.

What's wrong with [:alnum:]? It's a posix standard. So your whole regex would be: ^[:alnum:]+$.
The wikipedia article for regular expressions includes lots of examples and details.

Regex for a specific url pattern

In C#, how would I capture the integer value in the URL like:
/someBlah/a/3434/b/232/999.aspx
I need to get the 999 value from the above url.
The url HAS to have the /someBlah/ in it.
All other values like a/3434/b/232/ can be any character/number.
Do I have escape for the '/' ?

Try the following
var match = Regex.Match(url,"^http://.*someblah.*\/(\w+).aspx$");
if ( match.Success ) {
string name = match.Groups[1].Value;
}
You didn't specify what names could appear in front of the ASPX file. I took the simple approach of using the \w regex character which matches letters and digits. You can modify it as necessary to include other items.

You are effectively getting the file name without an extension.
Although you specifically asked for a regular expression, unless you are in a scenario where you really need to use one, I'd recommend that you use System.IO.Path.GetFileNameWithoutExtension:
Path.GetFileNameWithoutExtension(Context.Request.FilePath)

^(?:.+/)*(?:.+)?/someBlah/(?:.+/)*(.+)\.aspx$
This is a bit exhaustive, but it can handle scenarios where /someBlah/ does not have to be at the beginning of the string.
The ?: operator implies a non-capturing group, which may or may not be supported by your RegEx flavor.

Regex regex = new Regex("^http://.*someBlah.*/(\\d+).aspx$");
Match match = regex.Match(url);
int result;
if (match.Success)
{
int.TryParse(match.Groups[1].Value, out result);
}
Using \d rather than \w ensures that you only match digits, and unless the ignore case flag is set the capitalisation of someBlah must be correct.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Regex to remove certain repeating characters but ignore others [duplicate] - c#

I'm trying to find a regexp that only matches strings if they don't contain a dot, e.g. it matches stackoverflow, 42abc47 or a-bc-31_4 but doesn't match: .swp, stackoverflow or test..

^[^.]*$ or ^[^.]+$ Depending on whether you want to match empty string. Some applications may implicitly supply the ^ and $, in which case they'd be unnecessary. For example: the HTML5 input element's pattern attribute. You can find a lot more great information on the regular-expressions.info site.

Related

Writing a proper regex to allow number and only combinations of letters and numbers mixed up

Regex working in Regexr but not C#, why?

Regular Expression to match /u/{word or underscore or numbers}

Regex to exclude all chars except letters

Regex for a specific url pattern

Categories

Resources