Why is regex not matching on unicode character - c#

So I am trying to write a regex in c# (.NET) to match on a range of unicode characters that could potentially be found in a string. As a simple test, I attempted to match on a single unicode character, \u8221, which is the character ”. If I use the regex string "”", I get a match against my test string that contains this character. If, however, I change my regex to "\u8221", I don't get a match. Anyone know why this could be and how to get it to work? I have been pulling my hair out over this one. Thanks in advance.

You are not matching the correct character. \u requires a character code in hexadecimal. Try \u201D instead.

Related

Regex not able to detect ,(comma)

I'm trying to validate date string and for that I have written following regex string
[0-3]*\d{1}(st|nd|rd|th)?[\s\-\/]?(Jan|January|Feb|February|Mar|March|Apr|April|May|May|Jun|June|Jul|July|Aug|August|Sep|September|Oct|October|Nov|November|Dec|December|[0-1]?\d{1})[\s\-\/,]?(\d{4}|\d{2})
string input for my regex is "31st March, 2018"
I have already included ,(comma character) in my regex string (the [\s\-\/,] part) but above input fails to validate.
Can anyone point out what correction is needed in above regex string so that it can detect ,(comma) character in string?
You're missing the space between the comma and the year. You should add \s? after the block that matches the comma.
[0-3]?\d{1}(st|nd|rd|th)?[\s\-\/]?(Jan|January|Feb|February|Mar|March|Apr|April|May|May|Jun|June|Jul|July|Aug|August|Sep|September|Oct|October|Nov|November|Dec|December|[0-1]?\d{1})[\s\-\/,]?\s?(\d{4}|\d{2})
Also, you need not scape characters inside [], or specify a quantifier for matching only one character, so you can just change your regex to:
[0-3]?\d(st|nd|rd|th)?[ -/]?(Jan|January|Feb|February|Mar|March|Apr|April|May|May|Jun|June|Jul|July|Aug|August|Sep|September|Oct|October|Nov|November|Dec|December|[0-1]?\d)[ -/,]?\s?(\d{4}|\d{2})
demo
PS: I don't really know much of C#, so can't speak for what others recommend using Datetime, but sounds like it may be worth checking out.

Regular Expression for Alphanumeric characters with one special char in between

can anyone please help me to figure Regex attribute for string field.
I want my string should be in format of FirstName#LastName thats is it.. I require only one special char in between and rest all alphabets only..
You can use the expression [A-Za-z]+#[A-Za-z]+ to test against a nonempty string of alphabetical characters, followed by an # sign and again followed by a nonempty string of alphabetical characters. You can test it online here.
If you want to accept any non-alphanumeric characters in the middle, like $,#,_,- etc, you can use the following,
[a-zA-Z]+[^a-zA-Z\d\s][a-zA-Z]+
it will match all these among others,
FirstName#LastName
FirstName-LastName
FirstName_LastName
FirstName$LastName
FirstName:LastName
Live Demo
If you want to match whitespace in between as well then simply remove \s from above expression.
Hope it helps.

Regex in between characters

Im trying to create a regex that will match ascii characters in a string so that they be converted with hex afterwards. The string is received as follows:<<<441234567895,ASCII,4,54657379>>> so I am looking to match everything between the third comma and the >>> characters at the end of the string like so.
<<<441234567895,ASCII,4,54657379>>>
So far I have managed to create this regex (/([^,]*,[^,]*)*([^;]*)>>>/) for it but the third comma is picked up as well which I don't want. What do I need to do to remove it from the match?
thanks Callum
(?<=,)[^,]+(?=>>>)
This should do it.See demo.
https://regex101.com/r/sJ9gM7/79
Do you need to use Regex?
string input = "<<<441234567895,ASCII,4,54657379>>>";
string match = input.Substring(3, input.Length - 6).Split(',')[3];
You can also use further splits on the beginning and ending padding strings or check their lengths if you want something safer than the Substring magic.

Regex Expression Only Numbers and Characters

I created the following regex expression for my C# file. Bascily I want the user's input to only be regular characters (A-Z lower or upper) and numbers. (spaces or symbols ).
[a-zA-Z0-9]
For some reason it only fails when its a symbol on its own. if theres characters mixed with it then the expression passes.
I can show you my code of how I implment it but I think its my expression.
Thanks!
The problem is that it can match anywhere. You need anchors:
^[a-zA-Z0-9]+\z
^ matches the start of a string, and \z matches the end of a string.
(Note: in .NET regex, $ matches the end of a string with an optional newline.)
This is because it will match any character in the string you need the following.
Forces it to match the entire string not just part of it
^[0-9a-zA-Z]*$
That regex will match every single alphanumeric character in the string as separate matches.
If you want to make sure the whole string the user entered only has alphanumeric characters you need to do something like:
^[a-zA-Z0-9]+$
Are you making sure to check the whole string? That is are you using an expression like
^[a-zA-Z0-9]*$
where ^ means the start of the string and $ means the end of the string?

Regex - Match a string only when it contains any alphabetic characters

example strings
785*()&!~`a
##$%$~2343
455frt&*&*
i want to capture the first and the third but not the second since it doesnt contain any alphabet character plz help
In fact, I think [a-zA-Z] might suffice to match your strings.
To capture the whole thing, try: ^.*[a-zA-Z].*$
Here is one possible way:
.*[a-zA-Z]+
You should maybe clarify a bit what you mean by 'catpuring': do you want the whole string of just the ascii bits?
Also, you don't say if it should match just plain Roman alphabet (A to Z) or if it should also match Unicode chars to match strings in other languages.
If you just need to test your string, in C# you would do:
bool matching = Regex.IsMatch(myString, "[a-zA-Z]");
You wouldn't need anything else, since just one letter anywhere in the myString string will match (according to your definition).
This is my favorite RegEx testing site: Javascript Regexp Tester and Cheat Sheet
If you want to match all letters (including non-ascii ones), use p{L} instead of [a-zA-Z]. See Unicode categories.

Categories

Resources