c# String - Split Pascal Case

c# String - Split Pascal Case - c#

I've been trying to get a C# regex command to turn something like
EYDLessThan5Days
into
EYD Less Than 5 Days
Any ideas?
The code I used :
public static string SplitPascalCase(this string value) {
Regex NameExpression = new Regex("([A-Z]+(?=$|[A-Z][a-z])|[A-Z]?[a-z0-9]+)",
RegexOptions.Compiled);
return NameExpression.Replace(value, " $1").Trim();
}
Out:
EYD Less Than5 Days
But still give me wrong result.
Actually I already asked about this in javascript code but when i implemented in c# code with same logic, it's failed.
Please help me. Thanks.

Use lookarounds in your regex so that it won't consume any characters and it allows overlapping of matches.
(?<=[A-Za-z])(?=[A-Z][a-z])|(?<=[a-z0-9])(?=[0-9]?[A-Z])
Replace the matched boundaries with a space.
Regex.Replace(yourString, #"(?<=[A-Za-z])(?=[A-Z][a-z])|(?<=[a-z0-9])(?=[0-9]?[A-Z])", " ");
DEMO
Explanation:
(?<=[A-Za-z])(?=[A-Z][a-z]) Matches the boundary which was exists inbetween an upper or lowercase letter and an Uppercase letter which was immediately followed by a lowercase letter. For example. consider this ABc string. And this regex would match, the boundary exists inbetween A and Bc. For this aBc example , this regex would match, the boundary exists inbetween a and Bc
| Logical OR operator.
(?<=[a-z0-9])(?=[0-9]?[A-Z]) Matches the boundary which was exists inbetween an lower case letter or digit and an optional digit which was immediately followed by an Uppercase letter. For example. consider this a9A string. And this regex would match, the boundary exists inbetween a and 9A , and also the boundary exists inbetween 9 and A, because we gave [0-9] as optional in positive lookahead.

You can just match and join..
var arr = Regex.Matches(str, #"[A-Z]+(?=[A-Z][a-z]+)|\d|[A-Z][a-z]+").Cast<Match>().Select(m => m.Value).ToArray();
Console.WriteLine(String.Join(" ",arr));
The regex isn't complex at all, it is just capturing each and joining them with a " "
DEMO

Something like this should do
string pattern=#"(?<=\d)(?=[a-zA-Z])|(?<=[a-zA-Z])(?=\d)|(?=[A-Z][a-z])|(?<=[a-z])(?=[A-Z])";
Regex.Replace(input,pattern," ");

Related

How to split Alphanumeric with Symbol in C#

I want to spilt Alphanumeric with two part Alpha and numeric with special character like -
string mystring = "1- Any Thing"
I want to store like:
numberPart = 1
alphaPart = Any Thing
For this i am using Regex
Regex re = new Regex(#"([a-zA-Z]+)(\d+)");
Match result = re.Match("1- Any Thing");
string alphaPart = result.Groups[1].Value;
string numberPart = result.Groups[2].Value;
If there is no space in between string its working fine but space and symbol both alphaPart and numberPart showing null where i am doing wrong Might be Regex expression is wrong for this type of filter please suggest me on same

Try this:
(\d+)(?:[^\w]+)?([a-zA-Z\s]+)
Demo
Explanation:
(\d+) - capture one or more digit
[^\w]+ match anything except alphabets
? this tell that anything between word and number can appear or not(when not space is between them)
[a-zA-Z\s]+ match alphabets(even if between them have spaces)

Start of string is matched with ^.
Digits are matched with \d+.
Any non-alphanumeric characters are matched with [\W_] or \W.
Anything is matched with .*.
Use
(?s)^(\d+)\W*(.*)
See proof
(?s) makes . match linebreaks. So, it literally matches everything.

Lookbehind with equal sign

I want to match
===Something===
but not
====Something====
I've come up with the following regular expression
Regex.Match("====Something====", #"^\s*===\s*(?<!=====\s*)(?<Title>.*?)\s*===\s*$").Groups["Title"]
but it returns
=Something=
please help what's the issue with the lookbehind pattern.

Match for the full word! the angle brackets are all important. The below expression translated - if we are talking to the computer is like this: computer, search for a word starting with with three = signs then have any number of letters then end the word with three equals signs.
Hence if 4 equals signs are there at the start of the word - it won't match.
string regExpression = #"<={3}(\w+)={3}>";
static void Main(string[] args)
{
// searches for the first specified instance.
string textToSearchThrough = "===Something===";
string textToSearchThrough2 = "====Something====";
// add in \s+ to the below if you wish
string regexExpression = #"<={3}(\w+)={3}>";
Regex r = new Regex(regexExpression);
// change the text to search through to the second variable textToSearchThrough2 if you wish to check
Match m = r.Match(textToSearchThrough);
Console.WriteLine(m.Success.ToString());
Console.ReadLine();
}

One more possible solution:
(?<!=)===(?!=)(?<Title>.*?)(?<!=)===(?!=)

Your regex works wrong because you use .*? which can also match =. So it looks for === then accepts anything (other = also), and look for a match which will end with === again. So it will match also === in ========= string, and it is not what you are looking for. However if you change . (match any character) on \w (match word character). Also it would be better to use \w+ insted \w* to avoid maching only ====== without any word (if you don't want to) it should work nad match only ===Something=== even without lookbehind, like:
^\s*===\s*(?<Title>\w+?)\s*===\s*$
Try it HERE.

Regex - How to replace a certain word given a few starting letters

I have the following string with me - "ct lungs, mediastinum". Now I want to do a Regex.Replace such that word starting with the letters "media" in the expression is converted to "chest".
So, the following strings should be converted to "ct chest no contrast" -
"ct media no contrast"
"ct medias no contrast"
"ct mediastin no contrast"
etc.
I wrote
Regex.Replace(myString,#"\bmedia.*\b"," chest ")
but this is taking everything after "media" and "media" included and changing it to "chest". So, if I use the above on the given example then the words "no contrast" are lost. What can I do to only replace the word starting with "media" to "chest" and leave everything after that in the string as it is?
Thanks a lot!

The .* is greedy, meaning it will try to take match as many characters as possible. You can make it match as few as possible by using .*? instead.
Regex.Replace(myString,#"\bmedia.*?\b"," chest ")

string text = Regex.Replace( inputString, #"media\w*", "chest" , RegexOptions.None );
This means replace media + 0 or more matches of any word character with chest.
You may want to use:
\bmedia\w*
\b means word boundary, so you will only do it if the word starts with media

In regex \S* means non whitespace character zero or more times. So try with this one:
Regex.Replace(myString,#"\bmedia\S*"," chest ")
^^
You can switch the \S* into [a-zA-Z]* if you want to allow only alphabets.

Make your * quantifier non-greedy by following it with ?. This means it will stop consuming at the first word boundary it finds, not the last one (the end of the string).

Without Regex, maybe something like this?
Dim tempList = myString.Split(" ").ToList()
tempList.Where(s => s == "media").ToList().ForEach(i => i = "chest")
Dim myString = String.Join(" ", tempList)

Replace with wildcards

I need some advice. Suppose I have the following string: Read Variable
I want to find all pieces of text like this in a string and make all of them like the following:Variable = MessageBox.Show. So as aditional examples:
"Read Dog" --> "Dog = MessageBox.Show"
"Read Cat" --> "Cat = MessageBox.Show"
Can you help me? I need a fast advice using RegEx in C#. I think it is a job involving wildcards, but I do not know how to use them very well... Also, I need this for a school project tomorrow... Thanks!
Edit: This is what I have done so far and it does not work: Regex.Replace(String, "Read ", " = Messagebox.Show").

You can do this
string ns= Regex.Replace(yourString,"Read\s+(.*?)(?:\s|$)","$1 = MessageBox.Show");
\s+ matches 1 to many space characters
(.*?)(?:\s|$) matches 0 to many characters till the first space (i.e \s) or till the end of the string is reached(i.e $)
$1 represents the first captured group i.e (.*?)

You might want to clarify your question... but here goes:
If you want to match the next word after "Read " in regex, use Read (\w*) where \w is the word character class and * is the greedy match operator.
If you want to match everything after "Read " in regex, use Read (.*)$ where . will match all characters and $ means end of line.
With either regex, you can use a replace of $1 = MessageBox.Show as $1 will reference the first matched group (which was denoted by the parenthesis).
Complete code:
replacedString = Regex.Replace(inStr, #"Read (.*)$", "$1 = MessageBox.Show");

The problem with your attempt is, that it cannot know that the replacement string should be inserted after your variable. Let's assume that valid variable names contain letters, digits and underscores (which can be conveniently matched with \w). That means, any other character ends the variable name. Then you could match the variable name, capture it (using parentheses) and put it in the replacement string with $1:
output = Regex.Replace(input, #"Read\s+(\w+)", "$1 = MessageBox.Show");
Note that \s+ matches one or more arbitrary whitespace characters. \w+ matches one or more letters, digits and underscores. If you want to restrict variable names to letters only, this is the place to change it:
output = Regex.Replace(input, #"Read\s+([a-zA-Z]+)", "$1 = MessageBox.Show");
Here is a good tutorial.
Finally note, that in C# it is advisable to write regular expressions as verbatim strings (#"..."). Otherwise, you will have to double escape everything, so that the backslashes get through to the regex engine, and that really lessens the readability of the regex.

Using Regular Expression Match a String that contains numbers letters and dashes

I need to match this string 011Q-0SH3-936729 but not 345376346 or asfsdfgsfsdf
It has to contain characters AND numbers AND dashes
Pattern could be 011Q-0SH3-936729 or 011Q-0SH3-936729-SDF3 or 000-222-AAAA or 011Q-0SH3-936729-011Q-0SH3-936729-011Q-0SH3-936729-011Q-0SH3-936729 and I want it to be able to match anyone of those. Reason for this is that I don't really know if the format is fixed and I have no way of finding out either so I need to come up with a generic solution for a pattern with any number of dashes and the pattern recurring any number of times.
Sorry this is probably a stupid question, but I really suck at Regular expressions.
TIA

foundMatch = Regex.IsMatch(subjectString,
#"^ # Start of the string
(?=.*\p{L}) # Assert that there is at least one letter
(?=.*\p{N}) # and at least one digit
(?=.*-) # and at least one dash.
[\p{L}\p{N}-]* # Match a string of letters, digits and dashes
$ # until the end of the string.",
RegexOptions.IgnorePatternWhitespace);
should do what you want. If by letters/digits you meant "only ASCII letters/digits" (and not international/Unicode letters, too), then use
foundMatch = Regex.IsMatch(subjectString,
#"^ # Start of the string
(?=.*[A-Z]) # Assert that there is at least one letter
(?=.*[0-9]) # and at least one digit
(?=.*-) # and at least one dash.
[A-Z0-9-]* # Match a string of letters, digits and dashes
$ # until the end of the string.",
RegexOptions.IgnorePatternWhitespace | RegexOptions.IgnoreCase);

EDIT:
this will match any of the key provided in your comments:
^[0-9A-Z]+(-[0-9A-Z]+)+$
this means the key starts with the digit or letter and have at leats one dash symbol:

Without more info about the regularity of the dashes or otherwise, this is the best we can do:
Regex.IsMatch(input,#"[A-Z0-9\-]+\-[A-Z0-9]")
Although this will also match -A-0

Most naive implementation EVER (might get you started):
([0-9]|[A-Z])+(-)([0-9]|[A-Z])+(-)([0-9]|[A-Z])+
Tested with Regex Coach.
EDIT:
That does match only three groups; here another, slightly better:
([0-9A-Z]+\-)+([0-9A-Z]+)

Are you applying the regex to a whole string (i.e., validating or filtering)? If so, Tim's answer should put you right. But if you're plucking matches from a larger string, it gets a bit more complicated. Here's how I would do that:
string input = #"Pattern could be 011Q-0SH3-936729 or 011Q-0SH3-936729-SDF3 or 000-222-AAAA or 011Q-0SH3-936729-011Q-0SH3-936729-011Q-0SH3-936729-011Q-0SH3-936729 but not 345-3763-46 or ASFS-DFGS-FSDF or ASD123FGH987.";
Regex pluckingRegex = new Regex(
#"(?<!\S) # start of 'word'
(?=\S*\p{L}) # contains a letter
(?=\S*\p{N}) # contains a digit
(?=\S*-) # contains a hyphen
[\p{L}\p{N}-]+ # gobble up letters, digits and hyphens only
(?!\S) # end of 'word'
", RegexOptions.IgnorePatternWhitespace);
foreach (Match m in pluckingRegex.Matches(input))
{
Console.WriteLine(m.Value);
}
output: 011Q-0SH3-936729
011Q-0SH3-936729-SDF3
000-222-AAAA
011Q-0SH3-936729-011Q-0SH3-936729-011Q-0SH3-936729-011Q-0SH3-936729
The negative lookarounds serve as 'word' boundaries: they insure the matched substring starts either at the beginning of the string or after a whitespace character ((?<!\S)), and ends either at the end of the string or before a whitespace character ((?!\S)).
The three positive lookaheads work just like Tim's, except they use \S* to skip whatever precedes the first letter/digit/hyphen. We can't use .* in this case because that would allow it to skip to the next word, or the next, etc., defeating the purpose of the lookahead.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

c# String - Split Pascal Case - c#

You can just match and join.. var arr = Regex.Matches(str, #"[A-Z]+(?=[A-Z][a-z]+)|\d|[A-Z][a-z]+").Cast<Match>().Select(m => m.Value).ToArray(); Console.WriteLine(String.Join(" ",arr)); The regex isn't complex at all, it is just capturing each and joining them with a " " DEMO

Something like this should do string pattern=#"(?<=\d)(?=[a-zA-Z])|(?<=[a-zA-Z])(?=\d)|(?=[A-Z][a-z])|(?<=[a-z])(?=[A-Z])"; Regex.Replace(input,pattern," ");

Related

How to split Alphanumeric with Symbol in C#

Lookbehind with equal sign

Regex - How to replace a certain word given a few starting letters

Replace with wildcards

Using Regular Expression Match a String that contains numbers letters and dashes

Categories

Resources