C# RegEx Keep Line Breaks

C# RegEx Keep Line Breaks - c#

I need to remove all special characters except line breaks. Does anyone know of a RegEx that would accomplish this task? Here is my RegEx:
string b = "ABC\r\nVVV";
string a = Regex.Replace(b, "[^\\x20-\\x7E]", "");

You match any char other than a char from space to tilde with "[^\\x20-\\x7E]". So, it matches CR and LF symbols. To avoid matching those chars, add them to the character class, and it is best to add + after the ] to match 1 or more occurrences to remove whole sequences at once:
string a = Regex.Replace(b, "[^\\x20-\\x7E\r\n]+", "");
See the regex demo at RegexStorm.

Related

How to split Alphanumeric with Symbol in C#

I want to spilt Alphanumeric with two part Alpha and numeric with special character like -
string mystring = "1- Any Thing"
I want to store like:
numberPart = 1
alphaPart = Any Thing
For this i am using Regex
Regex re = new Regex(#"([a-zA-Z]+)(\d+)");
Match result = re.Match("1- Any Thing");
string alphaPart = result.Groups[1].Value;
string numberPart = result.Groups[2].Value;
If there is no space in between string its working fine but space and symbol both alphaPart and numberPart showing null where i am doing wrong Might be Regex expression is wrong for this type of filter please suggest me on same

Try this:
(\d+)(?:[^\w]+)?([a-zA-Z\s]+)
Demo
Explanation:
(\d+) - capture one or more digit
[^\w]+ match anything except alphabets
? this tell that anything between word and number can appear or not(when not space is between them)
[a-zA-Z\s]+ match alphabets(even if between them have spaces)

Start of string is matched with ^.
Digits are matched with \d+.
Any non-alphanumeric characters are matched with [\W_] or \W.
Anything is matched with .*.
Use
(?s)^(\d+)\W*(.*)
See proof
(?s) makes . match linebreaks. So, it literally matches everything.

Regex replace special character

I need help in my regex.
I need to remove the special character found in the start of text
for example I have a text like this
.just a $#text this should not be incl#uded
The output should be like this
just a text this should not be incl#uded
I've been testing my regex here but i can't make it work
([\!-\/\;-\#]+)[\w\d]+
How do I limit the regex to check only the text that starts in special characters?
Thank you

Use \B[!-/;-#]+\s*\b:
var result = Regex.Replace(s, #"\B[!-/;-#]+\s*\b", "");
See the regex demo
Details
\B - the position other than a word boundary (there must be start of string or a non-word char immediately to the left of the current position)
[!-/;-#]+ - 1 or more ASCII punctuation
\s* - 0+ whitespace chars
\b - a word boundary, there must be a letter/digit/underscore immediately to the right of the current location.
If you plan to remove all punctuation and symbols, use
var result = Regex.Replace(s, #"\B[\p{P}\p{S}]+\s*\b", "");
See another regex demo.
Note that \p{P} matches any punctuation symbols and \p{S} matches any symbols.

Use lookahead:
(^[.$#]+|(?<= )[.$#]+)
The ^[.$#]+ is used to match the special characters at the start of a line.
The (?<= )[.$#]+) is used to matching the special characters at the start of a word which is in the sentence.
Add your special characters in the character group [] as you need.

Following are two possible options from your question details. Hope it will help you.
string input = ".just a $#text this should not be incl#uded";
//REMOVING ALL THE SPECIAL CHARACTERS FROM THE WHOLE STRING
string output1 = Regex.Replace(input, #"[^0-9a-zA-Z\ ]+", "");
// REMOVE LEADING SPECIAL CHARACTERS FROM EACH WORD IN THE STRING. WILL KEEP OTHER SPECIAL CHARACTERS
var split = input.Split();
string output2 = string.Join(" ", split.Select(s=> Regex.Replace(s, #"^[^0-9a-zA-Z]+", "")).ToArray());

Negative lookahead is fine here :
(?![\.\$#].*)[\S]+
https://regex101.com/r/i0aacp/11/
[\S] match any character
(?![\.\$#].*) negative lookahead means those characters [\S]+ should not start with any of \.\$#

Make Regex Match word containing spetial characters

My Code is like this:
string currentPageSlug = "securities/EBR#03L$ZZZ";
string patern= #"securities/(\w+)[\#\$]";
string res = Regex.Match(currentPageSlug, patern).Value;
Console.WriteLine(res);
which gives me this result:
securities/EBR#
but I want to get:
securities/EBR#03L$ZZZ
whole word including all special characters (# and $ and maybe others too)
my regex pattern does not seem to work.

Your regex matches words followed by a single special character. You need to include [#$] in the repeating construct +, like this:
string patern= #"securities/((?:\w|[#$])+)";
Note that since # and $ are used inside a character class, it is not necessary to escape them with a backslash \.

C# - Removing single word in string after certain character

I have string that I would like to remove any word following a "\", whether in the middle or at the end, such as:
testing a\determiner checking test one\pronoun
desired result:
testing a checking test one
I have tried a simple regex that removes anything between the backslash and whitespace, but it gives the following result:
string input = "testing a\determiner checking test one\pronoun";
Regex regex = new Regex(#"\\.*\s");
string output = regex.Replace(input, " ");
Result:
testing a one\pronoun
It looks like this regex matches from the backslash until the last whitespace in the string. I cannot seem to figure out how to match from the backlash to the next whitespace. Also, I am not guaranteed a whitespace at the end, so I would need to handle that. I could continue processing the string and remove any text after the backslash, but I was hoping I could handle both cases with one step.
Any advice would be appreciated.

Change .* which match any characters, to \w*, which only match word characters.
Regex regex = new Regex(#"\\\w*");
string output = regex.Replace(input, "");

".*" matches zero or more characters of any kind. Consider using "\w+" instead, which matches one or more "word" characters (not including whitespace).
Using "+" instead of "*" would allow a backslash followed by a non-"word" character to remain unmatched. For example, no matches would be found in the sentence "Sometimes I experience \ an uncontrollable compulsion \ to intersperse backslash \ characters throughout my sentences!"

With your current pattern, .* tells the parser to be "greedy," that is, to take as much of the string as possible until it hits a space. Adding a ? right after that * tells it instead to make the capture as small as possible--to stop as soon as it hits the first space.
Next, you want to end at not just a space, but at either a space or the end of the string. The $ symbol captures the end of the string, and | means or. Group those together using parentheses and your group collectively tells the parser to stop at either a space or the end of the string. Your code will look like this:
string input = #"testing a\determiner checking test one\pronoun";
Regex regex = new Regex(#"\\.*?(\s|$)");
string output = regex.Replace(input, " ");

Try this regex (\\[^\s]*)
(\\[^\s]*)
1st Capturing group (\\[^\s]*)
\\ matches the character \ literally
[^\s]* match a single character not present in the list below
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
\s match any white space character [\r\n\t\f ].

Remove punctuation from string with Regex

I'm really bad with Regex but I want to remove all these .,;:'"$##!?/*&^-+ out of a string
string x = "This is a test string, with lots of: punctuations; in it?!.";
How can I do that ?

First, please read here for information on regular expressions. It's worth learning.
You can use this:
Regex.Replace("This is a test string, with lots of: punctuations; in it?!.", #"[^\w\s]", "");
Which means:
[ #Character block start.
^ #Not these characters (letters, numbers).
\w #Word characters.
\s #Space characters.
] #Character block end.
In the end it reads "replace any character that is not a word character or a space character with nothing."

This code shows the full RegEx replace process and gives a sample Regex that only keeps letters, numbers, and spaces in a string - replacing ALL other characters with an empty string:
//Regex to remove all non-alphanumeric characters
System.Text.RegularExpressions.Regex TitleRegex = new
System.Text.RegularExpressions.Regex("[^a-z0-9 ]+",
System.Text.RegularExpressions.RegexOptions.IgnoreCase);
string ParsedString = TitleRegex.Replace(stringToParse, String.Empty);
return ParsedString;
And I've also stored the code here for future use:
http://code.justingengo.com/post/Use%20a%20Regular%20Expression%20to%20Remove%20all%20Punctuation%20from%20a%20String
Sincerely,
S. Justin Gengo
http://www.justingengo.com

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# RegEx Keep Line Breaks - c#

I need to remove all special characters except line breaks. Does anyone know of a RegEx that would accomplish this task? Here is my RegEx: string b = "ABC\r\nVVV"; string a = Regex.Replace(b, "[^\\x20-\\x7E]", "");

Related

How to split Alphanumeric with Symbol in C#

Regex replace special character

Make Regex Match word containing spetial characters

C# - Removing single word in string after certain character

Remove punctuation from string with Regex

Categories

Resources