Regex to match only numbers , no apostrophes - c#

I want to match only numbers in the following string
String : "40’000"
Match : "40000"
basically tring to ignore apostrophe.
I am using C#, in case it matters.
Cant use any C# methods, need to only use Regex.

Replace like this it replace all char excpet numbers
string input = "40’000";
string result = Regex.Replace(input, #"[^\d]", "");

Since you said; I just want to pick up numbers only, how about without regex?
var s = "40’000";
var result = new string(s.Where(char.IsDigit).ToArray());
Console.WriteLine(result); // 40000

I suggest use regex to find the special characters not the digits, and then replace by ''.
So a simple (?=\S)\D should be enough, the (?=\S) is to ignore the whitespace at the end of number.
DEMO

Replace like this it replace all char excpet numbers and points
string input = "40’000";
string result = Regex.Replace(input, #"[^\d^.]", "");

Don't complicate your life, use Regex.Replace
string s = "40'000";
string replaced = Regex.Replace(s, #"\D", "");

Related

Need Regex to match [#URL^Url Description^#]

I need regex to find this text
[#URL^Url Description^#]
in a string and replace it with
Url Description
"Url Description" can be set of characters in any language.
Any Regex Experts out there to help me?
Thanks.
It might be a bit confusing, but you can use the following:
string str = #"[#URL^Url Description^#]";
var regex = new Regex(#"^[^^]+\^([^^]+)\^[^^]+$");
var result = regex.Replace(str, #"$1");
The first ^ means the beginning of the string;
The [^^]+ means anything not a caret character;
The \^ is a literal caret;
The $ is the end of the string.
Basically, it captures all characters between the carets (^) and replace this in between the <a> tags.
See ideone demo.
You can also replace the last line with this:
var result = regex.Replace(str, #"$1");
Where link is the variable containing the link you want to replace in.
Why don't you use String.Replace()? A regex would work, but it looks like the format is well defined and regexes are harder to read.
string url = "[#URL^blah^#]";
string url_html = url.Replace("[#URL^", "<a href=\"http://www.somewhere.net\">")
.Replace("^#]", "</a>");

How can I use RegEx (Or Should I) to extract a string between the starting string '__' and ending with '__' or 'nothing'

RegEx has always confused me.
I have a string like this:
IDE\DiskDJ205GA20_____________________________A3VS____\5&1003ca0&0&0.0.0
Or Sometimes stored like this:
IDE\DiskSJ305GA23_____________________________PG33S\6&2003Sa0&0&0.0.0
I want to get the 'A3VS' or 'PG33S' string. It's my firmware and is varied in length and type. I used to use:
string[] split = PNP.Split('\\'); //where PHP is my string name
var start = split[1].LastIndexOf('_');
string mystring = split[1].Substring(start + 1);
But that only works for strings that don't end with __ after the firmware string. I noticed that some have an additional random '_' after it.
Is RegEx the way to solve this? Or is there another way better
just without RegEx it can be expressed like this:
var firmware = PNP.Split(new[] {'_'}, StringSplitOptions.RemoveEmptyEntries)[1].Split('\\')[0];
string s = split[1].TrimEnd('_');
string mystring = s.Substring(s.LastIndexOf('_') + 1);
If you want the RegEX way to do it here it is:
Regex regex = new Regex(#"\\.*_+(?<firmware>[A-Za-z0-9]+)_*\\");
var m1 = regex.Match("IDE\DiskSJ305GA23_____________________________PG33S\6&2003Sa0&0&0.0.0");
var g1 = m1.Groups["firmware"].Value;
//g1 == "PG33S"
Keep in mind you have to use [A-Za-z0-9] instead of \w in the capture subexpression since \w also matches an underscore (_).

C# Replace group of numbers in a string with a single character

Does anybody know I can replace a group of numbers in a string by one *. For example if I have a string like this "Test123456.txt", I want to convert it to "Test#.txt". I have seen plenty of examples that can replace each individual number with a new character, but none that deal with a group of numbers. Any help is much appreciated!
Regex r = new Regex(#"\d+", RegexOptions.None);
Console.WriteLine(r.Replace("Test123456.txt", "#"));
Console.Read();
Use Regex.Replace() as follows:
string fileName = "Test12345.txt";
string newFileName = Regex.Replace(fileName, #"[\d]+", "#");
you can use regex, to do this, but if you know the exact text, then using the string.Replace method would be more efficient:
string str = "blahblahblahTest123456.txt";
str = string.Replace("Test#.txt","Test123456.txt");

Remove alphabets from a string

I want to remove alphabets from a string. What is the best way to do it. To be more precise, i have MAC address of a system, and I want to extract only the numbers from it. I have found this article or stackoverflow. link text
I want to know, if using the regex is the best way or there are other ways to do it (maybe using LINQ).
To get the digits, you can use this regex:
var digits = Regex.Replace(text, #"\D", "");
\D matches anything that is not a digit, so removing those will give you the remaining digits.
The LINQ approach would be as follows:
string input = "12-34-56-78-9A-BC";
string result = new String(input.Where(Char.IsDigit).ToArray());
Non-LINQ / 2.0 approach:
string result = new String(Array.FindAll(input.ToCharArray(),
delegate(char c) { return Char.IsDigit(c); }));
This will replace anything that's not a number and leave you with just numbers:
string text = "abc123abc:13sdf2";
string numbers = Regex.Replace(text, #"[^\d]+", "");
Console.WriteLine(numbers);

How to remove non-ASCII word from a string in C#

I want to filter some string which has some wrong letters (non-ASCII). It looks different in Notepad, Visual Studio 2010 and MySQL.
How can I check if a string has non-ASCII letters and how I can remove them?
You could use a regular expression to filter non ASCII characters:
string input = "AB £ CD";
string result = Regex.Replace(input, "[^\x0d\x0a\x20-\x7e\t]", "");
You could use Regular Expressions.
Regex.Replace(input, "[^a-zA-Z0-9]+", "")
You could also use \W+ as the pattern to remove any non-character.
This has been a God-send:
Regex.Replace(input, #"[^\u0000-\u007F]", "");
I think I got it elsewhere originally, but here is a link to the same answer here:
How can you strip non-ASCII characters from a string? (in C#)
string testString = Regex.Replace(OldString, #"[\u0000-\u0008\u000A-\u001F\u0100-\uFFFF]", "");
First, you need to determine what you mean by a "word". If non-ascii, this probably implies non-english?
Personally, I'd ask why you need to do this and what fundamental assumption has your application got that conflicts with your data? Depending on the situation, I suggest you either re-encode the text from the source encoding, although this will be a lossy conversion, or alternatively, address that fundamental assumption so that your application handles data correctly.
I think something as simple as this would probably work, wouldn't it?
public static string AsciiOnly(this string input, bool includeExtendedAscii)
{
int upperLimit = includeExtendedAscii ? 255 : 127;
char[] asciiChars = input.Where(c => (int)c <= upperLimit).ToArray();
return new string(asciiChars);
}
Example usage:
string input = "AB£ȼCD";
string asciiOnly = input.AsciiOnly(false); // returns "ABCD"
string extendedAsciiOnly = input.AsciiOnly(true); // returns "AB£CD"

Categories

Resources