This question already has answers here:
Replace multiple characters in a C# string
(15 answers)
Closed 3 years ago.
I am new to C#. Say that I have a string like this:
string test = 'yes/, I~ know# there# are% invalid£ characters$ in& this* string^";
If I wanted to get rid of a single invalid symbol, I would do:
if (test.Contains('/'))
{
test = test.Replace("/","");
}
But is there a way I can use a list of symbols as argument of the Contains and Replace functions, instead of deleting symbols one by one?
I would go with the regular expression solution
string test = Regex.Replace(test, #"\/|~|#|#|%|£|\$|&|\*|\^", "");
Add a | or parameter for each character and use the replace
Bear in mind the \/ means / but you need to escape the character.
You'll likely be better off defining acceptable characters than trying to think of and code for everything you need to eliminate.
Because you mention that you are learning, sounds like the perfect time to learn about Regular Expressions. Here are a couple of links to get you started:
Regular Expression Language - Quick Reference (MSDN)
C# Regex.Match Examples (DotNetPerls
I don't think there is such a feature out of the box.
I think your idea is pretty much on point, despite the fact the in my opinion you don't really need the if(test.Contains(..)) part. Doing this, once you iterate the characters of the string to see if such element is present when at the end if indeed this character is in the string you replace it
It would be faster just to replace the special characters right away. So...
List<string> specialChars = new List<string>() {"*", "/", "&"}
for (var i = 0; i < specialChars.Count; i++)
{
test = test.Replace(specialChars[i],"");
}
Your solution is:
Path.GetInvalidPathChars()
So the code would look something like this:
string illegal = "yes/, I~ know# there# are% invalid£ characters$ in& this* string^";
string invalid = new string(Path.GetInvalidFileNameChars()) + new
string(Path.GetInvalidPathChars());
foreach (char c in invalid)
{
illegal = illegal.Replace(c.ToString(), "");
}
Another variant:
List<string> chars = new List<string> {"!", "#"};
string test = "My funny! string#";
foreach (var c in chars)
{
test = test.Replace(c,"");
}
No need to use Contains as Replace does that.
Related
I am trying to learn some .net6 and c# and I am struggling with regular expressions a lot. More specificaly with Avalonia in Windows if that is relevant.
I am trying to do a small app with 2 textboxes. I write text on one and get the text "filtered" in the other one using a value converter.
I would like to filter math expressions to try to solve them later on. Something simple, kind of a way of writing text math and getting results real time.
I have been trying for several weeks to figure this regular expression on my own with no success whatsoever.
I would like to replace in my string "_Expression{BLABLA}" for "BLABLA". For testing my expressions I have been checking in http://regexstorm.net/ and https://regex101.com/ and according to them my matches should be correct (unless I misunderstood the results). But the results in my little app are extremely odd to me and I finally decided to ask for help.
Here is my code:
private static string? FilterStr(object value)
{
if (value is string str)
{
string pattern = #"\b_Expression{(.+?)\w*}";
Regex rgx = new(pattern);
foreach (Match match in rgx.Matches(str))
{
string aux = "";
aux = match.Value;
aux = Regex.Replace(aux, #"_Expression{", "");
aux = Regex.Replace(aux, #"[\}]", "");
str = Regex.Replace(str, match.Value, aux);
}
return new string(str);
}
return null;
}
Then the results for some sample inputs are:
Input:
Some text
_Expression{x}
_Expression{1}
_Expression{4}
_Expression{4.5} _Expression{4+4}
_Expression{4-4} _Expression{4*x}
_Expression{x/x}
_Expression{x^4}
_Expression{sin(x)}
Output:
Some text
x
1{1}
1{4}
1{4.5} 1{4+4}
1{4-4} 1{4*x}
1{x/x}
1{x^4}
1{sin(x)}
or
Input:
Some text
_Expression{x}
_Expression{4}
_Expression{4.5} _Expression{4+4}
_Expression{4-4} _Expression{4*x}
_Expression{x/x}
_Expression{x^4}
_Expression{sin(x)}
Output:
Some text
x
_Expression{4}
4.5 _Expression{4+4}
4-4 _Expression{4*x}
x/x
_Expression{x^4}
_Expression{sin(x)}
It feels very confusing to me this behaviour. I can't see why "(.+?)" does not work with some of them and it does with others... Or maybe I haven't defined something properly or my Replace is wrong? I can't see it...
Thanks a lot for the time! :)
There are some missing parts in your regular expression, for example it doesn't have the curly braces { and } escaped, since curly braces have a special meaning in a regular expression; they are used as quantifiers.
Use the one below.
For extracting the math expression between the curly braces, it uses a named capturing group with name mathExpression.
_Expression\{(?<mathExpression>.+?)\}
_Expression\{ : start with the fixed text_Expression{
(?<mathExpression> : start a named capturing group with name mathExpression
.+? : take the next characters in a non greedy way
) : end the named capturing group
\} : end with the fixed character }
The below example will output 2 matches
Regex regex = new(#"_Expression\{(?<mathExpression>.+?)\}");
var matches = regex.Matches(#"_Expression{4.5} _Expression{4+4}");
foreach (Match match in matches.Where(o => o.Success))
{
var mathExpression = match.Groups["mathExpression"];
Console.WriteLine(mathExpression);
}
Output
4.5
4+4
This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 2 years ago.
I have a couple of strings and I want them to be transformed like shown below
In the first two examples " is included in the input string.
But " does not comes always with the input string as shown in last two examples.
Basically I need the string between |" and "| or string between first and last occurrence of |
Can someone please let me know how to find the match for the output string that I need which will work for all of these strings? I am trying to code these in C#.
Thanks in advance for any help
I would propose an alternative to regex. Simply using Substring and Replace:
List<string> input = new List<string>
{
"501000061|\"B084PD449Q|2088|1\"|",
"504000585|\"B000NSIAG0|3115|0\"|",
"508000036|B084S1FVH5|42|1|",
"504000584|B000NSIAG0|3115|0|"
};
foreach (var element in input)
{
string transformed = element.Substring(10, element.Length - 11)
.Replace("\"", string.Empty);
Console.WriteLine(transformed);
}
Output:
B084PD449Q|2088|1
B000NSIAG0|3115|0
B084S1FVH5|42|1
B000NSIAG0|3115|0
^[0-9]{9}\|"?([A-Z0-9]){10}\|([0-9]){2,}\|([0-9])"?\|$
This regex is a bit more rigid than the ones already proposed based on the input examples that you've provided.
I suggest you look into non-regex solutions that have already been pointed out, however, if you absolutely must use regex here's how to do it in C# for this example.
var pattern = #"^[0-9]{9}\|"?([A-Z0-9]){10}\|([0-9]){2,}\|([0-9])"?\|$";
var replacement = "$1|$2|$3";
var input = "501000061|\"B084PD449Q|2088|1\"|";
var result = Regex.Replace(input, pattern, replacement);
Here's one without regex:
using System;
using System.Linq;
public class Program
{
public static void Main()
{
var inp = "504000585|\"B000NSIAG0|3115|0\"|";
var res = string
.Join("|", inp.Split(new []{'|'}, StringSplitOptions.RemoveEmptyEntries).Skip(1))
.Replace("\"", "");
Console.WriteLine(res);
}
}
https://dotnetfiddle.net/i39XUY
B000NSIAG0|3115|0
What is the regular expression (in JavaScript if it matters) to only match if the text is an exact match? That is, there should be no extra characters at other end of the string.
For example, if I'm trying to match for abc, then 1abc1, 1abc, and abc1 would not match.
Use the start and end delimiters: ^abc$
It depends. You could
string.match(/^abc$/)
But that would not match the following string: 'the first 3 letters of the alphabet are abc. not abc123'
I think you would want to use \b (word boundaries):
var str = 'the first 3 letters of the alphabet are abc. not abc123';
var pat = /\b(abc)\b/g;
console.log(str.match(pat));
Live example: http://jsfiddle.net/uu5VJ/
If the former solution works for you, I would advise against using it.
That means you may have something like the following:
var strs = ['abc', 'abc1', 'abc2']
for (var i = 0; i < strs.length; i++) {
if (strs[i] == 'abc') {
//do something
}
else {
//do something else
}
}
While you could use
if (str[i].match(/^abc$/g)) {
//do something
}
It would be considerably more resource-intensive. For me, a general rule of thumb is for a simple string comparison use a conditional expression, for a more dynamic pattern use a regular expression.
More on JavaScript regexes: https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions
"^" For the begining of the line "$" for the end of it. Eg.:
var re = /^abc$/;
Would match "abc" but not "1abc" or "abc1". You can learn more at https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions
I am trying to see if my string starts with a string in an array of strings I've created. Here is my code:
string x = "Table a";
string y = "a table";
string[] arr = new string["table", "chair", "plate"]
if (arr.Contains(x.ToLower())){
// this should be true
}
if (arr.Contains(y.ToLower())){
// this should be false
}
How can I make it so my if statement comes up true? Id like to just match the beginning of string x to the contents of the array while ignoring the case and the following characters. I thought I needed regex to do this but I could be mistaken. I'm a bit of a newbie with regex.
It seems you want to check if your string contains an element from your list, so this should be what you are looking for:
if (arr.Any(c => x.ToLower().Contains(c)))
Or simpler:
if (arr.Any(x.ToLower().Contains))
Or based on your comments you may use this:
if (arr.Any(x.ToLower().Split(' ')[0].Contains))
Because you said you want regex...
you can set a regex to var regex = new Regex("(table|plate|fork)");
and check for if(regex.IsMatch(myString)) { ... }
but it for the issue at hand, you dont have to use Regex, as you are searching for an exact substring... you can use
(as #S.Akbari mentioned : if (arr.Any(c => x.ToLower().Contains(c))) { ... }
Enumerable.Contains matches exact values (and there is no build in compare that checks for "starts with"), you need Any that takes predicate that takes each array element as parameter and perform the check. So first step is you want "contains" to be other way around - given string to contain element from array like:
var myString = "some string"
if (arr.Any(arrayItem => myString.Contains(arrayItem)))...
Now you actually asking for "string starts with given word" and not just contains - so you obviously need StartsWith (which conveniently allows to specify case sensitivity unlike Contains - Case insensitive 'Contains(string)'):
if (arr.Any(arrayItem => myString.StartsWith(
arrayItem, StringComparison.CurrentCultureIgnoreCase))) ...
Note that this code will accept "tableAAA bob" - if you really need to break on word boundary regular expression may be better choice. Building regular expressions dynamically is trivial as long as you properly escape all the values.
Regex should be
beginning of string - ^
properly escaped word you are searching for - Escape Special Character in Regex
word break - \b
if (arr.Any(arrayItem => Regex.Match(myString,
String.Format(#"^{0}\b", Regex.Escape(arrayItem)),
RegexOptions.IgnoreCase)) ...
you can do something like below using TypeScript. Instead of Starts with you can also use contains or equals etc..
public namesList: Array<string> = ['name1','name2','name3','name4','name5'];
// SomeString = 'name1, Hello there';
private isNamePresent(SomeString : string):boolean{
if (this.namesList.find(name => SomeString.startsWith(name)))
return true;
return false;
}
I think I understand what you are trying to say here, although there are still some ambiguity. Are you trying to see if 1 word in your String (which is a sentence) exists in your array?
#Amy is correct, this might not have to do with Regex at all.
I think this segment of code will do what you want in Java (which can easily be translated to C#):
Java:
x = x.ToLower();
string[] words = x.Split("\\s+");
foreach(string word in words){
foreach(string element in arr){
if(element.Equals(word)){
return true;
}
}
}
return false;
You can also use a Set to store the elements in your array, which can make look up more efficient.
Java:
x = x.ToLower();
string[] words = x.Split("\\s+");
HashSet<string> set = new HashSet<string>(arr);
for(string word : words){
if(set.contains(word)){
return true;
}
}
return false;
Edit: (12/22, 11:05am)
I rewrote my solution in C#, thanks to reminders by #Amy and #JohnyL. Since the author only wants to match the first word of the string, this edited code should work :)
C#:
static bool contains(){
x = x.ToLower();
string[] words = x.Split(" ");
var set = new HashSet<string>(arr);
if(set.Contains(words[0])){
return true;
}
return false;
}
Sorry my question was so vague but here is the solution thanks to some help from a few people that answered.
var regex = new Regex("^(table|chair|plate) *.*");
if (regex.IsMatch(x.ToLower())){}
I have a string for example: "GamerTag":"A Talented Boy","GamerTileUrl" and what I have been trying and failing to get is the value: A Talented Boy. I need help creating a regex string to get specifically just A Talented Boy. Can somebody please help me!
var str = "\"GamerTag\":\"A Talented Boy\",\"GamerTileUrl\"";
var colonParts = str.Split(':');
if (colonParts.Length >= 2) {
var commaParts = colonParts[1].Split(',');
var aTalentedBoy = commaParts[0];
var gamerTileUrl = commaParts[1];
}
This allows you to also get other parts of the comma-separated list.
Suppose s is your string (no check here):
s = s.Split(':')[1].Split(',')[0].Trim('"');
If you want to have a Regex solution, here it is:
s = "\"GamerTag\":\"A Talented Boy\",\"GamerTileUrl\"";
Regex reg = new Regex("(?<=:\").+?(?=\")");
s = reg.Match(s).Value;
You can use string methods:
string result = text.Split(':').Last().Split(',').First().Trim('"');
The First/Last extension methods prevent exceptions when the separators are missing.
Demo
I think it's safe to assume that your string is actually bigger than what you showed us and it contains multiple key/value pairs? I think this is will do what you are looking for:
str.Split("GamerTag:\"")[1].Split("\"")[1];
The first split targets "GamerTag:" and gets everything after it. The second split gets everything between first and second " that exists in that chunk after "GamerTag:"
How about this?
\:\"([^\"]+)\"
This matches the semicolon and the opening quote, and matches any non-quote characters until the next quote.