c# regex clarification [duplicate]

c# regex clarification [duplicate] - c#

What is the regular expression (in JavaScript if it matters) to only match if the text is an exact match? That is, there should be no extra characters at other end of the string.
For example, if I'm trying to match for abc, then 1abc1, 1abc, and abc1 would not match.

Use the start and end delimiters: ^abc$

It depends. You could
string.match(/^abc$/)
But that would not match the following string: 'the first 3 letters of the alphabet are abc. not abc123'
I think you would want to use \b (word boundaries):
var str = 'the first 3 letters of the alphabet are abc. not abc123';
var pat = /\b(abc)\b/g;
console.log(str.match(pat));
Live example: http://jsfiddle.net/uu5VJ/
If the former solution works for you, I would advise against using it.
That means you may have something like the following:
var strs = ['abc', 'abc1', 'abc2']
for (var i = 0; i < strs.length; i++) {
if (strs[i] == 'abc') {
//do something
}
else {
//do something else
}
}
While you could use
if (str[i].match(/^abc$/g)) {
//do something
}
It would be considerably more resource-intensive. For me, a general rule of thumb is for a simple string comparison use a conditional expression, for a more dynamic pattern use a regular expression.
More on JavaScript regexes: https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions

"^" For the begining of the line "$" for the end of it. Eg.:
var re = /^abc$/;
Would match "abc" but not "1abc" or "abc1". You can learn more at https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions

Related

Regular expression issues in .net 6 value converter

I am trying to learn some .net6 and c# and I am struggling with regular expressions a lot. More specificaly with Avalonia in Windows if that is relevant.
I am trying to do a small app with 2 textboxes. I write text on one and get the text "filtered" in the other one using a value converter.
I would like to filter math expressions to try to solve them later on. Something simple, kind of a way of writing text math and getting results real time.
I have been trying for several weeks to figure this regular expression on my own with no success whatsoever.
I would like to replace in my string "_Expression{BLABLA}" for "BLABLA". For testing my expressions I have been checking in http://regexstorm.net/ and https://regex101.com/ and according to them my matches should be correct (unless I misunderstood the results). But the results in my little app are extremely odd to me and I finally decided to ask for help.
Here is my code:
private static string? FilterStr(object value)
{
if (value is string str)
{
string pattern = #"\b_Expression{(.+?)\w*}";
Regex rgx = new(pattern);
foreach (Match match in rgx.Matches(str))
{
string aux = "";
aux = match.Value;
aux = Regex.Replace(aux, #"_Expression{", "");
aux = Regex.Replace(aux, #"[\}]", "");
str = Regex.Replace(str, match.Value, aux);
}
return new string(str);
}
return null;
}
Then the results for some sample inputs are:
Input:
Some text
_Expression{x}
_Expression{1}
_Expression{4}
_Expression{4.5} _Expression{4+4}
_Expression{4-4} _Expression{4*x}
_Expression{x/x}
_Expression{x^4}
_Expression{sin(x)}
Output:
Some text
x
1{1}
1{4}
1{4.5} 1{4+4}
1{4-4} 1{4*x}
1{x/x}
1{x^4}
1{sin(x)}
or
Input:
Some text
_Expression{x}
_Expression{4}
_Expression{4.5} _Expression{4+4}
_Expression{4-4} _Expression{4*x}
_Expression{x/x}
_Expression{x^4}
_Expression{sin(x)}
Output:
Some text
x
_Expression{4}
4.5 _Expression{4+4}
4-4 _Expression{4*x}
x/x
_Expression{x^4}
_Expression{sin(x)}
It feels very confusing to me this behaviour. I can't see why "(.+?)" does not work with some of them and it does with others... Or maybe I haven't defined something properly or my Replace is wrong? I can't see it...
Thanks a lot for the time! :)

There are some missing parts in your regular expression, for example it doesn't have the curly braces { and } escaped, since curly braces have a special meaning in a regular expression; they are used as quantifiers.
Use the one below.
For extracting the math expression between the curly braces, it uses a named capturing group with name mathExpression.
_Expression\{(?<mathExpression>.+?)\}
_Expression\{ : start with the fixed text_Expression{
(?<mathExpression> : start a named capturing group with name mathExpression
.+? : take the next characters in a non greedy way
) : end the named capturing group
\} : end with the fixed character }
The below example will output 2 matches
Regex regex = new(#"_Expression\{(?<mathExpression>.+?)\}");
var matches = regex.Matches(#"_Expression{4.5} _Expression{4+4}");
foreach (Match match in matches.Where(o => o.Success))
{
var mathExpression = match.Groups["mathExpression"];
Console.WriteLine(mathExpression);
}
Output
4.5
4+4

Check array for string that starts with given one (ignoring case)

I am trying to see if my string starts with a string in an array of strings I've created. Here is my code:
string x = "Table a";
string y = "a table";
string[] arr = new string["table", "chair", "plate"]
if (arr.Contains(x.ToLower())){
// this should be true
}
if (arr.Contains(y.ToLower())){
// this should be false
}
How can I make it so my if statement comes up true? Id like to just match the beginning of string x to the contents of the array while ignoring the case and the following characters. I thought I needed regex to do this but I could be mistaken. I'm a bit of a newbie with regex.

It seems you want to check if your string contains an element from your list, so this should be what you are looking for:
if (arr.Any(c => x.ToLower().Contains(c)))
Or simpler:
if (arr.Any(x.ToLower().Contains))
Or based on your comments you may use this:
if (arr.Any(x.ToLower().Split(' ')[0].Contains))

Because you said you want regex...
you can set a regex to var regex = new Regex("(table|plate|fork)");
and check for if(regex.IsMatch(myString)) { ... }
but it for the issue at hand, you dont have to use Regex, as you are searching for an exact substring... you can use
(as #S.Akbari mentioned : if (arr.Any(c => x.ToLower().Contains(c))) { ... }

Enumerable.Contains matches exact values (and there is no build in compare that checks for "starts with"), you need Any that takes predicate that takes each array element as parameter and perform the check. So first step is you want "contains" to be other way around - given string to contain element from array like:
var myString = "some string"
if (arr.Any(arrayItem => myString.Contains(arrayItem)))...
Now you actually asking for "string starts with given word" and not just contains - so you obviously need StartsWith (which conveniently allows to specify case sensitivity unlike Contains - Case insensitive 'Contains(string)'):
if (arr.Any(arrayItem => myString.StartsWith(
arrayItem, StringComparison.CurrentCultureIgnoreCase))) ...
Note that this code will accept "tableAAA bob" - if you really need to break on word boundary regular expression may be better choice. Building regular expressions dynamically is trivial as long as you properly escape all the values.
Regex should be
beginning of string - ^
properly escaped word you are searching for - Escape Special Character in Regex
word break - \b
if (arr.Any(arrayItem => Regex.Match(myString,
String.Format(#"^{0}\b", Regex.Escape(arrayItem)),
RegexOptions.IgnoreCase)) ...

you can do something like below using TypeScript. Instead of Starts with you can also use contains or equals etc..
public namesList: Array<string> = ['name1','name2','name3','name4','name5'];
// SomeString = 'name1, Hello there';
private isNamePresent(SomeString : string):boolean{
if (this.namesList.find(name => SomeString.startsWith(name)))
return true;
return false;
}

I think I understand what you are trying to say here, although there are still some ambiguity. Are you trying to see if 1 word in your String (which is a sentence) exists in your array?
#Amy is correct, this might not have to do with Regex at all.
I think this segment of code will do what you want in Java (which can easily be translated to C#):
Java:
x = x.ToLower();
string[] words = x.Split("\\s+");
foreach(string word in words){
foreach(string element in arr){
if(element.Equals(word)){
return true;
}
}
}
return false;
You can also use a Set to store the elements in your array, which can make look up more efficient.
Java:
x = x.ToLower();
string[] words = x.Split("\\s+");
HashSet<string> set = new HashSet<string>(arr);
for(string word : words){
if(set.contains(word)){
return true;
}
}
return false;
Edit: (12/22, 11:05am)
I rewrote my solution in C#, thanks to reminders by #Amy and #JohnyL. Since the author only wants to match the first word of the string, this edited code should work :)
C#:
static bool contains(){
x = x.ToLower();
string[] words = x.Split(" ");
var set = new HashSet<string>(arr);
if(set.Contains(words[0])){
return true;
}
return false;
}

Sorry my question was so vague but here is the solution thanks to some help from a few people that answered.
var regex = new Regex("^(table|chair|plate) *.*");
if (regex.IsMatch(x.ToLower())){}

match first digits before # symbol

How to match all first digits before # in this line
26909578#Sbrntrl_7x06-lilla.avi#356028416#2012-10-24 09:06#0#http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html#[URL=http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html]http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html[/URL]#http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html#http://bitshare.com/?f=dvk9o1oz#http://bitshare.com/delete/dvk9o1oz/4511e6f3612961f961a761adcb7e40a0/Sbrntrl_7x06-lilla.avi.html
Im trying to get this number 26909578
My try
string text = #"26909578#Sbrntrl_7x06-lilla.avi#356028416#2012-10-24 09:06#0#http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html#[URL=http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html]http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html[/URL]#http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html#http://bitshare.com/?f=dvk9o1oz#http://bitshare.com/delete/dvk9o1oz/4511e6f3612961f961a761adcb7e40a0/Sbrntrl_7x06-lilla.avi.html";
MatchCollection m1 = Regex.Matches(text, #"(.+?)#", RegexOptions.Singleline);
but then its outputs all text

Make it explicit that it has to start at the beginning of the string:
#"^(.+?)#"
Alternatively, if you know that this will always be a number, restrict the possible characters to digits:
#"^\d+"
Alternatively use the function Match instead of Matches. Matches explicitly says, "give me all the matches", while Match will only return the first one.

Or, in a trivial case like this, you might also consider a non-RegEx approach. The IndexOf() method will locate the '#' and you could easily strip off what came before.
I even wrote a sscanf() replacement for C#, which you can see in my article A sscanf() Replacement for .NET.

If you dont want to/dont like to use regex, use a string builder and just loop until you hit the #.
so like this
StringBuilder sb = new StringBuilder();
string yourdata = "yourdata";
int i = 0;
while(yourdata[i]!='#')
{
sb.Append(yourdata[i]);
i++;
}
//when you get to that # your stringbuilder will have the number you want in it so return it with .toString();
string answer = sb.toString();

The entire string (except the final url) is composed of segments that can be matched by (.+?)#, so you will get several matches. Retrieve only the first match from the collection returned by matching .+?(?=#)

Match.Regex syntax

I have a string that can be either
"MyName (ctid 5555)"
or
"OtherName (id 555-5555-5555-555)"
I tried to write a regex to fetch ctid or id, like so:
"(?<=ctid=).+(?=))"
Checking here gave 0 results.
What's wrong with my syntax?

Try this pattern: (?<=\((?:ctid|id)\s).+?(?=\))
It uses a look-behind to check for "ctid" or "id" followed by whitespace, then it matches any content up till the closing parenthesis.
string[] inputs = { "MyName (ctid 5555)", "OtherName (id 555-5555-5555-555)" };
string pattern = #"(?<=\((?:ctid|id)\s).+?(?=\))";
foreach (var input in inputs)
{
var result = Regex.Match(input, pattern).Value;
Console.WriteLine(result);
}
If you clarify your question a better solution might exist. If you care to know whether the value was a "ctid" or an "id" then named capture groups could be used.

Based on your example, I am assuming you require a regex to explicitally match
try
{
var idRegEx = "^.*?\s\(id\s(\d{3}-\d{4}-\d{4}-\d{3})\)$";
var ctIdRegex = "^.*?\s\(ctid\s(\d{4})\)$";
var idMatch = Regex.Replace(textToTest, idRegEx, RegexOptions.IgnoreCase).Groups[1].Value;
var ctIdMatch = Regex.Replace(textToTest, ctIdRegex , RegexOptions.IgnoreCase).Groups[1].Value;
}
catch(ArgumentException)
{
// Regex is wrong
}
catch(ArgumentOutOfRangeException)
{
// No match found on one or the other
}

Assuming that a ctid is always 4 digits, and an id is always 3-4-4-3 digits, and that either way it is enclosed in round brackets, I would do:
\((?:ctid (?<ctid>\d{4})|id (?<id>\d{3}-\d{4}-\d{4}-\d{3}))\)
This adds named groups and does validity checking at the same time. For example, you can use match.Groups['ctid'].value to get the ctid value, or ['id'] to get the id value. Because there is validity checking, you'll never get (what I am assuming is) an invalid id like "(id 123)" (since it doesn't have the 3-4-4-3 pattern).

Not sure what you want exactly
(?:(ct)?id)\s(.+?)\)
But this regex worked for me at
http://derekslager.com/blog/posts/2007/09/a-better-dotnet-regular-expression-tester.ashx
you just need to grab the 2nd group though...
If you don't really want the look around regex, then
\((ct)?id\s(.+?)\)
might do it as well (and is more readable for regex beginners)

Well, you're looking for 'ctid=' and the string has 'ctid '. You'll also need to escape the parenthesis in the lookahead (change ')' to '\)'.

How can I find a string after a specific string/character using regex

I am hopeless with regex (c#) so I would appreciate some help:
Basicaly I need to parse a text and I need to find the following information inside the text:
Sample text:
KeywordB:***TextToFind* the rest is not relevant but **KeywordB: Text ToFindB and then some more text.
I need to find the word(s) after a certain keyword which may end with a “:”.
[UPDATE]
Thanks Andrew and Alan: Sorry for reopening the question but there is quite an important thing missing in that regex. As I wrote in my last comment, Is it possible to have a variable (how many words to look for, depending on the keyword) as part of the regex?
Or: I could have a different regex for each keyword (will only be a hand full). But still don't know how to have the "words to look for" constant inside the regex

The basic regex is this:
var pattern = #"KeywordB:\s*(\w*)";
\s* = any number of spaces
\w* = 0 or more word characters (non-space, basically)
() = make a group, so you can extract the part that matched
var pattern = #"KeywordB:\s*(\w*)";
var test = #"KeywordB: TextToFind";
var match = Regex.Match(test, pattern);
if (match.Success) {
Console.Write("Value found = {0}", match.Groups[1]);
}
If you have more than one of these on a line, you can use this:
var test = #"KeywordB: TextToFind KeyWordF: MoreText";
var matches = Regex.Matches(test, #"(?:\s*(?<key>\w*):\s?(?<value>\w*))");
foreach (Match f in matches ) {
Console.WriteLine("Keyword '{0}' = '{1}'", f.Groups["key"], f.Groups["value"]);
}
Also, check out the regex designer here: http://www.radsoftware.com.au/. It is free, and I use it constantly. It works great to prototype expressions. You need to rearrange the UI for basic work, but after that it's easy.
(fyi) The "#" before strings means that \ no longer means something special, so you can type #"c:\fun.txt" instead of "c:\fun.txt"

Let me know if I should delete the old post, but perhaps someone wants to read it.
The way to do a "words to look for" inside the regex is like this:
regex = #"(Key1|Key2|Key3|LastName|FirstName|Etc):"
What you are doing probably isn't worth the effort in a regex, though it can probably be done the way you want (still not 100% clear on requirements, though). It involves looking ahead to the next match, and stopping at that point.
Here is a re-write as a regex + regular functional code that should do the trick. It doesn't care about spaces, so if you ask for "Key2" like below, it will separate it from the value.
string[] keys = {"Key1", "Key2", "Key3"};
string source = "Key1:Value1Key2: ValueAnd A: To Test Key3: Something";
FindKeys(keys, source);
private void FindKeys(IEnumerable<string> keywords, string source) {
var found = new Dictionary<string, string>(10);
var keys = string.Join("|", keywords.ToArray());
var matches = Regex.Matches(source, #"(?<key>" + keys + "):",
RegexOptions.IgnoreCase);
foreach (Match m in matches) {
var key = m.Groups["key"].ToString();
var start = m.Index + m.Length;
var nx = m.NextMatch();
var end = (nx.Success ? nx.Index : source.Length);
found.Add(key, source.Substring(start, end - start));
}
foreach (var n in found) {
Console.WriteLine("Key={0}, Value={1}", n.Key, n.Value);
}
}
And the output from this is:
Key=Key1, Value=Value1
Key=Key2, Value= ValueAnd A: To Test
Key=Key3, Value= Something

/KeywordB\: (\w)/
This matches any word that comes after your keyword. As you didn´t mentioned any terminator, I assumed that you wanted only the word next to the keyword.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

c# regex clarification [duplicate] - c#

What is the regular expression (in JavaScript if it matters) to only match if the text is an exact match? That is, there should be no extra characters at other end of the string. For example, if I'm trying to match for abc, then 1abc1, 1abc, and abc1 would not match.

Use the start and end delimiters: ^abc$

"^" For the begining of the line "$" for the end of it. Eg.: var re = /^abc$/; Would match "abc" but not "1abc" or "abc1". You can learn more at https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions

Related

Regular expression issues in .net 6 value converter

Check array for string that starts with given one (ignoring case)

match first digits before # symbol

Match.Regex syntax

How can I find a string after a specific string/character using regex

Categories

Resources