Regex to check a string

Regex to check a string - c#

I'm trying to check a string and then extract all the variables which starts with #. I can't find the appropriate regular expression to check the string. The string may start with # or " and if it's started with " it should have a matching pair ".
Example 1:
"ip : "+#value1+"."+#value2+"."+#value3+"."+#value4
Example 2:
#nameParameter "#yahoo.com"
Thanks

It would probably be easiest to first split the string on each quoted string, then check the unquoted parts for #'s. For example all quoted strings could be: /"[^"]*"/, calling Regex.Split on your string would return an array of strings of the non-quoted parts, which you could then use the expression /#\w+/ to find any #'s.

Try this:
string text = "#nameParameter \"#yahoo.com\"";
Regex variables = new Regex(#"(?<!"")#\w+", RegexOptions.Compiled);
foreach (Match match in variables.Matches(text))
{
Console.WriteLine(match.Value);
}

To check the strings you have provided in your post:
(^("[^"\r\n]"\s+#[\w.]+\s*+?)+)|(((^#[\w.]+)|("#[\w.]+"))\s*)+

Related

Replacing a portion of a string with an exact matching

I just want to replace a portion of a string only if matches the given text.
My use case is as follows:
var text = "<wd:response><wd:response-data></wd:response-data></wd:response >";
string result = text.Replace("wd:response", "response");
/*
* expecting the below text
<response><wd:response-data></wd:response-data></response>
*
*/
I followed the following answers:
Way to have String.Replace only hit "whole words"
Regular expression for exact match of a string
But I failed to achieve what I want.
Please share your thoughts/solutions.
Sample on
https://dotnetfiddle.net/pMkO8Q

In general, you should really be parsing and manipulating XML as XML, using functions that know how XML works and what's legal in the language. Regex and other naive text manipulation will often lead you into trouble.
That said, for a very simple solution to this specific problem, you can do this with two replaces:
var text = "<wd:response><wd:response-data></wd:response-data></wd:response >";
text.Replace("wd:response>", "response>").Replace("wd:response ", "response ")
(Note the spaces at the end of the parameters to the second replace.)
Alternatively use a regex similar to "wd:response\s*>"

The easiest way to achieve your result as per your .net fiddle is use the replace as below.
string result = text.Replace("wd:response>", "response>");
But proper way to achieve this is parsing using XML

You can capture the string wd-response in a capturing group and replace using Regex.Replace using the MatchEvaluator like this.
Regex explanation - <[/]?(wd:response)[\s+]?>
Match < literally
Match / optionally hence the ?
Match the string wd:response and place it in a capturing group enclosed with ()
Match one or more optional whitespace [\s+]?
Match > literally
public class Program
{
public static void Main(string[] args)
{
string text = "<wd:response><wd:response-data></wd:response-data></wd:response >";
string replacePattern = "response";
string pattern = #"<[/]?(wd:response)[\s+]?>";
string replacedPattern = Regex.Replace(text, pattern, match =>
{
// Extract the first group
Group group = match.Groups[1];
// Replace the group value with the replacePattern
return string.Format("{0}{1}{2}", match.Value.Substring(0, group.Index - match.Index), replacePattern, match.Value.Substring(group.Index - match.Index + group.Length));
});
Console.WriteLine(replacedPattern);
}
}
Outputting:
<response><wd:response-data></wd:response-data></response >

Splitting of a string using Regex

I have string of the following format:
string test = "test.BO.ID";
My aim is string that part of the string whatever comes after first dot.
So ideally I am expecting output as "BO.ID".
Here is what I have tried:
// Checking for the first occurence and take whatever comes after dot
var output = Regex.Match(test, #"^(?=.).*?");
The output I am getting is empty.
What is the modification I need to make it for Regex?

You get an empty output because the pattern you have can match an empty string at the start of a string, and that is enough since .*? is a lazy subpattern and . matches any char.
Use (the value will be in Match.Groups[1].Value)
\.(.*)
or (with a lookahead, to get the string as a Match.Value)
(?<=\.).*
See the regex demo and a C# online demo.
A non-regex approach can be use String#Split with count argument (demo):
var s = "test.BO.ID";
var res = s.Split(new[] {"."}, 2, StringSplitOptions.None);
if (res.GetLength(0) > 1)
Console.WriteLine(res[1]);

If you only want the part after the first dot you don't need a regex at all:
x.Substring(x.IndexOf('.'))

Regular Expression without braces

i have the following sample cases :
1) "Sample"
2) "[10,25]"
I want to form a(only one) regular expression pattern, to which the above examples are passed returns me "Sample" and "10,25".
Note: Input strings do not include Quotes.
I came up with the following expression (?<=\[)(.*?)(?=\]), this satisfies the second case and retreives me only "10,25" but when the first case is matched it returns me blank. I want "Sample" to be returned? can anyone help me.
C#.

here you go, a small regex using a positive lookbehind, sometime these are very handy
Regex
(?<=^|\[)([\w,]+)
Test string
Sample
[10,25]
Result
MATCH 1
[0-6] Sample
MATCH 2
[8-13] 10,25
try at regex101.com
if " is included in your original string, use this regex, this will look for " mark as well, you may choose to remove ^| from lookup if " mark is always included or you may choose to leave it as it is if your text has combination of with and without " marks
Regex
(?<=^|\[|\")([\w,]+)
try at regex101.com

As far as I can tell, the below regex should help:
Regex regex = new Regex(#"^\w+|[[](\w)+\,(\w)+[]]$");
This will match multiple words, or 2 words (alphanumeric) separated by commas and inside square brackets.

One Java example:
// String input = "Sample";
String input = "[10,25]";
String text = "[^,\\[\\]]+";
Pattern pMod = Pattern.compile("(" + text + ")|(?>\\[(" + text + "," + text + ")\\])");
Matcher mMod = pMod.matcher(input);
while (mMod.find()) {
if(mMod.group(1) != null) {
System.out.println(mMod.group(1));
}
if(mMod.group(2)!=null) {
System.out.println(mMod.group(2));
}
}
if input is "[hello&bye,25|35]", then the output is hello&bye,25|35

Regular expressions in C# for extracting parts

I have this text:
" </SYM field/NN name=/IN ""/"" object/NN ""/"" >/SYM Categories/NNS :/: Cars/NNS ,/, About/RB Model/NNP :/: "
I would like to extract values such as
Categories/NNS :/: Cars/NNS ,/, About/RB
where the pattern is
WORD + /NNS + :/: ANYTHING until you reach the same pattern
I tried:
Match match = Regex.Match(input, #"([A-Za-z0-9\-]+)/NNS :/: ([A-Za-z0-9\-/s]+)",
RegexOptions.IgnoreCase);
if (match.Success)
{
string key = match.Groups[1].Value;
Console.WriteLine(key);
}
and the answer I got back was:
Categories
instead of
Categories/NNS :/: Cars/NNS ,/, About/RB
What I am doing wrong?

You need to enclose the bits of the regex you want as result inside parenthesis.
To obtain what you're looking for, you need to replace your regexp by (not tested, moreover I don't know C# regex specifics but the below should be OK):
"((?:[A-Za-z0-9\-]+)/NNS :/: (?:[A-Za-z0-9\-/s]+))"
The main parenthesis mean that you'll get the entire string as result.
The opening parenthesis followed by ?: mean that you don't want that part in the result.
If you would not put the ?:, it would result in a tuple with your entire string, then the string matching the first sub-regex, then the string matching the second sub-regex.

Why don't you use match.Value? Everything you put in parenthesis represents a group, but it looks like you want the whole thing.
Match match = Regex.Match(input, #"([A-Za-z0-9\-]+)/NNS :/: ([A-Za-z0-9\-/s]+)",
RegexOptions.IgnoreCase);
if (match.Success)
{
string key = match.Value;
Console.WriteLine(key);
}

match first digits before # symbol

How to match all first digits before # in this line
26909578#Sbrntrl_7x06-lilla.avi#356028416#2012-10-24 09:06#0#http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html#[URL=http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html]http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html[/URL]#http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html#http://bitshare.com/?f=dvk9o1oz#http://bitshare.com/delete/dvk9o1oz/4511e6f3612961f961a761adcb7e40a0/Sbrntrl_7x06-lilla.avi.html
Im trying to get this number 26909578
My try
string text = #"26909578#Sbrntrl_7x06-lilla.avi#356028416#2012-10-24 09:06#0#http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html#[URL=http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html]http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html[/URL]#http://bitshare.com/files/dvk9o1oz/Sbrntrl_7x06-lilla.avi.html#http://bitshare.com/?f=dvk9o1oz#http://bitshare.com/delete/dvk9o1oz/4511e6f3612961f961a761adcb7e40a0/Sbrntrl_7x06-lilla.avi.html";
MatchCollection m1 = Regex.Matches(text, #"(.+?)#", RegexOptions.Singleline);
but then its outputs all text

Make it explicit that it has to start at the beginning of the string:
#"^(.+?)#"
Alternatively, if you know that this will always be a number, restrict the possible characters to digits:
#"^\d+"
Alternatively use the function Match instead of Matches. Matches explicitly says, "give me all the matches", while Match will only return the first one.

Or, in a trivial case like this, you might also consider a non-RegEx approach. The IndexOf() method will locate the '#' and you could easily strip off what came before.
I even wrote a sscanf() replacement for C#, which you can see in my article A sscanf() Replacement for .NET.

If you dont want to/dont like to use regex, use a string builder and just loop until you hit the #.
so like this
StringBuilder sb = new StringBuilder();
string yourdata = "yourdata";
int i = 0;
while(yourdata[i]!='#')
{
sb.Append(yourdata[i]);
i++;
}
//when you get to that # your stringbuilder will have the number you want in it so return it with .toString();
string answer = sb.toString();

The entire string (except the final url) is composed of segments that can be matched by (.+?)#, so you will get several matches. Retrieve only the first match from the collection returned by matching .+?(?=#)

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Regex to check a string - c#

Try this: string text = "#nameParameter \"#yahoo.com\""; Regex variables = new Regex(#"(?<!"")#\w+", RegexOptions.Compiled); foreach (Match match in variables.Matches(text)) { Console.WriteLine(match.Value); }

To check the strings you have provided in your post: (^("[^"\r\n]"\s+#[\w.]+\s+?)+)|(((^#[\w.]+)|("#[\w.]+"))\s)+

Related

Replacing a portion of a string with an exact matching

Splitting of a string using Regex

Regular Expression without braces

Regular expressions in C# for extracting parts

match first digits before # symbol

Categories

Resources

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Regex to check a string - c#

Try this: string text = "#nameParameter \"#yahoo.com\""; Regex variables = new Regex(#"(?<!"")#\w+", RegexOptions.Compiled); foreach (Match match in variables.Matches(text)) { Console.WriteLine(match.Value); }

To check the strings you have provided in your post: (^("[^"\r\n]"\s+#[\w.]+\s*+?)+)|(((^#[\w.]+)|("#[\w.]+"))\s*)+

Related

Replacing a portion of a string with an exact matching

Splitting of a string using Regex

Regular Expression without braces

Regular expressions in C# for extracting parts

match first digits before # symbol

Categories

Resources

To check the strings you have provided in your post: (^("[^"\r\n]"\s+#[\w.]+\s+?)+)|(((^#[\w.]+)|("#[\w.]+"))\s)+