how to use c# regex to fetch substring between recognized patterns? - c#

My c# code stores a text.
I want to fetch some words without a known pattern which appear among words with known patterns. I don't want to fetch the words with the patterns.
i.e.
My company! 02-45895438 more details: myDomain.mysite.com
can I fetch like this?
<vendorName?>\\s*\\d{2}-d{6}\\s*more details: <site?>
vendorName = "My company!" or "My company! "
site = "myDomain.mysite.com"
Is there any way to do so with regex?

from your description, it seems like you want to find "myDomain.mysite.com" from the string "My company! 02-45895438 more details: myDomain.mysite.com", if that's the case you can use a regex simmilar to this one to get the string you want
(?<=My company! 02-45895438 more details: ).*?
that should give you the substring based on the preceeding match, but will ommit that from the capture.

You can do this by using parentheses. For example, this will give you the contents of a bold tag:
<b>([^>]+)</b>
You can then use Regex.Match to get a Match object, then get the groups via Match.Groups. Each group is a set of parentheses, so in this case there's one group that contains the tag's content.

THis is the syntax I was looking for:
(?<TheServer>\w*)
like in:
string matchPattern = #"\\\\(?<TheServer>\w*)\\(?<TheService>\w*)\\";
see
http://en.csharp-online.net/CSharp_Regular_Expression_Recipes%E2%80%94Extracting_Groups_from_a_MatchCollection

Related

C# Extract part of the string that starts with specific word

I have multiple strings that looks:
this is ab-skdn string
ab-sdhnif my string
For each string, I need to pull the part that is ab-**
For example I need ab-skdn and ab-sdhnif
How could I do that using C#
My code would look like something:
var myString = "this is ab-skdn string"
match = "ab-skdn"
Use a regex to find the matches. A simple regex that does what you need is:
ab-[a-z]*
Since you didn't provide any code I can't provide an example for how to use a regex in your context, but there are plenty of examples out there. A quick google search on how to use Regex in C# should get you started. The link I provided also has some good examples.

How do I see if a string contains another string with quotes in it?

I am trying to see if a large string contains this line of HTML:
<label ng-class="choiceCaptionClass" class="ng-binding choice-caption">Was this information helpful?</label>
As you can see, this snippet has quotations in multiple places and it's causing problems when I do something like this:
Assert.IsTrue(responseContent.Contains("<label ng-class="choiceCaptionClass" class="ng - binding choice - caption">Was this information helpful?</label>"));
I've tried both of these ways of defining the string:
#"<label ng-class=""choiceCaptionClass"" class=""ng - binding choice - caption"">Was this information helpful?</label>"
and
"<label ng-class=\"choiceCaptionClass\" class=\"ng - binding choice - caption\">Was this information helpful?</label>"
But in each case the Contains() method looks for the literal string with either the double quotes or the backslashes. Is there another way I could define this string so I can correctly search for it?
Escaping the double-quotes with backslashes is the proper thing to do.
The reason your search may be failing is that the strings don't actually match. For example, in your version with backslashes, you have spaces around some of the dashes but your HTML string does not.
Try using regular expressions. I made this one for you but you can test your own regex here.
var regex = new Regex(#"<label\s+ng-class\s*=\s*""choiceCaptionClass""\s+class\s*=\s*""ng-binding choice-caption""\s*>\s*Was this information helpful\?\s*</label>", RegexOptions.IgnoreCase);
Assert.IsTrue(regex.IsMatch(responseContent));
If this is not working use the tester tool to figure it out what part of the pattern is getting off.
Hope this help!

asp.net c# allowing users to search string using multiple terms

I am trying to add a search feature to my application which will allow someone to enter several words and search for those in my data.
Doing single words and phrases is simple:
if (x.Title.ToUpper().Contains(tbSearch.Text.ToUpper()) || x.Description.ToUpper().Contains(tbSearch.Text.ToUpper()))
BUT how do I work out if someone entered a search for "red car" and the title was "the car that is red"? I know I could split on SPACE and then search for each term but this seems over complicated and I would also need to strip out non word characters.
I've been looking at using RegExes but am not sure if it would search for items in order or any order.
I guess I'm trying to basically create a simple google search in my application.
Have you considered using a proper search engine such as Lucene? The StandardAnalyzer in Lucene uses the StandardTokenizer, which takes care of (some) special characters, when tokenizing. It would for example split "red-car" into the tokens "red car", thereby "removing" special characters.
In order to search in multiple fields in a Lucene index, you could use the MultiFieldQueryParser.
I think you are looking for something like this:
public static bool HasWordsContaining(this string searchCriteria, string toFilter)
{
var regex = new Regex(string.Format("^{0}| {0}", Regex.Escape(toFilter)), RegexOptions.IgnoreCase);
return regex.IsMatch(searchCriteria);
}
Usage:
someList.Where(x=>x.Name.HasWordsContaining(searchedText)).ToList();
You might use CONTAINSTABLE for this. You can use a SPROC and pass in the search string.
USE AdventureWorks2012
GO
SELECT
KEY_TBL.RANK,
FT_TBL.Description
FROM
Production.ProductDescription AS FT_TBL
INNER JOIN
FREETEXTTABLE
(
Production.ProductDescription,
Description,
'perfect all-around bike'
) AS KEY_TBL
ON FT_TBL.ProductDescriptionID = KEY_TBL.[KEY]
ORDER BY KEY_TBL.RANK DESC
GO
https://msdn.microsoft.com/en-us/library/ms142583.aspx

Correction in this simple regular expression

I am new to regular expressions and the one that i have written might be a very simple one but donot know where I am wrong.
#"^([a-zA-Z._]+)#([\d]+)"
This RE is for the following string:
somename#somenumber
Now i am trying to retrieve the somename and somenumber. This is what i did:
ac.name = m.Groups[0].Value;
ac.number = m.Groups[1].Value;
Here ac.name reads the complete string, and ac.number reads somenumber. Where am I wrong in ac.name?
i guess the regex is correct, the problem is, you get the ac.name not from group 1 but group(0), which is the whole string. try this:
ac.name = m.Groups[1].Value;
ac.number = m.Groups[2].Value;
This regex is correct. I think your mistake is in somewhere else. You seem to use C#. So, you should think about the regex usage in the language.
Looking to the code sample in MSDN, you need to use 1-based indexes while accessing Groups instead of zero-based (as also Kent suggested). So, use this:
String name = m.Groups[1].Value;
String number = m.Groups[2].Value;
use this regex (\w+)#(\d+([.,]\d+)?)
Groups[1] will be contain name
Groups[2] will be contain number
I think you should move the + into the capture group:
#"^([a-zA-Z._]+)#([\d]+)"
If this is C#, try without the ^
([a-zA-Z\._]+)#([\d]+)
I just tried it out and it groups properly
Update: escaped the .
If you want only one match (and hence the ^ in original expression), use .Match instead of .Matches method. See MSDN documentation on Regular Expression Classes.

C# string masking/formatting/filtering with or without regex

Hopefully this isn't too complicated, I just can't seem to find the answer I need.
I have a string with variables in, such as: this is a %variable% string
The format of the variables within the string is arbitrary, although in this example we're using the filter %{0}%
I am wanting to match variable names to properties and ideally I don't want to loop through GetProperties, formatting and testing each name. What I'd like to do is obtain "variable" as a string and test that.
I already use RegEx to get a list of the variables in a string, using the given filter:
string regExSyntax = string.Format(syntax, #"(?<word>\w+)");
but this returns them WITH the '%' (e.g. '%variable%') and as I said, that filter is arbitrary so I can't just do a string.Replace.
This feels like it should be straight-forward....
Thanks!
"(?<word>\w+)"
Is just capturing anything alphnumeric and putting it into a named capturing group called "Word"
You might be interested in learning about lookbehind and lookahead. For example:
"(?<=%)(?<word>\w+)(?=%)"
You can make it a bit more generic with putting your filter in a seperate variable:
string Boundie = "%";
string Expression = #"(?<=" + Boundie + #")(?<word>\w+)(?=" + Boundie + #")";
I hope this is anywhere near what you are looking for.
Given that your regex syntax is: string regExSyntax = string.Format(syntax, #"(?<word>\w+)");, I assume you're then going to create a Regex and use it to match against some string:
Regex reExtractVars = new Regex(regExSyntax);
Match m = reExtractVars.Match(inputString);
while (m.Success)
{
// get the matched variable
string wholeVar = m.Value; // returns "%variable%"
// get just the "word"
string wordOnly = m.Groups["word"].Value; // returns "variable"
m = m.NextMatch();
}
Or have I completely misunderstood the problem?
Acron,
If you're going to roll-your own script parser... apart from being "a bit mad", unless that's the point of the exercise (is it?), then I strongly suggest that you KISS it... Keep It Simple Stoopid.
So what denotes a VARIABLE in your scripting syntax? Is it the percent signs? And they're fixed, yes? So %name% is a variable, but #comment# is NOT a variable... correct? The phrase "that filter is arbitrary" has me worried. What's a "filter"?
If this isn't homework then just use an existing scripting engine, with existing, well defined, well known syntax. Something like Jint, for example.
Cheers. Keith.

Categories

Resources