How to compare two regex matches? - c#

I'm making a software for TV series and I want to verify that two regex matches have the same meaning in a foreach loop (eg. S03E01 == 03x01).
Here's the code I have:
Regex regex = new Regex(#"S?\d{1,2}[x|e]?\d{1,2}", RegexOptions.IgnoreCase);
foreach (string file in path) {
if (regex.IsMatch(file)) {
//something
}
}
how do I do that?

Transform the filename to one format and keep them in a collection to match them:
Dictionary<string, string> dict = new Dictionary<string,string>();
Regex regex = new Regex(#"S?(\d{1,2})[x|e]?(\d{1,2})", RegexOptions.IgnoreCase);
foreach (string file in path)
{
var match = regex.Match(file);
if (match.Success)
{
string key = "S" + match.Groups[1].Value.PadLeft(2, '0') + "E" + match.Groups[2].Value.PadLeft(2, '0');
if (dict.ContainsKey(key))
{
// .. already in there
}
else
{
dict[key] = file;
}
}
}

Related

C# Replace regex matched pattern using dictionary

I am trying to replace a pattern in my string where only the words between the tags should be replaced. The word that needs to be replaced resides in a dictionary as key and value pair.
Currently this is what I am trying:
string input = "<a>hello</a> <b>hello world</b> <c>I like apple</c>";
string pattern = (#"(?<=>)(.)?[^<>]*(?=</)");
Regex match = new Regex(pattern, RegexOptions.IgnoreCase);
MatchCollection matches = match.Matches(input);
var dictionary1 = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
dictionary1.Add("hello", "Hi");
dictionary1.Add("world", "people");
dictionary1.Add("apple", "fruit");
string output = "";
output = match.Replace(input, replace => { return dictionary1.ContainsKey(replace.Value) ? dictionary1[replace.Value] : replace.Value; });
Console.WriteLine(output);
Console.ReadLine();
Using this, it does replace but only the first 'hello' and not the second one. I want to replace every occurrence of 'hello' between the tags.
Any help will be much appreciated.
The problem is that the matches are:
hello
hello world
I like apple
so e.g. hello world is not in your dictionary.
Based on your code, this could be a solution:
using System;
using System.Text.RegularExpressions;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
var dictionary1 = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
dictionary1.Add("hello", "Hi");
dictionary1.Add("world", "people");
dictionary1.Add("apple", "fruit");
string input = "<a>hello</a> <b>hello world</b> <c>I like apple</c>";
string pattern = ("(?<=>)(.)?[^<>]list|" + GetKeyList(dictionary1) + "(?=</)");
Regex match = new Regex(pattern, RegexOptions.IgnoreCase);
MatchCollection matches = match.Matches(input);
string output = "";
output = match.Replace(input, replace => {
Console.WriteLine(" - " + replace.Value);
return dictionary1.ContainsKey(replace.Value) ? dictionary1[replace.Value] : replace.Value;
});
Console.WriteLine(output);
}
private static string GetKeyList(Dictionary<string, string> list)
{
return string.Join("|", new List<string>(list.Keys).ToArray());
}
}
Fiddle: https://dotnetfiddle.net/zNkEDv
If someone wants to dig into this an tell me why do I need a "list|" in the list (because the first item is being ignored), I'll appreciate it.
This is another way of doing it - I parse the string into XML and then select elements containing the keys in your dictionary and then replace each element's value.
However, you have to have a valid XML document - your example lacks a root node.
var xDocument = XDocument.Parse("<root><a>hello</a> <b>hello world</b> <c>I like apple</c></root>");
var dictionary1 = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase) { { "hello", "Hi" }, { "world", "people" }, { "apple", "fruit" } };
string pattern = #"\w+";
Regex match = new Regex(pattern, RegexOptions.IgnoreCase);
var xElements = xDocument.Root.Descendants()
.Where(x => dictionary1.Keys.Any(s => x.Value.Contains(s)));
foreach (var xElement in xElements)
{
var updated = match.Replace(xElement.Value,
replace => {
return dictionary1.ContainsKey(replace.Value)
? dictionary1[replace.Value] : replace.Value; });
xElement.Value = updated;
}
string output = xDocument.ToString(SaveOptions.DisableFormatting);
This pattern of "\w+" matches words, not spaces.
This LINQ selects descendants of the root node where the element value contains any of the keys of your dictionary:
var xElements = xDocument.Root.Descendants().Where(x => dictionary1.Keys.Any(s => x.Value.Contains(s)));
I then iterate through the XElement enumerable collection returned and apply your replacement MatchEvaluator to just the string value, which is a lot easier!
The final output is <root><a>Hi</a><b>Hi people</b><c>I like fruit</c></root>. You could then remove the opening and closing <root> and </root> tags, but I don't know what your complete XML looks like.
This will do what you want (from what you have provided so far):
private static Dictionary<string, string> dict;
static void Main(string[] args)
{
dict =
new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase)
{
{ "hello", "Hi" },
{ "world", "people" },
{ "apple", "fruit" }
};
var input = "<a>hello</a> <b>hello world</b> apple <c>I like apple</c> hello";
var pattern = #"<.>([^<>]+)<\/.>";
var output = Regex.Replace(input, pattern, Replacer);
Console.WriteLine(output);
Console.ReadLine();
}
static string Replacer(Match match)
{
var value = match.Value;
foreach (var kvp in dict)
{
if (value.Contains(kvp.Key)) value = value.Replace(kvp.Key, kvp.Value);
}
return value;
}

How to split and take multiple strings from a url in c#?

I have a string looking something like this:
/Gender=&Age=&Query=&Orgrimmar+l%C3%A4n=01&Stormwind+l%C3%A4n=07&Undercity+l%C3%A4n=09&Pag
I want a list of string with "Orgrimmar", "Stormwind" and "Undercity". How is this possible so that it splits AFTER Query and between & and + in order so that we avoid getting a string like this "Orgrimmar+l%C3%A4n=01&Stormwind".
Let us assume that we don't know the name of the strings.. :)
Updated, i still don't seem to get it to work. I have added a list of counties that i can use to validate this. However i still find it hard in this case. countyList is used to validate that the counties/cities in the url matches a pre-existing Collection.
var countyQuery = Request.Url.Query;
var counties = this._locationService.GetAllCounties();
List<string> countyList = new List<string>();
List<string> selectedCountiesList = new List<string>();
foreach (var i in counties)
{
countyList.Add(i.Name);
}
Regex r = new Regex(#"&(.+?)\+");
MatchCollection mc = r.Matches(countyQuery);
foreach (Match curMatch in mc)
{
if (countyList.Contains(curMatch.Groups[1].Value))
{
selectedCountiesList.Add(curMatch.Groups[1].Value);
}
}
return selectedCountiesList;
Changed url to be/?Gender=&Age=&Query=&county=13&county=08&county=01&Page=1
where 13, 08, 01 and so on is Id of the counties
The final solution was:
var selectedCountyQuery = Request.QueryString
//CountySearch = "county"
[QueryStringParameters.CountySearch];
List countyList = new List();
List<string> selectedCounties = new List<string>();
if (!string.IsNullOrEmpty(selectedCountyQuery))
{
var selectedCountiesArray = selectedCountyQuery.Split(new[]{ ',' });
foreach (var selectedCounty in selectedCountiesArray)
{
selectedCounties.Add(selectedCounty);
}
}
return selectedCounties;
You can get all parameter and value with Substring() and Split() method.
Example :
var URL = "controller/method?var1=&var2=&var3=dsgdf";
var ParameterPart = URL.Split("?")[1];
var ParametersArray = ParameterPart.Split("&");
//output : ["var1=","var2=","var3=dsgdf"];
foreach(var Parameter in ParametersArray)
{
var ParameterName= Parameter.Split("=")[0];
var ParameterValue= Parameter.Split("=")[1];
}
You can use a regex and extract the matches:
Regex r = new Regex(#"&(.+?)\+");
MatchCollection mc = r.Matches(s);
Then you can itterate your desired strings (in this case wow cities) like:
foreach(Match curMatch in mc)
{
Console.WriteLine(curMatch.Groups[1].Value);
}
string[] numbers ={ "/Gender=&Age=&Query=&Orgrimmar+l%C3%A4n=01&Stormwind+l%C3%A4n=07&Undercity+l%C3%A4n=09&Pag"};
string sPattern = #"(?<=&Orgrimmar)+";
foreach (string s in numbers){
if (System.Text.RegularExpressions.Regex.IsMatch(s, sPattern)){
System.Console.WriteLine(" - valid");}
else{System.Console.WriteLine(" - invalid");}
Output: valid
string[] numbers ={ "/Gender=&Age=&Query=Orgrimmar+l%C3%A4n=01&Stormwind+l%C3%A4n=07&Undercity+l%C3%A4n=09&Pag"};
Output: invalid
Further to check two parameters:
string[] numbers ={ "/Gender=&Age=&Query=&Orgrimmar+l%C3%A4n=01&Stormwind+l%C3%A4n=07&Undercity+l%C3%A4n=09&Pag"};
string sPattern = #"(?<=&Orgrimmar)+";
string sPattern2 = #"(?<=&Stormwind)+";
foreach (string s in numbers){
if (System.Text.RegularExpressions.Regex.IsMatch(s, sPattern) && System.Text.RegularExpressions.Regex.IsMatch(s, sPattern2))
...

How to check if a regex groups are equal?

I have a RegEx that checks my string. In my string I have two groups ?<key> and ?<value>. So here is my sample string:
string input = "key=value&key=value1&key=value2";
I use MatchCollections and when I try to print my groups on the console that here is my code:
string input = Console.ReadLine();
string pattern = #"(?<key>\w+)=(?<value>\w+)";
Regex rgx = new Regex(pattern);
MatchCollection matches = rgx.Matches(input);
foreach (Match item in matches)
{
Console.Write("{0}=[{1}]",item.Groups["key"], item.Groups["value"]);
}
I get an output like this: key=[value]key=[value1]key=[value2]
But I want my output to be like this: key=[value, value1, value2]
My point is how to check the group "key" if it's equal to the previous one so I can make the output like that I want.
You can use a Dictionary<string, List<string>>:
string pattern = #"(?<key>\w+)=(?<value>\w+)";
Regex rgx = new Regex(pattern);
MatchCollection matches = rgx.Matches(input);
Dictionary<string, List<string>> results = new Dictionary<string, List<string>>();
foreach (Match item in matches)
{
if (!results.ContainsKey(item.Groups["key"].Value)) {
results.Add(item.Groups["key"].Value, new List<string>());
}
results[item.Groups["key"].Value].Add(item.Groups["value"].Value);
}
foreach (var r in results) {
Console.Write("{0}=[{1}]", r.Key, string.Join(", ", r.Value));
}
Note the use of string.Join to output the data in the format required.
Use a Dictionary<string,List<string>>
Something like:
var dict = new Dictionary<string,List<string>>();
foreach (Match item in matches)
{
var key = item.Groups["key"];
var val = item.Groups["value"];
if (!dict.ContainsKey(key))
{
dict[key] = new List<string>();
}
dict[key].Add(val);
}
You can use Linq GroupBy method:
string input = "key=value&key=value1&key=value2&key1=value3&key1=value4";
string pattern = #"(?<key>\w+)=(?<value>\w+)";
Regex rgx = new Regex(pattern);
MatchCollection matches = rgx.Matches(input);
foreach (var result in matches
.Cast<Match>()
.GroupBy(k => k.Groups["key"].Value, v => v.Groups["value"].Value))
{
Console.WriteLine("{0}=[{1}]", result.Key, String.Join(",", result));
}
Output for snippet (here I've added another key key1 with two values into you original input string):
key=[value,value1,value2]
key1=[value3,value4]

Question on searching a string using regex and storing in a List

Below is code used to search a string where Identity=" " exists and stores that line in a List. I need to add to this search so that it not only picks up Identity=" " but ALSO where FrameworkSiteID=" ". How can I modify the below code to do this?
Many thanks.
List<KeyValuePair<string, string>> IdentityLines = new List<KeyValuePair<string, string>>();
foreach(FileInfo file in Files)
{
string line = "";
using(StreamReader sr = new StreamReader(file.FullName))
{
while(!String.IsNullOrEmpty(line = sr.ReadLine()))
{
if (line.ToUpper().Contains("IDENTITY="))
{
string login = reg.Match(line).Groups[0].Value;
IdentityLines.Add(new KeyValuePair<string, string>(file.Name, login));
}
else
{
IdentityLines.Add(new KeyValuePair<string, string>(file.Name,"NO LOGIN"));
}
}
//More additional code, not included..
Fixed:
static void TestRegularExpression()
{
String line = "Some text here, blah blah Identity=\"EDN\\nuckol\" and FRAMEworkSiteID=\"DesotoGeneral\" and other stuff.";
Match m1 = Regex.Match(line, "((identity)(=)('|\")([a-zA-Z]*)([\\\\]*)([a-zA-Z]*)('|\"))", RegexOptions.IgnoreCase);
Match m2 = Regex.Match(line, "((frameworkSiteID)(=)('|\")([a-zA-Z]*)('|\"))", RegexOptions.IgnoreCase);
if (m1.Success && m2.Success)
{
//...
Console.WriteLine("Success!");
Console.ReadLine();
}
}
Here's a regular expression tester I like to use.
http://gskinner.com/RegExr/
-Matt

How do I get the name of captured groups in a C# Regex?

Is there a way to get the name of a captured group in C#?
string line = "No.123456789 04/09/2009 999";
Regex regex = new Regex(#"(?<number>[\d]{9}) (?<date>[\d]{2}/[\d]{2}/[\d]{4}) (?<code>.*)");
GroupCollection groups = regex.Match(line).Groups;
foreach (Group group in groups)
{
Console.WriteLine("Group: {0}, Value: {1}", ???, group.Value);
}
I want to get this result:
Group: [I donĀ“t know what should go here], Value: 123456789 04/09/2009 999
Group: number, Value: 123456789
Group: date, Value: 04/09/2009
Group: code, Value: 999
Use GetGroupNames to get the list of groups in an expression and then iterate over those, using the names as keys into the groups collection.
For example,
GroupCollection groups = regex.Match(line).Groups;
foreach (string groupName in regex.GetGroupNames())
{
Console.WriteLine(
"Group: {0}, Value: {1}",
groupName,
groups[groupName].Value);
}
The cleanest way to do this is by using this extension method:
public static class MyExtensionMethods
{
public static Dictionary<string, string> MatchNamedCaptures(this Regex regex, string input)
{
var namedCaptureDictionary = new Dictionary<string, string>();
GroupCollection groups = regex.Match(input).Groups;
string [] groupNames = regex.GetGroupNames();
foreach (string groupName in groupNames)
if (groups[groupName].Captures.Count > 0)
namedCaptureDictionary.Add(groupName,groups[groupName].Value);
return namedCaptureDictionary;
}
}
Once this extension method is in place, you can get names and values like this:
var regex = new Regex(#"(?<year>[\d]+)\|(?<month>[\d]+)\|(?<day>[\d]+)");
var namedCaptures = regex.MatchNamedCaptures(wikiDate);
string s = "";
foreach (var item in namedCaptures)
{
s += item.Key + ": " + item.Value + "\r\n";
}
s += namedCaptures["year"];
s += namedCaptures["month"];
s += namedCaptures["day"];
Since .NET 4.7, there is Group.Name property available.
You should use GetGroupNames(); and the code will look something like this:
string line = "No.123456789 04/09/2009 999";
Regex regex =
new Regex(#"(?<number>[\d]{9}) (?<date>[\d]{2}/[\d]{2}/[\d]{4}) (?<code>.*)");
GroupCollection groups = regex.Match(line).Groups;
var grpNames = regex.GetGroupNames();
foreach (var grpName in grpNames)
{
Console.WriteLine("Group: {0}, Value: {1}", grpName, groups[grpName].Value);
}
To update the existing extension method answer by #whitneyland with one that can handle multiple matches:
public static List<Dictionary<string, string>> MatchNamedCaptures(this Regex regex, string input)
{
var namedCaptureList = new List<Dictionary<string, string>>();
var match = regex.Match(input);
do
{
Dictionary<string, string> namedCaptureDictionary = new Dictionary<string, string>();
GroupCollection groups = match.Groups;
string[] groupNames = regex.GetGroupNames();
foreach (string groupName in groupNames)
{
if (groups[groupName].Captures.Count > 0)
namedCaptureDictionary.Add(groupName, groups[groupName].Value);
}
namedCaptureList.Add(namedCaptureDictionary);
match = match.NextMatch();
}
while (match!=null && match.Success);
return namedCaptureList;
}
Usage:
Regex pickoutInfo = new Regex(#"(?<key>[^=;,]+)=(?<val>[^;,]+(,\d+)?)", RegexOptions.ExplicitCapture);
var matches = pickoutInfo.MatchNamedCaptures(_context.Database.GetConnectionString());
string server = matches.Single( a => a["key"]=="Server")["val"];
The Regex class is the key to this!
foreach(Group group in match.Groups)
{
Console.WriteLine("Group: {0}, Value: {1}", regex.GroupNameFromNumber(group.Index), group.Value);
}
http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.groupnamefromnumber.aspx

Categories

Resources