Using RegEx in Dictionary - c#

I'm new to C#, and it looks I need to use Regex with Dictionary<string, Action>
The below working example with me, as testing of understanding the Regex in C#:
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string sPattern = "word1|word2";
string input = "word2";
Match result = Regex.Match(input, sPattern);
Console.WriteLine(result);
}
}
I tried to include it in Dictionary as below, but failed:
var functions = new Dictionary<Match, Action>();
functions.Add(Regex.Match(string, sPattern), CountParameters);
Action action;
if (functions.TryGetValue("word1|word2", out action)) {action.Invoke(); }
It gave me invalid expression string at Regex.Match(string, sPattern) and cannot convert string to Match at .TryGetValue("word1|word2")
UPDATE
I restructured my code like below, so I've no compiling error, but nothing is printed out as a result:
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string sPattern1 = "word1|word2";
string sPattern2 = "word3|word4";
string input = "word2";
var functions = new Dictionary<string, Action>();
functions.Add("word1", CountParameters);
functions.Add("word3", SomeOtherMethodName);
Action action;
if (functions.TryGetValue((Regex.Match(input, sPattern1)).ToString(), out action))
{
action.Invoke();
}
else
{
// No function with that name
}
}
public static void CountParameters()
{
Console.WriteLine("Fn 1");
}
public static void SomeOtherMethodName()
{
Console.WriteLine("Fn 2");
}
}
The above is working if string input = "word1"; but not working if string input = "word2"; while the RegEx should consider both word1 and word2 as the same based on the string sPattern = "word1|word2";
UPDATE 2
In case it was not clear enough, the output of the above should be:
Executing CountParameters in case the input is word1 or word2, as the RegEx should consider them the same considering the | used in the pattern above.
Executing SomeOtherMethodName in case the input is word3 or word4, as the RegEx should consider them the same considering the | used in the pattern above.
and so on, in case I added more RegEx expression using the OR which is |

I think you want something like this:
var input = "word2";
var functions = new Dictionary<Regex, Action>
{
{new Regex("word1|word2"), CountParameters}
};
functions.FirstOrDefault(f => f.Key.IsMatch(input)).Value?.Invoke();

Related

Replacing a string with an equal amount of underscores

I'm trying to replace all characters of a string with underscores. From my readings a string is ordinarily immutable which means it cannot be modified superficially once it has been created.
I've decided to use StringBuilder to carry out the modification, though I need the underscores to be for display purposes only (hangman game) and not actually alter the value.
I've read through the Microsoft docs and feel like I'm doing the right thing but cannot understand why it won't work. Code below.
using System;
using System.Text;
namespace randomtesting
{
internal class Program
{
static void Main(string[] args)
{
string str = "hello";
StringBuilder sb = new StringBuilder(str);
sb.Replace(str, "_", 0, str.Length);
Console.WriteLine(str);
}
}
}
Edit -
What I ended up doing to get it to do what I wanted - unsure if ideal. Please provide feedback if there's a better way to do it, feel like it's not the most efficient way, but it works.
using System;
using System.Text;
namespace randomtesting
{
internal class Program
{
static void Main(string[] args)
{
string str = "hello";
string strDisplayedAsUnderscores = new string('_', str.Length);
Console.WriteLine(strDisplayedAsUnderscores);
char guess = Convert.ToChar(Console.ReadLine().ToLower()); //reads the user's guess
int guessIndex = str.IndexOf(guess); //gets the index of the character guessed in relation to the original word
StringBuilder word = new StringBuilder(strDisplayedAsUnderscores); //converts the underscores into a StringBuilder string
if (str.Contains(guess))
{
Console.WriteLine(word.Replace('_', guess, guessIndex, 1));
//if guess is contained in the original word
//replace the indexed underscore with the
//guessed character
}
}
}
}
You want this System.String constructor
string result = new string('_', str.Length);

Remove characters from List<string> in between separators (from text file)

Fast way to replace text in text file.
From this: somename#somedomain.com:hello_world
To This: somename:hello_world
It needs to be FAST and support multiple lines of text file.
I tried spiting the string into three parts but it seems slow. Example in the code below.
<pre><code>
public static void Conversion()
{
List<string> list = File.ReadAllLines("ETU/Tut.txt").ToList();
Console.WriteLine("Please wait, converting in progress !");
foreach (string combination in list)
{
if (combination.Contains("#"))
{
write: try
{
using (StreamWriter sw = new
StreamWriter("ETU/UPCombination.txt", true))
{
sw.WriteLine(combination.Split('#', ':')[0] + ":"
+ combination.Split('#', ':')[2]);
}
}
catch
{
goto write;
}
}
else
{
Console.WriteLine("At least one line doesn't contain #");
}
}
}</code></pre>
So a fast way to convert every line in text file from
somename#somedomain.com:hello_world
To: somename:hello_world
then save it different text file.
!Remember the domain bit always changes!
Most likely not the fastest, but it is pretty fast with an expression similar to,
#[^:]+
and replace that with an empty string.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = #"#[^:]+";
string substitution = #"";
string input = #"somename#somedomain.com:hello_world1
somename#some_other_domain.com:hello_world2";
RegexOptions options = RegexOptions.Multiline;
Regex regex = new Regex(pattern, options);
string result = regex.Replace(input, substitution);
}
}
If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.
RegEx Circuit
jex.im visualizes regular expressions:

How to extract name and version from string

I have many filenames such as:
libgcc1-5.2.0-r0.70413e92.rbt.xar
python3-sqlite3-3.4.3-r1.0.f25d9e76.rbt.xar
u-boot-signed-pad.bin-v2015.10+gitAUTOINC+1b6aee73e6-r0.02df1c57.rbt.xar
I need to reliably extract the name, version and "rbt" or "norbt" from this. What is the best way? I am trying regex, something like:
(?<fileName>.*?)-(?<version>.+).(rbt|norbt).xar
Issue is the file name and version both can have multiple semi colons. So I am not sure if there is an answer by I have two questions:
What is the best strategy to extract values such as these?
How would I be able to figure out which version is greater?
Expected output is:
libgcc1, 5.2.0-r0.70413e92, rbt
python3-sqlite3, 3.4.3-r1.0.f25d9e76, rbt
u-boot-signed-pad.bin, v2015.10+gitAUTOINC+1b6aee73e6-r0.02df1c57, rbt
This will give you what you want without using Regex:
var fileNames = new List<string>(){
"libgcc1-5.2.0-r0.70413e92.rbt.xar",
"python3-sqlite3-3.4.3-r1.0.f25d9e76.rbt.xar",
"u-boot-signed-pad.bin-v2015.10+gitAUTOINC+1b6aee73e6-r0.02df1c57.rbt.xar"
};
foreach(var file in fileNames){
var spl = file.Split('-');
string name = string.Join("-",spl.Take(spl.Length-2));
string versionRbt = string.Join("-",spl.Skip(spl.Length-2));
string rbtNorbt = versionRbt.IndexOf("norbt") > 0 ? "norbt" : "rbt";
string version = versionRbt.Replace($".{rbtNorbt}.xar","");
Console.WriteLine($"name={name};version={version};rbt={rbtNorbt}");
}
Output:
name=libgcc1;version=5.2.0-r0.70413e92;rbt=rbt
name=python3-sqlite3;version=3.4.3-r1.0.f25d9e76;rbt=rbt
name=u-boot-signed-pad.bin;version=v2015.10+gitAUTOINC+1b6aee73e6-r0.02df1c57;rbt=rbt
Edit:
Or using Regex:
var m = Regex.Match(file,#"^(?<fileName>.*)-(?<version>.+-.+)\.(rbt|norbt)\.xar$");
string name = m.Groups["fileName"].Value;
string version = m.Groups["version"].Value;
string rbtNorbt = m.Groups[1].Value;
The output will be the same. Both approaches assum that "version" has one -.
Tested following code and work perfectly with Regex. I used option Right-To-Left
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace ConsoleApplication107
{
class Program
{
static void Main(string[] args)
{
string[] inputs = {
"libgcc1-5.2.0-r0.70413e92.rbt.xar",
"python3-sqlite3-3.4.3-r1.0.f25d9e76.rbt.xar",
"u-boot-signed-pad.bin-v2015.10+gitAUTOINC+1b6aee73e6-r0.02df1c57.rbt.xar"
};
string pattern = #"(?'prefix'.+)-(?'middle'[^-][\w+\.]+-[\w+\.]+)\.(?'extension'[^\.]+).\.xar";
foreach (string input in inputs)
{
Match match = Regex.Match(input, pattern, RegexOptions.RightToLeft);
Console.WriteLine("prefix : '{0}', middle : '{1}', extension : '{2}'",
match.Groups["prefix"].Value,
match.Groups["middle"].Value,
match.Groups["extension"].Value
);
}
Console.ReadLine();
}
}
}

regex pattern for the following string Amby : Dexter,Dexter : Karla

I have a input list that takes input in the above format and put them into a comma seperated string. I would like to get strings before and after colon(:).
I tried this regex pattern
string[] reg = Regex.Split(x, #"^(?:[\w ]\:\s[\w]+)+$");
but it doesnt seem to work. Please help.
Below is my code. This is a C# console application
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.Threading.Tasks;
namespace test
{
class Program
{
static void Main(string[] args)
{
List<string> input = new List<string>();
Console.WriteLine("Please enter your input");
string readinput = Console.ReadLine();
input.Add(readinput);
while (readinput != "")
{
readinput = Console.ReadLine();
input.Add(readinput);
}
string x = string.Join(",", input.ToArray());
Console.WriteLine(x);
// using regex
string[] reg = Regex.Split(x, #"^(?:[\w ]\:\s[\w]+)+$");
Console.WriteLine(reg);
Console.ReadLine();
}
}
}
Sorry i was not very clear but the
input : Amby : Dexter,
Dexter : Karla,
Karla : Matt .....
Expected Output is Amby, Dexter, Karla, matt....
If I understood you correctly... User enters some strings, and then you join them with commas. After that you want to split that string by colons?
Why don't you use simpler solution like this:
string[] reg = x.Split(':').Select(s => s.Trim()).ToArray();
Maybe this will get you started:
new Regex(#"(([a-zA-Z])+(?:[\s\:\,]+))").Matches("...");
or this regex
"\b([a-zA-Z])+\b"
Iterate over the MatchCollection.

Ignore special characters in Examine

In Umbraco, I use Examine to search in the website but the content is in french. Everything works fine except when I search for "Français" it's not the same result as "Francais". Is there a way to ignore those french characters? I try to find a FrenchAnalyser for Leucene/Examine but did not found anything. I use Fuzzy so it return results even if the words is not the same.
Here's the code of my search :
public static ISearchResults Search(string searchTerm)
{
var provider = ExamineManager.Instance.SearchProviderCollection["ExternalSearcher"];
var criteria = provider.CreateSearchCriteria(BooleanOperation.Or);
var crawl = criteria.GroupedOr(BoostedSearchableFields, searchTerm.Boost(15))
.Or().GroupedOr(BoostedSearchableFields, searchTerm.Fuzzy(Fuzziness))
.Or().GroupedOr(SearchableFields, searchTerm.Fuzzy(Fuzziness))
.Not().Field("umbracoNavHide", "1");
return provider.Search(crawl.Compile());
}
We ended up using a custom analyer based on the SnowballAnalyzer
public class CustomAnalyzer : SnowballAnalyzer
{
public CustomAnalyzer() : base("French") { }
public override TokenStream TokenStream(string fieldName, TextReader reader)
{
TokenStream result = base.TokenStream(fieldName, reader);
result = new ISOLatin1AccentFilter(result);
return result;
}
}
Try using Regex like this below:
var strInput ="Français";
var strToReplace = string.Empty;
var sNewString = Regex.Replace(strInput, "[^A-Za-z0-9]", strToReplace);
I've used this pattern "[^A-Za-z0-9]" to replace all non-alphanumeric string with a blank.
Hope it helps.
You can actually convert the unicode characters with diacritics to english equivalents using the following method. That will enable you to search for "Français" with the search term "Francais".
public static string RemoveDiacritics(this string text)
{
if (string.IsNullOrWhiteSpace(text))
return text;
text = text.Normalize(NormalizationForm.FormD);
var chars = text.Where(c => CharUnicodeInfo.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark).ToArray();
return new string(chars).Normalize(NormalizationForm.FormC);
}
Use it on any string like this:
var converted = unicodeString.RemoveDiacritics();

Categories

Resources