How to extract name and version from string - c#

I have many filenames such as:
libgcc1-5.2.0-r0.70413e92.rbt.xar
python3-sqlite3-3.4.3-r1.0.f25d9e76.rbt.xar
u-boot-signed-pad.bin-v2015.10+gitAUTOINC+1b6aee73e6-r0.02df1c57.rbt.xar
I need to reliably extract the name, version and "rbt" or "norbt" from this. What is the best way? I am trying regex, something like:
(?<fileName>.*?)-(?<version>.+).(rbt|norbt).xar
Issue is the file name and version both can have multiple semi colons. So I am not sure if there is an answer by I have two questions:
What is the best strategy to extract values such as these?
How would I be able to figure out which version is greater?
Expected output is:
libgcc1, 5.2.0-r0.70413e92, rbt
python3-sqlite3, 3.4.3-r1.0.f25d9e76, rbt
u-boot-signed-pad.bin, v2015.10+gitAUTOINC+1b6aee73e6-r0.02df1c57, rbt

This will give you what you want without using Regex:
var fileNames = new List<string>(){
"libgcc1-5.2.0-r0.70413e92.rbt.xar",
"python3-sqlite3-3.4.3-r1.0.f25d9e76.rbt.xar",
"u-boot-signed-pad.bin-v2015.10+gitAUTOINC+1b6aee73e6-r0.02df1c57.rbt.xar"
};
foreach(var file in fileNames){
var spl = file.Split('-');
string name = string.Join("-",spl.Take(spl.Length-2));
string versionRbt = string.Join("-",spl.Skip(spl.Length-2));
string rbtNorbt = versionRbt.IndexOf("norbt") > 0 ? "norbt" : "rbt";
string version = versionRbt.Replace($".{rbtNorbt}.xar","");
Console.WriteLine($"name={name};version={version};rbt={rbtNorbt}");
}
Output:
name=libgcc1;version=5.2.0-r0.70413e92;rbt=rbt
name=python3-sqlite3;version=3.4.3-r1.0.f25d9e76;rbt=rbt
name=u-boot-signed-pad.bin;version=v2015.10+gitAUTOINC+1b6aee73e6-r0.02df1c57;rbt=rbt
Edit:
Or using Regex:
var m = Regex.Match(file,#"^(?<fileName>.*)-(?<version>.+-.+)\.(rbt|norbt)\.xar$");
string name = m.Groups["fileName"].Value;
string version = m.Groups["version"].Value;
string rbtNorbt = m.Groups[1].Value;
The output will be the same. Both approaches assum that "version" has one -.

Tested following code and work perfectly with Regex. I used option Right-To-Left
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace ConsoleApplication107
{
class Program
{
static void Main(string[] args)
{
string[] inputs = {
"libgcc1-5.2.0-r0.70413e92.rbt.xar",
"python3-sqlite3-3.4.3-r1.0.f25d9e76.rbt.xar",
"u-boot-signed-pad.bin-v2015.10+gitAUTOINC+1b6aee73e6-r0.02df1c57.rbt.xar"
};
string pattern = #"(?'prefix'.+)-(?'middle'[^-][\w+\.]+-[\w+\.]+)\.(?'extension'[^\.]+).\.xar";
foreach (string input in inputs)
{
Match match = Regex.Match(input, pattern, RegexOptions.RightToLeft);
Console.WriteLine("prefix : '{0}', middle : '{1}', extension : '{2}'",
match.Groups["prefix"].Value,
match.Groups["middle"].Value,
match.Groups["extension"].Value
);
}
Console.ReadLine();
}
}
}

Related

Issue with Regex Replace? [duplicate]

I am using Regex to replace all the strings in a template. Everything works fine until there is a value I want to replace, which is $0.00. I can't seem to properly replace the $0 as replacement text. The output I am getting is "Project Cost: [[ProjectCost]].00". Any idea why?
Here is an example of the code with some simplified variables.
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
using Newtonsoft.Json.Linq;
using System;
using System.Collections.Generic;
using System.Security;
using System.Text.RegularExpressions;
namespace Export.Services
{
public class CommonExportService
{
private Dictionary<string, string> _formTokens;
public CommonExportService() {
_formTokens = {{"EstimatedOneTimeProjectCost", "0.00"}};
}
private string GetReplacementText(string replacementText)
{
replacementText = "Project Cost: [[EstimatedOneTimeProjectCost]]";
//replacement text = "Project Cost: [[ProjectCost]]"
foreach (var token in _formTokens)
{
var val = token.Value;
var key = token.Key;
//work around for now
//if (val.Equals("$0.00")) {
// val = "0.00";
//}
var reg = new Regex(Regex.Escape("[[" + key + "]]"));
if (reg.IsMatch(replacementText))
replacementText = reg.Replace(replacementText, SecurityElement.Escape(val ?? string.Empty));
else {
}
}
return replacementText;
//$0.00 does not replace, something is happening with the $0 before the decimal
//the output becomes Project Cost: [[EstimatedOneTimeProjectCost]].00
//The output is correct for these
//0.00 replaces correctly
//$.00 replaces correctly
//0 replaces correctly
//00 replaces correctly
//$ replaces correctly
}
}
}
Since your replacement string is built dynamically, you need to take care of the $ char in it. When $ is followed with 0, the $0 is a backreference to the whole match, so the whole match is inserted as a result of replacement.
You just need to dollar-escape the $ inside a literal string pattern:
return replacementText.replace("$", "$$");
Then, your replacement pattern will contain $$0, and that will "translate" into a literal $0 string.

regex pattern for the following string Amby : Dexter,Dexter : Karla

I have a input list that takes input in the above format and put them into a comma seperated string. I would like to get strings before and after colon(:).
I tried this regex pattern
string[] reg = Regex.Split(x, #"^(?:[\w ]\:\s[\w]+)+$");
but it doesnt seem to work. Please help.
Below is my code. This is a C# console application
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.Threading.Tasks;
namespace test
{
class Program
{
static void Main(string[] args)
{
List<string> input = new List<string>();
Console.WriteLine("Please enter your input");
string readinput = Console.ReadLine();
input.Add(readinput);
while (readinput != "")
{
readinput = Console.ReadLine();
input.Add(readinput);
}
string x = string.Join(",", input.ToArray());
Console.WriteLine(x);
// using regex
string[] reg = Regex.Split(x, #"^(?:[\w ]\:\s[\w]+)+$");
Console.WriteLine(reg);
Console.ReadLine();
}
}
}
Sorry i was not very clear but the
input : Amby : Dexter,
Dexter : Karla,
Karla : Matt .....
Expected Output is Amby, Dexter, Karla, matt....
If I understood you correctly... User enters some strings, and then you join them with commas. After that you want to split that string by colons?
Why don't you use simpler solution like this:
string[] reg = x.Split(':').Select(s => s.Trim()).ToArray();
Maybe this will get you started:
new Regex(#"(([a-zA-Z])+(?:[\s\:\,]+))").Matches("...");
or this regex
"\b([a-zA-Z])+\b"
Iterate over the MatchCollection.

Using RegEx in Dictionary

I'm new to C#, and it looks I need to use Regex with Dictionary<string, Action>
The below working example with me, as testing of understanding the Regex in C#:
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string sPattern = "word1|word2";
string input = "word2";
Match result = Regex.Match(input, sPattern);
Console.WriteLine(result);
}
}
I tried to include it in Dictionary as below, but failed:
var functions = new Dictionary<Match, Action>();
functions.Add(Regex.Match(string, sPattern), CountParameters);
Action action;
if (functions.TryGetValue("word1|word2", out action)) {action.Invoke(); }
It gave me invalid expression string at Regex.Match(string, sPattern) and cannot convert string to Match at .TryGetValue("word1|word2")
UPDATE
I restructured my code like below, so I've no compiling error, but nothing is printed out as a result:
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string sPattern1 = "word1|word2";
string sPattern2 = "word3|word4";
string input = "word2";
var functions = new Dictionary<string, Action>();
functions.Add("word1", CountParameters);
functions.Add("word3", SomeOtherMethodName);
Action action;
if (functions.TryGetValue((Regex.Match(input, sPattern1)).ToString(), out action))
{
action.Invoke();
}
else
{
// No function with that name
}
}
public static void CountParameters()
{
Console.WriteLine("Fn 1");
}
public static void SomeOtherMethodName()
{
Console.WriteLine("Fn 2");
}
}
The above is working if string input = "word1"; but not working if string input = "word2"; while the RegEx should consider both word1 and word2 as the same based on the string sPattern = "word1|word2";
UPDATE 2
In case it was not clear enough, the output of the above should be:
Executing CountParameters in case the input is word1 or word2, as the RegEx should consider them the same considering the | used in the pattern above.
Executing SomeOtherMethodName in case the input is word3 or word4, as the RegEx should consider them the same considering the | used in the pattern above.
and so on, in case I added more RegEx expression using the OR which is |
I think you want something like this:
var input = "word2";
var functions = new Dictionary<Regex, Action>
{
{new Regex("word1|word2"), CountParameters}
};
functions.FirstOrDefault(f => f.Key.IsMatch(input)).Value?.Invoke();

Extracting and reading decimal numbers from a string in C#

I am relatively new to C# programming and I apologize if this is a simple matter, but I need help with something.
I need a function which will 'extract' regular AND decimal numbers from a string and place them in an array. I'm familiar with
string[] extractData = Regex.Split(someInput, #"\D+")
but that only takes out integers. If I have a string "19 something 58" it will take 19 and 58 and store them into two different array fields. However if I had "19.58 something" it will again take them as two separate numbers, while I want to register it as one decimal number.
Is there a way to make it 'read' such numbers as one decimal number, using Regex or some other method?
Thanks in advance.
Try following :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace ConsoleApplication9
{
class Program
{
const string FILENAME = #"c:\temp\test.xml";
static void Main(string[] args)
{
string[] inputs = {
"9 something 58" ,
"19.58 something"
};
foreach (string input in inputs)
{
MatchCollection matches = Regex.Matches(input, #"(?'number'\d*\.\d*)|(?'number'\d+[^\.])");
foreach (Match match in matches)
{
Console.WriteLine("Number : {0}", match.Groups["number"].Value);
}
}
Console.ReadLine();
}
}
}
Try this
Regex.Replace(someInput, "[^-?\d+\.]", ""))

Delimit a string by character unless within quotation marks C#

I need to demilitarise text by a single character, a comma. But I want to only use that comma as a delimiter if it is not encapsulated by quotation marks.
An example:
Method,value1,value2
Would contain three values: Method, value1 and value2
But:
Method,"value1,value2"
Would contain two values: Method and "value1,value2"
I'm not really sure how to go about this as when splitting a string I would use:
String.Split(',');
But that would demilitarise based on ALL commas. Is this possible without getting overly complicated and having to manually check every character of the string.
Thanks in advance
Copied from my comment: Use an available csv parser like VisualBasic.FileIO.TextFieldParser or this or this.
As requested, here is an example for the TextFieldParser:
var allLineFields = new List<string[]>();
string sampleText = "Method,\"value1,value2\"";
var reader = new System.IO.StringReader(sampleText);
using (var parser = new Microsoft.VisualBasic.FileIO.TextFieldParser(reader))
{
parser.Delimiters = new string[] { "," };
parser.HasFieldsEnclosedInQuotes = true; // <--- !!!
string[] fields;
while ((fields = parser.ReadFields()) != null)
{
allLineFields.Add(fields);
}
}
This list now contains a single string[] with two strings. I have used a StringReader because this sample uses a string, if the source is a file use a StreamReader(f.e. via File.OpenText).
You can try Regex.Split() to split the data up using the pattern
",|(\"[^\"]*\")"
This will split by commas and by characters within quotes.
Code Sample:
using System;
using System.Linq;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string data = "Method,\"value1,value2\",Method2";
string[] pieces = Regex.Split(data, ",|(\"[^\"]*\")").Where(exp => !String.IsNullOrEmpty(exp)).ToArray();
foreach (string piece in pieces)
{
Console.WriteLine(piece);
}
}
}
Results:
Method
"value1,value2"
Method2
Demo

Categories

Resources