Custom List<string[]> Sort - c#

I have a list of string[].
List<string[]> cardDataBase;
I need to sort that list by each list-item's second string value (item[1]) in custom order.
The custom order is a bit complicated, order by those starting characters:
"MW1"
"FW"
"DN"
"MWSTX1CK"
"MWSTX2FF"
then order by these letters following above starting letters:
"A"
"Q"
"J"
"C"
"E"
"I"
"A"
and then by the numbers following above.
a sample, unordered list left, ordered right:
MW1E10 MW1Q04
MWSTX2FFI06 MW1Q05
FWQ02 MW1E10
MW1Q04 MW1I06
MW1Q05 FWQ02
FWI01 FWI01
MWSTX2FFA01 DNC03
DNC03 MWSTX1CKC02
MWSTX1CKC02 MWSTX2FFI03
MWSTX2FFI03 MWSTX2FFI06
MW1I06 MWSTX2FFA01
I tried Linq but I am not that good in it right now and cannot solve this on my own. Do I need a dictionary, regex or a dictionary with regex in it? What would be the best approach?

I think you're approaching this incorrectly. You're not sorting strings, you're sorting structured objects that are misrepresented as strings (somebody aptly named this antipattern "stringly typed"). Your requirements show that you know this structure, yet it's not represented in the datastructure List<string[]>, and that's making your life hard. You should parse that structure into a real type (struct or class), and then sort that.
enum PrefixCode { MW1, FW, DN, MWSTX1CK, MWSTX2FF, }
enum TheseLetters { Q, J, C, E, I, A, }
struct CardRecord : IComparable<CardRecord> {
public readonly PrefixCode Code;
public readonly TheseLetters Letter;
public readonly uint Number;
public CardRecord(string input) {
Code = ParseEnum<PrefixCode>(ref input);
Letter = ParseEnum<TheseLetters>(ref input);
Number = uint.Parse(input);
}
static T ParseEnum<T>(ref string input) { //assumes non-overlapping prefixes
foreach(T val in Enum.GetValues(typeof(T))) {
if(input.StartsWith(val.ToString())) {
input = input.Substring(val.ToString().Length);
return val;
}
}
throw new InvalidOperationException("Failed to parse: "+input);
}
public int CompareTo(CardRecord other) {
var codeCmp = Code.CompareTo(other.Code);
if (codeCmp!=0) return codeCmp;
var letterCmp = Letter.CompareTo(other.Letter);
if (letterCmp!=0) return letterCmp;
return Number.CompareTo(other.Number);
}
public override string ToString() {
return Code.ToString() + Letter + Number.ToString("00");
}
}
A program using the above to process your example might then be:
static class Program {
static void Main() {
var inputStrings = new []{ "MW1E10", "MWSTX2FFI06", "FWQ02", "MW1Q04", "MW1Q05",
"FWI01", "MWSTX2FFA01", "DNC03", "MWSTX1CKC02", "MWSTX2FFI03", "MW1I06" };
var outputStrings = inputStrings
.Select(s => new CardRecord(s))
.OrderBy(c => c)
.Select(c => c.ToString());
Console.WriteLine(string.Join("\n", outputStrings));
}
}
This generates the same ordering as in your example. In real code, I'd recommend you name the types according to what they represent, and not, for example, TheseLetters.
This solution - with a real parse step - is superior because it's almost certain that you'll want to do more with this data at some point, and this allows you to actually access the components of the data easily. Furthermore, it's comprehensible to a future maintainer since the reason behind the ordering is somewhat clear. By contrast, if you chose to do complex string-based processing it's often very hard to understand what's going on (especially if it's part of a larger program, and not a tiny example as here).
Making new types is cheap. If your method's return value doesn't quite "fit" in an existing type, just make a new one, even if that means 1000's of types.

A bit spoonfeeding, but I found this question pretty interesting and perhaps it will be useful for others, also added some comments to explain:
void Main()
{
var cardDatabase = new List<string>{
"MW1E10",
"MWSTX2FFI06",
"FWQ02",
"MW1Q04",
"MW1Q05",
"FWI01",
"MWSTX2FFA01",
"DNC03",
"MWSTX1CKC02",
"MWSTX2FFI03",
"MW1I06",
};
var orderTable = new List<string>[]{
new List<string>
{
"MW1",
"FW",
"DN",
"MWSTX1CK",
"MWSTX2FF"
},
new List<string>
{
"Q",
"J",
"C",
"E",
"I",
"A"
}
};
var test = cardDatabase.Select(input => {
var r = Regex.Match(input, "^(MW1|FW|DN|MWSTX1CK|MWSTX2FF)(A|Q|J|C|E|I|A)([0-9]+)$");
if(!r.Success) throw new Exception("Invalid data!");
// for each input string,
// we are going to split it into "substrings",
// eg: MWSTX1CKC02 will be
// [MWSTX1CK, C, 02]
// after that, we use IndexOf on each component
// to calculate "real" order,
// note that thirdComponent(aka number component)
// does not need IndexOf because it is already representing the real order,
// we still want to convert string to integer though, because we don't like
// "string ordering" for numbers.
return new
{
input = input,
firstComponent = orderTable[0].IndexOf(r.Groups[1].Value),
secondComponent = orderTable[1].IndexOf(r.Groups[2].Value),
thirdComponent = int.Parse(r.Groups[3].Value)
};
// and after it's done,
// we start using LINQ OrderBy and ThenBy functions
// to have our custom sorting.
})
.OrderBy(calculatedInput => calculatedInput.firstComponent)
.ThenBy(calculatedInput => calculatedInput.secondComponent)
.ThenBy(calculatedInput => calculatedInput.thirdComponent)
.Select(calculatedInput => calculatedInput.input)
.ToList();
Console.WriteLine(test);
}

You can use the Array.Sort() method. Where your first parameter is the string[] you're sorting and the second parameter contains the complicated logic of determining the order.

You can use the IEnumerable.OrderBy method provided by the System.Linq namespace.

Related

How to use .OrderBy for multiple conditions, one only sometimes used?

Sorry if the title, is confusing, I had some trouble putting my problem into words.
I have a List, where every string is composed of 2 words, delimited by space.
For example:
{ "word1 word2", "wordA wordB", "dog cat", "mouse cat" }
I want to use OrderBy to sort the list by the 2nd word, if any words are equal, I then want to sort those by the 1st word. I'm having trouble figuring out how to handle the 2nd condition for this (sorting by 1st word only if 2nd words are equal).
I originally tried:
public List<string> SpecialSort(List<string> text)
{
return text.OrderBy(x => x.Split(' ')[1]).ThenBy(x => x.Split(' ')[0]);
}
but this seems to just sort first by the 2nd word, and then re-sort everything by the 1st word. Is there a way for me to do this where I only sort by 1st word if the 2nd words are equal?
Thanks!
My advice would be to split the text into words, while keeping the original text in a Select. Then sort the sequence and finally remove the split words.
Requirement
Input: a sequence of strings, every string has exactly one space.
This space is neither the first nor the last character.
The characters before this one and only space are defined as the first word.
The characters after the space are defined as the second word.
Output: Sort the sequence by 2nd word, then by 1st word.
IEnumerable<string> inputTexts = ...
const string splitChar = ' ';
// first add the split words
var sortedSequence = inputTexts.Select(txt => new
{
Original = txt,
Split = txt.Split(splitChar, StringSplitOptions.None),
})
// then sort by the split words
.OrderBy(splitTxt => splitTxt.Split[1])
.ThenBy(splitTxt => splitTxt.Split[0])
// finally remove the split words
.Select(splitTxt => splitTxt.Original);
Create intermediate results within an .OrderBy() statement can be painful, cause the comparer needs to possibly call them multiple times on each object. Also to make it better maintainable I would write a class that gets the original value, creates the desired elements and feeding these intermediate objects into a specific comparer that can sort them. At the end just get the original value out of the intermediate class and you're done.
A rough sketch for your example would look something like this:
using System;
using System.Collections.Generic;
using System.Linq;
public static class Program
{
private static void Main(string[] args)
{
var words = new List<string>{"word1 word2", "wordA wordB", "dog cat", "mouse cat"};
var ordered = words
.Select(SpecialComparerInstance.Create)
.OrderBy(special => special, SpecialComparer.Default)
.Select(special => special.Value);
foreach (var item in ordered)
{
Console.WriteLine(item);
}
}
}
public class SpecialComparerInstance
{
public static SpecialComparerInstance Create(string value) => new SpecialComparerInstance(value);
public SpecialComparerInstance(string value)
{
if (string.IsNullOrEmpty(value))
throw new ArgumentNullException(nameof(value));
var elements = value.Split(' ');
if (elements.Length != 2)
throw new ArgumentException("Must contain exactly one space character", nameof(value));
Value = value;
FirstOrderValue = elements[1];
SecondOrderValue = elements[0];
}
public string Value { get; }
public string FirstOrderValue { get; }
public string SecondOrderValue { get; }
}
public class SpecialComparer : IComparer<SpecialComparerInstance>
{
public static readonly IComparer<SpecialComparerInstance> Default = new SpecialComparer(StringComparer.Ordinal);
private readonly StringComparer _comparer;
public SpecialComparer(StringComparer comparer)
{
_comparer = comparer;
}
public int Compare(SpecialComparerInstance x, SpecialComparerInstance y)
{
if (ReferenceEquals(x, y))
return 0;
if (ReferenceEquals(x, null))
return 1;
if (ReferenceEquals(y, null))
return -1;
var result = _comparer.Compare(x.FirstOrderValue, y.FirstOrderValue);
if (result == 0)
result = _comparer.Compare(x.SecondOrderValue, y.SecondOrderValue);
return result;
}
}

How to check whether two lists items have value equality using EqualityComparer? [duplicate]

Before marking this as duplicate because of its title please consider the following short program:
static void Main()
{
var expected = new List<long[]> { new[] { Convert.ToInt64(1), Convert.ToInt64(999999) } };
var actual = DoSomething();
if (!actual.SequenceEqual(expected)) throw new Exception();
}
static IEnumerable<long[]> DoSomething()
{
yield return new[] { Convert.ToInt64(1), Convert.ToInt64(999999) };
}
I have a method which returns a sequence of arrays of type long. To test it I wrote some test-code similar to that one within Main.
However I get the exception, but I don´t know why. Shouldn´t the expected sequence be comparable to the actually returned one or did I miss anything?
To me it looks as both the method and the epxected contain exactly one single element containing an array of type long, doesn´t it?
EDIT: So how do I achieve to not get the exception meaning to compare the elements within the enumeration to return equality?
The actual problem is the fact that you're comparing two long[], and Enumerable.SequenceEquals will use an ObjectEqualityComparer<Int64[]> (you can see that by examining EqualityComparer<long[]>.Default which is what is being internally used by Enumerable.SequenceEquals), which will compare references of those two arrays, and not the actual values stored inside the array, which obviously aren't the same.
To get around this, you could write a custom EqualityComparer<long[]>:
static void Main()
{
var expected = new List<long[]>
{ new[] { Convert.ToInt64(1), Convert.ToInt64(999999) } };
var actual = DoSomething();
if (!actual.SequenceEqual(expected, new LongArrayComparer()))
throw new Exception();
}
public class LongArrayComparer : EqualityComparer<long[]>
{
public override bool Equals(long[] first, long[] second)
{
return first.SequenceEqual(second);
}
// GetHashCode implementation in the courtesy of #JonSkeet
// from http://stackoverflow.com/questions/7244699/gethashcode-on-byte-array
public override int GetHashCode(long[] arr)
{
unchecked
{
if (array == null)
{
return 0;
}
int hash = 17;
foreach (long element in arr)
{
hash = hash * 31 + element.GetHashCode();
}
return hash;
}
}
}
No, your sequences are not equal!
Lets remove the sequence bit, and just take what is in the first element of each item
var firstExpected = new[] { Convert.ToInt64(1), Convert.ToInt64(999999) };
var firstActual = new[] { Convert.ToInt64(1), Convert.ToInt64(999999) };
Console.WriteLine(firstExpected == firstActual); // writes "false"
The code above is comparing two separate arrays for equality. Equality does not check the contents of arrays it checks the references for equality.
Your code using SequenceEquals is, essentially, doing the same thing. It checks the references in each case of each element in an enumerable.
SequenceEquals tests for the elements within the sequences to be identical. The elements within the enumerations are of type long[], so we actually compare two different arrays (containing the same elements however) against each other which is obsiously done by comparing their references instead of their actual value .
So what we actually check here is this expected[0] == actual[0] instead of expected[0].SequqnceEquals(actual[0])
This is obiosuly returns false as both arrays share different references.
If we flatten the hierarchy using SelectMany we get what we want:
if (!actual.SelectMany(x => x).SequenceEqual(expected.SelectMany(x => x))) throw new Exception();
EDIT:
Based on this approach I found another elegant way to check if all the elements from expected are contained in actual also:
if (!expected.All(x => actual.Any(y => y.SequenceEqual(x)))) throw new Exception();
This will search if for ever sub-list within expected there is a list within actual that is sequentially identical to the current one. This seems much smarter to be as we do not need any custom EqualityComparer and no weird hashcode-implementation.

C#: Dynamically Constructing Variables

I get from an input a group of double variables named: weight0, weight1...weight49.
I want to dynamically insert them into a double Array for easier manipulation.
But instead of calling each one like: Weights[0] = weight0...Weights[49] = weight49 I want to do it with a single loop.
Is there a way to do it?
No, basically - unless you mean at the same time that you create the array:
var weights = new[] {weight0, weight1, weight2, ... , weight48, weight49};
Personally, I'd be tempted to get rid of the 50 variables, and use the array from the outset, but that may not be possible in all cases.
you could use reflection to determine the index of the array from the variable names but this is far from efficient. See this post for details.
I would try to do it with a KeyValuePair- Listobject
// sample data
var weight = 1.00;
// create a list
var tmp = new List<KeyValuePair<string,object>>();
// Here you can add your variables
tmp.Add(new KeyValuePair<string,object>("weights" + tmp.Count.ToString()
, weight));
// If needed convert to array
var weights = tmp.ToArray();
// get the information out of the array
var weightValue = weights[0].Value;
var weightKey = weights[0].Key;
I think this will give you all the options, you might need for the array. Give it a try.
I'm putting this up because you can do it - so long as these variables are actually fields/properties. Whether you should is another matter - this solution, while reusable, is slow (needs delegate caching) and I have to say I agree with Marc Gravell - consider using an array throughout if you can.
If the variables are properties then it needs changing. Also if you need to write the array back to the variables in one shot (because this solution generates an array with copies of all the doubles, I wouldn't consider creating an object array with boxed doubles), that requires another method...
So here goes. First a holy wall of code/extension method:
//paste this as a direct child of a namespace (not a nested class)
public static class SO8877853Extensions
{
public static TArray[] FieldsToArray<TObj, TArray>(this TObj o,string fieldPrefix)
{
if(string.IsNullOrWhiteSpace(fieldPrefix))
throw new ArgumentException("fieldPrefix must not null/empty/whitespace",
"fieldPrefix");
//I've done this slightly more expanded than it really needs to be...
var fields = typeof(TObj).GetFields(System.Reflection.BindingFlags.Instance
| System.Reflection.BindingFlags.Public
| System.Reflection.BindingFlags.NonPublic)
.Where(f =>f.Name.StartsWith(fieldPrefix) && f.FieldType.Equals(typeof(TArray)))
.Select(f =>new{ Field = f, OrdinalStr = f.Name.Substring(fieldPrefix.Length)})
.Where(f => { int unused; return int.TryParse(f.OrdinalStr, out unused);})
.Select(f => new { Field = f.Field, Ordinal = int.Parse(f.OrdinalStr) })
.OrderBy(f => f.Ordinal).ToArray();
//doesn't handle ordinal gaps e.g. 0,1,2,7
if(fields.Length == 0)
throw new ArgumentException(
string.Format("No fields found with the prefix {0}",
fieldPrefix),
"fieldPrefix");
//could instead bake the 'o' reference as a constant - but if
//you are caching the delegate, it makes it non-reusable.
ParameterExpression pThis = Expression.Parameter(o.GetType());
//generates a dynamic new double[] { var0, var1 ... } expression
var lambda = Expression.Lambda<Func<TObj, TArray[]>>(
Expression.NewArrayInit(typeof(TArray),
fields.Select(f => Expression.Field(pThis, f.Field))), pThis);
//you could cache this delegate here by typeof(TObj),
//fieldPrefix and typeof(TArray) in a Dictionary/ConcurrentDictionary
return lambda.Compile()(o);
}
}
The extension method above will work on any type. It's generic over both the instance type and desired array type to simplify the creation of the lambda in code - it doesn't have to be generic though.
You pass in the name prefix for a group of fields - in your case "weight" - it then searches all the public and private instance fields for those with that prefix that also have a suffix which can be parsed into an integer. It then orders those fields based on that ordinal. It does not check for gaps in the ordinal list - so a type with weight0 and weight2 would work, but would only create a two-element array.
Then it bakes a dynamic piece of code using Expression trees, compiles it (at this point, as mentioned in the code, it would be good to cache the delegate against TObj and TArray for future use) and then executes it, returning the result.
Now add this to a test class in a standard unit test project:
private class SO8877853
{
private double field0 = 1.0;
private double field1 = -5.0;
private double field2 = 10.0;
public double[] AsArray()
{
//it would be nice not to have to pass both type names here - that
//can be achieved by making the extension method pass out the array
//via an 'out TArray[]' instead.
return this.FieldsToArray<SO8877853, double>("field");
}
}
[TestMethod]
public void TestThatItWorks()
{
var asArray = new SO8877853().AsArray();
Assert.IsTrue(new[] { 1.0, -5.0, 10.0 }.SequenceEqual(asArray));
}
Like I say - I'm not condoning doing this, nor am I expecting any +1s for it - but I'm a sucker for a challenge :)

Finding duplicates in List<string>

In a list with some hundred thousand entries, how does one go about comparing each entry with the rest of the list for duplicates?
For example, List fileNames contains both "00012345.pdf" and "12345.pdf" and are considered duplicte. What is the best strategy to flagging this kind of a duplicate?
Thanks
Update: The naming of files is restricted to numbers. They are padded with zeros. Duplicates are where the padding is missing. Thus, "123.pdf" & "000123.pdf" are duplicates.
You probably want to implement your own substring comparer to test equality based on whether a substring is contained within another string.
This isn't necessarily optimised, but it will work. You could also possibly consider using Parallel Linq if you are using .NET 4.0.
EDIT: Answer updated to reflect refined question after it was edited
void Main()
{
List<string> stringList = new List<string> { "00012345.pdf","12345.pdf","notaduplicate.jpg","3453456363234.jpg"};
IEqualityComparer<string> comparer = new NumericFilenameEqualityComparer ();
var duplicates = stringList.GroupBy (s => s, comparer).Where(grp => grp.Count() > 1);
// do something with grouped duplicates...
}
// Not safe for null's !
// NB do you own parameter / null checks / string-case options etc !
public class NumericFilenameEqualityComparer : IEqualityComparer<string> {
private static Regex digitFilenameRegex = new Regex(#"\d+", RegexOptions.Compiled);
public bool Equals(string left, string right) {
Match leftDigitsMatch = digitFilenameRegex.Match(left);
Match rightDigitsMatch = digitFilenameRegex.Match(right);
long leftValue = leftDigitsMatch.Success ? long.Parse(leftDigitsMatch.Value) : long.MaxValue;
long rightValue = rightDigitsMatch.Success ? long.Parse(rightDigitsMatch.Value) : long.MaxValue;
return leftValue == rightValue;
}
public int GetHashCode(string value) {
return base.GetHashCode();
}
}
I understand you are looking for duplicates in order to remove them?
One way to go about it could be the following:
Create a class MyString which takes care of duplication rules. That is, overrides Equals and GetHashCode to recreate exactly the duplication rules you are considering. (I'm understanding from your question that 00012345.pdf and 12345.pdf should be considered duplicates?)
Make this class explicitly or implictly convertible to string (or override ToString() for that matter).
Create a HashCode<MyString> and fill it up iterating through your original List<String> checking for duplicates.
Might be dirty but it will do the trick. The only "hard" part here is correctly implementing your duplication rules.
I have a simple solution for everyone to find a duplicate string word and cahracter
For word
public class Test {
public static void main(String[] args) {
findDuplicateWords("i am am a a learner learner learner");
}
private static void findDuplicateWords(String string) {
HashMap<String,Integer> hm=new HashMap<>();
String[] s=string.split(" ");
for(String tempString:s){
if(hm.get(tempString)!=null){
hm.put(tempString, hm.get(tempString)+1);
}
else{
hm.put(tempString,1);
}
}
System.out.println(hm);
}
}
for character use for loop, get array length and use charAt()
Maybe somthing like this:
List<string> theList = new List<string>() { "00012345.pdf", "00012345.pdf", "12345.pdf", "1234567.pdf", "12.pdf" };
theList.GroupBy(txt => txt)
.Where(grouping => grouping.Count() > 1)
.ToList()
.ForEach(groupItem => Console.WriteLine("{0} duplicated {1} times with these values {2}",
groupItem.Key,
groupItem.Count(),
string.Join(" ", groupItem.ToArray())));

String Parsing in C#

What is the most efficient way to parse a C# string in the form of
"(params (abc 1.3)(sdc 2.0)(www 3.05)....)"
into a struct in the form
struct Params
{
double abc,sdc,www....;
}
Thanks
EDIT
The structure always have the same parameters (same names,only doubles, known at compile time).. but the order is not granted.. only one struct at a time..
using System;
namespace ConsoleApplication1
{
class Program
{
struct Params
{
public double abc, sdc;
};
static void Main(string[] args)
{
string s = "(params (abc 1.3)(sdc 2.0))";
Params p = new Params();
object pbox = (object)p; // structs must be boxed for SetValue() to work
string[] arr = s.Substring(8).Replace(")", "").Split(new char[] { ' ', '(', }, StringSplitOptions.RemoveEmptyEntries);
for (int i = 0; i < arr.Length; i+=2)
typeof(Params).GetField(arr[i]).SetValue(pbox, double.Parse(arr[i + 1]));
p = (Params)pbox;
Console.WriteLine("p.abc={0} p.sdc={1}", p.abc, p.sdc);
}
}
}
Note: if you used a class instead of a struct the boxing/unboxing would not be necessary.
Depending on your complete grammar you have a few options:
if it's a very simple grammar and you don't have to test for errors in it you could simply go with the below (which will be fast)
var input = "(params (abc 1.3)(sdc 2.0)(www 3.05)....)";
var tokens = input.Split('(');
var typeName = tokens[0];
//you'll need more than the type name (assembly/namespace) so I'll leave that to you
Type t = getStructFromType(typeName);
var obj = TypeDescriptor.CreateInstance(null, t, null, null);
for(var i = 1;i<tokens.Length;i++)
{
var innerTokens = tokens[i].Trim(' ', ')').Split(' ');
var fieldName = innerTokens[0];
var value = Convert.ToDouble(innerTokens[1]);
var field = t.GetField(fieldName);
field.SetValue(obj, value);
}
that simple approach however requires a well conforming string or it will misbehave.
If the grammar is a bit more complicated e.g. nested ( ) then that simple approach won't work.
you could try to use a regEx but that still requires a rather simple grammar so if you end up having a complex grammar your best choice is a real parser. Irony is easy to use since you can write it all in simple c# (some knowledge of BNF is a plus though).
Do you need to support multiple structs ? In other words, does this need to be dynamic; or do you know the struct definition at compile time ?
Parsing the string with a regex would be the obvious choice.
Here is a regex, that will parse your string format:
private static readonly Regex regParser = new Regex(#"^\(params\s(\((?<name>[a-zA-Z]+)\s(?<value>[\d\.]+)\))+\)$", RegexOptions.Compiled);
Running that regex on a string will give you two groups named "name" and "value". The Captures property of each group will contain the names and values.
If the struct type is unknown at compile time, then you will need to use reflection to fill in the fields.
If you mean to generate the struct definition at runtime, you will need to use Reflection to emit the type; or you will need to generate the source code.
Which part are you having trouble with ?
A regex can do the job for you:
public Dictionary<string, double> ParseString(string input){
var dict = new Dictionary<string, double>();
try
{
var re = new Regex(#"(?:\(params\s)?(?:\((?<n>[^\s]+)\s(?<v>[^\)]+)\))");
foreach (Match m in re.Matches(input))
dict.Add(m.Groups["n"].Value, double.Parse(m.Groups["v"].Value));
}
catch
{
throw new Exception("Invalid format!");
}
return dict;
}
use it like:
string str = "(params (abc 1.3)(sdc 2.0)(www 3.05))";
var parsed = ParseString(str);
// parsed["abc"] would now return 1.3
That might fit better than creating a lot of different structs for every possible input string, and using reflection for filling them. I dont think that is worth the effort.
Furthermore I assumed the input string is always in exactly the format you posted.
You might consider performing just enough string manipulation to make the input look like standard command line arguments then use an off-the-shelf command line argument parser like NDesk.Options to populate the Params object. You give up some efficiency but you make it up in maintainability.
public Params Parse(string input)
{
var #params = new Params();
var argv = ConvertToArgv(input);
new NDesk.Options.OptionSet
{
{"abc=", v => Double.TryParse(v, out #params.abc)},
{"sdc=", v => Double.TryParse(v, out #params.sdc)},
{"www=", v => Double.TryParse(v, out #params.www)}
}
.Parse(argv);
return #params;
}
private string[] ConvertToArgv(string input)
{
return input
.Replace('(', '-')
.Split(new[] {')', ' '});
}
Do you want to build a data representation of your defined syntax?
If you are looking for easily maintainability, without having to write long RegEx statements you could build your own Lexer parser. here is a prior discussion on SO with good links in the answers as well to help you
Poor man's "lexer" for C#
I would just do a basic recursive-descent parser. It may be more general than you want, but nothing else will be much faster.
Here's an out-of-the-box approach:
convert () to {} and [SPACE] to ":", then use System.Web.Script.Serialization.JavaScriptSerializer.Deserialize
string s = "(params (abc 1.3)(sdc 2.0))"
.Replace(" ", ":")
.Replace("(", "{")
.Replace(")","}");
return new System.Web.Script.Serialization.JavaScriptSerializer().Deserialize(s);

Categories

Resources