How to split string based on multiple Keywords? - c#

I have a string which has string values separated by special character ';' and I need to split string and store each value in separate string. In my ipStr, keywords like ServerName, DBName, TableNames and ColumnNames are identifiers and it will not change only the values might get changed.
For Example.
string ipStr = "ServerName=DevTestServer;DBName=CustomerSummary;TableNames=CustomerDetails&OrderDetails;ColumnNames=ID,CustName,OrderID;"
Now I want to split ServerName, DBName, TableNames and ColumnNames values separately and store each value in in different strings. I tried below but after finding ServerName, identifying DBName part looks difficult and also it does not look like a proper way of coding.
string ServerIdentifier = "ServerName=";
string separator = ";";
string serverName = ipStr.Substring(ipStr.IndexOf(ServerIdentifier), ipStr.IndexOf(delimiter));
What is the easiest way of getting values like below from the ipStr.
string ServerName="DevTestServer";
string DBName="CustomerSummary";
string TableNames="CustomerDetails&OrderDetails";
string ColumnNames="ID,CustName,OrderID";

SqlConnectionStringBuilder won't work because ServerName etc isn't a valid token in a connection string.
However, a low tech approach is to use a good old fashioned Split and ToDictionary
var someWeirdStr = "ServerName=DevTestServer;DBName=CustomerSummary;TableNames=CustomerDetails&OrderDetails;ColumnNames=ID,CustName,OrderID;";
var results = someWeirdStr
.Split(';',StringSplitOptions.RemoveEmptyEntries)
.Select(x => x.Split('='))
.ToDictionary(x => x[0], x => x.ElementAtOrDefault(1));
Console.WriteLine(results["ServerName"]);
Console.WriteLine(results["DBName"]);
Console.WriteLine(results["TableNames"]);
Console.WriteLine(results["ColumnNames"]);
Output
DevTestServer
CustomerSummary
CustomerDetails&OrderDetails
ID,CustName,OrderID

you need to split the string by semi colon and then remove any empty strings then, after that you can split again by equals and create a dictionary of the results.
string ipStr = "ServerName=DevTestServer;DBName=CustomerSummary;TableNames=CustomerDetails&OrderDetails;ColumnNames=ID,CustName,OrderID;";
var values = ipStr.Split(';')
.Where(x => !string.IsNullOrEmpty(x))
.Select(x => {
var pair = x.Split('=');
return KeyValuePair.Create<string, string>(pair[0], pair[1]);
})
.ToDictionary(pair => pair.Key, pair => pair.Value);
foreach (var i in values) {
Console.WriteLine($"{i.Key}: {i.Value}");
}

Here is a working demo:
string ipStr = "ServerName=DevTestServer;DBName=CustomerSummary;TableNames=CustomerDetails&OrderDetails;ColumnNames=ID,CustName,OrderID;";
Dictionary<string, string> dict = Regex
.Matches(ipStr, #"\s*(?<key>[^;=]+)\s*=\s*((?<value>[^'][^;]*)|'(?<value>[^']*)')")
.Cast<Match>()
.ToDictionary(m => m.Groups["key"].Value,m => m.Groups["value"].Value);
result:

Related

What is elegant way to create array from list?

I have this string:
"(Id=7) OR (Id=6) OR (Id=8)"
from the string above how can I create array or list like this:
"Id=6"
"Id=7"
"Id=8"
Without using Regex but with some Linq you could write
string test = "(Id=7) OR (Id=6) OR (Id=8)";
var result = test
.Split(new string[] { " OR "}, StringSplitOptions.None)
.Select(x => x = x.Trim('(', ')'))
.ToList();
If you need also to take in consideration the presence of the AND operator or a variable number of spaces between the AND/OR and the conditions then you could change the code to this one
string test = "(Id=7) OR (Id=6) OR (Id=8)";
var result = test
.Split(new string[] { "OR", "AND"}, StringSplitOptions.None)
.Select(x => x = x.Trim('(', ')', ' '))
.ToList();
I suggest combining regex and LINQ powers:
var result = Regex.Matches(input, #"\(([^()]+)\)")
.Cast<Match>()
.Select(p => p.Groups[1].Value)
.ToList();
The \(([^()]+)\) pattern (see its demo) will match all (...) strings and use the Group 1 (inside unescaped (...)) to build the final list.
Simply grab the matches
(?<=\()[^)]*(?=\))
See demo.
https://regex101.com/r/iJ7bT6/18
string strRegex = #"(?<=\()[^)]*(?=\))";
Regex myRegex = new Regex(strRegex, RegexOptions.Multiline);
string strTargetString = #"(Id=7) OR (Id=6) OR (Id=8)";
foreach (Match myMatch in myRegex.Matches(strTargetString))
{
if (myMatch.Success)
{
// Add your code here
}
}

Identifying and grouping similar items in a collection of strings

I have a collection of strings like the following:
List<string> codes = new List<string>
{
"44.01", "44.02", "44.03", "44.04", "44.05", "44.06", "44.07", "44.08", "46", "47.10"
};
Each string is made up of two components separated by a full stop - a prefix code and a subcode. Some of the strings don't have sub codes.
I want to be able combine the strings whose prefixes are the same and output them as follows with the other codes also:
44(01,02,03,04,05,06,07,08),46,47.10
I'm stuck at the first hurdle of this, which is how to identify and group together the codes whose prefix values are the same, so that I can combine them into a single string as you can see above.
You can do:
var query = codes.Select(c =>
new
{
SplitArray = c.Split('.'), //to avoid multiple split
Value = c
})
.Select(c => new
{
Prefix = c.SplitArray.First(), //you can avoid multiple split if you split first and use it later
PostFix = c.SplitArray.Last(),
Value = c.Value,
})
.GroupBy(r => r.Prefix)
.Select(grp => new
{
Key = grp.Key,
Items = grp.Count() > 1 ? String.Join(",", grp.Select(t => t.PostFix)) : "",
Value = grp.First().Value,
});
This is how it works:
Split each item in the list on the delimiter and populate an anonymous type with Prefix, Postfix and original value
Later group on Prefix
after that select the values and the post fix values using string.Join
For output:
foreach (var item in query)
{
if(String.IsNullOrWhiteSpace(item.Items))
Console.WriteLine(item.Value);
else
Console.WriteLine("{0}({1})", item.Key, item.Items);
}
Output would be:
44(01,02,03,04,05,06,07,08)
46
47.10
Try this:-
var result = codes.Select(x => new { SplitArr = x.Split('.'), OriginalValue = x })
.GroupBy(x => x.SplitArr[0])
.Select(x => new
{
Prefix= x.Key,
subCode = x.Count() > 1 ?
String.Join(",", x.Select(z => z.SplitArray[1])) : "",
OriginalValue = x.First().OriginalValue
});
You can print your desired output like this:-
foreach (var item in result)
{
Console.Write("{0}({1}),",item.Prefix,item.subCode);
}
Working Fiddle.
Outlined idea:
Use Dictionary<string, List<string>> for collecting your result
in a loop over your list, use string.split() .. the first element will be your Dictionary key ... create a new List<string> there if the key doesn't exist yet
if the result of split has a second element, append that to the List
use a second loop to format that Dictionary to your output string
Of course, linq is possible too, e.g.
List<string> codes = new List<string>() {
"44.01", "44.05", "47", "42.02", "44.03" };
var result = string.Join(",",
codes.OrderBy(x => x)
.Select(x => x.Split('.'))
.GroupBy(x => x[0])
.Select((x) =>
{
if (x.Count() == 0) return x.Key;
else if (x.Count() == 1) return string.Join(".", x.First());
else return x.Key + "(" + string.Join(",", x.Select(e => e[1]).ToArray()) + ")";
}).ToArray());
Gotta love linq ... haha ... I think this is a monster.
You can do it all in one clever LINQ:
var grouped = codes.Select(x => x.Split('.'))
.Select(x => new
{
Prefix = int.Parse(x[0]),
Subcode = x.Length > 1 ? int.Parse(x[1]) : (int?)null
})
.GroupBy(k => k.Prefix)
.Select(g => new
{
Prefix = g.Key,
Subcodes = g.Where(s => s.Subcode.HasValue).Select(s => s.Subcode)
})
.Select(x =>
x.Prefix +
(x.Subcodes.Count() == 1 ? string.Format(".{0}", x.Subcodes.First()) :
x.Subcodes.Count() > 1 ? string.Format("({0})", string.Join(",", x.Subcodes))
: string.Empty)
).ToArray();
First it splits by Code and Subcode
Group by you Code, and get all Subcodes as a collection
Select it in the appropriate format
Looking at the problem, I think you should stop just before the last Select and let the data presentation be done in another part/method of your application.
The old fashioned way:
List<string> codes = new List<string>() {"44.01", "44.05", "47", "42.02", "44.03" };
string output=""
for (int i=0;i<list.count;i++)
{
string [] items= (codes[i]+"..").split('.') ;
int pos1=output.IndexOf(","+items[0]+"(") ;
if (pos1<0) output+=","+items[0]+"("+items[1]+")" ; // first occurence of code : add it
else
{ // Code already inserted : find the insert point
int pos2=output.Substring(pos1).IndexOf(')') ;
output=output.Substring(0,pos2)+","+items[1]+output.Substring(pos2) ;
}
}
if (output.Length>0) output=output.Substring(1).replace("()","") ;
This will work, including the correct formats for no subcodes, a single subcode, multiple subcodes. It also doesn't assume the prefix or subcodes are numeric, so it leaves leading zeros as is. Your question didn't show what to do in the case you have a prefix without subcode AND the same prefix with subcode, so it may not work in that edge case (44,44.01). I have it so that it ignores the prefix without subcode in that edge case.
List<string> codes = new List<string>
{
"44.01", "44.02", "44.03", "44.04", "44.05", "44.06", "44.07", "44.08", "46", "47.10"
};
var result=codes.Select(x => (x+".").Split('.'))
.Select(x => new
{
Prefix = x[0],
Subcode = x[1]
})
.GroupBy(k => k.Prefix)
.Select(g => new
{
Prefix = g.Key,
Subcodes = g.Where(s => s.Subcode!="").Select(s => s.Subcode)
})
.Select(x =>
x.Prefix +
(x.Subcodes.Count() == 0 ? string.Empty :
string.Format(x.Subcodes.Count()>1?"({0})":".{0}",
string.Join(",", x.Subcodes)))
).ToArray();
General idea, but i'm sure replacing the Substring calls with Regex would be a lot better as well
List<string> newCodes = new List<string>()
foreach (string sub1 in codes.Select(item => item.Substring(0,2)).Distinct)
{
StringBuilder code = new StringBuilder();
code.Append("sub1(");
foreach (string sub2 in codes.Where(item => item.Substring(0,2) == sub1).Select(item => item.Substring(2))
code.Append(sub2 + ",");
code.Append(")");
newCodes.Add(code.ToString());
}
You could go a couple ways... I could see you making a Dictionary<string,List<string>> so that you could have "44" map to a list of {".01", ".02", ".03", etc.} This would require you processing the codes before adding them to this list (i.e. separating out the two parts of the code and handling the case where there is only one part).
Or you could put them into a a SortedSet and provide your own Comparator which knows that these are codes and how to sort them (at least that'd be more reliable than grouping them alphabetically). Iterating over this SortedSet would still require special logic, though, so perhaps the Dictionary to List option above is still preferable.
In either case you would still need to handle a special case "46" where there is no second element in the code. In the dictionary example, would you insert a String.Empty into the list? Not sure what you'd output if you got a list {"46", "46.1"} -- would you display as "46(null,1)" or... "46(0,1)"... or "46(,1)" or "46(1)"?

Array of concatenated strings into Dictionary<int, int>

I have an array of strings. Each string is two numbers separated with a "|".
How can I get this array of string into Dictionary<int,int> without looping through the array, splitting each string and adding to the dictionary.
Is there a better way?
simply,
var result = strings
.Select(s => s.Split('|'))
.ToDictionary(a => int.Parse(a[0]), a => int.Parse(a[1]));
if duplicates are allowed,
var result = strings
.Select(s => s.Split('|'))
.ToLookup(a => int.Parse(a[0]), a => int.Parse(a[1]));
You can use ToDictionary method:
var dictionary = stringArray.ToDictionary(x => x.Split('|')[0], x => x.Split('|')[1]);
But you should be aware that this will throw an exception if there are duplicate keys.

Using Dictionary to count the number of appearances

My problem is that I am trying to take a body of text from a text box for example
"Spent the day with "insert famous name" '#excited #happy #happy"
then I want to count how many times each hashtag appears in the body, which can be any length of text.
so the above would return this
excited = 1
happy = 2
I Was planning on using a dictionary but I am not sure how I would implement the search for the hashtags and add to the dictionary.
This is all I have so far
string body = txtBody.Text;
Dictionary<string, string> dic = new Dictionary<string, string>();
foreach(char c in body)
{
}
thanks for any help
This can be achieved with a couple of LINQ methods:
var text = "Spent the day with <insert famous name> #excited #happy #happy";
var hashtags = text.Split(new[] { ' ' })
.Where(word => word.StartsWith("#"))
.GroupBy(hashtag => hashtag)
.ToDictionary(group => group.Key, group => group.Count());
Console.WriteLine(string.Join("; ", hashtags.Select(kvp => kvp.Key + ": " + kvp.Value)));
This will print
#excited: 1; #happy: 2
This will find any hashtags in a string of the form a hash followed by one or more non-whitespace characters and create a dictionary of them versus their count.
You did mean Dictionary<string, int> really, didn't you?
var input = "Spent the day with \"insert famous name\" '#excited #happy #happy";
Dictionary<string, int> dic =
Regex
.Matches(input, #"(?<=\#)\S+")
.Cast<Match>()
.Select(m => m.Value)
.GroupBy(s => s)
.ToDictionary(g => g.Key, g => g.Count());

Regex Split String at particular word

I would like to use the .net Regex.Split method to split this input string into an array. It must group the word.
Input: **AAA**-1111,**AAA**-666,**SMT**-QWQE,**SMT**-TTTR
Expected output:
**AAA** : 1111,666
**SMT** : QWQE,TTTR
What pattern do I need to use?
As the comment on the question notes, you cannot do this in a single step (regex or not).
So:
Split on commas.
Split on dash (but keep the pairs)
Group by the first part of each pair.
Something like:
var result = select outer in input.Split(",")
let p = outer.Split('-') // will be string[2]
select new { identifier = p[0], value = p[1] }
into pair
group by pair.identifier into g
select new {
identifier = g.Key
values = String.Join(",", g)
}
This should give you an IEnumerable with a key-string and a string listing (separated by comma) the values fore each:
var input = "AAA-1111,AAA-666,SMT-QWQE,SMT-TTTR";
var list = input.Split(',')
.Select(pair => pair.Split('-'))
.GroupBy(pair => pair.First())
.Select(grp =>
new{
key = grp.Key,
items = String.Join(",", grp.Select(x => x[1]))
});
You can then use it for example like this (if you just want to output the values):
string output = "";
foreach(var grp in list)
{
output += grp.key + ": " + grp.items + Environment.NewLine;
}
FWIW here's the same solution in fluent syntax which might be easier to understand:
string input = "AAA-1111,AAA-666,SMT-QWQE,SMT-TTTR";
Dictionary<string, string> output = input.Split(',') // first split by ','
.Select(el => el.Split('-')) // then split each inner element by '-'
.GroupBy(el => el.ElementAt(0), el => el.ElementAt(1)) // group by the part that comes before '-'
.ToDictionary(grp => grp.Key, grp => string.Join(",", grp)); // convert to a dictionary with comma separated values
-
output["AAA"] // 1111,666
output["SMT"] // QWQE,TTTR

Categories

Resources