Split string around colon - c#

could you help splitting a string into key value pairs around the colon delimiter. I am having trouble with this.
eg.
"somekey:value1 value2 another:<value3 one_more:value4..value5"
output
<"somekey", "value1 value2">
<"another", "<value3">
<"one_more", "value4..value5">

This if you just want a simple conversion. you can also use regex.
private static Dictionary<string, string> Dictionary(string str)
{
var dictionary = new Dictionary<string, string>();
var splitOnSpace = str.Split(" ");
var value = string.Empty;
var key = "";
var i = 0;
while (i < splitOnSpace.Length)
{
var item = splitOnSpace[i];
if (item.Contains(":"))
{
var split = item.Split(':');
key = split[0];
value = split[1];
dictionary.Add(key, value);
}
else
{
value += " " + item;
dictionary[key] = value;
}
i++;
}
return dictionary;
}

The regex extracting such key-value pairs is
([^\s:]+):(.*?)(?=\s+[^\s:]+:|$)
(Demo)
The tricky part here is (?=\s+[^\s:]+:|$) lookahead, which tells the "match anything for value" regex ((.*?)) stop as soon as it encounters the next key preceded by some spaces (\s+[^\s:]+:) or simply end of string ($).
Then the match groups can be extracted as follows:
var input = "somekey:value1 value2 another:<value3 one_more:value4..value5";
var matches = Regex.Matches(input, #"([^\s:]+):(.*?)(?=\s+[^\s:]+:|$)");
var pairs = matches.Select(m => (m.Groups[1].Value, m.Groups[2].Value));
foreach (var (key, value) in pairs)
{
Console.WriteLine($"<\"{key}\": \"{value}\">");
}
Full demo

You can try this regex.
string givenString =
#"key1:value1 value2 key2:<value3 key3:value4..value5";
Dictionary<string, string> result1 = Regex
.Split(givenString, "([a-z0-9]+:)")
.Skip(1) // will skip the first empty
.Select((item, index) => new {
value = item.Trim(),
index = index / 2
})
.GroupBy(item => item.index)
.ToDictionary(chunk => chunk.First().value.TrimEnd(':'),
chunk => chunk.Last().value);

Related

Convert string to dictionary key value - split issue

I am trying to convert a string to key value pair for testing purposes. The issue I have is that when I split the string, the value can be null in rare occasions
For example:
"Sent On\r\n2021-01-31 09:18:42"
"Priority\r\nLow"
When I use the following code it works fine for all records except when value is null. There will always be a key
Dictionary<string, string> details = new Dictionary<string, string>();
foreach (var row in Rows)
{
var text = row.Text.Replace("\r\n", ",");
var splitText = text.IndexOf(",");
var key = text.Substring(0, splitText);
var value = text.Substring(splitText + 1);
details.Add(key, value);
}
return details;
The issue is when the text is like this and I only have a key and no value. I cant split the text either by '\r\n' as it doesnt have a value
"Read On"
how can I modify my code to check if this scenario?
Give this a go:
foreach (var row in Rows)
{
var parts = row.Text.Split(new [] { Environment.NewLine }, StringSplitOptions.None);
var key = parts[0];
var value = parts.Length > 1 ? parts[1] : null;
details.Add(key, value);
}
Or even this:
Dictionary<string, string> details =
Rows
.Select(row => row.Text.Split(new [] { Environment.NewLine }, StringSplitOptions.None))
.ToDictionary(parts => parts[0], parts => parts.Length > 1 ? parts[1] : null);

Reading characters from a string and counting each one of them

The issue i have with my code is as following: i cannot get my head around how to read each character and sum each up in one int for everyone at the end of all rotations. Here is my code:
class Program
{
static void Main()
{
SortedDictionary<string, int> text = new SortedDictionary<string, int>();
string[] characters = Console.ReadLine()
.Split()
.ToArray();
foreach (var character in characters)
{
if (text.ContainsKey(character))
{
text[character]++;
}
else
{
text.Add(character, 1);
}
}
foreach (var character in text)
{
Console.WriteLine($"{character.Key} -> {character.Value}");
}
}
}
I am reading here how many times a string exists in the Dictionary. What i need to get, written above, is different. Please help, thanks!
String.Split() is splitting on new lines by default so characters contains a single string with the whole line in it. If you want each of the characters, just get rid of the Split (and change the Dictionary KeyType to char to match the values):
SortedDictionary<char, int> text = new SortedDictionary<char, int>();
char[] characters = Console.ReadLine().ToArray();
// ...
https://www.ideone.com/hnMSv1
Since string implements IEnumerable<char> you actually don't even need to convert the characters into an array:
SortedDictionary<char, int> text = new SortedDictionary<char, int>();
string line = Console.ReadLine();
foreach( char character in line )
// ...
https://www.ideone.com/nLyBfC
You can use LINQ here because any string consists of char element. So, string type implements IEnumerable<char> interface:
string str = "aaabbc";
var res = str
.GroupBy(c => c)
.ToDictionary(g => new { g.Key, Count = g.Count() });
The example below demonstrates how you can get it without casting to dictionary but projecting an anonymous type and sort the number of characters in descending order:
var res2 = str
.GroupBy(c => c)
.Select(d => new { d.Key, Count = d.Count() })
.OrderByDescending(x => x.Count);

Retrieving Numeric value before a decimal in a string value

I am working on a routine in C#
I have a list of alphanumeric sheet numbers that I would like to retrieve the numbers before the decimal to use in my routine.
FP10.01-->10
M1.01-->1
PP8.01-->8
If possible, how can something like this be achieved as either a string or integer?
You could use a regex:
Regex r = new Regex("([0-9]+)[.]");
string s = "FP10.01";
var result = Convert.ToInt32(r.Match(s).Groups[1].ToString()); //10
string input = "FP10.01";
string[] _input = input.Split('.');
string num = find(_input[0]);
public string find(string input)
{
char[] _input = input.ToArray();
int number;
string result = null;
foreach (var item in _input)
{
if (int.TryParse(item.ToString(), out number) == true)
{
result = result + number;
}
}
return result;
}
To accumulate the resulting elements into a list, you can do something like:
List<string> myList = new List<string>(){ "FP10.01","M1.01", "PP8.01"};
List<int> resultSet =
myList.Select(e =>
Regex.Replace(e.Substring(0, e.IndexOf('.')), #"[^\d]", string.Empty))
.Select(int.Parse)
.ToList();
This will take each element in myList and in turn, take a substring of each element from index 0 until before the . and then replace all the non-numeric data with string.Empty and then finally parse the string element into an int and store it into a list.
another variant would be:
List<int> resultSet =
myList.Select(e => e.Substring(0, e.IndexOf('.')))
.Select(e => string.Join(string.Empty, e.Where(char.IsDigit)))
.Select(int.Parse)
.ToList();
or if you want the elements to be strings then you could do:
List<string> resultSet =
myList.Select(e => e.Substring(0, e.IndexOf('.')))
.Select(e => string.Join(string.Empty, e.Where(char.IsDigit)))
.ToList();
To retrieve a single element of type string then you can create a helper function as such:
public static string GetValueBeforeDot(string input){
return input.Substring(0, input.IndexOf('.'))
.Where(char.IsDigit)
.Aggregate(string.Empty, (e, a) => e + a);
}
To retrieve a single element of type int then the helper function should be:
public static int GetValueBeforeDot(string input){
return int.Parse(input.Substring(0, input.IndexOf('.'))
.Where(char.IsDigit)
.Aggregate(string.Empty, (e, a) => e + a));
}
This approach removes alphabet characters by replacing them with an empty string. Splitting on the '.' character will leave you with a two element array consisting of numbers at index 0 and after decimal values at index 1.
string input = "FP10.01";
var result = Regex.Replace(input, #"([A-Za-z]+)", string.Empty).Split('.');
var beforeDecimalNumbers = result[0]; // 10
var afterDecimalNumbers = result[1]; // 01

How to get the indexes and reoccurrences of a specific character?

IF in a string there is a character or characters that occurs again and again. Like in the following string:
1+1+1-2+2/2*4-2*3/23
Now in the string above the + occurs 3 times at the indexes of 1,3,7 and - occurs 2 times at the indexes of 5,13 and so others, and then storing them in 2 dimensional array So now the issue is that how to do this.
The following function will return all matched indices for a given search string:
List<int> GetAllIndices(string input, string search)
{
List<int> result = new List<int>();
int index = input.IndexOf(search);
while(index != -1)
{
result.Add(index);
index++;//increment to avoid matching the same index again
if(index >= input.Length)//check if index is greater than string (causes exception)
break;
index = input.IndexOf(search, index);
}
return result;
}
It should also handle overlapping matches, for example: searching "iii" for occurrences of "ii" will return [0,1]
If you want to use this function to create a list of symbols and their indices then I would recommend the following approach:
string input = "1+1+1-2+2/2*4-2*3/23";
//create a dictionary to store the results
Dictionary<string, List<int>> results = new Dictionary<string, List<int>>();
//add results for + symbol
results.Add("+", GetAllIndices(input, "+"));
//add results for - symbol
results.Add("-", GetAllIndices(input, "-"));
//you can then access all indices for a given symbol like so
foreach(int index in results["+"])
{
//do something with index
}
You could even go a step further and wrap that in a function that searches for multiple symbols:
Dictionary<string, List<int>> GetSymbolMatches(string input, params string[] symbols)
{
Dictionary<string, List<int>> results = new Dictionary<string, List<int>>();
foreach(string symbol in symbols)
{
results.Add(symbol, GetAllIndices(input, symbol));
}
return results;
}
Which you can then use like so:
string input = "1+1+1-2+2/2*4-2*3/23";
Dictionary<string, List<int>> results = GetSymbolMatches(input, "+", "-", "*", "/");
foreach(int index in results["+"])
{
//do something with index
}
With Linq:
var allIndices = yourString.Select((c, i) => new { c, i, })
.Where(a => a.c == '+').Select(a => a.i);
To get a dictionary with all characters in the string, for example:
var allCharsAllIndices = yourString.Select((c, i) => new { c, i, })
.GroupBy(a => a.c)
.ToDictionary(g => g.Key, g => g.Select(a => a.i).ToArray());
you can try this with changing 'value'
var duplicates = param1.ToCharArray().Select((item, index) => new { item, index })
.Where(x =>x.item==VALUE).GroupBy(g=>g.index)
.Select(g => new { Key = g.Key })
.ToList();
string msg = "1+1+1-2+2/2*4-2*3/23";
Dictionary<char, List<int>> list = new Dictionary<char, List<int>>();
for (int i = 0; i < msg.Length; i++)
{
if (!list.ContainsKey(msg[i]))
{
list.Add(msg[i], new List<int>());
list[msg[i]].Add(i);
}
else
list[msg[i]].Add(i);
}
Simple = best. Without memory allocation.
public static IEnumerable<int> GetIndexOfEvery(string haystack, string needle)
{
int index;
int pos = 0;
string s = haystack;
while((index = s.IndexOf(needle)) != -1)
{
yield return index + pos;
pos = pos + index + 1;
s = haystack.Substring(pos);
}
}

Dictionary<string, int> increase value

I have a Dictionary<string, int> and I am reading some strings from a list... I want to add them in the dictionary, but if the string is already in the dictionary, I want its value to increase by 1.
The code I tried is as below, but there are some strings that are increased with every input.. Is something wrong?
Dictionary<string, int> dictionary = new Dictionary<string, int>();
foreach (String recordline in tags)
{
String recordstag = recordline.Split('\t')[1];
String tagToDic = recordstag.Substring(0, (recordstag.Length-1) );
if (dictionary.ContainsKey(tagToDic) == false)
{
dictionary.Add(tagToDic, 1);
}
else
{
try
{
dictionary[tagToDic] = dictionary[tagToDic] + 1;
}
catch (KeyNotFoundException ex)
{
System.Console.WriteLine("X" + tagToDic + "X");
dictionary.Add(tagToDic, 1);
}
}
}
EDIT: To answer your comments... I am removing the last char of the string because it is always a blank space...
My input is like:
10000301 business 0 0,000
10000301 management & auxiliary services 0 0,000
10000316 demographie 0 0,000
10000316 histoire de france 0 0,000
10000347 economics 0 0,000
10000347 philosophy 1 0,500
and i want only the string like "business" or "management & auxiliary services" etc.
You are splitting each string in the input string array and selecting the 2nd string in the string array. Then you are removing the last character of this 2nd string using SubString. Hence all strings that differ only in the last character would be considered the same and incremented. Thats why you might be seeing "some strings that are increased with every input".
EDIT: If the purpose of removing the last char is to remove space, Use String.Trim instead.
Another edit is using TryGetValue instead of ContainsKey which performs better to increment your value. Code has been edited below.
Try this:
Dictionary<string, int> dictionary = new Dictionary<string, int>();
foreach(string recordline in tags)
{
string recordstag = recordline.Split('\t')[1].Trim();
int value;
if (!dictionary.TryGetValue(recordstag, out value))
dictionary.Add(recordstag, 1);
else
dictionary[recordstag] = value + 1;
}
No need for a dictionary, can be solved using this Linq query.
(Assuming you want the complete string after \t)
var q =
from s in tags.Select (t => t.Substring(t.IndexOf("\t")))
group s by s into g
select new
{
g.Key,
Count = g.Count()
};
And if you need it as a dictionary just add:
var dic = q.ToDictionary (x => x.Key, x => x.Count);
Your input string first split and then substring of it returned to tagToDic, So maybe n strings have a same tagToDic.
Extension method
public static void Increment(this Dictionary<string, int> dictionary, string key)
{
int val;
dictionary.TryGetValue(key, out val);
if (val != null)
dictionary[key] = val + 1;
}
Dictionary<string, int> dictionary = new Dictionary<string, int>();
// fill with some data
dictionary.Increment("someKey");
It's probably easier just to re-add the dictionary value after you retrieve the count from the existing one.
Here's some psuedo code to handle the look up logic.
Dictionary<string, int> _dictionary = new Dictionary<string, int>();
private void AdjustWordCount(string word)
{
int count;
bool success = _dictionary.TryGetValue(word, out count);
if (success)
{
//Remove it
_dictionary.Remove(word);
//Add it back in plus 1
_dictionary.Add(word, count + 1);
}
else //could not get, add it with a count of 1
{
_dictionary.Add(word, 1);
}
}
How about:
Dictionary<string, int> dictionary = new Dictionary<string, int>();
string delimitedTags = "some tab delimited string";
List<string> tags = delimitedTags.Split(new char[] {'\t'}, StringSplitOptions.None).ToList();
foreach (string tag in tags.Distinct())
{
dictionary.Add(tag, tags.Where(t => t == tag).Count());
}
If you have them in a list you could just group them and make your list.
list.GroupBy(recordline => recordline.Split('\t').Substring(0, (recordstag.Length-1),
(key, ienum) => new {word = key, count = ienum.Count()});
Then you can put that in a dictionary or iterate it or something.
Your dictionary code looks like it will function the way you expect.
My best guess is that your string-splitting code is not working correctly.
You'd have to give us some sample inputs to verify this though.
Anyway, your entire block of code could be simplified and rewritten with LINQ as:
var dictionary = tags
.Select(t => {
var recordstag = t.Split('\t')[1];
return recordstag.Substring(0, recordstag.Length-1);
})
.GroupBy(t => t)
.ToDictionary(k => k.Key, v => v.Count())
;

Categories

Resources