c# : avoid branching while iterating on a dictionary

c# : avoid branching while iterating on a dictionary - c#

I have the below code in which i branch for each sample in a dictionary , is there a way either by using LINQ or any other method in which i can avoid branching -> may be a functional approach
Dictionary<string, int> samples = new Dictionary<string, int>()
{
{"a", 1},
{"aa", 2},
{"b", 1},
{"bb", 3}
};
foreach (var sample in samples)
{
if (sample.Value ==)
{
Console.WriteLine("sample passed");
}
else if (sample.Value == 2)
{
Console.WriteLine("sample isolated");
}
else if (sample.Value == 3)
{
Console.WriteLine("sample biased");
}
}
UPD
What if i have other type of comprasion:
foreach (var sample in samples)
{
if (sample.Value <= 1)
{
Console.WriteLine("sample passed");
}
else if (sample.Value <= 2)
{
Console.WriteLine("sample isolated");
}
else if (sample.Value <= 3)
{
Console.WriteLine("sample biased");
}
}

One option would be to create a list of Actions that you wish to perform, then execute them based on the index. This way your methods can be quite varied. If you need to perform very similar actions for each option, then storing a list of values would be a better than storing Actions.
List<Action> functions = new List<Action>();
functions.Add(() => Console.WriteLine("sample passed"));
functions.Add(() => Console.WriteLine("sample isolated"));
functions.Add(() => Console.WriteLine("sample biased"));
foreach (var sample in samples)
{
Action actionToExecute = functions[sample.Value - 1];
actionToExectute();
}
If you wanted to use a dictionary as your comment implies:
Dictionary<int, Action> functions = new Dictionary<int, Action>();
functions.Add(1, () => Console.WriteLine("sample passed"));
functions.Add(2, () => Console.WriteLine("sample isolated"));
functions.Add(3, () => Console.WriteLine("sample biased"));
foreach (var sample in samples)
{
Action actionToExecute = functions[sample.Value];
actionToExectute();
}

For this concrete case you can introduce another map(Dictionary or an array, as I did):
Dictionary<string, int> samples = new Dictionary<string, int>()
{
{"a", 1},
{"aa", 2},
{"b", 1},
{"bb", 3}
};
var map = new []
{
"sample passed",
"sample isolated",
"sample biased"
};
foreach (var sample in samples)
{
Console.WriteLine(map[sample.Value - 1]);
}
As for actual code it highly depends on usecases and how you want to handle faulty situations.
UPD
It seems that if you will be using dictionary for your map there still will be some branching, but if you will not have misses branch prediction should take care of it.

So you have a Dictionary<string, int>. Every item in the dictionary is a KeyValuePair<string, int>. I assume that the string is the name of the sample (identifier), and the int is a number that says something about the sample:
if the number equals 0 or 1, the sample is qualified as Passed;
if the number equals 2, then you call it Isolated
if the number equals 3, then you call it Biased.
All higher numbers are not interesting for you.
You want to group the samples in Passed / Isolated / Biased samples.
Whenever you have a sequence of similar items and you want to make groups of items, where every element has something in common with the other elements in the group, consider using one of the overloads of Enumerable.GroupBy
Let's first define an enum to hold your qualifications, and a method that converts the integer value of the sample into the enum:
enum SampleQualification
{
Passed,
Isolated,
Biased,
}
SampleQualification FromNumber(int number)
{
switch (number)
{
case 2:
return SampleQualification.Isolated;
case 3:
return SampleQualification.Biased;
default:
return SampleQualification.Passed;
}
}
Ok, so you have your dictionary of samples, where every key is a name of the sample and the value is a number that can be converted to a SampleQualification.
Dictionary<string, int> samples = ...
var qualifiedSamples = samples // implements IEnumerable<KeyValuePair<string, int>>
// keep only samples with Value 0..3
.Where(sample => 0 <= sample.Value && sample.Value <= 3)
// Decide where the sample is Passed / Isolated / Biased
.Select(sample => new
{
Qualification = FromNumber(sample.Value)
Name = sample.Key, // the name of the sample
Number = sample.Value,
})
// Make groups of Samples with same Qualification:
.GroupBy(
// KeySelector: make groups with same qualification:
sample => sample.Qualification,
// ResultSelector: take the qualification, and all samples with this qualification
// to make one new:
(qualification, samplesWithThisQualification) => new
{
Qualification = qualification,
Samples = samplesWithThisQualification.Select(sample => new
{
Name = sample.Name,
Number = sample.Number,
})
.ToList(),
});
The result is a sequence of items. Where every item has a property Qualification, which holds Passed / Isolated / Biased. Every item also has a list of samples that have this qualification.
// Process Result
foreach (var qualifiedSample in qualifiedSamples)
{
Console.WriteLine("All samples with qualification " + qualifiedSample.Qualification);
foreach (var sample in qualifiedSample.Samples)
{
Console.WriteLine({0} - {1}, sample.Name, sample.Value);
}
}

Related

Iterate a list and keep an index based on the name of the item

I have a list of items which have names and I need to iterate them, but I also need to know how many times this item with the same name it is. So this is an example:
-----
|1|A|
|2|B|
|3|C|
|4|C|
|5|C|
|6|A|
|7|B|
|8|C|
|9|C|
-----
So, when I'm iterating and I'm on row 1, I want to know it is the first time it is an A, when I'm on row 6, I want to know it is the second time, when I'm on row 9, I want to know it is the 5th C, etc. How can I achieve this? Is there some index I can keep track of? I was also thinking of filling a hash while iterating, but perhaps thats too much.

You can use Dictionary<char, int> for keeping count of each character in your list
here your key will be character and value will contain number of occurrences of that character in list
Dictionary<char, int> occurances = new Dictionary<char, int>();
List<char> elements = new List<char>{'A', 'B','C','C','C','A','B', 'C', 'C'};
int result = 0;
foreach(char element in elements)
{
if(occurances.TryGetValue(element, out result))
occurances[element] = result + 1;
else
occurances.Add(element, 1);
}
foreach(KeyValuePair<char, int> kv in occurances)
Console.WriteLine("Key: "+ kv.Key + " Value: "+kv.Value);
Output:
Key: A Value: 2
Key: B Value: 2
Key: C Value: 5
POC: dotNetFiddler

Use dictionary to keep track of counter.
List<string> input = new List<string> { "A", "B", "C", "C", "C", "A", "B", "C", "C" };
Dictionary<string, int> output = new Dictionary<string, int>();
foreach(var item in input)
{
if (output.ContainsKey(item))
{
output[item] = output[item] + 1;
}
else
{
output.Add(item, 1);
}
}

I think you'll need a reversed index instead of row store index.
Row store index just like your table described, and reversed index store terms to search indexes.
Probably like this:
A 1,6
B 2,7
C 3,4,5,8,9
The search engine such like 'Elastic search/Solr' will store terms like this.
If you are in C#, Dictionary<string, List<int>> is pretty much good for you. There you can keep your data that is reverse indexed.

The clean way is to implement your own list; the item is your own object. By this method, you implement your own Iterator pattern with an additional property in your object and your own Add() method. The new Iterator should inherit List and should override the Add() method of List.
I implement this for my own. you can use it. keep in mind, this solution is one of some solutions that exist. However, I think this is one the best solutions with respect to SOLID and OO principals.
public class CounterIterator : List<Item>
{
public new void Add(Item item)
{
base.Add(item);
foreach (var listItem in this)
{
if (listItem.Equals(item))
{
item.CountRepeat++;
}
}
}
}
public class Item
{
public Item(string value)
{
Value = value;
}
public string Value { get; private set; }
public int CountRepeat { get; set; }
public override bool Equals(object obj)
{
var item = obj as Item;
return item != null && Value.Equals(item.Value);
}
}
I tested the code above. It is an extension of List which has an added behavior. If anyone thinks it is not a correct answer, please mention me in comments. I will try to clarify the issue.

Searching for dictionary keys contained in a string array

I have a List of strings where each item is a free text describing a skill, so looks kinda like this:
List<string> list = new List<string> {"very good right now", "pretty good",
"convinced me that is good", "pretty medium", "just medium" .....}
And I want to keep a user score for these free texts. So for now, I use conditions:
foreach (var item in list)
{
if (item.Contains("good"))
{
score += 2.5;
Console.WriteLine("good skill, score+= 2.5, is now {0}", score);
}
else if (item.Contains(low"))
{
score += 1.0;
Console.WriteLine("low skill, score+= 1.0, is now {0}", score);
}
}
Suppose In the furure I want to use a dictionary for the score mapping, such as:
Dictionary<string, double> dic = new Dictionary<string, double>
{ { "good", 2.5 }, { "low", 1.0 }};
What would be a good way to cross between the dictionary values and the string list? The way I see it now is do a nested loop:
foreach (var item in list)
{
foreach (var key in dic.Keys)
if (item.Contains(key))
score += dic[key];
}
But I'm sure there are better ways. Better being faster, or more pleasant to the eye (LINQ) at the very least.
Thanks.

var scores = from item in list
from word in item.Split()
join kvp in dic on word equals kvp.Key
select kvp.Value;
var totalScore = scores.Sum();
Note: your current solution checks whether the item in the list contains key in the dictionary. But it will return true even if key in dictionary is a part of some word in the item. E.g. "follow the rabbit" contains "low". Splitting item into words solves this issue.
Also LINQ join uses hash set internally to search first sequence items in second sequence. That gives you O(1) lookup speed instead of O(N) when you enumerating all entries of dictionary.

If your code finds N skill strings containing the word "good" then it appends score 2.5 N times.
So you can just count skill strings containing dictionary work and multiply the value on corresponding score.
var scores = from pair in dic
let word = pair.Key
let score = pair.Value
let count = list.Count(x => x.Contains(word))
select score * count;
var totalScore = scores.Sum();

its not faster really, but you can use LINQ:
score = list.Select(s => dic.Where(d => s.Contains(d.Key))
.Sum(d => d.Value))
.Sum();
note that your example loop will hit 2 different keys if he string matches both, I kept that in my solution.

Well, you aren't really using the Dictionary as a dictionary, so we can simplify this a bit with a new class:
class TermValue
{
public string Term { get; set; }
public double Value { get; set; }
public TermValue(string t, double v)
{
Term = t;
Value = v;
}
}
With that, we can be a bit more direct:
void Main()
{
var dic = new TermValue[] { new TermValue("good", 2.5), new TermValue("low", 1.0)};
List<string> list = new List<string> {"very good right now", "pretty good",
"convinced me that is good", "pretty medium", "just medium" };
double score = 0.0;
foreach (var item in list)
{
var entry = dic.FirstOrDefault(d =>item.Contains(d.Term));
if (entry != null)
score += entry.Value;
}
}
From here, we can just play a bit (the compiled code for this will probably be the same as above)
double score = 0.0;
foreach (var item in list)
{
score += dic.FirstOrDefault(d =>item.Contains(d.Term))?.Value ?? 0.0;
}
then, (in the word of the Purple One), we can go crazy:
double score = list.Aggregate(0.0,
(scre, item) =>scre + (dic.FirstOrDefault(d => item.Contains(d.Term))?.Value ?? 0.0));

Custom Ordering in C#

There is a list of Package item is sorted by GUID, but I need to order them as follows
KK %, AB, AB art, DD %, FV, ER, PP and WW
I have implemented as follows, but I wonder is there a better way of doing it?
List<PackageType> list = new List<PackageType> (8);
foreach (var a in mail.Package)
{
if (a.Name == "KK %")
list[0] = a;
else if (a.Name == "AB art")
list[1] = a;
else if (a.Name == "AB")
list[2] = a;
else if (a.Name == "DD %")
list[3] = a;
else if (a.Name == "FV")
list[4] = a;
else if (a.Name == "ER")
list[5] = a;
else if (a.Name == "PP")
list[6] = a;
else if (a.Name == "WW")
list[7] = a;
}

You can get this down to two lines (one for array definition, one for ordering):
var PackageOrder = new[] { "KK %", "AB", "AB art", "DD %", "FV", "ER", "PP", "WW"};
//...
var list = mail.Package.OrderBy(p => Array.IndexOf(PackageOrder, p.Name)).ToList();
But we can do even better.
The code so far either requires several O(n) lookups into the reference array, or switching to a Dictionary<string,int>, which is O(1) for each lookup for a value of 1 that might be disproportionate to the task. Each package item may need several of these lookups over the course of a sort operation, which means this might be less efficient than you want.
We can get around that like this:
private static string[] Names = new[] { "KK", "AB", "BC", "DD", "FV", "ER", "PP", "WW" };
//...
var list = mail.Package.
Select(p => new {Package = p, Index = Array.IndexOf(Names, p.Name)}).
OrderBy(p => p.Index).
Select(p => p.Package).ToList();
This guarantees only one lookup per package over the course of the sort. The idea is to first create a projection of the original data that also includes an index, then sort by the index, and finally project back to just the original data. Now the only question is whether to use an array or dictionary, which mainly depends on the size of the reference array (for this size data stick with the array, for more than about 15 items, switch to the dictionary; but it varies depending on the GetHashCode() performance of your type).
Of course, there's also YAGNI to consider. For large sets this will typically be much better, but for small data it might not be worth it, or if the data happens to be sorted in a certain lucky ways it can make things slower. It can also make things slower if your are more constrained by memory pressure than cpu time (common on web servers). But in the general sense, it's a step in the right direction.
Finally, I question the need for an actual List<T> here. Just change the declaration to var and remove the .ToList() at the end. Wait to call ToList() or ToArray() until you absolutely need it, and work with the simple IEnumerable<T> until then. This can often greatly improve performance.
In this case (and for reference, I added this paragraph later on), it seems like you only have eight items total, meaning the extra code isn't really saving you anything. With that in mind, I'd just stick with the two-line solution at the top of this answer (when performance doesn't matter, go for less or simpler code).

// List<PackageType> list = ...;
var names = new[] { "KK", "AB", "BC", "DD", "FV", "ER", "PP", "WW" };
var sortedList = list.OrderBy(packageType => Array.IndexOf(names, packageType.Name));
Here's a longer version of the above that explains in more detail what's going on:
// this array contains all recognized keys in the desired order;
private static string[] Names = new[] { "KK", "AB", "BC", "DD", "FV", "ER", "PP", "WW" };
// this helper method will return the index of a `PackageType`'s `Name`
// in the above array, and thus a key by which you can sort `PackageType`s.
static int GetSortingKey(PackageType packageType)
{
var sortingKey = Array.IndexOf(Names, packageType.Name);
if (sortingKey == -1) throw new KeyNotFoundException();
return sortingKey;
}
// this is how you could then sort your `PackageType` objects:
// List<PackageType> list = ...;
IEnumerable<PackageType> sortedList = list.OrderBy(GetSortingKey);

Updated version of #stakx answer. Well I think this should be better solution, if these values are fixed, also can be used elsewhere. Each value in enum has its int value, by which they can be ordered.
public enum Names
{
KK, //0
AB, //1
BC, //2
DD, //3
FV, //4
ER, //5
PP, //6
WW //7
}
var packageList = list.OrderBy(p => p.Name);
UPDATE
Use example for enum in class
public class Package
{
public Names Name { get; set; }
}
If there is need to get string value of this enum, then just use this
Package package - Package class variable
package.Name.ToString();
If you need whole list of enum names (enum key names), you can use Enum class method:
Enum.GetNames(Type enumType) which returns string array with all defined enum key names.
Enum.GetNames(typeof(Names))

An alternative.. (I haven't compiled it)
var indexPositions = new Dictionary<string, int> {
{ "KK", 0 },
{ "AB", 1 },
{ "BC", 2 },
{ "DD", 3 },
{ "FV", 4 },
{ "ER", 5 },
{ "PP", 6 },
{ "WW", 7 }
}
foreach (var package in mail.Package)
{ // access position
int index;
if (!indexPositions.TryGetValue(a.Name, out index)) {
throw new KeyNotFoundException()
}
list[index] = package;
}

In addition to the usual OrderBy Array.IndexOf method, the list can also sorted in-place:
string[] order = { "KK", "AB", "BC", "DD", "FV", "ER", "PP", "WW" };
list.Sort((a, b) => Array.IndexOf(order, a.Name).CompareTo(Array.IndexOf(order, b.Name)));
A bit more advanced O(n)ish alternative:
var lookup = list.ToLookup(x => x.Name);
list = order.SelectMany(x => lookup[x]).ToList();

List function, how to get an average of scores for each name- c# console application

I have a list function on a console application on C#. This list function has different items where they look something like 'matt,5' 'matt,7' 'jack,4' 'jack,8' etc...
I want to be able to combine all of the names where I only see their name written once but the number after them are averaged out so it would be like 'jack,5+7/2' which would then display as 'jack,6'.
So far I have this...
currentFileReader = new StreamReader(file);
List<string> AverageList = new List<string>();
while (!currentFileReader.EndOfStream)
{
string text = currentFileReader.ReadLine();
AverageList.Add(text.ToString());
}
AverageList.GroupBy(n => n).Any(c => c.Count() > 1);
Not really sure where to go from here.

What you need is to Split your each string item on , and then group by first element of the returned array and average second element of the array (after parsing it to int) something like:
List<string> AverageList = new List<string> { "matt,5", "matt,7", "jack,4", "jack,8" };
var query = AverageList.Select(s => s.Split(','))
.GroupBy(sp => sp[0])
.Select(grp =>
new
{
Name = grp.Key,
Avg = grp.Average(t=> int.Parse(t[1])),
});
foreach (var item in query)
{
Console.WriteLine("Name: {0}, Avg: {1}", item.Name, item.Avg);
}
and it will give you:
Name: matt, Avg: 6
Name: jack, Avg: 6
But, a better option would be to use a class with Name and Score properties instead of comma separated string values.
(The code above doesn't check for invalid input values).

Firstly you will want to populate your unformatted data into a List, as you can see I called it rawScores. You could then Split each line by the comma delimiting them. You can then check to see if an existing person is in your Dictionary and add his score to it, or if not create a new person.
After that you would simply have to generate the Average of the List.
Hope this helps!
var scores = new Dictionary<string, List<int>>();
var rawScores = new List<string>();
rawScores.ForEach(raw =>
{
var split = raw.Split(',');
if (scores.Keys.All(a => a != split[0]))
{
scores.Add(split[0], new List<int> {Convert.ToInt32(split[1])});
}
else
{
var existing = scores.FirstOrDefault(f => f.Key == split[0]);
existing.Value.Add(Convert.ToInt32(split[1]));
}
});

How to display all mistaken words

I have some text in richTextBox1.
I have to sort the words by their frequency and display them in richTextBox2. It seems to work.
Have to find all mistaken words and display them in richTextBox4. I'm using Hunspell.
Apparently I'm missing something. Almost all words are displayed in richTextBox4 not only the wrong ones.
Code:
foreach (Match match in wordPattern.Matches(str))
{
if (!words.ContainsKey(match.Value))
words.Add(match.Value, 1);
else
words[match.Value]++;
}
string[] words2 = new string[words.Keys.Count];
words.Keys.CopyTo(words2, 0);
int[] freqs = new int[words.Values.Count];
words.Values.CopyTo(freqs, 0);
Array.Sort(freqs, words2);
Array.Reverse(freqs);
Array.Reverse(words2);
Dictionary<string, int> dictByFreq = new Dictionary<string, int>();
for (int i = 0; i < freqs.Length; i++)
{
dictByFreq.Add(words2[i], freqs[i]);
}
Hunspell hunspell = new Hunspell("en_US.aff", "en_US.dic");
StringBuilder resultSb = new StringBuilder(dictByFreq.Count);
foreach (KeyValuePair<string, int> entry in dictByFreq)
{
resultSb.AppendLine(string.Format("{0} [{1}]", entry.Key, entry.Value));
richTextBox2.Text = resultSb.ToString();
bool correct = hunspell.Spell(entry.Key);
if (correct == false)
{
richTextBox4.Text = resultSb.ToString();
}
}

In addition to the above answer (which should work if your Hunspell.Spell method works correctly), I have a few suggestions to shorten your code. You are adding Matches to your dictionary, and counting the number of occurrences of each match. Then you appear to be sorting them in descending value of the frequency (so the highest occurrence match will have index 0 in the result). Here are a few code snippets which should make your function a lot shorter:
IOrderedEnumerable<KeyValuePair<string, int>> dictByFreq = words.OrderBy<KeyValuePair<string, int>, int>((KeyValuePair<string, int> kvp) => -kvp.Value);
This uses the .NET framework to do all your work for you. words.OrderBy takes a Func argument which provides the value to sort on. The problem with using the default values for this function is it wants to sort on the keys and you want to sort on the values. This function call will do exactly that. It will also sort them in descending order based on the values, which is the frequency that a particular match occurred. It returns an IOrderedEnumerable object, which has to be stored. And since that is enumerable, you don't even have to put it back into a dictionary! If you really need to do other operations on it later, you can call the dictByFreq.ToList() function, which returns an object of type: List>.
So your whole function then becomes this:
foreach (Match match in wordPattern.Matches(str))
{
if (!words.ContainsKey(match.Value))
words.Add(match.Value, 1);
else
words[match.Value]++;
}
IOrderedEnumerable<KeyValuePair<string, int>> dictByFreq = words.OrderBy<KeyValuePair<string, int>, int>((KeyValuePair<string, int> kvp) => -kvp.Value);
Hunspell hunspell = new Hunspell("en_US.aff", "en_US.dic");
StringBuilder resultSb = new StringBuilder(dictByFreq.Count);
foreach (KeyValuePair<string, int> entry in dictByFreq)
{
resultSb.AppendLine(string.Format("{0} [{1}]", entry.Key, entry.Value));
richTextBox2.Text = resultSb.ToString();
bool correct = hunspell.Spell(entry.Key);
if (correct == false)
{
richTextBox4.Text = entry.Key;
}
}

Your are displaying on richtextbox4 the same as in richtextbox2 :)
I think this should work:
foreach (KeyValuePair<string, int> entry in dictByFreq)
{
resultSb.AppendLine(string.Format("{0} [{1}]", entry.Key, entry.Value));
richTextBox2.Text = resultSb.ToString();
bool correct = hunspell.Spell(entry.Key);
if (correct == false)
{
richTextBox4.Text += entry.Key;
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

c# : avoid branching while iterating on a dictionary - c#

Related

Iterate a list and keep an index based on the name of the item

Searching for dictionary keys contained in a string array

Custom Ordering in C#

List function, how to get an average of scores for each name- c# console application

How to display all mistaken words

Categories

Resources