ienumerable <string> to dictionary <string, int> - c#

I'm using the following code to split array of strings into list.
private List<string> GenerateTerms(string[] docs)
{
return docs.SelectMany(doc => ProcessDocument(doc)).Distinct().ToList();
}
private IEnumerable<string> ProcessDocument(string doc)
{
return doc.Split(' ')
.GroupBy(word => word)
.OrderByDescending(g => g.Count())
.Select(g => g.Key)
.Take(1000);
}
What I want to do is replace the list returned with
Dictionary <string, int>
i.e. instead of returned list , i want to return Dictionary
Could anyone help ?? thanks in advance.

string doc = "This is a test sentence with some words with some words repeating like: is a test";
var result = doc.Split(' ')
.GroupBy(word => word)
.OrderByDescending(g=> g.Count())
.Take(1000)
.ToDictionary(r => r.Key ,r=> r.Count());
EDIT:
I believe you are looking to get a final dictionary from array of strings, based on words as key and their final count as values. Since dictionary can't contain duplicate values, so you will not be required to use Distict.
You have to re-write your methods as:
private Dictionary<string,int> GenerateTerms(string[] docs)
{
List<Dictionary<string, int>> combinedDictionaryList = new List<Dictionary<string, int>>();
foreach (string str in docs)
{
//Add returned dictionaries to a list
combinedDictionaryList.Add(ProcessDocument(str));
}
//return a single dictionary from list od dictionaries
return combinedDictionaryList
.SelectMany(dict=> dict)
.ToLookup(pair => pair.Key, pair => pair.Value)
.ToDictionary(group => group.Key, group => group.Sum(value => value));
}
private Dictionary<string,int> ProcessDocument(string doc)
{
return doc.Split(' ')
.GroupBy(word => word)
.OrderByDescending(g => g.Count())
.Take(1000)
.ToDictionary(r => r.Key, r => r.Count());
}
Then you can call it like:
string[] docs = new[]
{
"This is a test sentence with some words with some words repeating like: is a test",
"This is a test sentence with some words with some words repeating like: is a test",
"This is a test sentence with some words",
"This is a test sentence with some words",
};
Dictionary<string, int> finalDictionary = GenerateTerms(docs);

Try this:
string[] docs = {"aaa bbb", "aaa ccc", "sss, ccc"};
var result = docs.SelectMany(doc => doc.Split())
.GroupBy(word => word)
.OrderByDescending(g => g.Count())
.ToDictionary(g => g.Key, g => g.Count())
.Take(1000);
EDIT:
var result = docs.SelectMany(
doc => doc.Split()
.GroupBy(word => word)
.OrderByDescending(g => g.Count())
.Take(1000))
.Select(g => new {Word = g.Key, Cnt = g.Count()})
.GroupBy(t => t.Word)
.ToDictionary(g => g.Key, g => g.Sum(t => t.Cnt));

Without any additional cruft the following should work.
return doc.Split(' ')
.GroupBy(word => word)
.ToDictionary(g => g.Key, g => g.Count());
Tailor it via Take, OrderBy etc as is necessary for your situation.

Try something like this:
var keys = new List<string>();
var values = new List<string>();
var dictionary = keys.ToDictionary(x => x, x => values[keys.IndexOf(x)]);

Related

Calculate Mode Using LINQ C#

I'm new with using Linq and was wondering how I could print out multiple values of my Mode value. At the minute I can only get 1 value from the Mode but I want it to show multiples ones.
string[] list = TextBox1.Text.Split(new string[] { "," },
StringSplitOptions.RemoveEmptyEntries);
int[] numbers = new int[list.Length];
for (int i = 0; i < numbers.Length; i++)
{
numbers[i] = Convert.ToInt32(list[i].Trim());
}
int mode = numbers.GroupBy(v => v)
.OrderByDescending(g => g.Count())
.First()
.Key;
You need to save off the collection before taking the item(s) you want.
string[] list = TextBox1.Text.Split(new string[] { "," },
StringSplitOptions.RemoveEmptyEntries);
IEnumerable<IGrouping<int, int>> modes = list.GroupBy(v => v);
IEnumerable<IGrouping<int, IGrouping<int, int>>> groupedModes = modes.GroupBy(v => v.Count());
var sortedGroupedModes = groupedModes.OrderByDescending(g => g.Key).ToList();
TextBox2.Text = string.Join(" ", sortedGroupedModes[0].Select(x => x.Key)));
You could get all of the groups and just extract those with the highest count (including ties):
var counts = numbers.GroupBy(v => v)
.Select(g => g.Key, Count = g.Count())
.OrderByDescending(g => g.Count);
var modes = numbers.Where(g => g.Count == counts.First().Count)
.Select(g => g.Key);

Converting Tuple<List<Guid>, string> to Dictionary<Guid, List<string>>

I am trying to converting a Tuple<List<Guid>, string> to Dictionary<Guid, List<string>>. This is what I have so far:
var listOfTuples = GetListOfTuples(); // returns type List<Tuple<List<Guid>, string>>
var transformedDictionary = new Dictionary<Guid, List<string>>();
foreach (var listOfTuple in listOfTuples)
{
foreach (var key in listOfTuple.Item1)
{
if (!transformedDictionary.ContainsKey(key))
transformedDictionary[key] = new List<string> { listOfTuple.Item2 };
else transformedDictionary[key].Add(listOfTuple.Item2);
}
}
Is there a better way of doing this, perhaps using LINQ; SelectMany, Grouping, or toDictionary?
Update: I have tried this, but clearly not working:
listOfTuples.ToList()
.SelectMany(x => x.Item1,(y, z) => new { key = y.Item2, value = z })
.GroupBy(p => p.key)
.ToDictionary(x => x.Key, x => x.Select(m => m.key));
You are close. The problem is with selecting the right key and value
var result = listOfTuples.SelectMany(t => t.Item1.Select(g => (g, str: t.Item2)))
.GroupBy(item => item.g, item => item.str)
.ToDictionary(g => g.Key, g => g.ToList());
The mistake is here (y, z) => new { key = y.Item2, value = z } - you want the key to be the Guid and therefore instead of it being Item2 it should be z which is the Guid. So you can go with the way I wrote it or just
(y, z) => new { key = z, value = y.Item2 }
Also the .ToList() at the beginning is not needed. You say that listOfTuples already returns a list

Linq Expression to Turn DataTable to Dictionary of <Key, List<Values>>

I am trying to convert a DataTable of the form
Key Value
1 A
1 B
1 C
2 X
2 Y
To a Dictionary
1 [A,B,C]
2 [X,Y]
The lambda expression I am using is
GetTable("..sql..").AsEnumerable().
.Select(r => new {Key = r.Field<int>("Key"), Val = r.Field<string>("Value")})
.GroupBy(g => g.Key)
.ToDictionary(a => a.Key, a => String.Join(",", a.Value))
But it fails with "Cannot convert lambda expression to type 'System.Collections.Generic.IEqualityComparer' because it is not a delegate type"
How can I accomplish this?
This does it:
GetTable("..sql..").AsEnumerable().
.Select(r => new {Key = r.Field<int>("Key"), Val = r.Field<string>("Value")})
.GroupBy(g => g.Key)
.ToDictionary(a => a.Key, a => String.Join(",", a.Select(x => x.Value).ToList()))
Here's another way you can do it...
GetTable("..sql..").AsEnumerable()
.GroupBy(x => x.Field<int>("Key"))
.ToDictionary(grp => grp.Key, x => x.Select(y => y.Field<string>("Value")).ToList());
var foo = GetTable("").AsEnumerable()
.ToLookup(x => x.Key, x => x.Value);
foreach(var x in foo)
{
foreach(var value in x)
{
Console.WriteLine(string.Format("{0} {1}", x.Key, value));
}
}

Converting a LINQ query into a Dictionary<string, string[]>

I've got a query that returns something of the following format:
{ "tesla", "model s" }
{ "tesla", "roadster" }
{ "honda", "civic" }
{ "honda", "accord" }
and I'd like to convert that to a dictionary of <string, string[]> like so:
{ "tesla" : ["model s", "roadster"], "honda" : ["civic", "accord"] }
I've tried with this:
var result = query.Select(q => new { q.Manufacturer, q.Car}).Distinct().ToDictionary(q => q.Manufacturer.ToString(), q => q.Car.ToArray());
but so far I am not having any luck. I think what this is doing is actually trying to add individual items like "tesla" : ["model s"] and "tesla" : ["roadster"] and that's why it's failing ... any easy way to accomplish what I am trying to do in LINQ?
You would need to group each item by the key first, then construct the dictionary:
result = query.Select(q => new { q.Manufacturer, q.Car}).Distinct()
.GroupBy(q => q.Manufacturer)
.ToDictionary(g => g.Key,
g => g.Select(q => q.Car).ToArray());
Of course, an ILookup<string, string> much easier:
result = query.Select(q => new { q.Manufacturer, q.Car }).Distinct()
.ToLookup(q => q.Manufacturer, q => q.Car);
You're looking for ToLookup if you would like the results to be grouped into a dictionary-like object:
var result = query.Select(q => new { q.Manufacturer, q.Car})
.Distinct()
.ToLookup(q => q.Manufacturer.ToString(), q => q.Car);
Otherwise you will have to group the results first:
var result = query.Select(q => new { q.Manufacturer, q.Car })
.Distinct()
.GroupBy(q => q.Manufacturer)
.ToDictionary(gg => gg.Key,
gg => gg.Select(q => q.Car).ToArray());
What you want is GroupBy(), followed by ToDictionary().
Example:
var result = query.GroupBy(q => q.Manufacturer).ToDictionary(q => q.Key, q => q.Value.ToArray());
What GroupBy() does is group all the elements that have the same matching key selector. So when you tell it to GroupBy(q => q.Manufacturer), all the elements that have the same Manufacturer will be grouped together as IEnumerable<T>.
Use ToLookup:
var table = pairs.ToLookup(kvp => kvp.Key, kvp => kvp.Value);
foreach(var i in table["tesla"])
Console.WriteLine(i);

How to select non-distinct elements along with their indexes

List<string> str = new List<string>() {
"Alpha", "Beta", "Alpha", "Alpha", "Gamma", "Beta", "XYZ" };
Expected output:
String | Indexes
----------------------------
Alpha | 0, 2, 3
Beta | 1, 5
Gamma and XYZ are distinct so, they are ignored.
I've done this by comparing the strings manually. Would it be possible to do it using LINQ in more easier way?
foreach (var grp in
str.Select((s, i) => new { s, i })
.ToLookup(pair => pair.s, pair => pair.i)
.Where(pair => pair.Count() > 1))
{
Console.WriteLine("{0}: {1}", grp.Key, string.Join(", ", grp));
}
Something like this should work:
var elements = str
.Select((Elem, Idx) => new {Elem, Idx})
.GroupBy(x => x.Elem)
.Where(x => x.Count() > 1);
If you want to get a Dictionary<string,List<int>> having the duplicated string as key and the indexes as value, just add
.ToDictionary(x => x.Key, x => x.Select(e => e.Idx).ToList() );
after Where()
You can get the non-distinct strings by grouping, then you can get the index for each non-distinct string and group them to create an array for each string:
var distinct = new HashSet<string>(
str.GroupBy(s => s)
.Where(g => g.Count() > 1)
.Select(g => g.Key)
);
var index =
str.Select((s, i) => new {
Str = s,
Index = i
})
.Where(s => distinct.Contains(s.Str))
.GroupBy(i => i.Str).Select(g => new {
Str = g.Key,
Index = g.Select(s => s.Index).ToArray()
});
foreach (var i in index) {
Console.WriteLine("{0} : {1}", i.Str, String.Join(", ", i.Index.Select(n => n.ToString())));
}
Output:
Alpha : 0, 2, 3
Beta : 1, 5

Categories

Resources