Counting words in a collection using LINQ - c#

I have a StringCollection object with 5 words in them. 3 of them are duplicate words. I am trying to create a LINQ query that will count how many unique words are in the collection and output them to to the console. So, for example, if my StringCollection has 'House', 'Car', 'House','Dog', 'Cat', then it should output like this:
House --> 2
Car --> 1
Dog --> 1
Cat --> 1
Any ideas on how to create a LINQ query to do this?

Try the following
var res = from word in col.Cast<string>()
group word by word into g
select new { Word = g.Key, Count = g.Count() };

var xs = new StringCollection { "House", "Car", "House", "Dog", "Cat" };
foreach (var g in xs.Cast<string>()
.GroupBy(x => x, StringComparer.CurrentCultureIgnoreCase))
{
Console.WriteLine("{0}: {1}", g.Key, g.Count());
}

Given that you are using StringCollection and want to ignore case, you'll need to use Enumerable.GroupBy with Enumerable.Cast:
var results = collection.Cast<string>.GroupBy(
i => i,
(word, words) => new { Word = word, Count = words.Count() },
StringComparer.CurrentCultureIgnoreCase
);
foreach(var wordPair in results)
Console.WriteLine("Word: \"{0}\" - Count: {1}", wordPair.Word, wordPair.Count);

To build a single string value result...
var stringCollection = new[] { "House", "Car", "house", "Dog", "Cat" };
var result = stringCollection.Cast<string>().GroupBy(
k => k,
StringComparer.InvariantCultureIgnoreCase)
.Select(v => v.Key + " -->" + v.Count())
.Aggregate((l,r)=>l+" " + r);
//result = "House -->2 Car -->1 Dog -->1 Cat -->1"
To put each value on a different line...
var stringCollection = new[] { "House", "Car", "house", "Dog", "Cat" };
var result = stringCollection.Cast<string>().GroupBy(
k => k,
StringComparer.InvariantCultureIgnoreCase);
foreach (var value in result)
Console.WriteLine("{0} --> {1}", value.Key, value.Count());

foreach(var g in input.GroupBy(i => i.ToLower()).Select(i => new {Word = i.Key, Count = i.Count()})
{
Console.WriteLine(string.Format("{0} -> {1}", g.Word, g.Count));
}

It should be as simple as:
Console.WriteLine(stringCollection.Distinct().Count());

var query =
from s in Collection
group s by s.Description into g
select new {word = g.Key, num = g.Count()};

Related

How to use linq to group a list of strings on only certain strings

example list of strings:
var test = new List<string>{
"hdr1","abc","def","ghi","hdr2","lmn","opq","hdr3","rst","xyz"
};
I want to partition this list by "hdr*" so that each group contains elements...
"hdr1","abc","def","ghi"
"hdr2","lmn","opq",
"hdr3","rst","xyz"
I tried:
var result = test.GroupBy(g => g.StartsWith("hdr"));
but this gives me two groups
"hdr1","hdr2","hdr3"
"abc","def"..."xyz"
What is the proper LINQ statement I should use? Let me emphasize that the strings following "hdr*" could be anything. The only thing they have in common is that they follow "hdr*".
You get two groups because one group is the group of elements starting with "hdr" and the other group is the group of elements not starting with "hdr". StartsWith returns a bool, so this results in two groups having the Keys false and true.
You can use statement blocks in LINQ. This enables us to do:
string header = null;
var groups = test
.Select(s => {
if (s.StartsWith("hdr")) header = s;
return s;
})
.Where(s => header != s)
.GroupBy(s => header);
We store the last header in header. The where clause eliminates the header itself, since the header is the group key.
The following test...
foreach (var g in groups) {
Console.WriteLine(g.Key);
foreach (var item in g) {
Console.WriteLine(" " + item);
}
}
... prints this with the given list:
hdr1
abc
def
ghi
hdr2
lmn
opq
hdr3
rst
xyz
Instead, we can also create lists with the header as first element:
string header = null;
IEnumerable<List<string>> lists = test
.Select(s => {
if (s.StartsWith("hdr")) {
header = s;
}
return s;
})
.GroupBy(s => header)
.Select(g => g.ToList());
This test...
foreach (var l in lists) {
foreach (var item in l) {
Console.Write(item + " ");
}
Console.WriteLine();
}
... prints:
hdr1 abc def ghi
hdr2 lmn opq
hdr3 rst xyz
You could make a fancy extension method GroupWhen that starts a new group when it finds a matching item. Just like IEnumerable.GroupBy it will return a "list" of groups:
public static IEnumerable<IGrouping<int, T>> GroupWhen<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
var i = 0;
// This method "marks" which group each item belongs in
// by creating a Tuple with the item and group number
IEnumerable<(T Item, int GroupNum)> Iterate()
{
foreach (var item in source)
{
if (predicate(item)) i++; // Start new group
yield return (item, i);
}
}
// Group items by the "mark" from above and only
// output the Item from the Tuple, since the
// GroupNum will be the 'int' key of the group
return Iterate().GroupBy(tup => tup.GroupNum, tup => tup.Item);
}
// Use like so:
var list = new List<string> {"hdr1","abc","def","ghi","hdr2","lmn","opq","hdr3","rst","xyz"};
var groups = list.GroupWhen(s => s.StartsWith("hdr"))
Console.WriteLine(string.Join(",", groups.First()))
// hdr1,abc,def,ghi
Check out this fiddle for a test run.
Yes, you can do it with LINQ expressions. But I don't think it's is much readable than a foreach loop.
var test = new List<string>{
"hdr1","abc","def","ghi","hdr2","lmn","opq","hdr3","rst","xyz"
};
int groupid = 0;
var result = test.GroupBy(t =>
{
if (t.StartsWith("hdr")) ++groupid;
return groupid;
}).ToList();
result.Select(t => string.Join(' ', t)).ToList().ForEach(Console.WriteLine);
/*
Outputs:
hdr1 abc def ghi
hdr2 lmn opq
hdr3 rst xyz
*/

I am having trouble converting a linq list into a string list

I have two lists. The first one is a list of strings that I then count the number of times a string repeats itself within the list. In the foreach loop I add convert those values and add them to my second list. However when I go to print out the items in the second list I get this "System.Linq.Grouping`2[System.String,System.String]" for every item in the list... I can not find a way to convert the linq values into string values.
//LIST #1
List<string> mlist = new List<string>(new string [] {"Line5","Line2", "Line3", "Line4",
"Line6", "Line5", "Line5", "Line5", "Line6", "Line6", "Line2" });
//LIST #2
List<string> newlist = new List<string>();
var g = mlist.GroupBy(i => i);
foreach (var grp in g)
{
Console.WriteLine("{0} {1}", grp.Key, grp.Count());
newlist.Add(grp.ToString());
}
foreach(string line in newlist)
{
Console.WriteLine(Convert.ToString(line));
}
There are many solutions to this, however you can't just print an IGrouping<T> with Console.WriteLine(). You will need to access its properties
foreach (var grp in groups)
Console.WriteLine($"{grp.Key} --- {grp.Count()}");
Output
Line5 --- 4
Line2 --- 2
Line3 --- 1
Line4 --- 1
Line6 --- 3
Another easy way, is to project to a ValueTuple, which has a special ToString override
var groups = items
.GroupBy(i => i)
.Select(x => (x.Key, Count : x.Count()));
foreach (var grp in groups)
Console.WriteLine(grp);
Output
(Line5, 4)
(Line2, 2)
(Line3, 1)
(Line4, 1)
(Line6, 3)
Or customise
var groups = items
.GroupBy(i => i)
.Select(x => (x.Key, Count : x.Count()));
foreach (var grp in groups)
Console.WriteLine(#$"My words = {grp.Key}, occurrence = {grp.Count}");
Output
My words = Line5, occurrence = 4
My words = Line2, occurrence = 2
My words = Line3, occurrence = 1
My words = Line4, occurrence = 1
My words = Line6, occurrence = 3
You already know how to convert it into a readable string, because you did this in the Console.WriteLine call above it.
foreach (var grp in g)
{
Console.WriteLine("{0} {1}", grp.Key, grp.Count());
newlist.Add(string.Format("{0} {1}", grp.Key, grp.Count()));
}
Since this question has a linq tag, lets do it in one line:
List<string> newlist = g.Select(grp => $"{grp.Key} {grp.Count()}").ToList();
Try this out
List<string> mlist = new List<string>(new string [] {"Line5","Line2", "Line3", "Line4",
"Line6", "Line5", "Line5", "Line5", "Line6", "Line6", "Line2" });
//LIST #2
List<string> newlist = new List<string>();
var g = mlist.GroupBy(i => i);
foreach (var grp in g)
{
Console.WriteLine("{0} {1}", grp.Key, grp.Count());
newlist.Add($"{grp.Key} {grp.Count()}");
}
Console.WriteLine("New List");
foreach(string line in newlist)
{
Console.WriteLine(Convert.ToString(line));
}
Use this link for confirmation: https://dotnetfiddle.net/GhK6Tl

List.OrderByDescending Linq not working

Please have a look in the below code, I need an output in OrderedListDesc = {7,6,5,4,1,2,3,8,9} instead of {4,5,6,7,1,2,3,8,9}.
List<long> List = new List<long>() { 1,2,4,5,3,8,6,7,9 };
List<long> ListAsc = new List<long>() { 4,5,6,7 };
List<long> ListDesc = new List<long>() { 7,6,5,4 };
var OrderedListAsc = List.OrderBy(b => ListAsc.FindIndex(a => a == b)).ToList();
foreach (var l in OrderedListAsc)
{
Console.Write(l+" ,");
}
Console.WriteLine();
var OrderedListDesc = List.OrderByDescending(b => ListDesc.FindIndex(a => a == b)).ToList();
foreach (var l in OrderedListDesc)
{
Console.Write(l + " ,");
}
It is really simple if you think about it:
The order of the elements found in ListDesc should be the number itself, then you got your result:
var OrderedListDesc = List.OrderByDescending(b => ListDesc.Any(a => a == b) ? b : 0).ToList();
foreach (var l in OrderedListDesc)
{
Console.Write(l + " ,");
}
If you want to see what's happening, that is, why you're getting things in the wrong order, run this:
foreach (var i in List)
{
Console.WriteLine("{0}, {1}", i, ListDesc.FindIndex(a => a == i));
}
There's no need for ListDesc anyway. Just use ListAsc:
var OrderedListDesc = List.OrderByDescending(b => ListAsc.FindIndex(a => a == b)).ToList();
Or, use ListDesc and call OrderBy rather than OrderByDescending:
var OrderedListDesc = List.OrderBy(b => ListDesc.FindIndex(a => a == b)).ToList();
If you notice the problem is, when an element(value) not found FindIndex returns -1, which will appear first in order. Assign the maximum value when element is not found.
var OrderedListDesc = List.OrderBy(b =>
{
var index = ListDesc.FindIndex(a => a == b);
return index==-1? int.MaxValue : index;
}).ToList();
A small tip (not relating to issue), if you want to print , separated values you could simply use string.Join as below.
Console.WriteLine(string.Join(",", OrderedListDesc));
Output:
7 ,6 ,5 ,4 ,1 ,2 ,3 ,8 ,9 ,
Check this Fiddle

List to Dictionary with incremental keys in one line LINQ statement

I have this list:
var items = new List<string>() { "Hello", "I am a value", "Bye" };
I want it to convert it to a dictionary with the following structure:
var dic = new Dictionary<int, string>()
{
{ 1, "Hello" },
{ 2, "I am a value" },
{ 3, "Bye" }
};
As you can see, the dictionary keys are just incremental values, but they should also reflect the positions of each element in the list.
I am looking for a one-line LINQ statement. Something like this:
var dic = items.ToDictionary(i => **Specify incremental key or get element index**, i => i);
You can do that by using the overload of Enumerable.Select which passes the index of the element:
var dic = items.Select((val, index) => new { Index = index, Value = val})
.ToDictionary(i => i.Index, i => i.Value);
static void Main(string[] args)
{
var items = new List<string>() { "Hello", "I am a value", "Bye" };
int i = 1;
var dict = items.ToDictionary(A => i++, A => A);
foreach (var v in dict)
{
Console.WriteLine(v.Key + " " + v.Value);
}
Console.ReadLine();
}
Output
1 Hello
2 I am a value
3 Bye
EDIT: Out of curosity i did a performance test with a list of 3 million strings.
1st Place: Simple For loop to add items to a dictionary using the loop count as the key value. (Time: 00:00:00.2494029)
2nd Place: This answer using a integer variable outside of LINQ. Time(00:00:00.2931745)
3rd Place: Yuval Itzchakov's Answer doing it all on a single line. Time (00:00:00.7308006)
var items = new List<string>() { "Hello", "I am a value", "Bye" };
solution #1:
var dic2 = items.Select((item, index) => new { index, item })
.ToDictionary(x => x.item, x => x.index);
solution #2:
int counter = 0;
var dic = items.ToDictionary(x => x, z => counter++);

Get grouped comma separated values with linq

I would like a third column "items" with the values that are grouped.
var dic = new Dictionary<string, int>();
dic.Add("a", 1);
dic.Add("b", 1);
dic.Add("c", 2);
dic.Add("d", 3);
var dCounts =
(from i in dic
group i by i.Value into g
select new { g.Key, count = g.Count()});
var a = dCounts.Where(c => c.count>1 );
dCounts.Dump();
a.Dump();
This code results in:
Key Count
1 2
2 1
3 1
I would like these results:
Key Count Items
1 2 a, b
2 1 c
3 1 d
var dCounts =
(from i in dic
group i by i.Value into g
select new { g.Key, count = g.Count(), Items = string.Join(",", g.Select(kvp => kvp.Key)) });
Use string.Join(",", {array}), passing in your array of keys.
You can use:
var dCounts =
from i in dic
group i by i.Value into g
select new { g.Key, Count = g.Count(), Values = g };
The result created by grouping (value g) has a property Key that gives you the key, but it also implements IEnumerable<T> that allows you to access individual values in the group. If you return just g then you can iterate over all values using foreach or process them using LINQ.
Here is a simple dump function to demonstrate this:
foreach(var el in dCounts) {
Console.Write(" - {0}, count: {1}, values:", el.Key, el.Count);
foreach(var item in el.Values) Console.Write("{0}, ", item);
|
from i in dic
group i.Key by i.Value into g
select new
{
g.Key,
count = g.Count(),
items = string.Join(",", g.ToArray())
});

Categories

Resources