Remove duplicated elements from a List<String>

Remove duplicated elements from a List<String> - c#

I would like to remove the duplicate elements from a List. Some elements of the list looks like this:
Book 23
Book 22
Book 19
Notebook 22
Notebook 19
Pen 23
Pen 22
Pen 19
To get rid of duplicate elements i've done this:
List<String> nodup = dup.Distinct().ToList();
I would like to keep in the list just
Book 23
Notebook 22
Pen 23
How can i do that ?

you can do someting like
string firstElement = dup.Distinct().ToList().First();
and add it to another list if you want.

It's not 100% clear what you want here - however...
If you want to keep the "largest" number in the list, you could do:
List<string> noDup = dup.Select(s => s.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries)
.Select(p => new { Name=p[0], Val=int.Parse(p[1]) })
.GroupBy(p => p.Name)
.Select(g => string.Join(" ", g.Key, g.Max().ToString()))
.ToList();
This would transform the List<string> by parsing the numeric portion into a number, taking the max per item, and creating the output string as you have specified.

You can use LINQ in combination with some String operations to group all your itemy by name and MAX(Number):
var q = from str in list
let Parts = str.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries)
let item = Parts[ 0 ]
let num = int.Parse(Parts[ 1 ])
group new { Name = item, Number = num } by item into Grp
select new {
Name = Grp.Key,
Value = Grp.Max(i => i.Number).ToString()
};
var highestGroups = q.Select(g =>
String.Format("{0} {1}", g.Name, g.Value)).ToList();
(Same as Reed's approach but in query syntax which is better readable to my mind)
Edit: I cannot reproduce your comment that it does not work, here is sample data:
List<String> list = new List<String>();
list.Add("Book 23");
list.Add("Book 22");
list.Add("Book 19");
list.Add("Notebook 23");
list.Add("Notebook 22");
list.Add("Notebook 19");
list.Add("Pen 23");
list.Add("Pen 22");
list.Add("Pen 19");
list.Add("sheet 3");
var q = from str in list
let Parts = str.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries)
let item = Parts[ 0 ]
let num = int.Parse(Parts[ 1 ])
group new { Name = item, Number = num } by item into Grp
select new {
Name = Grp.Key,
Value = Grp.Max(i => i.Number).ToString()
};
var highestGroups = q.Select(g => String.Format("{0} {1}", g.Name, g.Value));
MessageBox.Show(String.Join(Environment.NewLine, highestGroups));
The result:
Book 23
Notebook 23
Pen 23
sheet 3

You may want to add a custom comparer as a parameter, as you can see in the example on MSDN.
In this example I assumed Foo is a class with two members.
class Program
{
static void Main(string[] args)
{
var list = new List<Foo>()
{
new Foo("Book", 23),
new Foo("Book", 22),
new Foo("Book", 19)
};
foreach(var element in list.Distinct(new Comparer()))
{
Console.WriteLine(element.Type + " " + element.Value);
}
}
}
public class Foo
{
public Foo(string type, int value)
{
this.Type = type;
this.Value = value;
}
public string Type { get; private set; }
public int Value { get; private set; }
}
public class Comparer : IEqualityComparer<Foo>
{
public bool Equals(Foo x, Foo y)
{
if(x == null || y == null)
return x == y;
else
return x.Type == y.Type;
}
public int GetHashCode(Foo obj)
{
return obj.Type.GetHashCode();
}
}
This works on an IList, assuming that we want the first item each, not the one with the highest number. Be careful with different collection types (like ICollection or IEnumerable), as they do not guarantee you any order. Therefore any of the Foos may remain after the Distinct.
You could also override both Equals and GetHashCode of Foo instead of using a custom IEqualityComparer. However, I would not actually recommend this for a local distinct. Consumers of your class may not recognize that two instances with same value for Type are always equal, regardless of their Value.

a bit old fashioned , but it should work ,
If I understand correctrly
Dictionary<string,int> dict=new Dictionary<string,int>();
//Split accepts 1 character ,assume each line containes key value pair seperated with spaces and not containing whitespaces
input=input.Replace("\r\n","\n");
string[] lines=input.Split('\n');
//break to categories and find largest number at each
foreach(line in lines)
{
string parts[]=line.Split(' ');
string key=parts[0].Trim();
int value=Convert.ToInt32(parts[1].Trim());
if (dict.ContainsKey(key))
{
dict.Add(key, value);
}
else
{
if (dict[key]<value)
{
dict[key]=value;
}
}
}
//do somethig with dict

Related

Group list of strings with common prefixes

Suppose I have a list of strings [city01, city01002, state02, state03, city04, statebg, countryqw, countrypo]
How do I group them in a dictionary of <string, List<Strings>> like
city - [city01, city04, city01002]
state- [state02, state03, statebg]
country - [countrywq, countrypo]
If not code, can anyone please help with how to approach or proceed?

As shown in other answers you can use the GroupBy method from LINQ to create this grouping based on any condition you want. Before you can group your strings you need to know the conditions for how a string is grouped. It could be that it starts with one of a set of predefined prefixes, grouped by whats before the first digit or any random condition you can describe with code. In my code example the groupBy method calls another method for every string in your list and in that method you can place the code you need to group the strings as you want by returning the key to group the given string under. You can test this example online with dotnetfiddle: https://dotnetfiddle.net/UHNXvZ
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
public static void Main()
{
List<string> ungroupedList = new List<string>() {"city01", "city01002", "state02", "state03", "city04", "statebg", "countryqw", "countrypo", "theFirstTown"};
var groupedStrings = ungroupedList.GroupBy(x => groupingCondition(x));
foreach (var a in groupedStrings) {
Console.WriteLine("key: " + a.Key);
foreach (var b in a) {
Console.WriteLine("value: " + b);
}
}
}
public static string groupingCondition(String s) {
if(s.StartsWith("city") || s.EndsWith("Town"))
return "city";
if(s.StartsWith("country"))
return "country";
if(s.StartsWith("state"))
return "state";
return "unknown";
}
}

You can use LINQ:
var input = new List<string>()
{ "city01", "city01002", "state02",
"state03", "city04", "statebg", "countryqw", "countrypo" };
var output = input.GroupBy(c => string.Join("", c.TakeWhile(d => !char.IsDigit(d))
.Take(4))).ToDictionary(c => c.Key, c => c.ToList());

i suppose you have a list of references you are searching in the list:
var list = new List<string>()
{ "city01", "city01002", "state02",
"state03", "city04", "statebg", "countryqw", "countrypo" };
var tofound = new List<string>() { "city", "state", "country" }; //references to found
var result = new Dictionary<string, List<string>>();
foreach (var f in tofound)
{
result.Add(f, list.FindAll(x => x.StartsWith(f)));
}
In the result, you have the dictionary wanted. If no value are founded for a reference key, the value of key is null

Warning: This answer has a combinatorial expansion and will fail if your original string set is large. For 65 words I gave up after running for a couple of hours.
Using some IEnumerable extension methods to find Distinct sets and to find all possible combinations of sets, you can generate a group of prefixes and then group the original strings by these.
public static class IEnumerableExt {
public static bool IsDistinct<T>(this IEnumerable<T> items) {
var hs = new HashSet<T>();
foreach (var item in items)
if (!hs.Add(item))
return false;
return true;
}
public static bool IsEmpty<T>(this IEnumerable<T> items) => !items.Any();
public static IEnumerable<IEnumerable<T>> AllCombinations<T>(this IEnumerable<T> start) {
IEnumerable<IEnumerable<T>> HelperCombinations(IEnumerable<T> items) {
if (items.IsEmpty())
yield return items;
else {
var head = items.First();
var tail = items.Skip(1);
foreach (var sequence in HelperCombinations(tail)) {
yield return sequence; // Without first
yield return sequence.Prepend(head);
}
}
}
return HelperCombinations(start).Skip(1); // don't return the empty set
}
}
var keys = Enumerable.Range(0, src.Count - 1)
.SelectMany(n1 => Enumerable.Range(n1 + 1, src.Count - n1 - 1).Select(n2 => new { n1, n2 }))
.Select(n1n2 => new { s1 = src[n1n2.n1], s2 = src[n1n2.n2], Dist = src[n1n2.n1].TakeWhile((ch, n) => n < src[n1n2.n2].Length && ch == src[n1n2.n2][n]).Count() })
.SelectMany(s1s2d => new[] { new { s = s1s2d.s1, s1s2d.Dist }, new { s = s1s2d.s2, s1s2d.Dist } })
.Where(sd => sd.Dist > 0)
.GroupBy(sd => sd.s.Substring(0, sd.Dist))
.Select(sdg => sdg.Distinct())
.AllCombinations()
.Where(sdgc => sdgc.Sum(sdg => sdg.Count()) == src.Count)
.Where(sdgc => sdgc.SelectMany(sdg => sdg.Select(sd => sd.s)).IsDistinct())
.OrderByDescending(sdgc => sdgc.Sum(sdg => sdg.First().Dist)).First()
.Select(sdg => sdg.First())
.Select(sd => sd.s.Substring(0, sd.Dist))
.ToList();
var groups = src.GroupBy(s => keys.First(k => s.StartsWith(k)));

Rename List item when there is the same string multiple time

I have List of names like:
var list = new List<string> {"Allan", "Michael", "Jhon", "Smith", "George", "Jhon"};
and a combobox which itemssource is my list. As you can see in the list there is Jhon 2 times, what I want is when I put those name into combobox add "2" to second Jhon. I mean when I open the combobox names in it shoud look like:
Allan
Michael
Jhon
Smith
George
Jhon2
I have tired linq to do that but I'm quite new to c#/linq. Could someone show me simple way to do that?

I would do this:
var result = list.Take(1).ToList();
for (var i = 1; i < list.Count; i++)
{
var name = list[i];
var count = list.Take(i - 1).Where(n => n == name).Count() + 1;
result.Add(count < 2 ? name : name + count.ToString());
}

Here is what I would do:
First off, separate the list into two smaller ones, one that contains all the unique names, and one that contains only duplicates:
var duplicates = myList.GroupBy(s => s)
.SelectMany(grp => grp.Skip(1));
var unique = new HashSet<string>(myList).ToList();
Then process:
var result = new List<string>();
foreach (string uniqueName in unique)
{
int index=2;
foreach (string duplicateName in duplicates.Where(dupe => dupe == uniqueName))
{
result.Add(string.Format("{0}{1}", duplicateName, index.ToString()));
index++;
}
}
What we are doing here is the following:
Iterate through unique names.
Initialize a variable index with value 2. This will be the number we add at the end of each name.
Iterate through matching duplicate names.
Modify the name string by adding the number stored at index to the end.
Add this new value to the results list.
Increment index.
Finally, add the unique names back in:
result.AddRange(unique);
The result list should now contain all the same values as the original myList, only difference being that all names that appear more than once have a number appended to their end. Per your specification, there is no name name1. Instead, counting starts from 2.

Another possibility:
var groups = list.Select((name, index) => new { name, index }).GroupBy(s => s.name).ToList();
foreach (var group in groups.Where(g => g.Count() > 1))
{
foreach (var entry in group.Skip(1).Select((g, i) => new { g, i }))
{
list[entry.g.index] = list[entry.g.index] + entry.i;
}
}

Someone might be able to give a more efficient answer, but this does the job.
The dictionary keeps track of how many times a name has been repeated in the list. Each time a new name in the list is encountered, it is added to the dictionary and is added as is to the new list. If the name already exists in the dictionary (with the key check), instead, the count is increased by one in the dictionary and this name is added to the new list with the count (from the dictionary value corresponding to the name as the key) appended to the end of the name.
var list = new List<string> {"Allan", "Michael", "Jhon", "Smith", "George", "Jhon", "George", "George"};
Dictionary<string, int> dictionary = new Dictionary<string,int>();
var newList = new List<string>();
for(int i=0; i<list.Count();i++){
if(!dictionary.ContainsKey(list[i])){
dictionary.Add(list[i], 1);
newList.Add(list[i]);
}
else{
dictionary[list[i]] += 1;
newList.Add(list[i] + dictionary[list[i]]);
}
}
for(int i=0; i<newList.Count(); i++){
Console.WriteLine(newList[i]);
}
Output:
Allan
Michael
Jhon
Smith
George
Jhon2
George2
George3

Check this solution:
public List<string> AddName(IEnumerable<string> list, string name)
{
var suffixSelector = new Regex("^(?<name>[A-Za-z]+)(?<suffix>\\d?)$",
RegexOptions.Singleline);
var namesMap = list.Select(n => suffixSelector.Match(n))
.Select(x => new {name = x.Groups["name"].Value, suffix = x.Groups["suffix"].Value})
.GroupBy(x => x.name)
.ToDictionary(x => x.Key, x => x.Count());
if (namesMap.ContainsKey(name))
namesMap[name] = namesMap[name] + 1;
return namesMap.Select(x => x.Key).Concat(
namesMap.Where(x => x.Value > 1)
.SelectMany(x => Enumerable.Range(2, x.Value - 1)
.Select(i => $"{x.Key}{i}"))).ToList();
}
It handle case when you already has 'Jhon2' in the list

I would do
class Program
{
private static void Main(string[] args)
{
var list = new List<string> { "Allan", "Michael", "Jhon", "Smith", "George", "Jhon" };
var duplicates = list.GroupBy(x => x).Select(r => GetTuple(r.Key, r.Count()))
.Where(x => x.Count > 1)
.Select(c => { c.Count = 1; return c; }).ToList();
var result = list.Select(v =>
{
var val = duplicates.FirstOrDefault(x => x.Name == v);
if (val != null)
{
if (val.Count != 1)
{
v = v + " " + val.Count;
}
val.Count += 1;
}
return v;
}).ToList();
Console.ReadLine();
}
private static FooBar GetTuple(string key, int count)
{
return new FooBar(key, count);
}
}
public class FooBar
{
public int Count { get; set; }
public string Name { get; set; }
public FooBar(string name, int count)
{
Count = count;
Name = name;
}
}

Trying to merge two string array based on comparision

Below is my class :
public class Regions
{
public int Id { get; set; }
public string[] ParentName { get; set; }
}
Now I have 2 list of above regions class like below containing some data:
var region1 = new Regions();
var region2 = new Regions();
Now ParentName contains data like below for region1 :
[0] : Abc.mp3,Pqr.mp3
[1] : Xxx.mp3
[2] : kkk.mp3
[3] : ppp.mp3,zzz.mp3,rrr.mp3,ddd.mp3
Now ParentName contains data like below for region2 :
[0] : Abc.mp3,Pqr.mp3,lmn.mp3
[1] : rrr.mp3,ggg.mp3,yyy.mp3
Now I am trying to merge ParentName of region2 in to region1 if any part of region1 is matching with region2 after splitting records by comma like below :
[0] : Abc.mp3,Pqr.mp3,lmn.mp3
[1] : Xxx.mp3
[2] : kkk.mp3
[3] : ppp.mp3,zzz.mp3,rrr.mp3,ddd.mp3,ggg.mp3,yyy.mp3
Now in above expected output, Abc.mp3 and Pqr.Mp3(Region1 and Region2) is matching only Lmn.mp3 is not matching so it will be appended at the end of Region1.
For the last record from region1 and region2, rrr.mp3 is matching(single match is also enough) so non matching record from region2 i.e ggg.mp3,yyy.mp3 will be appended at the end of region1.
Output I am getting in Region1:
[0] : Abc.mp3,Pqr.mp3
[1] : Xxx.mp3
[2] : kkk.mp3
[3] : ppp.mp3,zzz.mp3,rrr.mp3,ddd.mp3
[4] : Abc.mp3,Pqr.mp3,lmn.mp3
[3] : rrr.mp3,ggg.mp3,yyy.mp3
Code :
region1.ParentName = region1.ParentName.Concat(region2.ParentName).Distinct().ToArray();
public static T[] Concat<T>(this T[] x, T[] y)
{
if (x == null) throw new ArgumentNullException("x");
if (y == null) throw new ArgumentNullException("y");
int oldLen = x.Length;
Array.Resize<T>(ref x, x.Length + y.Length);
Array.Copy(y, 0, x, oldLen, y.Length);
return x;
}

It's unclear if your names contain duplicates and how they should be handled, but here is the LINQ solution which produces the desired result with the specified inputs:
var e2Sets = region2.ParentName.Select(e2 => e2.Split(',')).ToList();
var result =
from e1 in region1.ParentName
let e1Set = e1.Split(',')
let e2AppendSet = (
from e2Set in e2Sets
where e1Set.Intersect(e2Set).Any()
from e2New in e2Set.Except(e1Set)
select e2New
).Distinct()
select string.Join(",", e1Set.Concat(e2AppendSet));
result.ToArray() will give you the desired new region1.ParentName.
How it works:
Since we basically need Cartesian product of the two input sequences, we start by preparing a list of the arrays of split strings of the second sequence, in order to avoid multiple string.Split inside the inner loop.
The for each element of the first sequence, we split it to array of strings, the for each split array in the second sequence which has a match (determined with Intersect method) we select the unmatched strings using the Except method. Then we flatten all the unmatched strings, apply Distinct to remove the potential duplicates, concatenate the two sets and use string.Join to produce the new comma delimited string.

You could do the following:
public static void Merge(Regions first, Regions second)
{
if (ReferenceEquals(first, null))
throw new ArgumentNullException(nameof(first));
if (ReferenceEquals(second, null))
throw new ArgumentNullException(nameof(second));
first.ParentName = first.ParentName.Merge(second.ParentName).ToArray();
}
private static IEnumerable<string> Merge(this IEnumerable<string> first, IEnumerable<string> second)
{
if (ReferenceEquals(first, null))
throw new ArgumentNullException(nameof(first));
if (ReferenceEquals(second, null))
throw new ArgumentNullException(nameof(second));
foreach (var f in first)
{
yield return f.Merge(second, ',');
}
}
private static string Merge(this string first, IEnumerable<string> second, char separator)
{
Debug.Assert(first != null);
Debug.Assert(second != null);
var firstSplitted = first.Split(separator);
foreach (var s in second)
{
var sSplitted = s.Split(separator);
if (firstSplitted.Intersect(sSplitted).Any())
return string.Join(separator.ToString(), firstSplitted.Union(sSplitted));
}
return first;
}
Note that this will merge on the first match it finds; if duplicate values exist, it will only merge the first time the match is encountered.
The secret here is divide and conquer. If you are having trouble implementing a certain logic, then break it down into simpler steps and implement a method for each baby step. Once its working, if you really need to, you can refactor your code to make it more concise or performant.
If you run this:
var first = new Regions();
var second = new Regions();
first.ParentName = new[] { "Abc.mp3,Pqr.mp3", "Xxx.mp3", "kkk.mp3", "ppp.mp3,zzz.mp3,rrr.mp3,ddd.mp3" };
second.ParentName = new[] { "Abc.mp3,Pqr.mp3,lmn.mp3", "rrr.mp3,ggg.mp3,yyy.mp3" };
Merge(first, second);
You will get the expected result. first.ParentName will be:
[0]: "Abc.mp3,Pqr.mp3,lmn.mp3"
[1]: "Xxx.mp3"
[2]: "kkk.mp3"
[3]: "ppp.mp3,zzz.mp3,rrr.mp3,ddd.mp3,ggg.mp3,yyy.mp3"

You can use Split() method to get parts of string and find matches and Join() method to get final string:
private static void Merge(Regions region, Regions region2)
{
List<List<string>> splittedLists = region.ParentName.Select(p => p.Split(new char[] { ',' }, StringSplitOptions.None).ToList()).ToList();
List<List<string>> splittedLists2 = region2.ParentName.Select(p => p.Split(new char[] { ',' }, StringSplitOptions.None).ToList()).ToList();
List<string> res = new List<string>();
foreach (var item in splittedLists)
{
bool wasMatch = false;
foreach (var s in item)
{
bool contains = false;
foreach (var s2 in splittedLists2.Where(s2 => s2.Contains(s)))
{
wasMatch = true;
contains = true;
res.Add(string.Join(",", item.Concat(s2).Distinct()));
}
if (contains)
{
contains = false;
break;
}
}
if (!wasMatch)
{
res.Add(string.Join(",", item));
}
}
region.ParentName = res.ToArray();
}

Fastest way to select distinct values from list based on two properties

I have a this list:
List<myobject> list= new List<myobject>();
list.Add(new myobject{name="n1",recordNumber=1});
list.Add(new myobject{name="n2",recordNumber=2});
list.Add(new myobject{name="n3",recordNumber=3});
list.Add(new myobject{name="n4",recordNumber=3});
I'm looking for the fastest way to select distinct objects based on recordNumber, but if there is more than one object with same recordNumber(here recordNumber=3), I want to select object base on its name.(the name provided by paramater)
thanks

It looks like you are really after something like:
Dictionary<int, List<myobject>> myDataStructure;
That allows you to quickly retrieve by record number. If the List<myobject> with that dictionary key contains more than one entry, you can then use the name to select the correct one.
Note that if your list is not terribly long, an O(n) check that just scans the list checking for the recordNumber and name may be fast enough, in the sense that other things happening in your program could obscure the list lookup cost. Consider that possibility before over-optimizing lookup times.

Here's the LINQ way of doing this:
Func<IEnumerable<myobject>, string, IEnumerable<myobject>> getDistinct =
(ms, n) =>
ms
.ToLookup(x => x.recordNumber)
.Select(xs => xs.Skip(1).Any()
? xs.Where(x => x.name == n).Take(1)
: xs)
.SelectMany(x => x)
.ToArray();
I just tested this with a 1,000,000 randomly created myobject list and it produced the result in 106ms. That should be fast enough for most situations.

Are you looking for
class Program
{
static void Main(string[] args)
{
List<myobject> list = new List<myobject>();
list.Add(new myobject { name = "n1", recordNumber = 1 });
list.Add(new myobject { name = "n2", recordNumber = 2 });
list.Add(new myobject { name = "n3", recordNumber = 3 });
list.Add(new myobject { name = "n4", recordNumber = 3 });
//Generates Row Number on the fly
var withRowNumbers = list
.Select((x, index) => new
{
Name = x.name,
RecordNumber = x.recordNumber,
RowNumber = index + 1
}).ToList();
//Generates Row Number with Partition by clause
var withRowNumbersPartitionBy = withRowNumbers
.OrderBy(x => x.RowNumber)
.GroupBy(x => x.RecordNumber)
.Select(g => new { g, count = g.Count() })
.SelectMany(t => t.g.Select(b => b)
.Zip(Enumerable.Range(1, t.count), (j, i) => new { Rn = i, j.RecordNumber, j.Name}))
.Where(i=>i.Rn == 1)
.ToList();
//print the result
withRowNumbersPartitionBy.ToList().ForEach(i => Console.WriteLine("Name = {0} RecordNumber = {1}", i.Name, i.RecordNumber));
Console.ReadKey();
}
}
class myobject
{
public int recordNumber { get; set; }
public string name { get; set; }
}
Result:
Name = n1 RecordNumber = 1
Name = n2 RecordNumber = 2
Name = n3 RecordNumber = 3

Are you looking for a method to do this?
List<myobject> list= new List<myobject>();
list.Add(new myobject{name="n1",recordNumber=1});
list.Add(new myobject{name="n2",recordNumber=2});
list.Add(new myobject{name="n3",recordNumber=3});
list.Add(new myobject{name="n4",recordNumber=3});
public myobject Find(int recordNumber, string name)
{
var matches = list.Where(l => l.recordNumber == recordNumber);
if (matches.Count() == 1)
return matches.Single();
else return matches.Single(m => m.name == name);
}
This will - of course - break if there are multiple matches, or zero matches. You need to write your own edge cases and error handling!

If the name and recordNumber combination is guaranteed to be unique then you can always use Hashset.
You can then use RecordNumber and Name to generate the HashCode by using a method described here.
class myobject
{
//override GetHashCode
public override int GetHashCode()
{
unchecked // Overflow is fine, just wrap
{
int hash = 17;
// Suitable nullity checks etc, of course :)
hash = hash * 23 + recordNumber.GetHashCode();
hash = hash * 23 + name.GetHashCode();
return hash;
}
}
//override Equals
}

Difference Between Select and SelectMany

I've been searching the difference between Select and SelectMany but I haven't been able to find a suitable answer. I need to learn the difference when using LINQ To SQL but all I've found are standard array examples.
Can someone provide a LINQ To SQL example?

SelectMany flattens queries that return lists of lists. For example
public class PhoneNumber
{
public string Number { get; set; }
}
public class Person
{
public IEnumerable<PhoneNumber> PhoneNumbers { get; set; }
public string Name { get; set; }
}
IEnumerable<Person> people = new List<Person>();
// Select gets a list of lists of phone numbers
IEnumerable<IEnumerable<PhoneNumber>> phoneLists = people.Select(p => p.PhoneNumbers);
// SelectMany flattens it to just a list of phone numbers.
IEnumerable<PhoneNumber> phoneNumbers = people.SelectMany(p => p.PhoneNumbers);
// And to include data from the parent in the result:
// pass an expression to the second parameter (resultSelector) in the overload:
var directory = people
.SelectMany(p => p.PhoneNumbers,
(parent, child) => new { parent.Name, child.Number });
Live Demo on .NET Fiddle

Select many is like cross join operation in SQL where it takes the cross product.
For example if we have
Set A={a,b,c}
Set B={x,y}
Select many can be used to get the following set
{ (x,a) , (x,b) , (x,c) , (y,a) , (y,b) , (y,c) }
Note that here we take the all the possible combinations that can be made from the elements of set A and set B.
Here is a LINQ example you can try
List<string> animals = new List<string>() { "cat", "dog", "donkey" };
List<int> number = new List<int>() { 10, 20 };
var mix = number.SelectMany(num => animals, (n, a) => new { n, a });
the mix will have following elements in flat structure like
{(10,cat), (10,dog), (10,donkey), (20,cat), (20,dog), (20,donkey)}

var players = db.SoccerTeams.Where(c => c.Country == "Spain")
.SelectMany(c => c.players);
foreach(var player in players)
{
Console.WriteLine(player.LastName);
}
De Gea
Alba
Costa
Villa
Busquets
...

SelectMany() lets you collapse a multidimensional sequence in a way that would otherwise require a second Select() or loop.
More details at this blog post.

There are several overloads to SelectMany. One of them allows you to keep trace of any relationship between parent and children while traversing the hierarchy.
Example: suppose you have the following structure: League -> Teams -> Player.
You can easily return a flat collection of players. However you may lose any reference to the team the player is part of.
Fortunately there is an overload for such purpose:
var teamsAndTheirLeagues =
from helper in leagues.SelectMany
( l => l.Teams
, ( league, team ) => new { league, team } )
where helper.team.Players.Count > 2
&& helper.league.Teams.Count < 10
select new
{ LeagueID = helper.league.ID
, Team = helper.team
};
The previous example is taken from Dan's IK blog. I strongly recommend you take a look at it.

I understand SelectMany to work like a join shortcut.
So you can:
var orders = customers
.Where(c => c.CustomerName == "Acme")
.SelectMany(c => c.Orders);

The SelectMany() method is used to flatten a sequence in which each of the elements of the sequence is a separate.
I have class user same like this
class User
{
public string UserName { get; set; }
public List<string> Roles { get; set; }
}
main:
var users = new List<User>
{
new User { UserName = "Reza" , Roles = new List<string>{"Superadmin" } },
new User { UserName = "Amin" , Roles = new List<string>{"Guest","Reseption" } },
new User { UserName = "Nima" , Roles = new List<string>{"Nurse","Guest" } },
};
var query = users.SelectMany(user => user.Roles, (user, role) => new { user.UserName, role });
foreach (var obj in query)
{
Console.WriteLine(obj);
}
//output
//{ UserName = Reza, role = Superadmin }
//{ UserName = Amin, role = Guest }
//{ UserName = Amin, role = Reseption }
//{ UserName = Nima, role = Nurse }
//{ UserName = Nima, role = Guest }
You can use operations on any item of sequence
int[][] numbers = {
new[] {1, 2, 3},
new[] {4},
new[] {5, 6 , 6 , 2 , 7, 8},
new[] {12, 14}
};
IEnumerable<int> result = numbers
.SelectMany(array => array.Distinct())
.OrderBy(x => x);
//output
//{ 1, 2 , 2 , 3, 4, 5, 6, 7, 8, 12, 14 }
List<List<int>> numbers = new List<List<int>> {
new List<int> {1, 2, 3},
new List<int> {12},
new List<int> {5, 6, 5, 7},
new List<int> {10, 10, 10, 12}
};
IEnumerable<int> result = numbers
.SelectMany(list => list)
.Distinct()
.OrderBy(x=>x);
//output
// { 1, 2, 3, 5, 6, 7, 10, 12 }

Select is a simple one-to-one projection from source element to a result element. Select-
Many is used when there are multiple from clauses in a query expression: each element in the original sequence is used to generate a new sequence.

The formal description for SelectMany() is:
Projects each element of a sequence to an IEnumerable and flattens
the resulting sequences into one sequence.
SelectMany() flattens the resulting sequences into one sequence, and invokes a result selector function on each element therein.
class PetOwner
{
public string Name { get; set; }
public List<String> Pets { get; set; }
}
public static void SelectManyEx()
{
PetOwner[] petOwners =
{ new PetOwner { Name="Higa, Sidney",
Pets = new List<string>{ "Scruffy", "Sam" } },
new PetOwner { Name="Ashkenazi, Ronen",
Pets = new List<string>{ "Walker", "Sugar" } },
new PetOwner { Name="Price, Vernette",
Pets = new List<string>{ "Scratches", "Diesel" } } };
// Query using SelectMany().
IEnumerable<string> query1 = petOwners.SelectMany(petOwner => petOwner.Pets);
Console.WriteLine("Using SelectMany():");
// Only one foreach loop is required to iterate
// through the results since it is a
// one-dimensional collection.
foreach (string pet in query1)
{
Console.WriteLine(pet);
}
// This code shows how to use Select()
// instead of SelectMany().
IEnumerable<List<String>> query2 =
petOwners.Select(petOwner => petOwner.Pets);
Console.WriteLine("\nUsing Select():");
// Notice that two foreach loops are required to
// iterate through the results
// because the query returns a collection of arrays.
foreach (List<String> petList in query2)
{
foreach (string pet in petList)
{
Console.WriteLine(pet);
}
Console.WriteLine();
}
}
/*
This code produces the following output:
Using SelectMany():
Scruffy
Sam
Walker
Sugar
Scratches
Diesel
Using Select():
Scruffy
Sam
Walker
Sugar
Scratches
Diesel
*/
The main difference is the result of each method while SelectMany() returns a flattern results; the Select() returns a list of list instead of a flattern result set.
Therefor the result of SelectMany is a list like
{Scruffy, Sam , Walker, Sugar, Scratches , Diesel}
which you can iterate each item by just one foreach. But with the result of select you need an extra foreach loop to iterate through the results because the query returns a collection of arrays.

Some SelectMany may not be necessary. Below 2 queries give the same result.
Customers.Where(c=>c.Name=="Tom").SelectMany(c=>c.Orders)
Orders.Where(o=>o.Customer.Name=="Tom")
For 1-to-Many relationship,
if Start from "1", SelectMany is needed, it flattens the many.
if Start from "Many", SelectMany is not needed. (still be able to filter from "1", also this is simpler than below standard join query)
from o in Orders
join c in Customers on o.CustomerID equals c.ID
where c.Name == "Tom"
select o

Just for an alternate view that may help some functional programmers out there:
Select is map
SelectMany is bind (or flatMap for your Scala/Kotlin people)

Without getting too technical - database with many Organizations, each with many Users:-
var orgId = "123456789";
var userList1 = db.Organizations
.Where(a => a.OrganizationId == orgId)
.SelectMany(a => a.Users)
.ToList();
var userList2 = db.Users
.Where(a => a.OrganizationId == orgId)
.ToList();
both return the same ApplicationUser list for the selected Organization.
The first "projects" from Organization to Users, the second queries the Users table directly.

It's more clear when the query return a string (an array of char):
For example if the list 'Fruits' contains 'apple'
'Select' returns the string:
Fruits.Select(s=>s)
[0]: "apple"
'SelectMany' flattens the string:
Fruits.SelectMany(s=>s)
[0]: 97 'a'
[1]: 112 'p'
[2]: 112 'p'
[3]: 108 'l'
[4]: 101 'e'

Consider this example :
var array = new string[2]
{
"I like what I like",
"I like what you like"
};
//query1 returns two elements sth like this:
//fisrt element would be array[5] :[0] = "I" "like" "what" "I" "like"
//second element would be array[5] :[1] = "I" "like" "what" "you" "like"
IEnumerable<string[]> query1 = array.Select(s => s.Split(' ')).Distinct();
//query2 return back flat result sth like this :
// "I" "like" "what" "you"
IEnumerable<string> query2 = array.SelectMany(s => s.Split(' ')).Distinct();
So as you see duplicate values like "I" or "like" have been removed from query2 because "SelectMany" flattens and projects across multiple sequences.
But query1 returns sequence of string arrays. and since there are two different arrays in query1 (first and second element), nothing would be removed.

The SelectMany method knocks down an IEnumerable<IEnumerable<T>> into an IEnumerable<T>, like communism, every element is behaved in the same manner(a stupid guy has same rights of a genious one).
var words = new [] { "a,b,c", "d,e", "f" };
var splitAndCombine = words.SelectMany(x => x.Split(','));
// returns { "a", "b", "c", "d", "e", "f" }

One more example how SelectMany + Select can be used in order to accumulate sub array objects data.
Suppose we have users with they phones:
class Phone {
public string BasePart = "555-xxx-xxx";
}
class User {
public string Name = "Xxxxx";
public List<Phone> Phones;
}
Now we need to select all phones' BaseParts of all users:
var usersArray = new List<User>(); // array of arrays
List<string> allBaseParts = usersArray.SelectMany(ua => ua.Phones).Select(p => p.BasePart).ToList();

Suppose you have an array of countries
var countries = new[] { "France", "Italy" };
If you perform Select on countries, you will get each element of the array as IEnumerable<T>
IEnumerable<string> selectQuery = countries.Select(country => country);
In the above code, the country represents a string that refers to each country in the array. now iterate over selectQuery to get countries:
foreach(var country in selectQuery)
Console.WriteLine(country);
// output
//
// France
// Italy
If you want to print every character of countries you have to use nested foreach
foreach (var country in selectQuery)
{
foreach (var charOfCountry in country)
{
Console.Write(charOfCountry + ", ");
}
}
// output
// F, r, a, n, c, e, I, t, a, l, y,
OK. now try to perform SelectMany on countries. This time SelectMany gets each country as string (as before) and because of string type is a collection of chars, SelectMany tries to divide each country into its constituent parts (chars) and then returns a collection of chars as IEnumerable<T>
IEnumerable<char> selectManyQuery = countries.SelectMany(country => country);
In the above code, the country represents a string that refers to each country in the array as before, but the return value is the chars of each country
Actually SelectMany likes to fetch two levels inside of collections and flatten the second level as IEnumerable<T>
Now iterate over selectManyQuery to get chars of each country:
foreach(var charOfCountry in selectManyQuery)
Console.Write(charOfCountry + ", ");
// output
// F, r, a, n, c, e, I, t, a, l, y,

Here is a code example with an initialized small collection for testing:
class Program
{
static void Main(string[] args)
{
List<Order> orders = new List<Order>
{
new Order
{
OrderID = "orderID1",
OrderLines = new List<OrderLine>
{
new OrderLine
{
ProductSKU = "SKU1",
Quantity = 1
},
new OrderLine
{
ProductSKU = "SKU2",
Quantity = 2
},
new OrderLine
{
ProductSKU = "SKU3",
Quantity = 3
}
}
},
new Order
{
OrderID = "orderID2",
OrderLines = new List<OrderLine>
{
new OrderLine
{
ProductSKU = "SKU4",
Quantity = 4
},
new OrderLine
{
ProductSKU = "SKU5",
Quantity = 5
}
}
}
};
//required result is the list of all SKUs in orders
List<string> allSKUs = new List<string>();
//With Select case 2 foreach loops are required
var flattenedOrdersLinesSelectCase = orders.Select(o => o.OrderLines);
foreach (var flattenedOrderLine in flattenedOrdersLinesSelectCase)
{
foreach (OrderLine orderLine in flattenedOrderLine)
{
allSKUs.Add(orderLine.ProductSKU);
}
}
//With SelectMany case only one foreach loop is required
allSKUs = new List<string>();
var flattenedOrdersLinesSelectManyCase = orders.SelectMany(o => o.OrderLines);
foreach (var flattenedOrderLine in flattenedOrdersLinesSelectManyCase)
{
allSKUs.Add(flattenedOrderLine.ProductSKU);
}
//If the required result is flattened list which has OrderID, ProductSKU and Quantity,
//SelectMany with selector is very helpful to get the required result
//and allows avoiding own For loops what according to my experience do code faster when
// hundreds of thousands of data rows must be operated
List<OrderLineForReport> ordersLinesForReport = (List<OrderLineForReport>)orders.SelectMany(o => o.OrderLines,
(o, ol) => new OrderLineForReport
{
OrderID = o.OrderID,
ProductSKU = ol.ProductSKU,
Quantity = ol.Quantity
}).ToList();
}
}
class Order
{
public string OrderID { get; set; }
public List<OrderLine> OrderLines { get; set; }
}
class OrderLine
{
public string ProductSKU { get; set; }
public int Quantity { get; set; }
}
class OrderLineForReport
{
public string OrderID { get; set; }
public string ProductSKU { get; set; }
public int Quantity { get; set; }
}

A select operator is used to select value from a collection and SelectMany operator is used to selecting values from a collection of collection i.e. nested collection.

It is the best way to understand i think.
var query =
Enumerable
.Range(1, 10)
.SelectMany(ints => Enumerable.Range(1, 10), (a, b) => $"{a} * {b} = {a * b}")
.ToArray();
Console.WriteLine(string.Join(Environment.NewLine, query));
Console.Read();
Multiplication Table example.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Remove duplicated elements from a List<String> - c#

you can do someting like string firstElement = dup.Distinct().ToList().First(); and add it to another list if you want.

Related

Group list of strings with common prefixes

Rename List item when there is the same string multiple time

Trying to merge two string array based on comparision

Fastest way to select distinct values from list based on two properties

Difference Between Select and SelectMany

Categories

Resources