Linq - Join with group by merging results - c#

I'm trying to merge the results from two tables in to a list and then display it in a dictionary where the key is the domain and the value is a list of urls from both tables.
class Source1Entity { string Domain {get;} string PageUrl {get;} /* more properties */ }
class Source2Entity { string Domain {get;} string PageUrl {get;} /* more properties */ }
I've got this far:
Dictionary<string, List<string>> results =
from firstSource in context.Source1
join secondSource in context.Source2 on firstSource.Domain equals secondSource.Domain
group firstSource by firstSource.Domain into g
...

It's not entirely clear, but I suspect you actually want to treat these two tables as equivalents - so you can project to a common form and then use Concat, then call ToLookup:
var projectedSource1 = context.Source1.Select(x => new { x.Domain, x.PageUrl });
var projectedSource2 = context.Source2.Select(x => new { x.Domain, x.PageUrl });
var results = projectedSource1
.Concat(projectedSource2)
.ToLookup(x => x.Domain, x => x.PageUrl);
Then:
// For a particular domain - you'll get an empty sequence it the
// domain isn't represented
foreach (var url in results[domain])
or
foreach (var entry in results)
{
Console.WriteLine("Domain: {0}", entry.Key);
foreach (var url in entry)
{
Console.WriteLine(" Url: {0}", url);
}
}
While you could use a Dictionary for this, a Lookup is generally more suitable for single-key-multiple-value queries.

You need a GroupJoin.
Try this...
join secondSource in context.Source on firstSource.Domain equals secondSource.Domain into group_join

Related

My LINQ query is not ordering my dictionary

I have a Dictionary
Dictionary<string, List<Employee>> employees
that contains a list of employees. And I want to allow the user to display the list of employees in alphabetical order based on the state they are in.
Console.WriteLine($"If you would like to sort this list please enter one of the following choices..\n" +
$"'STATE', 'INCOME', 'ID', 'NAME', 'TAX' other wise enter any key.");
var sort = Console.ReadLine().ToUpper();
var employees = EmployeeRecord.employees;
List<Employee> sortedEmps = new List<Employee>();
if (sort.Contains("STATE"))
foreach (var list in employees.Values) {
var columnQuery = list.OrderBy(x => x.stateCode).ToList();
sortedEmps.AddRange(columnQuery);
}
}
//Print out the newly ordered list
foreach (Employee r in sortedEmps) {
Console.WriteLine($"ID: {r.iD} Name: {r.name} State: {r.stateCode} Income:{r.income} Tax Due: {r.taxDue}");
}
However, it still prints out the list without ordering it. How can I get it to order alphabetically by the state code?
Try sorting when you have all data merged.
if (sort.Contains("STATE")) {
foreach (var list in employees.Values) {
sortedEmps.AddRange(list);
}
sortedEmps = sortedEmps.OrderBy(x => x.stateCode).ToList();
}
Also you can shorten a little the code with SelectMany as #Robert Harvey suggested
if (sort.Contains("STATE")) {
sortedEmps = employees.Values
.SelectMany(x => x)
.ToList().OrderBy(o => o.stateCode).ToList();
}

Contains matches on pairs of items

I have a mapping table in the following form:
Id ReferenceId ReferenceType LinkId
To retrieve a set of combinations, I could run each query separately:
var pairs = new List<Pair>
{
Pair.Create(1000, "Car"),
Pair.Create(2000, "Truck"),
};
var maps = new List<Mapping>();
foreach (var pair in pairs)
{
maps.AddRange(context.Mappings.Where(x => x.ReferenceId = pair.Id && x.ReferenceType == pair.Type).ToList());
}
However, I want to combine these into a single statement to reduce my hits on the db. Is there some form of Contains statement that can work with pairs of objects? Or is it possible to append an OR clause onto an IQueryable within a loop? Any other solutions?
Not sure if it works for your LINQ provider but you could try to join with an anonymous type:
var mapQuery = from p in pairs
join m in context.Mappings
on new { p.Id, p.Type } equals new { m.ReferenceId, m.ReferenceType}
select m;
List<Mapping> maps = mapQuery.ToList();
You could union your queries together.
Something like this:
var pairs = new List<Pair>
{
Pair.Create(1000, "Car"),
Pair.Create(2000, "Truck"),
};
List<Mapping> result =
pairs
.Select(pair =>
context.Mappings.Where(
x => x.ReferenceId == pair.Id
&& x.ReferenceType == pair.Type))
.Aggregate(Queryable.Union)
.ToList();

Form a list of distinct words from set of repeating words lists in c#

I have a model:
public class CompanyModel1
{
public string compnName1 { get; set; }
public string compnKeyProcesses1 { get; set; }
}
then I form a list:
List<CompanyModel1> companies1 = new List<CompanyModel1>();
If I access its values:
var newpairs = companies1.Select(x => new { Name = x.compnName1, Processes = x.compnKeyProcesses1 });
foreach (var item in newpairs)
{
string CName = item.Name;
Process = item.Processes;
}
I will get value like:
CName = "name1"
Process = "Casting, Casting, Casting, Welding, brazing & soldering"
and
CName = "name2"
Process = "Casting, Welding, Casting, Forming & Forging, Moulding"
etc.
Now I want to form a list of distinct Process and count number of them, how many time each of them have by different name.
For example with these two above, I have to form a list like following:
"Casting, Welding, brazing & soldering, Forming & Forging, Moulding"
and if I count there will be: 5 distinct Processes; frequency of them by each name:
"Casting" appears in 2 names
"Welding" appears in 2 names
"brazing & soldering" appears in 1 names
"Forming & Forging" appears in 1 names
"Moulding" appears in 1 names
I am thinking of Linq can help with this problem, may be something like this:
var list= Process
.SelectMany(u => u.Split(new string[] { ", " }, StringSplitOptions.None))
.GroupBy(s => s)
.ToDictionary(g => g.Key, g => g.Count());
var numberOfProcess = list.Count;
var numberOfNameWithProcessOne = frequency["Process1"];
But how could I put that in the foreach loop and apply for all the names and processes that I have and get the result I want?
var processes = companies1.SelectMany(
c => c.compnKeyProcesses1.Split(new char[] { ',' }).Select(s => s.Trim()).Distinct())
.GroupBy(s => s).ToDictionary(g => g.Key, g => g.Count());
foreach(var process in processes)
{
Console.WriteLine("\"{0}\" appears in {1} names", process.Key, process.Value);
}
This selects only distinct processes from each individual company, and then creates all master list using SelectMany to store the correct number of unique occurrences for every process. Then we just count the occurrences of each process in the final list, and put them into a dictionary of process=>count.
EDIT:
Here is another solution that groups the data in a dictionary, to allow showing the associated companies with each process. The dictionary is from Process Names -> List of Company Names.
Func<string, IEnumerable<string>> stringToListConverter = s => s.Split(new char[] { ',' }).Select(ss => ss.Trim());
var companiesDict = companies1.ToDictionary(c => c.compnName1, c => stringToListConverter(c.compnKeyProcesses1).Distinct());
var processesAll = companies1.SelectMany(c => stringToListConverter(c.compnKeyProcesses1)).Distinct();
var processesToNames = processesAll.ToDictionary(s => s, s => companiesDict.Where(d => d.Value.Contains(s)).Select(d => d.Key).ToList());
foreach(var processToName in processesToNames)
{
List<string> companyNames = processToName.Value;
Console.WriteLine("\"{0}\" appears in {1} names : {2}", processToName.Key, companyNames.Count, String.Join(", ", companyNames));
}
I've saved the stringToListConverter Func delegate to convert the process string into a list, and used that delegate in two of the queries.
This query would be more readable if the CompanyModel1 class stored the compnKeyProcesses1 field as a List<string> instead of just one big string. That way you could instantly query the list instead of having the split, select, and trim every time.

LINQ Getting Distinct Data and looping through to get related data

I'm new to Linq. I want to know whether this is the best way or are there any other ways to do this.
I have a requirement where from a web service, I receive a list of items:
class Item {
string ItemName { get; set;}
string GroupName { get; set; }
}
I receive the following data:
ItemName: Item1; GroupName: A
ItemName: Item2; GroupName: B
ItemName: Item3; GroupName: B
ItemName: Item4; GroupName: A
ItemName: Item5; GroupName: A
Now I want to get all of the unique Groups in the list, and associate all the Items to that Group. So I made a class:
class Group {
string GroupName { get; set; }
List<string> Items { get; set; }
}
So that there is a single group and all associated Items will be under the List.
I made two LINQ statements:
var uniqueGroups = (from g in webservice
where g.Group != null
select g.GroupName).Distinct();
Then I loop through it
foreach (var gn in uniqueGroups)
{
var itemsAssociated = (from item in webservice
where item.GroupName = gn.ToString()
select new {
});
}
and then I got the items, and save them to my object.
Is this the best way to do this or are there any LINQ statement that can do all these in one go?
Thanks.
Sounds like you want GroupBy
var itemsByGroup = items.GroupBy(i => i.GroupName);
foreach (var group in itemsByGroup)
{
var groupName = group.Key;
var itemsForThisGroup = group;
foreach (var item in itemsForThisGroup)
{
Console.Out.WriteLine(item.ItemName);
}
}
You can try this:
//List<Item> webservice = list with items from your webservice
var result = (from i in items
group i by i.GroupName into groups
select new Group()
{
GroupName = groups.Key,
Items = groups.Select(g => g.ItemName).ToList()
}).ToList();
I would use:
webservice.ToLookup(k => k.GroupName);
That would eliminate the need for the extra class.
Hope this helps!
That could be done all at once with an anonymous type and Enumerable.GroupBy:
var groupItems =
webservice.Where(i => i.GroupName != null)
.GroupBy(i => i.GroupName)
.Select(grp => new { Group = grp.Key, Items = grp.ToList() });
foreach (var groupItem in groupItems)
Console.WriteLine("Groupname: {0} Items: {1}"
, groupItem.Group
, string.Join(",", groupItem.Items.Select(i => i.ItemName)));
Distinct is useless since GroupBy will always make the groups distinct, that's the nature of a group.
Here's running code: http://ideone.com/R3jjZ

Using LINQ to group a list of strings based on known substrings that they will contain

I have a known list of strings like the following:
List<string> groupNames = new List<string>(){"Group1","Group2","Group3"};
I also have a list of strings that is not known in advance that will be something like this:
List<string> dataList = new List<string>()
{
"Group1.SomeOtherText",
"Group1.SomeOtherText2",
"Group3.MoreText",
"Group2.EvenMoreText"
};
I want to do a LINQ statement that will take the dataList and convert it into either an anonymous object or a dictionary that has a Key of the group name and a Value that contains a list of the strings in that group. With the intention of looping over the groups and inner looping over the group list and doing different actions on the strings based on which group it is in.
I would like a data structure that looks something like this:
var grouped = new
{
new
{
Key="Group1",
DataList=new List<string>()
{
"Group1.SomeOtherText",
"Group1.SomeOtherText2"
}
},
new
{
Key="Group2",
DataList=new List<string>()
{
"Group2.EvenMoreText"
}
}
...
};
I know I can just loop through the dataList and then check if each string contains the group name then add them to individual lists, but I am trying to learn the LINQ way of doing such a task.
Thanks in advance.
EDIT:
Just had another idea... What if my group names were in an Enum?
public enum Groups
{
Group1,
Group2,
Group3
}
How would I get that into a Dictionary>?
This is what I am trying but i am not sure how to form the ToDictionary part
Dictionary<Groups,List<string>> groupedDictionary = (from groupName in Enum.GetNames(typeof(Groups))
from data in dataList
where data.Contains(groupName)
group data by groupName).ToDictionary<Groups,List<string>>(...NOT SURE WHAT TO PUT HERE....);
EDIT 2:
Found the solution to the Enum question:
var enumType = typeof(Groups);
Dictionary<Groups,List<string>> query = (from groupName in Enum.GetValues(enumType).Cast<Groups>()
from data in dataList
where data.Contains(Enum.GetName(enumType, groupName))
group data by groupName).ToDictionary(x => x.Key, x=> x.ToList());
That looks like:
var query = from groupName in groupNames
from data in dataList
where data.StartsWith(groupName)
group data by groupName;
Note that this isn't a join, as potentially there are overlapping group names "G" and "Gr" for example, so an item could match multiple group names. If you could extract a group name from each item (e.g. by taking everything before the first dot) then you could use "join ... into" to get a group join. Anyway...
Then:
foreach (var result in query)
{
Console.WriteLine("Key: {0}", result.Key);
foreach (var item in result)
{
Console.WriteLine(" " + item);
}
}
If you really need the anonymous type, you can do...
var query = from groupName in groupNames
from data in dataList
where data.StartsWith(groupName)
group data by groupName into g
select new { g.Key, DataList = g.ToList() };

Categories

Resources