Linq Group By and merge rows

Linq Group By and merge rows - c#

I have a table similar to the following format:
id | name | year | Quality| Location |
------------------------------------------
1 | Apple | year1 | Good | Asia |
2 | Apple | year2 | Better | Asia |
3 | Apple | year3 | Best | Asia |
4 | Apple | year1 | Best | Africa |
5 | Apple | year2 | Bad | Africa |
6 | Apple | year3 | Better | Africa |
7 | Apple | year1 | Best | Europe |
8 | Apple | year2 | Bad | Europe |
9 | Apple | year3 | Better | Europe |
10 | Orange | year1 | Bad | Asia |
11 | Orange | year2 | Better | Asia |
12 | Orange | year3 | Bad | Asia |
13 | Orange | year1 | Best | Africa |
14 | Orange | year2 | Better | Africa |
15 | Orange | year3 | Bad | Africa |
16 | Orange | year1 | Best | Europe |
17 | Orange | year2 | Better | Europe |
18 | Orange | year3 | Best | Europe |
19 | Mango | year1 | Bad | Asia |
20 | Mango | year2 | Better | Asia |
21 | Mango | year3 | Better | Asia |
22 | Mango | year1 | Good | Africa |
23 | Mango | year2 | Better | Africa |
24 | Mango | year3 | Good | Africa |
25 | Mango | year1 | Best | Europe |
26 | Mango | year2 | Better | Europe |
27 | Mango | year3 | Best | Europe |
I need the list in this format in LINQ:
{ Location: Asia, year: year1, Good: 1, Bad: 2, Better: 0, Best: 0 }
{ Location: Asia, year: year2, Good: 0, Bad: 0, Better: 3, Best: 0 }
{ Location: Asia, year: year3, Good: 0, Bad: 1, Better: 1, Best: 1 }
.
.
.
.
My LINQ query is this:
var result = context.Fruits
.Groupby(f => new { f.Year,f.Location,f.Quality })
.Select(g => new
{
Year = g.Key.Year,
Location = g.Key.Location,
Quality = g.Key.Quality,
Count = g.Count()
});
This gives me something along the lines of:
{ Location: Asia, year: year1, Quality: Good, Count: 1 }
{ Location: Asia, year: year1, Quality: Bad, Count: 2 }
.
.
.
.
How do I get the required format with LINQ? Do I have to get the result and then use for each to get it to the format that I need?

You want to Count() the Quality property, so it does make sense to use it as a key in the grouping. If you just omit it and group only by Year and Location, you are getting the desired output.
public class Program
{
public static void Main()
{
var fruits = new List<Fruit> {
new Fruit { Id = 1, Name = "Apple", Year = "year1", Quality = "Good", Location ="Asia"},
new Fruit { Id = 2, Name = "Apple", Year = "year2", Quality = "Better", Location ="Asia"},
new Fruit { Id = 3, Name = "Apple", Year = "year3", Quality = "Better", Location ="Asia"},
new Fruit { Id = 4, Name = "Apple", Year = "year1", Quality = "Best", Location ="Africa"},
new Fruit { Id = 5, Name = "Orange", Year = "year1", Quality = "Vad", Location ="Asia"},
new Fruit { Id = 6, Name = "Orange", Year = "year2", Quality = "Better", Location ="Asia"},
new Fruit { Id = 7, Name = "Orange", Year = "year3", Quality = "Bad", Location ="Asia"},
};
var result = fruits.GroupBy(f => new { f.Year,f.Location})
.Select(g => new
{
Year = g.Key.Year,
Location = g.Key.Location,
Good = g.Count(x => x.Quality == "Good"),
Better = g.Count(x => x.Quality == "Better"),
Best = g.Count(x => x.Quality == "Best"),
});
foreach(var line in result) {
Console.WriteLine(String.Format("Year: {0} - Location: {1} - Good: {2} - Better: {3} - Best: {4}", line.Year, line.Location, line.Good, line.Better, line.Best));
}
}
}
public class Fruit
{
public int Id { get; set; }
public string Name { get; set; }
public string Year { get; set; }
public string Quality { get; set; }
public string Location { get; set; }
}
Output:
Year: year1 - Location: Asia - Good: 1 - Better: 0 - Best: 0
Year: year2 - Location: Asia - Good: 0 - Better: 2 - Best: 0
Year: year3 - Location: Asia - Good: 0 - Better: 1 - Best: 0
Year: year1 - Location: Africa - Good: 0 - Better: 0 - Best: 1
Fiddle: https://dotnetfiddle.net/eg99at

Related

Calculate Year-Over-Year Change in DataTable

I have DataTable mocked-up below:
+----+------+-------------+--------+
| ID | YEAR | PERSON_NAME | AMOUNT |
+----+------+-------------+--------+
| 1 | 2004 | BARBARA | 500 |
| 2 | 2004 | BOB | 100 |
| 3 | 2004 | JANE | 30 |
| 4 | 2004 | JOHN | 200 |
| 5 | 2005 | BARBARA | 505 |
| 6 | 2005 | BOB | 150 |
| 7 | 2005 | JANE | 15 |
| 8 | 2005 | JOHN | 215 |
| 10 | 2006 | BARBARA | 523 |
| 11 | 2006 | BOB | 185 |
| 12 | 2006 | JANE | 25 |
| 13 | 2006 | JOHN | 207 |
+----+------+-------------+--------+
I am trying to add a new column that will track the year-over-year change of the amounts of each person:
+----+------+-------------+--------+-------+
| ID | YEAR | PERSON_NAME | AMOUNT | Y-O-Y |
+----+------+-------------+--------+-------+
| 1 | 2004 | BARBARA | 500 | |
| 2 | 2004 | BOB | 100 | |
| 3 | 2004 | JANE | 30 | |
| 4 | 2004 | JOHN | 200 | |
| 5 | 2005 | BARBARA | 505 | 5 |
| 6 | 2005 | BOB | 150 | 50 |
| 7 | 2005 | JANE | 15 | -15 |
| 8 | 2005 | JOHN | 215 | 15 |
| 10 | 2006 | BARBARA | 523 | 18 |
| 11 | 2006 | BOB | 185 | 35 |
| 12 | 2006 | JANE | 25 | 10 |
| 13 | 2006 | JOHN | 207 | -8 |
+----+------+-------------+--------+-------+
I've achieved this easily in SQL by joining the table to itself with some ON conditions, and was trying to mimic the same logic to c# DataTable and got it to somehow work in a convoluted way. I was wondering if there is a cleaner way with LINQ or DataViews or just a compact algorithm to achieve the same effect. Thanks!

Try following :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
DataTable dt = new DataTable();
dt.Columns.Add("ID", typeof(int));
dt.Columns.Add("YEAR", typeof(int));
dt.Columns.Add("PERSON_NAME", typeof(string));
dt.Columns.Add("AMOUNT", typeof(int));
dt.Rows.Add(new object[] { 1, 2004, "BARBARA", 500 });
dt.Rows.Add(new object[] { 2, 2004, "BOB", 100 });
dt.Rows.Add(new object[] { 3, 2004, "JANE", 30 });
dt.Rows.Add(new object[] { 4, 2004, "JOHN", 200 });
dt.Rows.Add(new object[] { 5, 2005, "BARBARA", 505 });
dt.Rows.Add(new object[] { 6, 2005, "BOB", 150 });
dt.Rows.Add(new object[] { 7, 2005, "JANE", 15 });
dt.Rows.Add(new object[] { 8, 2005, "JOHN", 215 });
dt.Rows.Add(new object[] { 10, 2006, "BARBARA", 523 });
dt.Rows.Add(new object[] { 11, 2006, "BOB", 185 });
dt.Rows.Add(new object[] { 12, 2006, "JANE", 25 });
dt.Rows.Add(new object[] { 13, 2006, "JOHN", 207 });
dt.Columns.Add("Y-O-Y", typeof(int));
List<List<DataRow>> groups = dt.AsEnumerable()
.OrderBy(x => x.Field<int>("YEAR"))
.GroupBy(x => x.Field<string>("PERSON_NAME"))
.Select(x => x.ToList())
.ToList();
foreach (List<DataRow> person in groups)
{
for (int i = 1; i < person.Count(); i++)
{
person[i]["Y-O-Y"] = person[i].Field<int>("AMOUNT") - person[i - 1].Field<int>("AMOUNT");
//or
//person[i]["Y-O-Y"] = (int)person[i]["AMOUNT"] - (int)person[i - 1]["AMOUNT"];
}
}
}
}
}

Finding multiple unique matches from List<object> where two criteria have to be different

I am having trouble selecting the first item in a list that is unique based on two fields, JOB_ID and EMPLOYEE_ID.
Each job should only be assigned to one employee (the one with the lowest OVERALL_SCORE), then move on and assign the next employee.
The List Objects are as follows:
JobMatch.cs
public int JOB_ID { get; set; }
public int JOB_MATCHES_COUNT { get; set; }
EmployeeMatch.cs
public int EMPLOYEE_ID { get; set; }
public int EMPLOYEE_MATCHES_COUNT { get; set; }
Rankings.cs
public int JOB_ID { get; set; }
public int EMPLOYEE_ID { get; set; }
public int TRAVEL_TIME_MINUTES { get; set; }
public int PRIORITY { get; set; }
public int OVERALL_SCORE { get; set; }
Rankings.cs gets an overall score based on the travel time field and
number of matches an Employee/Job has.
EmployeeMatch.cs
+-------------+-------------------+
| EMPLOYEE_ID | EMP_MATCHES_COUNT |
+-------------+-------------------+
| 3 | 1 |
| 4 | 1 |
| 2 | 3 |
| 1 | 4 |
+-------------+-------------------+
JobMatch.cs
+--------+-------------------+
| JOB_ID | JOB_MATCHES_COUNT |
+--------+-------------------+
| 1 | 1 |
| 2 | 2 |
| 3 | 2 |
| 4 | 4 |
+--------+-------------------+
Ranking.cs (shortened as to not fill the screen)
+--------+-------------+---------------+
| JOB_ID | EMPLOYEE_ID | OVERALL_SCORE |
+--------+-------------+---------------+
| 4 | 3 | 800 |
| 4 | 4 | 800 |
| 3 | 1 | 800 |
| 3 | 2 | 1200 |
| 2 | 1 | 1600 |
| 2 | 2 | 1800 |
| 4 | 1 | 2000 |
| 4 | 2 | 2100 |
| 1 | 1 | 6400 |
+--------+-------------+---------------+
Basically, the idea is to select the first unique Employee and Job in this list and then the best matches will be put into a separate list, something like the following for the above scenario:
+--------+-------------+---------------+
| JOB_ID | EMPLOYEE_ID | OVERALL_SCORE |
+--------+-------------+---------------+
| 4 | 3 | 800 |
| 3 | 1 | 800 |
| 2 | 2 | 1800 |
+--------+-------------+---------------+
I tried the following but it didn't work as intended:
var FirstOrder = (rankings.GroupBy(u => u.JOB_ID)
.Select(g => g.First())).ToList();
var SecondOrder = (FirstOrder.GroupBy(u => u.EMPLOYEE_ID)
.Select(g => g.First())).ToList();

The idea is choosing first element and then removing corresponding elements from list to make sure next choice is unique, as below:
var rankings = new List<Rankings> {
new Rankings{ JOB_ID= 4,EMPLOYEE_ID= 3, OVERALL_SCORE= 800 },
new Rankings{ JOB_ID= 4,EMPLOYEE_ID= 4, OVERALL_SCORE= 800 },
new Rankings{ JOB_ID= 3,EMPLOYEE_ID= 1, OVERALL_SCORE= 800 },
new Rankings{ JOB_ID= 3,EMPLOYEE_ID= 2, OVERALL_SCORE= 1200 },
new Rankings{ JOB_ID= 2,EMPLOYEE_ID= 1, OVERALL_SCORE= 1600 },
new Rankings{ JOB_ID= 2,EMPLOYEE_ID= 2, OVERALL_SCORE= 1800 },
new Rankings{ JOB_ID= 4,EMPLOYEE_ID= 1, OVERALL_SCORE= 2000 },
new Rankings{ JOB_ID= 4,EMPLOYEE_ID= 2, OVERALL_SCORE= 2100 },
new Rankings{ JOB_ID= 1,EMPLOYEE_ID= 1, OVERALL_SCORE= 6400 },
};
var cpy = new List<Rankings>(rankings);
var result = new List<Rankings>();
while (cpy.Count() > 0)
{
var first = cpy.First();
result.Add(first);
cpy.RemoveAll(r => r.EMPLOYEE_ID == first.EMPLOYEE_ID || r.JOB_ID == first.JOB_ID);
}
result:
+--------+-------------+---------------+
| JOB_ID | EMPLOYEE_ID | OVERALL_SCORE |
+--------+-------------+---------------+
| 4 | 3 | 800 |
| 3 | 1 | 800 |
| 2 | 2 | 1800 |
+--------+-------------+---------------+

Really, if you're trying to get the best score for the job, you don't need to select by unique JOB_ID/EMPLOYEE_ID, you need to sort by JOB_ID/OVERALL_SCORE, and pick out the first matching employee per JOB_ID (that's not already in the "assigned list").
You could get the items in order using LINQ:
var sorted = new List<Ranking>
(
rankings
.OrderBy( r => r.JOB_ID )
.ThenBy( r => r.OVERALL_SCORE )
);
...and then peel off the employees you want...
var best = new List<Ranking>( );
sorted.ForEach( r1 =>
{
if ( !best.Any
(
r2 =>
r1.JOB_ID == r2.JOB_ID
||
r1.EMPLOYEE_ID == r2.EMPLOYEE_ID
) )
{
best.Add( r1 );
}
} );
Instead of using Linq to produce a sorted list, you could implement IComparable<Ranking> on Ranking and then just sort your rankings:
public class Ranking : IComparable<Ranking>
{
int IComparable<Ranking>.CompareTo( Ranking other )
{
var jobFirst = this.JOB_ID.CompareTo( other.JOB_ID );
return
jobFirst == 0?
this.OVERALL_SCORE.CompareTo( other.OVERALL_SCORE ):
jobFirst;
}
//--> other stuff...
}
Then, when you Sort() the Rankings, they'll be in JOB_ID/OVERALL_SCORE order. Implementing IComparable<Ranking> is probably faster and uses less memory.
Note that you have issues...maybe an unstated objective. Is it more important to fill the most jobs...or is more important to find work for the most employees? The route I took does what you suggest, and just take the best employee for the job as you go...but, maybe, the only employee for job 2 may be the same as the best employee for job 1...and if you put him/her on job 1, you might not have anybody left for job 2. It could get complicated :-)

Basically you could use System.Linq.Distinct method reinforced with the custom equality comparer IEqualityComparer<Ranking>. The System.Linq provide this method out of the box.
public class Comparer : IEqualityComparer<Ranking>
{
public bool Equals(Ranking l, Ranking r)
{
return l.JOB_ID == r.JOB_ID || l.EMPLOYEE_ID == r.EMPLOYEE_ID;
}
public int GetHashCode(Ranking obj)
{
return 1;
}
}
The trick here is with the GetHashCode method, and then as simple as this
rankings.Distinct(new Comparer())

Dividing a List into C # Blocks

I have a Branch List, each one has a Number N of employees, I have a Branch object and a NumberEmployees property, now I need to iterate over that list sending the number of employees per block, I explain better with the following table: I order the List by Number of Employees, so far no problem.
+---------+-----------+
| Branch | Employees |
+---------+-----------+
|MEXICO | 800 |
|USA | 700 |
|INDIA | 500 |
|CHINA | 400 |
|AUSTRALIA| 300 |
+---------+-----------+
Now iterate through a list but dividing the number of employees into blocks something like this:
+-----------+------------+-------------+------------+
| Branch | FirstGroup | SecondGroup | ThirdGroup |
+-----------+------------+-------------+------------+
| Mexico | 267 | 267 | 267 |
| USA | 234 | 234 | 234 |
| India | 167 | 167 | 167 |
| China | 134 | 134 | 134 |
| Australia | 100 | 100 | 100 |
+-----------+------------+-------------+------------+
In the end I think the list that should result would be:
+-----------+-----------+
| Branch | Employees |
+-----------+-----------+
| Mexico | 267 |
| USA | 234 |
| India | 167 |
| China | 134 |
| Australia | 100 |
| Mexico | 267 |
| USA | 234 |
| India | 167 |
| China | 134 |
| Australia | 100 |
| Mexico | 267 |
| USA | 234 |
| India | 167 |
| China | 134 |
| Australia | 100 |
+-----------+-----------+
So far I can only order the List.
double TotalEmployees = ListBranch.Sum(item => item.EmployeeNumber);
double blockSize = TotalEmployees / ListBranch.Count();
double sizeQuery = Math.Ceiling(blockSize);
foreach (Branch branch in ListBranch.OrderByDescending(f => f. EmployeeNumber))
{
//to do
}
I appreciate your valuable help for any clues you can give me

This might do the trick for you
List<BranchEmployee> be = new List<BranchEmployee>();
be.Add(new BranchEmployee() { Branch = "MEXICO", Employee = 800 });
be.Add(new BranchEmployee() { Branch = "USA", Employee = 700 });
be.Add(new BranchEmployee() { Branch = "INDIA", Employee = 500 });
be.Add(new BranchEmployee() { Branch = "CHINA", Employee = 400 });
be.Add(new BranchEmployee() { Branch = "AUSTRALIA", Employee = 300 });
List<BranchEmployee> ExpectedBE = new List<BranchEmployee>();
for(int i = 0; i <= 2; i++)
{
foreach(BranchEmployee smbe in be)
{
ExpectedBE.Add(new BranchEmployee()
{
Branch = smbe.Branch,
Employee = smbe.Employee / 3
});
}
}
What I see is that every group has equal number of employees that is the total number of employees divided by 3.
To look the data in the way you have shown I have created a class like this
public class BranchEmployee
{
public string Branch { get; set; }
public int Employee { get; set; }
}

EF Code First Configuration Duplicating Records

So I'm attempting to populate a table with seed data in EF5. I have an Enum of all 50 states and DC. I also have a lookup table of RequestTypes with IDs 1-6. It would be something like this:
+----+----------+-------------+------------+
| Id | State | SurveyId | RequestType|
+----+----------+-------------+------------+
| 1 | Alabama | 0 | 1 |
| 2 | Alabama | 0 | 2 |
| 3 | Alabama | 0 | 3 |
| 4 | Alabama | 0 | 4 |
| 5 | Alabama | 0 | 5 |
| 6 | Alabama | 0 | 6 |
+----+----------+-------------+------------+
The model that represents this table:
public class StateSurveyAssignment{
public long Id { get; set; }
public string State { get; set; }
public long RequestTypeId { get; set; }
public long SurveyId { get; set; }
}
And the code to seed the database in the Configuration.cs:
foreach (var state in Enum.GetValues(typeof(State))) {
foreach (var type in context.RequestTypes){
context.StateSurveyAssignments.AddOrUpdate(
ssa => ssa.Id,
new StateSurveyAssignment{
State = state.ToString(),
RequestTypeId = type.Id
}
);
}
}
My problem is that instead of updating/doing nothing to unchanged records, the seed method is duplicating each row. I've attempted to manually set the Id but had no luck.
EDIT:
This is what the database duplication looks like:
+----+----------+-------------+------------+
| Id | State | SurveyId | RequestType|
+----+----------+-------------+------------+
| 1 | Alabama | 0 | 1 |
| 2 | Alabama | 0 | 2 |
| 3 | Alabama | 0 | 3 |
| 4 | Alabama | 0 | 4 |
| 5 | Alabama | 0 | 5 |
| 6 | Alabama | 0 | 6 |
| ...| ... | ... | ... |
|307 | Alabama | 0 | 1 |
|308 | Alabama | 0 | 2 |
|309 | Alabama | 0 | 3 |
|310 | Alabama | 0 | 4 |
|311 | Alabama | 0 | 5 |
|312 | Alabama | 0 | 6 |
+----+----------+-------------+------------+
My Solution
I swear I'd tried setting my own Id at some point but tried it again per the answer and it seems to have worked. My final solution:
int counter = 1;
foreach (var state in Enum.GetValues(typeof(State))) {
foreach (var type in context.RequestTypes){
context.StateSurveyAssignments.AddOrUpdate(
ssa => ssa.Id,
new StateSurveyAssignment{
Id = counter,
State = state.ToString(),
RequestTypeId = type.Id
}
);
counter++;
}
}

The problem could be that the Id property in your StateSurveyAssignment class is an Identity column in the database.
This means that each row is not unique.
For example you try to insert the following several times using AddOrUpdate()
var model = new StateSurveyAssignment
{
State = "Alabama",
RequestTypeId = 1L,
SurveyId = 0L
};
Then each entry would have a different Id and thus you'll have duplicates.

Merging 2 lists and sum several properties using LINQ

I have an class which contains the following properties:
public class SomeClass()
{
public Int32 ObjectId1 {get;set;}
public Int32 ObjectId2 {get;set;}
public Int32 ActiveThickeness {get;set;}
public Int32 ActiveFilterThickness {get;set;}
}
I also have 2 lists:
List<SomeClass> A
List<SomeClass> B
List A has data:
| ObjectId1 | ObjectId2 | ActiveThickness | ActiveFilterThickness |
-------------------------------------------------------------------
| 1 | 3 | 50 | 0 |
------------------------------------------------------------------
| 1 | 2 | 400 | 0 |
-------------------------------------------------------------------
| 4 | 603 | 27 | 0 |
-------------------------------------------------------------------
List B has data:
| ObjectId1 | ObjectId2 | ActiveThickness | ActiveFilterThickness |
-------------------------------------------------------------------
| 1 | 3 | 0 | 13671 |
------------------------------------------------------------------
| 1 | 2 | 0 | 572 |
-------------------------------------------------------------------
| 29 | 11 | 0 | 4283 |
-------------------------------------------------------------------
I want to merge A and B (using LINQ if possible) into List C of SomeCalss which contains data as followed:
| ObjectId1 | ObjectId2 | ActiveThickness | ActiveFilterThickness |
-------------------------------------------------------------------
| 1 | 3 | 50 | 13671 |
------------------------------------------------------------------
| 1 | 2 | 400 | 572 |
-------------------------------------------------------------------
| 29 | 11 | 0 | 4283 |
-------------------------------------------------------------------
| 4 | 603 | 27 | 0 |
-------------------------------------------------------------------
How can I achieve that?

Use GroupBy to group common objects and Sum to sum required properties
var ab = A.Concat(B).GroupBy(x => new
{
x.ObjectId1,
x.ObjectId2
});
var result = ab.Select(x => new SomeClass
{
ObjectId1 = x.Key.ObjectId1,
ObjectId2 = x.Key.ObjectId2,
ActiveFilterThickness = x.Sum(i => i.ActiveFilterThickness),
ActiveThickeness = x.Sum(i => i.ActiveThickeness)
});

See LINQ - Full Outer Join (SO).
By doing a left outer join and a right outer join, and then taking the union of those two, you should get what you're looking for.
var leftOuterJoin = from someclass1 in A
join someclass2 in B
on someclass1.ObjectID2 equals someclass2.ObjectID2
into temp
from item in temp.DefaultIfEmpty(new SomeClass(){ objectID1 = someclass1.objectID1, ... })
select new SomeClass()
{
...
};
var rightOuterJoin = from someclass2 in B
join someclass1 in A
on someclass1.ObjectID2 equals someclass2.ObjectID2
into temp
from item in temp.DefaultIfEmpty(new SomeClass(){ objectID1 = someclass1.objectID1, ... })
select new SomeClass()
{
...
};
var fullOuterJoin = leftOuterJoin.Union(rightOuterJoin);

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Linq Group By and merge rows - c#

Related

Calculate Year-Over-Year Change in DataTable

Finding multiple unique matches from List<object> where two criteria have to be different

Dividing a List into C # Blocks

EF Code First Configuration Duplicating Records

Merging 2 lists and sum several properties using LINQ

Categories

Resources