What is the fastest way to compare two tables? - c#

For example there are two tables with same schema, but different contents:
Table1
| field1 | field2 | field3 |
----------------------------------------
| 1 | aaaaa | 100 |
| 2 | bbbbb | 200 |
| 3 | ccccc | 300 |
| 4 | ddddd | 400 |
Table2
| field1 | field2 | field3 |
----------------------------------------
| 2 | xxxxx | 200 |
| 3 | ccccc | 999 |
| 4 | ddddd | 400 |
| 5 | eeeee | 500 |
The expected comparison result would be:
Deleted in B:
| 1 | aaaaa | 100 |
Mismatch:
Table1:| 2 | bbbbb | 200 |
Table2:| 2 | xxxxx | 200 |
Table1:| 3 | ccccc | 300 |
Table2:| 3 | ccccc | 999 |
Newly added in B
| 5 | eeeee | 500 |
Using C#, what is the fastest way to compare two tables?
Currently my implementation is:
Check if each row in table1 has an exact match in table2;
Check if each row in table2 has an exact match in table1.
The efficiency is n*n so for 100k rows it takes 20 mins to run on a server.
Many thanks

You can try something like this, should be quite fast:
class objType
{
public int Field1 { get; set; }
public string Field2 { get; set; }
public int Field3 { get; set; }
public bool AreEqual(object other)
{
var otherType = other as objType;
if (otherType == null)
return false;
return Field1 == otherType.Field1 && Field2 == otherType.Field2 && Field3 == otherType.Field3;
}
}
var tableOne = new objType[] {
new objType { Field1 = 1, Field2 = "aaaa", Field3 = 100 },
new objType { Field1 = 2, Field2 = "bbbb", Field3 = 200 },
new objType { Field1 = 3, Field2 = "cccc", Field3 = 300 },
new objType { Field1 = 4, Field2 = "dddd", Field3 = 400 }
};
var tableTwo = new objType[] {
new objType { Field1 = 2, Field2 = "xxxx", Field3 = 200 },
new objType { Field1 = 3, Field2 = "cccc", Field3 = 999 },
new objType { Field1 = 4, Field2 = "dddd", Field3 = 400 },
new objType { Field1 = 5, Field2 = "eeee", Field3 = 500 }
};
var originalIds = tableOne.ToDictionary(o => o.Field1, o => o);
var newIds = tableTwo.ToDictionary(o => o.Field1, o => o);
var deleted = new List<objType>();
var modified = new List<objType>();
foreach (var row in tableOne)
{
if(!newIds.ContainsKey(row.Field1))
deleted.Add(row);
else
{
var otherRow = newIds[row.Field1];
if (!otherRow.AreEqual(row))
{
modified.Add(row);
modified.Add(otherRow);
}
}
}
var added = tableTwo.Where(t => !originalIds.ContainsKey(t.Field1)).ToList();
Might be worth overriding Equals instead of AreEqual (or making AreEqual a helper method outside the class definition), but that depends on how your project is setup.

Related

Finding multiple unique matches from List<object> where two criteria have to be different

I am having trouble selecting the first item in a list that is unique based on two fields, JOB_ID and EMPLOYEE_ID.
Each job should only be assigned to one employee (the one with the lowest OVERALL_SCORE), then move on and assign the next employee.
The List Objects are as follows:
JobMatch.cs
public int JOB_ID { get; set; }
public int JOB_MATCHES_COUNT { get; set; }
EmployeeMatch.cs
public int EMPLOYEE_ID { get; set; }
public int EMPLOYEE_MATCHES_COUNT { get; set; }
Rankings.cs
public int JOB_ID { get; set; }
public int EMPLOYEE_ID { get; set; }
public int TRAVEL_TIME_MINUTES { get; set; }
public int PRIORITY { get; set; }
public int OVERALL_SCORE { get; set; }
Rankings.cs gets an overall score based on the travel time field and
number of matches an Employee/Job has.
EmployeeMatch.cs
+-------------+-------------------+
| EMPLOYEE_ID | EMP_MATCHES_COUNT |
+-------------+-------------------+
| 3 | 1 |
| 4 | 1 |
| 2 | 3 |
| 1 | 4 |
+-------------+-------------------+
JobMatch.cs
+--------+-------------------+
| JOB_ID | JOB_MATCHES_COUNT |
+--------+-------------------+
| 1 | 1 |
| 2 | 2 |
| 3 | 2 |
| 4 | 4 |
+--------+-------------------+
Ranking.cs (shortened as to not fill the screen)
+--------+-------------+---------------+
| JOB_ID | EMPLOYEE_ID | OVERALL_SCORE |
+--------+-------------+---------------+
| 4 | 3 | 800 |
| 4 | 4 | 800 |
| 3 | 1 | 800 |
| 3 | 2 | 1200 |
| 2 | 1 | 1600 |
| 2 | 2 | 1800 |
| 4 | 1 | 2000 |
| 4 | 2 | 2100 |
| 1 | 1 | 6400 |
+--------+-------------+---------------+
Basically, the idea is to select the first unique Employee and Job in this list and then the best matches will be put into a separate list, something like the following for the above scenario:
+--------+-------------+---------------+
| JOB_ID | EMPLOYEE_ID | OVERALL_SCORE |
+--------+-------------+---------------+
| 4 | 3 | 800 |
| 3 | 1 | 800 |
| 2 | 2 | 1800 |
+--------+-------------+---------------+
I tried the following but it didn't work as intended:
var FirstOrder = (rankings.GroupBy(u => u.JOB_ID)
.Select(g => g.First())).ToList();
var SecondOrder = (FirstOrder.GroupBy(u => u.EMPLOYEE_ID)
.Select(g => g.First())).ToList();
The idea is choosing first element and then removing corresponding elements from list to make sure next choice is unique, as below:
var rankings = new List<Rankings> {
new Rankings{ JOB_ID= 4,EMPLOYEE_ID= 3, OVERALL_SCORE= 800 },
new Rankings{ JOB_ID= 4,EMPLOYEE_ID= 4, OVERALL_SCORE= 800 },
new Rankings{ JOB_ID= 3,EMPLOYEE_ID= 1, OVERALL_SCORE= 800 },
new Rankings{ JOB_ID= 3,EMPLOYEE_ID= 2, OVERALL_SCORE= 1200 },
new Rankings{ JOB_ID= 2,EMPLOYEE_ID= 1, OVERALL_SCORE= 1600 },
new Rankings{ JOB_ID= 2,EMPLOYEE_ID= 2, OVERALL_SCORE= 1800 },
new Rankings{ JOB_ID= 4,EMPLOYEE_ID= 1, OVERALL_SCORE= 2000 },
new Rankings{ JOB_ID= 4,EMPLOYEE_ID= 2, OVERALL_SCORE= 2100 },
new Rankings{ JOB_ID= 1,EMPLOYEE_ID= 1, OVERALL_SCORE= 6400 },
};
var cpy = new List<Rankings>(rankings);
var result = new List<Rankings>();
while (cpy.Count() > 0)
{
var first = cpy.First();
result.Add(first);
cpy.RemoveAll(r => r.EMPLOYEE_ID == first.EMPLOYEE_ID || r.JOB_ID == first.JOB_ID);
}
result:
+--------+-------------+---------------+
| JOB_ID | EMPLOYEE_ID | OVERALL_SCORE |
+--------+-------------+---------------+
| 4 | 3 | 800 |
| 3 | 1 | 800 |
| 2 | 2 | 1800 |
+--------+-------------+---------------+
Really, if you're trying to get the best score for the job, you don't need to select by unique JOB_ID/EMPLOYEE_ID, you need to sort by JOB_ID/OVERALL_SCORE, and pick out the first matching employee per JOB_ID (that's not already in the "assigned list").
You could get the items in order using LINQ:
var sorted = new List<Ranking>
(
rankings
.OrderBy( r => r.JOB_ID )
.ThenBy( r => r.OVERALL_SCORE )
);
...and then peel off the employees you want...
var best = new List<Ranking>( );
sorted.ForEach( r1 =>
{
if ( !best.Any
(
r2 =>
r1.JOB_ID == r2.JOB_ID
||
r1.EMPLOYEE_ID == r2.EMPLOYEE_ID
) )
{
best.Add( r1 );
}
} );
Instead of using Linq to produce a sorted list, you could implement IComparable<Ranking> on Ranking and then just sort your rankings:
public class Ranking : IComparable<Ranking>
{
int IComparable<Ranking>.CompareTo( Ranking other )
{
var jobFirst = this.JOB_ID.CompareTo( other.JOB_ID );
return
jobFirst == 0?
this.OVERALL_SCORE.CompareTo( other.OVERALL_SCORE ):
jobFirst;
}
//--> other stuff...
}
Then, when you Sort() the Rankings, they'll be in JOB_ID/OVERALL_SCORE order. Implementing IComparable<Ranking> is probably faster and uses less memory.
Note that you have issues...maybe an unstated objective. Is it more important to fill the most jobs...or is more important to find work for the most employees? The route I took does what you suggest, and just take the best employee for the job as you go...but, maybe, the only employee for job 2 may be the same as the best employee for job 1...and if you put him/her on job 1, you might not have anybody left for job 2. It could get complicated :-)
Basically you could use System.Linq.Distinct method reinforced with the custom equality comparer IEqualityComparer<Ranking>. The System.Linq provide this method out of the box.
public class Comparer : IEqualityComparer<Ranking>
{
public bool Equals(Ranking l, Ranking r)
{
return l.JOB_ID == r.JOB_ID || l.EMPLOYEE_ID == r.EMPLOYEE_ID;
}
public int GetHashCode(Ranking obj)
{
return 1;
}
}
The trick here is with the GetHashCode method, and then as simple as this
rankings.Distinct(new Comparer())

Count and Max Columns Group By in LINQ

I have model
public class Rate
{
public int Nr{ get; set; }
public string Rate{ get; set; }
public int Order{ get; set; }
}
and a RateList = List<Rate> like this
Nr | Rate | Order
123 | A | 2
425 | A+ | 1
454 | B | 4
656 | B+ | 3
465 | A | 2
765 | B | 4
Notice that Order always match the Rate (A+ = 1, A = 2, B+ = 3, B = 4, C+ = 5 ...)
I want to count how many time the Rate occoured and display order by the Order
The result should look like this
Rate | Count | Order
A+ | 1 | 1
A | 2 | 2
B+ | 1 | 3
B | 2 | 4
or without column Order
Rate | Count
A+ | 1
A | 2
B+ | 1
B | 2
In SQL I could do like this if I had above list in table Tab
SELECT Rate, COUNT(Rate), Max(Order) from Tab group by Rate
but in LINQ?
I was trying something like this
var rating= RateList.Distinct().GroupBy(x => x.Rate)
.Select(x => new { Rate = x.Key, RateCount = x.Count() })
.OrderBy(x => x.Order);
but didnt work.
Thank You for help.
Your SQL query is equevalent to:
var rating = rateList.GroupBy(x => x.Rate)
.Select(x => new {
Rate = x.Key,
RateCount = x.Count(e => e != null),
Max = x.Max(g => g.Order)
});

LINQ select out list of values and map into one field property in the nested class structure

Animal:
+----------+---------+--------+
| animalId | animal | typeId |
+----------+---------+--------+
| 1 | snake | 1 |
| 2 | cat | 2 |
+----------+---------+--------+
AnimalType:
+--------+----------+
| typeId | type |
+--------+----------+
| 1 | reptile |
| 2 | mammal |
+--------+----------+
AnimalBody:
+--------+-------+----------+
| bodyId | body | animalId |
+--------+-------+----------+
| 1 | tail | 1 |
| 2 | eye | 1 |
| 3 | tail | 2 |
| 4 | eye | 2 |
| 5 | leg | 2 |
+--------+-------+----------+
Table relation:
Animal.typeId = AnimalType.typeId
Animal.animalId = AnimalBody.animalId
I need to output them into JSON format as below:
{
animalId: 1,
animal: "snake",
type: "reptile",
body: {
"tail", "eye"
}
},
{
animalId: 2,
animal: "cat",
type: "mammal",
body: {
"tail", "eye", "leg"
}
}
How can I achieve this with pure LINQ clauses instead of method?
I have tried:
from animal in db.Animal
join animalType in db.AnimalType on animal.typeId equals animalType.typeId
select new
{
animalId = animal.animalId,
animal = animal.animal,
type = animalType.type,
body = ?
};
Assuming you want the body element to be an array of body parts, here's what you should do:
Join Animals with AnimalTypes:
var animalsWithType = db.Animals.Join(
animal => animal.typeId,
animalType => animalType.typeId,
(animal, type) => new { animal, type });
Afterwards, GroupJoin animalsWithType with AnimalBody elements:
var result = animalsWithType.GroupJoin(db.AnimalBodies,
animalWithType => animalWithType.animal.animalId,
body => body.animalId,
(animalWithType, bodyParts) => new
{
animalId = animalWithType.animal.animalId,
animal = animalWithType.animal.animal,
type = animalWithType.type.type,
body = bodyParts.Select(part => part.body)
});
Now, just export the result to JSON and you should be set.

Populate XtraTreeList with dynamic nodes

I have table with two columns:
+-------------+------------+
| Level | Desc |
+-------------+------------+
| 1 | a |
+-------------+------------+
| 2 | b |
+-------------+------------+
| 2 | c |
+-------------+------------+
| 1 | d |
+-------------+------------+
| 2 | e |
+-------------+------------+
| 2 | f |
+-------------+------------+
| 3 | g |
+-------------+------------+
| 1 | h |
+-------------+------------+
| 1 | i |
+-------------+------------+
| 2 | j |
+-------------+------------+
| 2 | k |
+-------------+------------+
And I need to create display of this data in XtraTreeview with two columns according to Level column and it should be like:
- 1 a
-- 2 b
-- 2 c
-1 d
-- 2 e
-- 2 f
-- 3 g
-1 h
-1 i
-- 2 j
-- 2 k
So, level columns represents the node. Level 1 is the main node, level 2 is subnode of level 1, level 3 is subnode of level 2, level 4 is subnode of 3...
I know how to populate Xtratreeview when there is fixed numbers of nodes and subnodes but in this case don't have idea how to populate where 1 node consist 3, 4 or more subnodes.
I've done this so far:
Populate TreeView:
DataTable table = new DataTable();
table.Columns.Add("Level");
table.Columns.Add("Data");
table.Rows.Add(1, "a");
table.Rows.Add(2, "b");
table.Rows.Add(2, "c");
table.Rows.Add(1, "d");
table.Rows.Add(2, "e");
table.Rows.Add(2, "f");
table.Rows.Add(3, "g");
table.Rows.Add(4, "z");
table.Rows.Add(5, "x");
table.Rows.Add(2, "h");
table.Rows.Add(3, "i");
table.Rows.Add(1, "j");
table.Rows.Add(2, "k");
TreeListNode rootNode = null;
for (int i = 0; i < table.Rows.Count; i++)
{
tl.BeginUnboundLoad();
TreeListNode parentForRootNodes = null;
if (table.Rows[i][0].ToString().Equals("1"))
{
rootNode = tl.AppendNode(new object[] { (string)table.Rows[i][1] }, parentForRootNodes);
}
if (table.Rows[i][0].ToString().Equals("2"))
{
tl.AppendNode(new object[] { (string)table.Rows[i][1] }, rootNode);
}
tl.EndUnboundLoad();
}
Create columns:
private void CreateColumns2(TreeList tl)
{
tl.BeginUpdate();
tl.Columns.Add();
tl.Columns[0].Caption = "Level";
tl.Columns[0].VisibleIndex = 0;
tl.Columns.Add();
tl.Columns[1].Caption = "Desc";
tl.Columns[1].VisibleIndex = 1;
tl.Columns.Add();
tl.EndUpdate();
}
Documentation you might like to read is here: https://documentation.devexpress.com/#windowsforms/CustomDocument198
You need at minimum three things for a tree:
Id
ParentId
Text
So the structure you've described needs to change to permit finding the parent for an item.
Once you have that, the concept goes like this:
Create an item for each node you want in the tree, I created my own class for this with the properties I wanted (Id, ParentId, Text...)
Then set the datasource of the tree control
Example:
var data = new List<TreeItem>
{
new TreeItem { Id = "L1_1", ParentId = "", Text = "ONE" },
new TreeItem { Id = "L1_2", ParentId = "", Text = "TWO" },
new TreeItem { Id = "L1_3", ParentId = "", Text = "THREE" },
new TreeItem { Id = "L2_1", ParentId = "L1_1", Text = "A" },
new TreeItem { Id = "L2_2", ParentId = "L1_1", Text = "B" },
new TreeItem { Id = "L2_3", ParentId = "L1_2", Text = "C" },
new TreeItem { Id = "L2_4", ParentId = "L1_2", Text = "D" },
new TreeItem { Id = "L2_5", ParentId = "L1_2", Text = "E" }
};
tree.Properties.DataSource = data;
}
class TreeItem
{
public string Id { get; set; }
public string ParentId { get; set; }
public string Text { get; set; }
}
The order of the items in the data source is irrelevant, what is important is the uniqueness of each id.
The above example produces a tree like this:
- ONE
-- A
-- B
- TWO
- THREE
-- C
-- D
-- E
I am doing this without my DevExpress installation and without a compiler, so please excuse any errors. however the concept remains the same.

How to add List<T> containing item of type List<string> ,int,string into another List

I have a list with complex data
public class CAR
{
public int ID {get ; set ; }
public string Name { get ; set ; }
public string EngineType { get ; set ; }
public List<string> Months { get; set; }
}
Note that Months data is List<string> its max count is 150
List<CAR> A = new List<CAR>();
List<CAR> B = new List<CAR>();
A has follwoing data
ID | Name | EngineType | Months[0] | Months[1] | Months[2] | Months[3] .. | Months[149] |
1 | Zen | 1001 | 1 | 1 | 4 | 5 .. | 6 |
2 | Benz | 2002 | 6 | 4 | 5 | 6 .. | 2 |
3 | Zen | 1001 | 3 | 1 | 7 | 5 .. | 0 |
4 | Zen | 1001 | 2 | 2 | 4 | 5 .. | 6 |
5 | Zen | 2002 | 2 | 2 | 4 | 5 .. | 6 |
6 | Benz | 2002 | 1 | 1 | 1 | 1 .. | 1 |
IF EngineType and Name are same we add those rows and store the result in a single row
Eg : adding rows
row 1 in B = 1 + 3 + 4
row 2 in B = 2 + 6
row 3 in B = 5
B should contain the following op
ID | Name | EngineType | Months[0] | Months[1] | Months[2] | Months[3] ... | Months[149] |
- | Zen | Petrol | 6 | 4 | 15 | 15 .. | 12 |
- | Benz | Diesel | 7 | 5 | 6 | 7 | 3 |
- | Zen | Diesel | 2 | 2 | 4 | 5 .. | 6 |
had months data been separate entity of type integer something else i could have done this
B = from val in A
group val by new val.EngineType into g
select new CAR{
EngineType = g.Key,
Name = g.Name,
Month0 = g.Sum(p => p.Month0),
Month1 = g.Sum(p => p.Month1),
Month2 = g.Sum(p => p.Month2),
.
.
.
.
.
.
Month148 = g.Sum(p => p.Month148),
Month149 = g.Sum(p => p.Month149)
}.ToList<CAR>();
But since its of type List<string> is there a way to get this done?
Thanks a lot!
Use the power of LINQ:
var B = A.GroupBy(x => new { x.Name, x.EngineType })
.Select(g => new Car
{
Name = g.Key.Name,
EngineType = g.Key.EngineType,
Months = g.SelectMany(x => x.Months.Select((y,i) => new { i, y = int.Parse(y) }))
.GroupBy(x => x.i)
.OrderBy(g2 => g2.Key)
.Select(g2 => g2.Sum(x => x.y).ToString()).ToList()
}).ToList();
foreach (CAR c in A)
{
bool blnadded = false;
if (B.Count == 0)
{
B.Add(c);
blnadded = true;
}
else
foreach (CAR d in B)
{
if (d.Name == c.Name && d.EngineType == c.EngineType)
{
for (int i = 0; i < d.Months.Count; i++)
d.Months[i] = (Convert.ToInt32(d.Months[i]) + Convert.ToInt32(c.Months[i])).ToString();
blnadded = true;
}
}
if (blnadded==false)
B.Add(c);
}

Categories

Resources