Union Lists using IEqualityComparer - c#

I'we got two Lists of my class Nomen:
var N1 = new List<Nomen>();
var N2 = new List<Nomen>();
public class Nomen
{
public string Id;
public string NomenCode;
...
public string ProducerName;
public decimal? minPrice;
}
I need to join them. I used to do it like this:
result = N2.Union(N1, new NomenComparer()).ToList();
class NomenComparer : IEqualityComparer<Nomen>
{
public bool Equals(Nomen x, Nomen y)
{
return x.Equals(y);
}
public int GetHashCode(Nomen nomen)
{
return nomen.GetHashCode();
}
}
public override int GetHashCode()
{
return (Id + NomenCode + ProducerName).GetHashCode();
}
public bool Equals(Nomen n)
{
if (!String.IsNullOrEmpty(Id) && Id == n.Id) return true;
return (NomenCode == n.NomenCode && ProducerName == n.ProducerName);
}
As you can see, if Ids or NomenCode and ProducerName are equal, for me it's the same Nomen.
now my task have changed and I need to take, if they equal, the one with less minPrice. Please, help me to solve this problem.
Tried to do the same with linq, but failed
var groups = (from n1 in N1
join n2 in N2
on new { n1.Id, n1.NomenCode, n1.ProducerName } equals new { n2.Id, n2.NomenCode, n2.ProducerName }
group new { n1, n2 } by new { n1.Id, n1.NomenCode, n1.ProducerName } into q
select new Nomen()
{
NomenCode = q.Key.NomenCode,
ProducerName = q.Key.ProducerName,
minPrice = q.Min(item => item.minPrice)
}).ToList();
Mostly because I need to join Lists by Ids OR {NomenCode, ProducerName} and I don't know how to do it.

Concat, GroupBy and then Select again? for example (less untested than before):
var nomens = N1.Concat(N2)
.GroupBy(n=>n, new NomenComparer())
.Select(group=>group.Aggregate( (min,n) => min == null || (n.minPrice ?? Decimal.MaxValue) < min.minPrice ? n : min));

Linq joins with OR conditions have been answered in this SO post:
Linq - left join on multiple (OR) conditions
In short, as Jon Skeet explains in that post, you should do something like
from a in tablea
from b in tableb
where a.col1 == b.col1 || a.col2 == b.col2
select ...

Related

Looking for LINQ expression for foreach loop

I have 2 device classes,
public class Device1
{
public string DeviceName { get; set; }
public string IP { get; set; }
public bool IsExist { get; set; }
}
public class Device2
{
public string DeviceName { get; set; }
public string DeviceIP { get; set; }
}
The current value for "IsExist" is "false" for "Device1[]" array,
private static Device1[] GetDevice1Arr()
{
List<Device1> d1List = new List<Device1>() {
new Device1 { DeviceName="d1", IP="1", IsExist=false},
new Device1 { DeviceName="d2", IP="1", IsExist=false}
};
return d1List.ToArray();
}
Now "Device2[]" array don't having "IsExist",
private static Device2[] GetDevice2Arr()
{
List<Device2> d2List = new List<Device2>() {
new Device2 { DeviceName="d1", DeviceIP="3"},
new Device2 { DeviceName="d2", DeviceIP="1"},
new Device2 { DeviceName="d2", DeviceIP="2"},
new Device2 { DeviceName="d3", DeviceIP="3"}
};
return d2List.ToArray();
}
Now I am comparing both array "Device1[]" and "Device2[]" by using 2 "foreach" loop, if DeviceName and DeviceIP is same, I am resetting "IsExist" = "true".
Looking for LINQ replacement here or any alternate way. Thanks!
Device1[] d1 = GetDevice1Arr();
Device2[] d2 = GetDevice2Arr();
foreach(var device1 in d1)
{
foreach(var device2 in d2)
{
if(device2.DeviceName == device1.DeviceName && device2.DeviceIP == device1.IP)
{
device1.IsExist = true;
}
}
}
You can replace the inner foreach loop with Linq, but not the outer one since Linq is for querying not updating. What you have is essentially an Any query (does any item in d2 match this condition?):
Device1[] d1 = GetDevice1Arr();
Device2[] d2 = GetDevice2Arr();
foreach(var device1 in d1)
{
device1.IsExist = d2.Any(device2 =>
device2.DeviceName == device1.DeviceName
&& device2.DeviceIP == device1.IP));
}
There may be alternate ways using Intersect, Join, Where, etc. to find the items that need to be updated, but in the end a foreach loop is the proper way to update them.
Looks like you're trying to do a join. You can do that in LINQ, but you'll still need a foreach to update IsExist on the result:
var itemsToUpdate = from d1 in GetDevice1Arr()
join d2 in GetDevice2Arr()
on new { d1.DeviceName, d1.IP }
equals new { d2.DeviceName, IP = d2.DeviceIP }
select d1;
foreach(var d1 in itemsToUpdate)
d1.IsExist = true;
One liner
d1.Where(dev1=> d2.Any(dev2=>dev2.DeviceName == dev1.DeviceName &&
dev2.DeviceIP == dev1.IP))
.ToList()
.ForEach(dev1=>dev1.IsExist = true);
Final Output
d1.Dump(); //LinqPad feature
Since both are list(type casted to array), you can use List.ForEach to iterate over first list, and Any to iterate over inner list.
d1.ForEach( d=> d.IsExist = d2.Any(x => x.DeviceIP == d.IP && x.DeviceName == d.DeviceName);
This one, and all other solutions use two level iterations, and are just shorthands to your existing solution. You cannot get away with it.
Another suggestion with using a join, but as an emulated 'left' join to get true or false:
Device1[] d1 = GetDevice1Arr();
Device2[] d2 = GetDevice2Arr();
foreach(var d in from dev1 in d1
join dd in d2 on new {dev1.DeviceName, dev1.IP} equals new {dd.DeviceName, IP = dd.DeviceIP} into d3
select new {dev1, Exists = d3.Any()})
d.dev1.IsExist= d.Exists;
d1.Where(x => d2.Any(y => x.IsExist = (x.DeviceName == y.DeviceName && x.IP == y.DeviceIP))).ToList();

Remove duplicate rows using linq query

I am not the greatest with linq but is the language of choice. I'm trying to write the query using sql like. Standard scenerio I have an invoice and that invoice had invoice details. When joining the tables together of course the invoices that have mulitple details are going to repeat. In standard sql I could use distinct or group by. I've tried to follow that with linq but getting errors or it is just not filtering them out.
Here is my query
var result = (from invoice in invoices
join invoiceItem in invItems on invoice.Id equals invoiceItem.InvoiceId
orderby invoice.InvoiceNo
select new InvoiceReceiveShipmentVM
{
dtInvoiced = invoice.dtInvoiced,
InvoiceNumber = invoice.InvoiceNo,
InvoiceType = invoice.InvoiceType,
InvoiceStatus = invoice.InvoiceStatus,
Lines = invoiceItem.Line,
Total = invoice.Total,
Carrier = invoice.Carrier,
});
return result.Distinct();
I've also tried :
var myList = result.GroupBy(x => x.InvoiceNumber)
.Select(g => g.First()).ToList();
return myList.Skip(fetch.Skip).Take(fetch.Take).AsQueryable();
Using distinct, please override Equals and GetHashCode in InvoiceReceiveShipmentVM
public class InvoiceReceiveShipmentVM
{
public override bool Equals(object obj)
{
if (obj is InvoiceReceiveShipmentVM == false) return false;
var invoice = (InvoiceReceiveShipmentVM)obj;
return invoice.InvoiceNumber == InvoiceNumber
&& invoice.InvoiceType == InvoiceType
&& invoice.InvoiceStatus == InvoiceStatus
&& invoice.Lines == Lines
&& invoice.Total == Total
&& invoice.Carrier == Carrier;
}
public override int GetHashCode()
{
return InvoiceNumber.GetHashCode()
^ InvoiceType.GetHashCode()
^ InvoiceStatus.GetHashCode()
^ Lines.GetHashCode()
^ Total.GetHashCode()
^ Carrier.GetHashCode();
}
}

What is the appropriate LINQ query to this specific case?

Given the following two classes:
public class Apple
{
public int Id { get; set; }
public string Name { get; set; }
}
public class Worm
{
public int AppleId { get; set; }
public int WormType { get; set; }
public int HungerValue { get; set; }
}
All instances of Worm are given an AppleId equal to a randomly existing Apple.Id
public void DoLINQ(List<Apple> apples, List<Worm> worms, string targetAppleName, List<int> wormTypes )
{
// Write LINQ Query here
}
How can we write a Linq query which
finds all the elements in 'apples', whose 'Name' matches the 'targetAppleName'
AND
(does not "contain" the any worm with Wormtype given in Wormtypes
OR
only contains worms with Hungervalue equal to 500)?
Note that an instance of Apple does not actually 'contain' any elements of Worm, since the relation is the other way around. This is also what complicates things and why it is more difficult to figure out.
--Update 1--
My attempt which selects multiple apples with the same Id:
var query =
from a in apples
join w in worms
on a.Id equals w.AppleId
where (a.Name == targetAppleName) && (!wormTypes.Any(p => p == w.WormType) || w.HungerValue == 500)
select a;
--Update 2--
This is closer to a solution. Here we use two queries and then merge the results:
var query =
from a in apples
join w in worms
on a.Id equals w.AppleId
where (a.Name == targetAppleName) && !wormTypes.Any(p => p == w.WormType)
group a by a.Id into q
select q;
var query2 =
from a in apples
join w in worms
on a.Id equals w.AppleId
where (a.Name == targetAppleName) && wormTypes.Any(p => p == w.WormType) && w.HungerValue == 500
group a by a.Id into q
select q;
var merged = query.Concat(query2).Distinct();
--Update 3--
For the input we expect the LINQ query to use the parameters in the method, and those only.
For the output we want all apples which satisfy the condition described above.
You can use a let construct to find the worms of a given apple if you want to use query syntax:
var q =
from a in apples
let ws = from w in worms where w.AppleId == a.Id select w
where
(ws.All(w => w.HungerValue == 500)
|| ws.All(w => !wormTypes.Any(wt => wt == w.WormType)))
&& a.Name == targetAppleName
select a;
In method chain syntax this is equivalent to introducing an intermediary anonymous object using Select:
var q =
apples.Select(a => new {a, ws = worms.Where(w => w.AppleId == a.Id)})
.Where(t => (t.ws.All(w => w.HungerValue == 500)
|| t.ws.All(w => wormTypes.All(wt => wt != w.WormType)))
&& t.a.Name == targetAppleName).Select(t => t.a);
I wouldn't exactly call this more readable, though :-)
var result = apples.Where(apple =>
{
var wormsInApple = worms.Where(worm => worm.AppleId == apple.Id);
return apple.Name == targetAppleName
&& (wormsInApple.Any(worm => wormTypes.Contains(worm.WormType)) == false
|| wormsInApple.All(worm => worm.HungerValue == 500));
});
For each apple, create a collection of worms in that apple. Return only apples that match the required name AND (contain no worms that are in WormType OR only contain worms with a HungerValue of 500).
You were so close in your first attempt. But instead of a Join which multiplies the apples you really need GroupJoin which "Correlates the elements of two sequences based on key equality and groups the results". In query syntax it's represented by the join .. into clause.
var query =
from apple in apples
join worm in worms on apple.Id equals worm.AppleId into appleWorms
where apple.Name == targetAppleName
&& (!appleWorms.Any(worm => wormTypes.Contains(worm.WormType))
|| appleWorms.All(worm => worm.HungerValue == 500))
select apple;
Using lambda would look like this:
var result = apples.Where(a =>
a.Name == targetAppleName &&
(worms.Any(w => w.AppleId == a.Id && w.HungerValue >= 500)) ||
worms.All(w => w.AppleId != a.Id));
I think the lambda makes the code look a bit cleaner/easier to read, plus, the usage of.Any() and .All() is more efficient than a full on join IMHO... I haven't tested it with any heavy data so hard to speak with authority here (plus, there can't be that many apples...!)
BTW, this is the entire body of code. Kind of surprised it doesn't work for you. Maybe you missed something...?
public class Apple
{
public int Id { get; set; }
public string Name { get; set; }
}
public class Worm
{
public int AppleId { get; set; }
public int WormType { get; set; }
public int HungerValue { get; set; }
}
void Main()
{
var apples = Enumerable.Range(1, 9).Select(e => new Apple { Id = e, Name = "Apple_" + e}).ToList();
var worms = Enumerable.Range(1, 9).SelectMany(a =>
Enumerable.Range(1, 5).Select((e, i) => new Worm { AppleId = a, WormType = e, HungerValue = i %2 == 0 ? a * e * 20 : 100 })).ToList();
DoLINQ(apples, worms, "Apple_4", new[] {4, 5});
}
public void DoLINQ(IList apples, IList worms, string targetAppleName, IList wormTypes)
{
// Write LINQ Query here
var result = apples.Where(a =>
a.Name == targetAppleName &&
(worms.All(w => w.AppleId != a.Id) || worms.Any(w => w.AppleId == a.Id && w.HungerValue >= 500)));
result.Dump(); // remark this out if you're not using LINQPad
apples.Dump(); // remark this out if you're not using LINQPad
worms.Dump(); // remark this out if you're not using LINQPad
}
I have modify your query but didn't tested yet lets have a look and try it. Hopefully it will solve your problem.
var query =
from a in apples
join w in worms
on a.Id equals w.AppleId into pt
from w in pt.DefaultIfEmpty()
where (a.Name == targetAppleName) && (!wormTypes.Any(p => p == w.WormType) || (w.HungerValue == 500))
select a;
Thanks.

LINQ query to select rows matching an array of pairs

Right now, I have a class called TrainingPlan that looks like this:
public class TrainingPlan
{
public int WorkgroupId { get; set; }
public int AreaId { get; set; }
}
I'm given an array of these instances, and need to load the matching training plans from the database. The WorkgroupId and AreaId basically form a compound key. What I'm doing now is looping through each TrainingPlan like so:
foreach (TrainingPlan plan in plans)
LoadPlan(pid, plan.AreaId, plan.WorkgroupId);
Then, LoadPlan has a LINQ query to load the individual plan:
var q = from tp in context.TPM_TRAININGPLAN.Include("TPM_TRAININGPLANSOLUTIONS")
where tp.PROJECTID == pid && tp.AREAID == areaid &&
tp.WORKGROUPID == workgroupid
select tp;
return q.FirstOrDefault();
The Problem:
This works, however it's very slow for a large array of plans. I believe this could be much faster if I could perform a single LINQ query to load in every TPM_TRAININGPLAN at once.
My Question:
Given an array of TrainingPlan objects, how can I load every matching WorkgroupId/AreaId combination at once? This query should translate into similar SQL syntax:
SELECT * FROM TPM_TRAININGPLANS
WHERE (AREAID, WORKGROUPID) IN ((1, 2), (3, 4), (5, 6), (7, 8));
I've used Contains to run a bulk filter similar to where-in. I setup a rough approximation of your scenario. The single select queries actually ran quicker than Contains did. I recommend running a similar test on your end with the DB tied in to see how your results wind up. Ideally see how it scales too. I'm running .NET 4.0 in visual studio 2012. I jammed in ToList() calls to push past potential lazy loading problems.
public class TrainingPlan
{
public int WorkgroupId { get; set; }
public int AreaId { get; set; }
public TrainingPlan(int workGroupId, int areaId)
{
WorkgroupId = workGroupId;
AreaId = areaId;
}
}
public class TrainingPlanComparer : IEqualityComparer<TrainingPlan>
{
public bool Equals(TrainingPlan x, TrainingPlan y)
{
//Check whether the compared objects reference the same data.
if (x.WorkgroupId == y.WorkgroupId && x.AreaId == y.AreaId)
return true;
return false;
}
public int GetHashCode(TrainingPlan trainingPlan)
{
if (ReferenceEquals(trainingPlan, null))
return 0;
int wgHash = trainingPlan.WorkgroupId.GetHashCode();
int aHash = trainingPlan.AreaId.GetHashCode();
return wgHash ^ aHash;
}
}
internal class Class1
{
private static void Main()
{
var plans = new List<TrainingPlan>
{
new TrainingPlan(1, 2),
new TrainingPlan(1, 3),
new TrainingPlan(2, 1),
new TrainingPlan(2, 2)
};
var filter = new List<TrainingPlan>
{
new TrainingPlan(1, 2),
new TrainingPlan(1, 3),
};
Stopwatch resultTimer1 = new Stopwatch();
resultTimer1.Start();
var results = plans.Where(plan => filter.Contains(plan, new TrainingPlanComparer())).ToList();
resultTimer1.Stop();
Console.WriteLine("Elapsed Time for filtered result {0}", resultTimer1.Elapsed);
Console.WriteLine("Result count: {0}",results.Count());
foreach (var item in results)
{
Console.WriteLine("WorkGroup: {0}, Area: {1}",item.WorkgroupId, item.AreaId);
}
resultTimer1.Reset();
resultTimer1.Start();
var result1 = plans.Where(p => p.AreaId == filter[0].AreaId && p.WorkgroupId == filter[0].WorkgroupId).ToList();
var result2 = plans.Where(p => p.AreaId == filter[1].AreaId && p.WorkgroupId == filter[1].WorkgroupId).ToList();
resultTimer1.Stop();
Console.WriteLine("Elapsed time for single query result: {0}",resultTimer1.Elapsed);//single query is faster
Console.ReadLine();
}
}
It seems to me that using Intersect() may get this done the way that you want. But, I don't have an environment set up to test this myself.
var q = (from tp in context.TPM_TRAININGPLAN.Include("TPM_TRAININGPLANSOLUTIONS")
where pid == tp.PROJECTID
select tp)
.Intersect
(from tp in context.TPM_TRAININGPLAN.Include("TPM_TRAININGPLANSOLUTIONS")
where plans.Any(p => p.AreaID == tp.AREAID)
select tp)
.Intersect
(from tp in context.TPM_TRAININGPLAN.Include("TPM_TRAININGPLANSOLUTIONS")
where plans.Any(p => p.WorkgroupId == tp.WORKGROUPID)
select tp);
My only concern might be that Intersect could cause it to load more records in memory than you would want, but I'm unable to test to confirm if that's the case.

How to perform a this kind of Distinct operation with LINQ?

I have the following foreach loop:
List<WorkingJournal> workingJournals = new List<WorkingJournal>();
foreach (WorkRoster workRoster in workRosters)
{
bool exists = workingJournals.Any(workingJournal => workingJournal.ServicePlan.Id == workRoster.ServicePlan.Id
&& workingJournal.Nurse.Id == workRoster.Nurse.Id
&& workingJournal.Month == workRoster.Start.Month
&& workingJournal.Year == workRoster.Start.Year);
if (exists == false)
{
WorkingJournal workingJournal = new WorkingJournal
{
ServicePlan = workRoster.ServicePlan,
Nurse = workRoster.Nurse,
Month = workRoster.Start.Month,
Year = workRoster.Start.Year
};
workingJournals.Add(workingJournal);
}
}
I started writing:
from workRoster in workRosters
select new WorkingJournal
{
ServicePlan = workRoster.ServicePlan,
Nurse = workRoster.Nurse,
Month = workRoster.Start.Month,
Year = workRoster.Start.Year
};
But now I am stuck with the comparison that produces distinct WorkingJournals.
I have a feeling that a group by clause should be here but I'm not sure how it should be done.
Assuming LINQ to objects:
(from workRoster in workRosters
select new WorkingJournal
{
ServicePlan = workRoster.ServicePlan,
Nurse = workRoster.Nurse,
Month = workRoster.Start.Month,
Year = workRoster.Start.Year
}).Distinct();
Note that for this to work you need Equals and GetHashCode implemented for the WorkingJournal object. If not, see Anthony's answer: How to perform a this kind of Distinct operation with LINQ?
If it's LINQ to SQL you could group by the new expression, then select the group key:
from workRoster in workRosters
group workRoster by new WorkingJournal
{
ServicePlan = workRoster.ServicePlan,
Nurse = workRoster.Nurse,
Month = workRoster.Start.Month,
Year = workRoster.Start.Year
} into workRosterGroup
select workRosterGroup.Key;
If you have proper Equals and GetHashCode implementations inside your class, you can simply invoke Distinct().
var result = workRosters.Select(...).Distinct();
On the chance you do not have such implementations, you can define an IEqualityComparer<WorkingJournal> implementation. This will have you defining Equals and GetHashCode methods for the T that can then be used by a dictionary or hashset and can also be used in overloads of Distinct() in Linq.
class JournalComparer : IEqualityComparer<WorkingJournal>
{
public bool Equals(WorkingJournal left, WorkingJournal right)
{
// perform your equality semantics here
}
public int GetHashCode(WorkingJournal obj)
{
// return some hash code here.
return obj.ServicePlan.GetHashCode();
}
}
var comparer = new JournalComparer(); // implements the interface
var result = workRosters.Select(r => new WorkingJournal { ... }).Distinct(comparer);

Categories

Resources