Entity framework distinct but not on all columns - c#

I'd like to make a query through entity framework that unions contacts from two different tables, remove duplicates and orders by date. The issue I'm having is around the different dates making the same contact appear as unique. I don't want to include the date in the distinct but I do need it afterwards for the ordering. I can't do the ordering first, remove the date and then perform the distinct, because the distinct changes the ordering. Neither can I order before the union because that doesn't ensure ordering after the union.
I would like to distinct all fields except the date, which is only required for the ordering.
Ideally I would pass a comparer to the distinct but EF can't translate this to SQL.
db.Packages.Select(p => new Recent()
{
Name = p.Attention, Address1 = p.Address1, ... , Date = ShippingDate
})
.Concat(db.Letters.Select(l => new Recent()
{
Name = l.AddressedTo, Address1 = p.Address1, ..., Date = MarkedDate
})
.Distinct()
.OrderByDescending(r => r.Date);
OR the problem in SQL
SELECT DISTINCT Attention, Address1, ShippingDate
FROM Packages
UNION ALL
SELECT AddressedTo, Address1, MarkedDate
FROM Letters
ORDER BY ShipmentDate DESC

You should be able to use a GroupBy to do what you want, like so (not to mention Group By is more performant than Distinct in EF):
db.Packages.Select(p => new Recent()
{
Name = p.Attention, Address1 = p.Address1, ... , Date = ShippingDate})
.Concat(db.Letters.Select(l => new Recent()
{
Name = l.AddressedTo, Address1 = p.Address1, ..., Date = MarkedDate}))
.GroupBy(p => <parameters to group by - which make the record distinct>)
.Select(g => new {Contact = g.Key, LastShippingDate = g.Max(p => p.ShippingDate)});

I'd be concerned with this approach, even if it was possible distinct would then remove one of the items and leave you with random date out of the two, and then your sort would be totally unpredictable.

Related

Linq select in select

I am struggling with converting MySQL query to linq syntax in C# (for use of Entity Framework). MySQL query looks like this:
SELECT *
FROM Availability as tableData
WHERE ID = (
SELECT Availability.ID
FROM Availability
WHERE Availability.FrameID = tableData.FrameID
ORDER BY Availability.Date DESC limit 1)
I don't know how to convert this part FROM table AS someName.
So far the only solution I have, is to execute raw SQL query such as:
dataContext.Availability.SqlQuery("SELECT * FROM Availability as tableData WHERE ID = (SELECT ID FROM Availability WHERE FrameID = tableData.FrameID ORDER BY Availability.Date DESC limit 1)").ToArray();
But it would be nice to know if linq can provide such a query.
Thanks in advance, for your answers!
If you need only latest record for every frame id, then use grouping:
dataContext.Availability
.GroupBy(a => a.FrameID)
.Select(g => g.OrderByDescending(a => a.Date).FirstOrDefault());
This query produces required result, though generated sql will be a little different. It will look like
SELECT /* limit1 fields */
FROM (
SELECT DISTINCT tableData.FrameID
FROM Availability as tableData) AS distinct1
OUTER APPLY (
SELECT TOP(1) /* project1 fields */
FROM (SELECT /* extent1 fields */
FROM Availability AS extent1
WHERE Availability.FrameID = distinct1.FrameID) AS project1
ORDER BY project1.Date DESC) AS limit1
NOTE: First() extension is not supported by EF
Take all the Avilabilities, group by FrameId, order each group by date, take the first entry of each group.
The ToList() at the end fetches all the results and puts them in a List.
var tableDate = dataContext.Availability
.GroupBy(x => x.FrameId)
.Select(x => x.OrderByDescending(y => y.Date).FirstOrDefault())
.ToList();
Yes Linq can do this, but you need to have a starting sequence on which the linq should operate. Usually this sequence has the same type as your table, in your case Availability.
From your sql I gather that each record in the Availabilities table has at least properties Id, FrameId and Date:
class Availability
{
public int Id {get; set;}
public int FrameId {get; set;
public DateTime Date {get; set;}
}
Of course this can also be an anonymous type. The importance is that you have somehow a sequence of items having these properties:
IQueryable<Availability> availabilities = ...
You wrote:
I need only one record (with max Date of insert) for every FrameID
So every Availability has a FrameId, and you want for every FrameId the record with the highest Date value.
You could use Enumerable.GroupBy and group by FrameId
var groupsWithSameFrameId = availabilities.GroupBy(availability => availability.FrameId);
The result is a sequence of groups. Every group contains the sequence of all availabilities with the same FrameId. In other words: if you take a group, you'll have a group.Key with a FrameId value and a sequence of all availabilities that have this FrameId value.
We won't use the group.Key.
If you sort the sequence of elements in each group in descending order by Date and take the first element, you'll have the date with the highest value
var recordWithMaxDateOfInsert = groupsWithSameFrameId
.Select(group => group.OrderByDescending(groupElement => groupElement.Date)
.First();
From every group sort all elements of the group by descending Date value and take the first element of the sorted group.
Result: from your original availabilities, you have for every frameId the availability with the highest value for date.

With LINQ DISTINCT a Data Table Multiple Columns Excluding a Single Column

I have a C# DataTable. I am retrieving Data into DataTable. After that I am trying to DISTINCT entry's at the same time creating a List<MyObject>.
Here is the code with what I am chasing with:
viewModelList = (from item in response.AsEnumerable()
select new
{
description = DataTableOperationHelper.GetStringValue(item, "description"),
unitCost = DataTableOperationHelper.GetDecimalValue(item, "unitcost"),
defaultChargeable = DataTableOperationHelper.GetBoolValue(item, "defaultChargeable"),
contractId = DataTableOperationHelper.GetIntValue(item, "contractID"),
consumableid = DataTableOperationHelper.GetIntValue(item, "consumableid")
})
.Distinct()
.Select(x => new ConsumablesViewModel(
x.description,
x.unitCost,
x.defaultChargeable,
x.contractId,
x.consumableid)
)
.ToList();
I just want to exclude a single column (consumableid) when I am doing DISTINCT. How could I DISTINCT with my rest of the Data Excluding a single value (consumableid)?
Take a look at this answered question (LinQ distinct with custom comparer leaves duplicates).
Basically, you create an equality comparer for your type that allows you to decide what makes an object distinct.

Linq lambda expression many to many table select

I have three tables, which two of them are in many to many relationship.
Picture:
This is the data in middle mm table:
Edit:
Got until here, I get proper 4 rows back, but they are all the same result(I know I need 4 rows back, but there are different results)
return this._mediaBugEntityDB.LotteryOffers
.Find(lotteryOfferId).LotteryDrawDates
.Join(this._mediaBugEntityDB.Lotteries, ldd => ldd.LotteryId, lot => lot.Id, (ldd, lot) =>
new Lottery
{
Name = lot.Name,
CreatedBy = lot.CreatedBy,
ModifiedOn = lot.ModifiedOn
}).AsQueryable();
My question is, how can I retrieve all the Lotteries via many to many table WHERE I have LotteryOfferId given only?
What I want to achieve is to get data from lottery table by LotteryDrawDateId.
First I use LotteryOfferId to get DrawDates from middle table, and by middle table I get drawDateIds to use them in LotteryDrawDate table. From that table I need to retreive Lottey table by LotteryId in LotteryDrawDate table.
I gain this by normal SQL(LotteryOffersLotteryDrawDates is middle table in DB, not seen in model):
select
Name, Lotteries.CreatedBy, Lotteries.ModifiedOn, count(Lotteries.Id)
as TotalDrawDates from Lotteries join LotteryDrawDates on Lotteries.Id
= LotteryDrawDates.LotteryId join LotteryOffersLotteryDrawDates on LotteryDrawDates.Id =
LotteryOffersLotteryDrawDates.LotteryDrawDate_Id
where LotteryOffersLotteryDrawDates.LotteryOffer_Id = 19 group by
Name, Lotteries.CreatedBy, Lotteries.ModifiedOn
But Linq is different story :P
I would like to do this with lambda expressions.
Thanks
db.LotteryOffer.Where(lo => lo.Id == <lotteryOfferId>)
.SelectMany(lo => lo.LotteryDrawDates)
.Select( ldd => ldd.Lottery )
.GroupBy( l => new { l.Name, l.CreatedBy, l.ModifiedOn } )
.Select( g => new
{
g.Key.Name,
g.Key.CreatedBy,
g.Key.ModifiedOn,
TotalDrawDates = g.Count()
} );
You can do this:
var query = from lo in this._mediaBugEntityDB.LotteryOffers
where lo.lotteryOfferId == lotteryOfferId
from ld in lo.LotteryDrawDates
group ld by ld.Lottery into grp
select grp.Key;
I do this in query syntax, because (in my opinion) it is easier to see what happens. The main point is the grouping by Lottery, because you get a number of LotteryDrawDates any of which can have the same Lottery.
If you want to display the counts of LotteryDrawDates per Lottery it's better to take a different approach:
from lot in this._mediaBugEntityDB.Lotteries.Include(x => x.LotteryDrawDates)
where lot.LotteryDrawDates
.Any(ld => ld.LotteryDrawDates
.Any(lo => lo.lotteryOfferId == lotteryOfferId))
select lot
Now you get Lottery objects with their LotteryDrawDates collections loaded, so afterwards you can access lottery.LotteryDrawDates.Count() without lazy loading exceptions.

LINQ contains between 2 lists

I have a string List and a supplier List<supplier>.
string list contains some searched items and supplier list contains a list of supplier object.
Now I need to find all the supplier names that matches with any of the items in the string List<string>.
this is one of my failed attempts..
var query = some join with the supplier table.
query = query.where(k=>stringlist.contains(k.companyname)).select (...).tolist();
any idea how to do that??
EDIT:
May be my question is not clear enough...I need to find a list of suppliers(not only names,the whole object) where suppliers names matches with the any items in the string list.
If I do
query = query.where(k=>k.companyname.contains("any_string")).select (...).tolist();
it works. but this is not my requirement.
My requirement is a list of string not a single string.
Following query will return distinct suppliers names which exist in list of names:
suppliers.Where(s => stringlist.Contains(s.CompanyName))
.Select(s => s.CompanyName) // remove if you need whole supplier object
.Distinct();
Generated SQL query will look like:
SELECT DISTINCT [t0].[FCompanyName]
FROM [dbo].[Supplier] AS [t0]
WHERE [t0].[CompanyName] IN (#p0, #p1, #p2)
BTW consider to use better names, e.g. companyNames instead of stringlist
You could use Intersect for this (for just matching names):
var suppliersInBothLists = supplierNames
.Intersect(supplierObjects.Select(s => s.CompanyName));
After your EDIT, for suppliers (not just names):
var suppliers = supplierObjects.Where(s => supplierNames.Contains(s.CompanyName));
var matches = yourList.Where(x => stringList.Contains(x.CompanyName)).Select(x => x.CompanyName).ToList();
Either use a join as Tim suggested or you could just use a HashSet directly. This is much more efficient that using .Contains on a List as in some of the other answers.
var stringSet = new HashSet(stringList);
var result = query.Where(q => stringSet.Contains(q.Name));

Order a List<T> by Date, but it has a String ID, if that ID is repeated with different date in the collection with should appear below of it

I couldn't express better what I am asking in the title.
This is what I'm looking.
I have a disordered List of an Specific Object I have a DateTime and String Property.
The String Property Has values Like this ones (note that it is an string, not a number, it always has the K letter, I should be ordering with just the numbers):
K07000564,
K07070000
K07069914
K07026318
K07019189
What I want is to order the List By Date... but when ordering if the String value is present in the collection with other Date I want to order them just after this one (By Date also in that miniGroup of IdFinders)... and then keep ordering...
Something Like this:
Edit
I edited the example to clarify that ordering by IdFinder will not work... I need to order By Date.. if when ordering by Date the IdFinder is present more than once in the collection should show them just after this last one, and then keep ordering the rest of them and so on by each idfinder
ID Date
**K07000564** Today
K07000562 Yesterday
K07000563 The Day Before Yesterday
**K07000564** The day before the day before yesterday
Should be
K07000564 Today
K07000564 The day before the day before yesterday
K07000562 Yesterday
K07000563 The Day Before Yesterday
I achieved this in SQL Server 2008 in a project before with something like this:
WITH B
AS
(
SELECT
ID,
MAX(DATE_COLUMN) DATE_COLUMN,
ROW_NUMBER() OVER (ORDER BY MAX(DATE_COLUMN) DESC) RN
FROM MYTABLE
GROUP BY ID
)
SELECT *
FROM MYTABLE c
, B
WHERE ID= b.ID
ORDER BY b.rn, c.DATE_COLUMN desc;
But I'm not good with Linq and I have no idea of how doing this in Linq.
Maybe an Important Note I'm in .NEt 2.0, so no LINQ available but I'm using Linqbridge to use Linq.
I tried this, but as you will notice, this will not work
oList.OrderBy(i => i.IdFinder).ThenByDescending(i => i.OperationDate);
I hope to have explained this clearly
var result = oList.OrderByDescending(x => x.OperationDate)
.GroupBy(x => x.IdFinder)
.SelectMany(x => x);
I think this should do the trick:
var sortedList = oList
.GroupBy(x => x.IdFinder)
.Select(g =>
new
{
MaxOpDate = g.Max(x => x.OperationDate),
Items = g
})
.OrderByDescending(g => g.MaxOpDate)
.SelectMany(g => g.Items.OrderByDescending(x => x.OperationDate));
However, I haven't tested it with Linqbridge.
Try this
oList.OrderBy(i => int.parse(i.IdFinder.Substring(1,i.IdFinder.Length-1)).ThenByDescending(i => i.OperationDate);
First extract the numeric value out of IdFinder, the orderby this value.
Note -> It is assumed that i.IdFinder is always a valid "K######" where # is number less than Int32.MaxValue.

Categories

Resources