How to apply self join in Linq Query? - c#

Books Table
Id VendorId ASIN Price
-- -------- ---- ------
1 gold123 123 10
2 sil123 123 11
3 gold456 456 15
4 gold678 678 12
5 sil456 456 12
6 gold980 980 12
I want to write a linq query which will return me rows for which corresponding to every gold if sil vendor id not exist. The last three digit of vendor Id is corresponding ASIN column in that row.
Ex- For gold123 corresponding sil123 exist so that row will not be returned but for gold678 and gold980 corresponding sil not exist. So those rows will be returned.
I tried following
var gold = _repository.Query<Books>().Where(x =>
x.VendorId.Contains("gold"))
.OrderBy(x => x.Id).Skip(0).Take(500).ToList();
var asinsForGold = gold.Select(x => x.ASIN).ToList();
var correspondingSilver = _repository.Query<Books>().Where(x =>
x.VendorId.Contains("sil")
&& asinsForGold.Contains(x.ASIN)).ToList();
var correspondingSilverAsins = correspondingSilver.Select(x => x.ASIN).ToList();
var goldWithoutCorrespondingSilver = gold.Where(x =>
!correspondingSilverAsins.Contains(x.ASIN));
Can We apply self join or better way to get result only in one query instead of two query and several other list statement.

It's just another predicate, "where a corresponding silver vendor doesn't exist":
var goldWoSilver = _repository.Query<Books>()
.Where(x => x.VendorId.Contains("gold"))
.Where(x => !_repository.Query<Books>()
.Any(s => s.ASIN == x.ASIN
&& s.VendorId.Contains("sil"))
.OrderBy(x => x.Id).Skip(0).Take(500).ToList();
In many cases this is a successful recipe: start the query with the entity you want to return and only add predicates. In general, joins shouldn't be used for filtering, only to collect related data, although in that case navigation properties should be used which implicitly translate to SQL joins.

See if it helps -
var goldWithoutCorrespondingSilver = from b1 in books
join b2 in books on b1.ASIN equals b2.ASIN
where b1.VendorId.Contains("gold")
group b2 by b1.VendorId into g
where !g.Any(x => x.VendorId.Contains("sil"))
select g.FirstOrDefault();
What I have done is -
Selected records with matching ASIN
Grouped them by VendorID
Selected ones which do not have sil

Related

Selecting Distinct Count and Sum of columns received as sub-query in Entity Framework

I want to get summarized data for a report that shows total amount & suppliers Count per decision in entity Framework Syntax. My Result needed to include a SUM of Amount and COUNT of total suppliers per decision.
I have a table of suppliers with the following columns:
SupplierNo | Decision | DecisionIssuedOn | Amount | SupplierGroup | SubSupplier
Raw SQL query to get above data for a specific time period is:
SELECT S.Decision, SUM(S.Amount) AS TotalAmount, COUNT(DISTINCT S.SupplierNo) AS SupplierCount
FROM (SELECT * FROM Indentors WHERE Indentors.DecisionIssuedOn BETWEEN '2018-01-01' AND '2018-12-31') S
GROUP BY S.Decision
Which gives data as:
SupplierCount | Amount
-----------------------
Approved 20 | 5000
Rejected 11 | 3000
In-Process 5 | 1500
Now from front end, the condition parameters can be anything from the given pool of options (dropdowns) which when selected add where clause in the exixting query like
WHERE Decision = 'Approved' AND SupplierGroup ='ABC' AND SubSupplier ='zxc'
The problem is I am having a hard time getting the desired result using Entity Framework lambda expressions instead of raw SQL.
What I did so far:
I checked for the availability of Options from fornt-end to build where clause as:
IQueryable<Supplier> suppliers = this.db.suppliers.OrderByDescending(i => i.Id);
if (string.IsNullOrEmpty(selectedSupplierGroup) == false)
{
suppliers = suppliers.Where(i => i.SupplierGroup == selectedSupplierGroup);
}
if (string.IsNullOrEmpty(selectedSubSupplier) == false)
{
suppliers = suppliers.Where(i => i.SubSupplier == selectedSubSupplier);
}
if (string.IsNullOrEmpty(selectedDecision) == false)
{
suppliers = suppliers.Where(i => i.Decision == selectedDecision);
}
if (selectedDecisionIssuedOn.HasValue)
{
suppliers = suppliers.Where(i => i.DecisionIssuedOn >= selectedDecisionIssuedOn);
}
var result = suppliers
.GroupBy(i => i.Decision)
.Select(i => i.SupplierNo).Distinct().Count(); // Gives me error
The error is:
IGrouping does not contain a definition for SupplierNo, and no extension method blah blah blah...
But after that I am unable to get data as the raw query (described above) would get me. Thanks
This should give you a similar result to your SQL query. Give it a try and see how you get on:
var results = suppliers
.Where(i => i.DecisionIssuedOn >= selectedDecisionIssuedOn)
.GroupBy(i => i.Decision)
.Select(group => new
{
Decision = group.Key,
TotalAmount = group.Sum(g => g.Amount),
SupplierCount = group.Select(i => i.SupplierNo).Distinct().Count()
});

Get list of items where their ID is equal to some Values - linq c#

Here is a query:
from order in db.tblCustomerBuys
where selectedProducts.Contains(order.ProductID)
select order.CustomerID;
selectedProducts is a list containing some target products IDs, for example it is { 1, 2, 3}.
The query above will return customerIDs where they have bought one of the selectedProducts. for example if someone has bought product 1 or 2, its ID will be in result.
But I need to collect CustomerIDs where they have bought all of the products. for example if someone has bought product 1 AND 2 AND 3 then it will be in result.
How to edit this query?
the tblCustomerBuys are like this:
CustomerID - ID of Customer
ProductID - the product which the customer has bought
something like this:
CustomerID ProdcutID
---------------------------
110 1
110 2
112 3
112 3
115 5
Updated:
due to answers I should do grouping, for some reason I should use this type of query:
var ID = from order in db.tblCustomerBuys
group order by order.CustomerID into g
where (selectedProducts.All(selProdID => g.Select(order => order.ProductID).Contains(selProdID)))
select g.Key;
but it will give this error:
Local sequence cannot be used in LINQ to SQL implementations of query operators except the Contains operator.
The updated query is the general LINQ solution of the issue.
But since your query provider does not support mixing the in memory sequences with database tables inside the query (other than Contains which is translated to SQL IN (value_list)), you need an alternative equivalent approach of All method, which could be to count the (distinct) matches and compare to the selected items count.
If the { CustomerID, ProductID } combination is unique in tblCustomerBuys, then the query could be as follows:
var selectedCount = selectedProducts.Distinct().Count();
var customerIDs =
from order in db.tblCustomerBuys
group order by order.CustomerID into customerOrders
where customerOrders.Where(order => selectedProducts.Contains(order.ProductID))
.Count() == selectedCount
select customerOrders.Key;
And if it's not unique, use the following criteria:
where customerOrders.Where(order => selectedProducts.Contains(order.ProductID))
.Select(order => order.ProductID).Distinct().Count() == selectedCount
As your question is written, it is a bit difficult to understand your structure. If I have understood correctly, you have an enumerable selectedProducts, which contains several Ids. You also have an enumeration of order objects, which have two properties we care about, ProductId and CustomerId, which are integers.
In this case, this should do the job:
ver result = db.tblCustomerBuys.GroupBy(order => order.CustomerId)
.Where(group => !selectedProducts.Except(group).Any())
.Select(group => group.Key);
What we are doing here is we are grouping all the customers together by their CustomerId, so that we can treat each customer as a single value. Then we are treating group as a superset of selectedProducts, and using a a piece of linq trickery commonly used to check if one enumeration is a subset of another. We filter db.tblCustomerBuys based on that, and then select the CustomerId of each order that matches.
You can use Any condition of Linq.
Step 1 : Create list of int where all required product id is stored
Step 2: Use Any condition of linq to compare from that list
List<int> selectedProducts = new List<int>() { 1,2 } // This list will contain required product ID
db.tblCustomerBuys.where(o=> selectedProducts .Any(p => p == o.ProductID)).select (o=>o.order.CustomerID); // This will return all customerid who bought productID 1 or 2

Efficient way of finding multiple dates per ID

I'm trying to query my MsSQL Express database to find all CompanyID's which have multiple dates associated - when I say multiple dates, I must point out they need to be over different days.
EG
ID UkDate CompanyId
1 01/01/2015 16
2 01/01/2015 16
3 03/01/2015 18
4 05/01/2015 19
5 06/01/2015 20
6 08/01/2015 20
In the example above, only the rows with ComapnyID 20 would be returned because it occurred multiple times and those times were over dates (note that although companyId 16 has multiple entries, but both entries are the same date).
I'm not sure how to write the query for this using Linq. My object is already IQueryable<T> but, I'm not sure how to perform the query without executing the code, and then 'finishing off' the query.
I'm not near Visual Studio but the code would be (please forgive typing errors, this is from memory)
//First, grab unique CompanyIds as this removes those who didn't visit multiple times
var uniqueIds = (from d in this._database.MyTable
select companyId).Distinct();
//This is the problem because on each iteration I'm re-querying the database!
foreach(var id in uniqueIds)
{
var result = (from d in this._database.MyTable.OrderBy(a=>a.UkDate)
where d.CompanyId==id
select d);
//check for nulls
if (result.First(a=>a.UkDate.Day) != result.Last(a => a.UkDate.Day)
{
this.AllResultsList.AddRange(results);
}
}
Whilst it works without error I don't feel the code is correct - it feels like a hack and unefficient but this was my best effort. Is there a way I could reduce the number of database requests I make and achieve the same result
It would be something along the lines of
var results = myTable.GroupBy(x => x.CompanyID)
.Where(g => g.GroupBy(g2 => g2.UkDate).Count()>1)
.Select(g => g.Key);
Live example (albeit with LinqToObjects, but the query should work against a database just fine): http://rextester.com/FPHI53553
var results = (from o in this._database.MyTable
group o by o.CompanyId into grouped
where (grouped.Max(s => s.UKDate) - grouped.Min(s => s.UKDate)).TotalDays > 0
select grouped.Key);
Edit (by OP)
Final result:
var results = (from o in this._database.MyTable
group o by o.CompanyId into grouped
where (Convert.ToDateTime(grouped.Max(s => s.UKDate)) - Convert.ToDateTime(grouped.Min(s => s.UKDate))).TotalDays > 0
from l in myTable
where l.CompanyID == grouped.Key
select l).ToList();
A little different version:
var result = (from o in this._database.MyTable
group o by o.CompanyId into grouped
select new {
grouped.Key,
Count = grouped.Select(c => c.UkDate).Distinct().Count()
} into filter
where filter.Count > 1
join a in this._database.MyTable on filter.Key equals a.CompanyID
select new { a.CompanyID, a.UkDate}
).ToList();
You can also try this if you want the company id and a count of the different dates:
from c in dataTable
group c by c.CompanyId into grouped
let count = grouped.Select(x => x.UkDate).Distinct().Count()
where count > 1
select new { CompanyId = grouped.Key, Count = count }

LINQ Lambda - Join, Distinct

I am still learning to develop LINQ lambda expressions.
I have a parent table Requests and a child table Sponsor that will have 0 or 1 row associated with a request. I would like to show a list of past sponsors that a user might have defined in any of his/her previous requests.
1st: I can find all previous requests entered by a user (Request.UserId == 1111);
2nd: The tables are associated by RequestId (request.RequestId == Sponsor.RequestId);
3rd: I want to limit the rows returned based on distinct Sponsor.Email (return the max Sponsor.RequestId based on distinct Sponsor.Email);
4th: I want them ordered by the latest sponsor used (order by descending Sponsor.RequestId);
One last caveat, I only want to return sponsor records were the Sponsor.LastNm is not null (A previous upgrade issue).
So I am close, but I am not filtering out based on emails being the same:
db.Requests
.Where (req => req.UserID == 1111)
.Join(db.Sponsors,
req => req.RequestID,
spon => spon.RequestID,
(req, spon) => new { Requests = req, Sponsors = spon })
.Where(both => both.Sponsors.LastNm != null)
.OrderByDescending(both => both.Sponsors.RequestID);
At a minimum I need the Request.DateRequested and entire Sponsor row returned.
Request Table (only certain columns)
RequestId UserId DateRequested
12 1111 2013-10-12
34 1111 2013-10-23
56 2222 2013-10-25
87 1111 2013-11-02
99 1111 2013-11-15
Sponsor Table (only certain columns)
RequestId Email LastNm
12 abc.xyz.com
34 abc#xyz.com Doe
87 abc#xyz.com Doe
99 def#xyz.com Doe
So I would like to have the following rows returned
Request.DateRequested Sponsor
2013-11-15 99, def#xyz.com, Doe
2013-11-02 87, abc#xyz.com, DOe
I find it easier to write my LINQ queries in query syntax style. It really does improve readability for me.
var qry = from r in db.Requests
join s in db.Sponsors on r.RequestID equals s.RequestID
where r.UserID == 111 &&
s.LastNm != null
orderby s.RequestID descending
group new { Request = r, Sponsor = s } by s.EMail into g
select g.First();
Sticking with function notation, it would be:
var qry = db.requests
.Where(req => req.UserID == 111)
.Join(db.sponsors,
req => req.RequestID,
spon => spon.RequestID,
(req, spon) => new { Requests = req, Sponsor = spon })
.Where(both => both.Sponsor.LastNm != null)
.OrderByDescending(both => both.Sponsor.RequestID)
.GroupBy(both => both.Sponsor.EMail)
.Select(group => group.First());
This produces the result I think you are going for. With a local replica of your data in two separate arrays,and using the following loop:
foreach (var rec in qry)
Console.WriteLine("{0}\t{1}\t{2}\t{3}", rec.Request.DateRequested, rec.Request.RequestID, rec.Sponsor.EMail, rec.Sponsor.LastNm);
I get:
11/15/2013 12:00:00 AM 99 def#xyz.com Doe
11/2/2013 12:00:00 AM 87 abc#xyz.com Doe
Also, if you have referential integrity in your database and are using EntityFramework (or OpenAccess) you can replace the join with two froms.
from r in requests
from s in r.sponsors

LINQ Sum of entries based on latest date

I have a table like so:
Code | BuildDate | BuildQuantity
---------------------------------
1 | 2013-04-10 | 4
1 | 2014-09-23 | 1
1 | 2014-08-20 | 2
7 | 2014-02-05 | 4
I want the LINQ query to pick up the LATEST Build date for each Code, pick the BuildQuantity for that Code, and so on for all the records, and sum it up. So for the data above, the result should be 1 + 4 = 5.
This is what I'm trying:
var built = (from tb in db.Builds
orderby tb.BuildDate descending
group tb by tb.Code into tbgrp
select tbgrp.Sum(c => c.BuildQuantity)).First();
This query returns 7... What am I doing wrong?
You are summing all code's BuildQuantities before you take the first, instead you want to sum the firsts.
int built = db.Builds
.GroupBy(b => b.Code)
.Sum(g => g.OrderByDescending(b => b.BuildDate).First().BuildQuantity);
You are looking for the sum of the build quantity of the last entry per code. You're currently ordering before you group, which doesn't actually do anything (after the group, the ordering isn't defined)
So first of, you're looking to get the latest element by code. Lets group first. I'm more comfortable writing through the extension methods:
IGrouping<int, Build> grouped = db.Builds.GroupBy(tb => tb.Code)
we now have the elements grouped. From each group, we want to get the first element ordered descending by build date.
var firsts = grouped.Select(gr => gr.OrderByDescending(gr => gr.BuildDate)
.First())
finally, we can get the sum:
var sum = firsts.Sum(tb => tb.BuildQuantity);
plugging this all together becomes
var sum = db.Builds.GroupBy(tb => tb.Code).
.Select(gr => gr.OrderByDescending(gr => gr.BuildDate).First())
.Sum(tb => tb.BuildQuantity);
Group by has overloads that allows you to roll almost everything in the group.
If you like compact code, you could write
var sum = db.Builds
.GroupBy(tb => tb.Code,
tb => tb,
gr => gr.OrderByDescending(gr => gr.BuildDate)
.First()
.BuildQuantity)
.Sum()
though I wouldn't recommend it from a readability point of view

Categories

Resources