Implement "not in" (aka "not exists") logic in LINQ - c#

Setup
I have two List<T>'s.
The data is un-normalized and from different sources which explains the convolution in the desired logic
An informal compound key in the data is fieldA, fieldB, fieldC.
The "fields" are strings - reference types - so their values could be null. I want to drop records where they may be matching on null. I get that null references in C# will match, but in SQL they do not. Adding a !string.IsNullOrEmpty() is easy enough.
This is not a question about DB design or relational algebra.
I have other logic which covers other criteria. Do not suggest reducing the logic shown such that it might broaden the result set. See # 5 above.
The Problem
I want to find the records in listA that are not in listB based on the informal key. I then want to further refine the listA results based on a partial key match.
The SQL version of the problem:
select
listA.fieldA, listA.fieldB, matching.fieldC
from listA
left join listB keyList on
listA.fieldA = keyList.fieldA and
listA.fieldB = keyList.fieldB and
listA.fieldC = keyList.fieldC
inner join listB matching on
listA.fieldA = matching.fieldA and
listA.fieldB = matching.fieldB
where
keyList.fieldA is null

SQL to LINQ ( Case 7 - Filter data by using IN and NOT IN clause)
Note: IN and NOT IN use the same function in the LINQ query, but it just use a ! (not) symbol for it. Here is the graphical representation:
You use, where <list>.Contains( <item> )
var myProducts = from p in db.Products
where productList.Contains(p.ProductID)
select p;
Or you can have a list predefined as such:
var ids = {1, 2, 3};
var query = from item in context.items
where ids.Contains( item.id )
select item;
For the 'NOT' case, just add the '!' operator before the 'Contains' statement.

Related

Self join in LINQ

SELECT
FW1.id, count(*)
FROM
firmware FW1
LEFT JOIN
firmware FW2 ON FW1.firmware_group_id = FW2.firmware_group_id
AND FW1.br_date < FW2.br_date
AND FW2.[public]= '1'
GROUP BY
FW1.id
I am looking to convert into linq query. As I know less than symbol cannot be converted into Linq query. Please suggest how to do it. I have a string date and I need to compare into linq.
As you said, Linq does not support other types of join outside of EquiJoin. Docs is pretty clear on what you can do to bypass this:
you could use a simple cross join (a cartesian product), then applying in the where clause the conditions for your non-equijoin.
Or you could use a temporary variable to store a new table with only the attributes you need for your query and, like before, applying the conditions in the where clause.
In your case, a possible Linq query could be this one:
from f1 in firmwares
from f2 in firmwares
let f1_date = DateTime.Parse(f1.Dt)
let f2_date = DateTime.Parse(f2.Dt)
where f1.Group_Id == f2.Group_Id && f1_date < f2_date
group f1 by f1.Id into fres
select new {
Id = fres.Key,
Count = fres.Count()
};
However I am still thinking how to emulate the LEFT JOIN without casting it to a group join.
Of course the < symbol can be used. Just use method syntax instead of query syntax!
Every FW1 has zero or more FW2s. Every FW2 belongs to exactly one FW1. This one-to-many is implemented using foreign key firmware_group_id.
Apparently you want all FW1s, each with the number of its FW2s, that have a property public with a value equal to 1 and a property br-date larger than the value of the br-date of the FW1.
Whenever you want an item with its many sub-items (using a foreign key), like s School with its Students, a Customer with his Orders, a Book with his Pages, you'll need Enumerable.GroupJoin
var result = FW1.GroupJoin(FW2, // GroupJoin FW1 and FW2
fw1 => fw1.firmware_group_id, // from fw1 take the primary key firmware_group_id
fw2 => fw2.firmware_group_id, // from fw2 take the foreing key firmware_group_id
(fw1, fw2s) => new // from every fw1, with all its matching fw2s
{ // make one new object containing the following properties:
Id = fw1.Id,
// Count the fw2s of this fw1 that have public == 1
// and a date larger than fw1.date
Count = fw2.Where(fw2 => fw2.Public == 1 && fw1.br_date < fw2.br_date)
.Count(),
});
Note:

Why is linq reversing order in group by

I have a linq query which seems to be reversing one column of several in some rows of an earlier query:
var dataSet = from fb in ds.Feedback_Answers
where fb.Feedback_Questions.Feedback_Questionnaires.QuestionnaireID == criteriaType
&& fb.UpdatedDate >= dateFeedbackFrom && fb.UpdatedDate <= dateFeedbackTo
select new
{
fb.Feedback_Questions.Feedback_Questionnaires.QuestionnaireID,
fb.QuestionID,
fb.Feedback_Questions.Text,
fb.Answer,
fb.UpdatedBy
};
Gets the first dataset and is confirmed working.
This is then grouped like this:
var groupedSet = from row in dataSet
group row by row.UpdatedBy
into grp
select new
{
Survey = grp.Key,
QuestionID = grp.Select(i => i.QuestionID),
Question = grp.Select(q => q.Text),
Answer = grp.Select(a => a.Answer)
};
While grouping, the resulting returnset (of type: string, list int, list string, list int) sometimes, but not always, turns the question order back to front, without inverting answer or questionID, which throws it off.
i.e. if the set is questionID 1,2,3 and question A,B,C it sometimes returns 1,2,3 and C,B,A
Can anyone advise why it may be doing this? Why only on the one column? Thanks!
edit: Got it thanks all! In case it helps anyone in future, here is the solution used:
var groupedSet = from row in dataSet
group row by row.UpdatedBy
into grp
select new
{
Survey = grp.Key,
QuestionID = grp.OrderBy(x=>x.QuestionID).Select(i => i.QuestionID),
Question = grp.OrderBy(x=>x.QuestionID).Select(q => q.Text),
Answer = grp.OrderBy(x=>x.QuestionID).Select(a => a.Answer)
};
Reversal of a grouped order is a coincidence: IQueryable<T>'s GroupBy returns groups in no particular order. Unlike in-memory GroupBy, which specifies the order of its groups, queries performed in RDBMS depend on implementation:
The query behavior that occurs as a result of executing an expression tree that represents calling GroupBy<TSource,TKey,TElement>(IQueryable<TSource>, Expression<Func<TSource,TKey>>, Expression<Func<TSource,TElement>>) depends on the implementation of the type of the source parameter.`
If you would like to have your rows in a specific order, you need to add OrderBy to your query to force it.
How I do it and maintain the relative list order, rather than apply an order to the resulting set?
One approach is to apply grouping to your data after bringing it into memory. Apply ToList() to dataSet at the end to bring data into memory. After that, the order of subsequent GrouBy query will be consistent with dataSet. A drawback is that the grouping is no longer done in RDBMS.

Multiple joins with multiple on statements using Linq Lambda expressions [duplicate]

Suppose I have a list of {City, State}. It originally came from the database, and I have LocationID, but by now I loaded it into memory. Suppose I also have a table of fast food restaurants that has City and State as part of the record. I need to get a list of establishments that match city and state.
NOTE: I try to describe a simplified scenario; my business domain is completely different.
I came up with the following LINQ solution:
var establishments = from r in restaurants
from l in locations
where l.LocationId == id &&
l.City == r.City &&
l.State == r.State
select r
and I feel there must be something better. For starters, I already have City/State in memory - so to go back to the database only to have a join seems very inefficient. I am looking for some way to say {r.City, r.State} match Any(MyList) where MyList is my collection of City/State.
UPDATE
I tried to update based on suggestion below:
List<CityState> myCityStates = ...;
var establishments =
from r in restaurants
join l in myCityStates
on new { r.City, r.State } equals new { l.City, l.State } into gls
select r;
and I got the following compile error:
Error CS1941 The type of one of the expressions in the join clause is incorrect. Type inference failed in the call to 'Join'.
UPDATE 2
Compiler didn't like anonymous class in the join. I made it explicit and it stopped complaining. I'll see if it actually works in the morning...
It seems to me that you need this:
var establishments =
from r in restaurants
join l in locations.Where(x => x.LocationId == id)
on new { r.City, r.State } equals new { l.City, l.State } into gls
select r;
Well, there isn't a lot more that you can do, as long as you rely on a table lookup, the only thing you can do to speed up things is to put an index on City and State.
The linq statement has to translate into a valid SQL Statement, where "Any" would translate to something like :
SELECT * FROM Restaurants where City in ('...all cities')
I dont know if other ORM's give better performance for these types of scenarios that EF, but it might be worth investigating. EF has never had a rumor for being fast on reads.
Edit: You can also do this:
List<string> names = new List { "John", "Max", "Pete" };
bool has = customers.Any(cus => names.Contains(cus.FirstName));
this will produce the necessary IN('value1', 'value2' ...) functionality that you were looking for

Filter items from database based on a List<>

I have a method that accepts two List<int> for which I need to get data from the database based on the List<>s.
So, I receive a List<PersonId> and List<NationalityId> for example, and I need to get a result set where records match the PersonIds and NationalistIds.
public List<PersonDTO> SearchPeople(List<int> persons, Lisy<int> nationalities)
{
var results = (from c in myDbContect.People where .... select c).ToList();
}
Note that I think Lists might be null.
Is there an efficient way?
I was going to try:
where ((persons != null && persons.Count > 0) && persons persons.Contains(x=>x.PersonId))
But this would generate rather inefficient SQL, and as I add more search parameters, the linq may get very messy.
Is there an efficient way to achieve this?
The join method may be easy to read, but the issue I face is that IF the input list is empty, then it shouldn't filter. That is, if nationalities is empty, don't filter any out:
var results = (from c in entities.Persons
join p in persons on c.PersonId equals b
join n in nationalities on c.NationalityId equals n
equals n
select c).ToList();
This would return no results if any of the lists were empty. Which, is bad.
If you join an IQueryable with an IEnumerable (in this case, entities.Persons and persons), your filtering will not happen within your query. Instead, your IQueryable is enumerated, retrieving all of your records from the database, while the join is performed in memory using the IEnumerable join method.
To perform your filtering against a list within your query, there are two main options:
Join using an IQueryable on both sides. This might be possible if your list of ids comes from the execution of another query, in which case you can use the underlying query in your join instead of the resulting set of ids.
Use the contains operator against your list. This is only possible with small lists, because each additional id requires its own query parameter. If you have many ids, you can possibly extend this approach with batching.
If you want to skip filtering when the list is empty, then you might consider using the extension method invocation instead of the LINQ syntax. This allows you to use an if statement:
IQueryable<Person> persons = entities.persons;
List<int> personIds = new List<int>();
if(personIds.Count > 0)
{
persons = persons.Where(p => personIds.Contains(p.PersonId));
}
var results = persons.ToList();
Note that the Where predicate uses option #2 above, and is only applied if there are any ids in the collection.
If you want to get all the records for persons for example if the list is empty and then filter by nationalityId list if its not empty you can do something like this:
List<int> personsIds = ...;
List<int> nationalitiesIds = ...;
var results = (from c in entities.Persons
join p in persons on c.PersonId equals b
join n in nationalities on c.NationalityId equals n
where ((personsIds == null || personsIds.Contains(p.Id))
&& (nationalitiesIds == null || nationalitiesIds.Contains(n.Id))
select c).ToList();

LINQ to SQL where collection contains collection

I have a problem :( I have a many-many table between two tables(1&2), via a mapping table(3):
(1)Trees / (2)Insects
TreeID <- (3)TreeInsects -> InsectID
And then a one to many relationship:
Trees.ID -> Leaves.TreeID
And I would like to perform a query which will give me all the Leaves for a collection of insects (via trees-insect mapping table).
E.g. I have List<Insects> and I want all Leaves that have an association with any of the Insects in the List via the Tree-Insects mapping table.
This seems like a simple task, but for some reason I'm having trouble doing this!!
The best I have: but the Single() makes it incorrect:
from l in Leaves
where (from i in Insects
select i.ID)
.Contains((from ti in l.Tree.TreeInsects
select ti.InsectID).Single())
select l;
(from i in insectsCollection
select from l in Leaves
let treeInsectIDs = l.Tree.TreeInsects.Select(ti => ti.InsectID)
where treeInsectIDs.Contains(i.ID)
select l)
.SelectMany(l => l)
.Distinct();
I'm bad with sql-like syntax so I'll write with extensions.
ctx.Leaves.Where(l => ctx.TreeInsects.Where( ti => list_with_insects.Select(lwi => lwi.InsectID).Contains( ti.InsectID ) ).Any( ti => ti.TreeID == l.TreeID ) );
Try investigating the SelectMany method - I think that may be the key you need.
I would get a list of Trees that are available to that Insect, then peg on a SelectMany to the end and pull out the collection of Leaves tied to that Tree.
List<int> insectIds = localInsects.Select(i => i.ID).ToList();
//note - each leaf is evaluated, so no duplicates.
IQueryable<Leaf> query =
from leaf in myDataContext.Leaves
where leaf.Tree.TreeInsects.Any(ti => insectIds.Contains(ti.InsectId))
select leaf;
//note, each matching insect is found, then turned into a collection of leaves.
// if two insects have the same leaf, that leaf will be duplicated.
IQueryable<Leaf> query2 =
from insect in myDataContext.Insects
where insectIds.Contains(insect.ID)
from ti in insect.TreeInsects
from leaf in ti.Tree.Leaves
select leaf;
Also note, Sql Server has a parameter limit of ~2100. LinqToSql will happily generate a query with more insect IDs, but you'll get a sql exception when you try to run it. To resolve this, run the query more than once, on smaller batches of IDs.
How do you get this list of insects? Is it a query too?
Anyway, if you don't mind performance (SelectMany can be slow if you have a big database), this should work:
List<Insect> insects = .... ; //(your query/method)
IEnumerable<Leave> leaves = db.TreeInsects
.Where(p=> insects.Contains(p.Insect))
.Select(p=>p.Tree)
.SelectMany(p=>p.Leaves);

Categories

Resources