how to implement Linq to Entities IQueryable TakeWhile() functionality - c#

Today while working with LINQ, I leanrnt that TakeWhile() is not supported for LINQ to entities, is there any efficient way to implement such a functionality? The use case I have is as below -
I have an Employee entity, and I have sorted the entity by Name, now I want to fetch the records from this IQueryable till the time (EmployeeID = 123)
Something like this -
IQueryable<Employee> employees = ObjectContext.Employees
.OrderBy(a => a.Name)
.TakeWhile(a => a.EmployeeId != 123)
However in above code the TakeWhile is not supported for Linq to Entities so it throws an error.
I am trying with below approach, Please let me know if anyone has better and efficient aproach:
Fetch first X records,
check if the required EmployeeId is part of it,
if not then fetch the next set of X records
and Concat them with previous set
and check if the EmployeeID is part of it again,
break the loop when the matching EmployeeId is found in the set of X records..

You shouldn't sort by Name, but store somewhere Id. May be, this will help:
// eager loading of employees, which name is less or equal, than stored name:
ObjectContext.Employees.Where(a => a.Name.CompareTo(storedName) < 0 || a.Name.CompareTo(storedName) == 0)
// lazy loading the rest of employees:
ObjectContext.Employees.Where(a => a.Name.CompareTo(storedName) > 0)

There is no TakeWhile() equivalent in SQL so you'll have to do that step in memory. You can force the SQL to be executed by casting the IQueryable<> to an IEnumerable<> prior to adding the filters that don't have SQL equivalents.
var employees = ObjectContext.Employees
.OrderBy(a => a.Name)
.AsEnumerable() // filters after this point will be done in memory on each record
.TakeWhile(a => a.EmployeeId != 123)

Related

T-SQL to LINQ to SQL using Navigation Properties

I can’t seem to come up with the right corresponding LINQ to SQL statement to generate the following T-SQL. Essentially, I'm trying to return payment information with only one of the customer's addresses... the AR address, if it exists, then the primary address, if it exists, then any address.
SELECT < payment and address columns >
FROM Payment AS p
INNER JOIN Customer AS c ON c.CustomerID = p.CustomerID
OUTER APPLY (
SELECT TOP 1 < address columns >
FROM Address AS a
WHERE a.person_id = c.PersonID
ORDER BY CASE WHEN a.BusinessType = 'AR' THEN 0
ELSE 1
END
, a.IsPrimary DESC
END
) AS pa
WHERE p.Posted = 1
We’re using the Repository Pattern to access the DB, so inside a method of the Payment Repository, I’ve tried:
var q = GetAll()
.Where(p => p.Posted == true)
.SelectMany(p => p.Customer
.Address
.OrderBy(a => a.BusinessType != "AR")
.ThenBy(a => a.Primary != true)
.Take(1)
.DefaultIfEmpty()
.Select(a => new
{
< only the columns I need from p and a >
});
But when I execute .ToList(), it throws the NullReferenceException (Object reference not set to an instance of an object) on a record where the customer has no addresses set up. So, I tried:
var q1 = GetAll().Where(p => p.Posted == true);
var q2 = q11.SelectMany(p => p.Customer
.Address
.OrderBy(a => a.BusinessType != "AR")
.ThenBy(a => a.Primary != true));
var q3 = q1.SelectMany(p => q2.Where(a => a.PersonID == p.Customer.PersonID)
.Take(1)
.DefaultIfEmpty()
.Select(a => new
{
< only the columns I need from p and a >
});
This returns the correct results, but the T-SQL it generates puts the entire T-SQL from above into the outer apply, which is then joined again on Payment and Customer. This seems somewhat inefficient and I wondered if it could be made more efficient because the T-SQL above returns in 6ms for the test case I’m using.
Additional Info:
Q: I think the problem here is that GetAll() returns IEnumerable, not IQueryable ... it would help to see this GetAll() method. - Gert Arnold
A: Actually, GetAll(), when traced all the way back, returns Table<TEntity> System.Data.Linq.GetTable<TEntity>() and Table<TEntity> does implement IQueryable.
However, DefaultIfEmpty() does return IEnumerable<Address>, which is what is throwing the exception, if I'm not mistaken, as I mentioned in the first L2S code section.
SOLUTION UPDATE
Okay, I knew I could fall back to simply going straight to joining the tables and foregoing the use of the navigation properties, and in this case, I now know that is how it should be done. It all makes sense now. I just had become accustomed to preferring using the navigation properties, but here, it’s best to go straight to joining tables.
The reason the T-SQL generated by the second L2S code section was so inefficient was because in order to get to the Address table, it required the inclusion of the Payment/Customer data.
When I simply go straight to joining the tables, the generated T-SQL, while not ideal, is much closer to the desired script code section. That’s because it didn’t require the inclusion of the Payment/Customer data. And that’s when the “well, duh” light bulb flashed on.
Thanks for all who helped on this path to discovery!
When trying a similar query it turned out that this DefaultIfEpty() call knocks down LINQ-to-SQL. The exception's stack trace shows that things go wrong in System.Data.Linq.SqlClient.SqlBinder.Visitor.IsOuterDependent, i.e. during SQL query building.
Contrary to your conclusion it's not advisable to abandon the use of navigation properties and return to explicit joins. The question is: how to use the best parts of LINQ (which includes nav properties) without troubling LINQ-to-SQL. This, by the way, is true for each ORM with LINQ support.
In this particular case I'd switch to query syntax for the main query and use the keyword let. Something like:
from p in context.Payments
let address = p.Customer
.Addresses
.OrderBy(a => a.BusinessType != "AR")
.ThenBy(a => a.Primary != true)
.FirstOrDefault()
select new
{
p.PropertyX,
address.PropertyY
...
}
This will be translated into one SQL statement and it avoids LINQ-to-SQL's apparent issue with DefaultIfEmpty.

Join vs Navigation property for sub lists in Entity Framework

I have a sql statement like this:
DECLARE #destinations table(destinationId int)
INSERT INTO #destinations
VALUES (414),(416)
SELECT *
FROM GroupOrder grp (NOLOCK)
JOIN DestinationGroupItem destItem (NOLOCK)
ON destItem.GroupOrderId = grp.GroupOrderId
JOIN #destinations dests
ON destItem.DestinationId = dests.destinationId
WHERE OrderId = 5662
I am using entity framework and I am having a hard time getting this query into Linq. (The only reason I wrote the query above was to help me conceptualize what I was looking for.)
I have an IQueryable of GroupOrder entities and a List of integers that are my destinations.
After looking at this I realize that I can probably just do two joins (like my SQL query) and get to what I want.
But it seems a bit odd to do that because a GroupOrder object already has a list of DestinationGroupItem objects on it.
I am a bit confused how to use the Navigation property on the GroupOrder when I have an IQueryable listing of GroupOrders.
Also, if possible, I would like to do this in one trip to the database. (I think I could do a few foreach loops to get this done, but it would not be as efficient as a single IQueryable run to the database.)
NOTE: I prefer fluent linq syntax over the query linq syntax. But beggars can't be choosers so I will take whatever I can get.
If you already have the DestinationGroupItem as a Navigation-property, then you already have your SQL-JOIN equivalent - example. Load the related entities with Include. Use List's Contains extension method to see if the desired DestinationId(s) is(are) hit:
var destinations = new List<int> { 414, 416 };
var query = from order in GroupOrder.Include(o => o.DestinationGroupItem) // this is the join via the navigation property
where order.OrderId == 5662 && destinations.Contain(order.DestinationGroupItem.DestinationId)
select order;
// OR
var query = dataContext.GroupOrder
.Include(o => o.DestinationGroupItem)
.Where(order => order.OrderId == 5662 && destinations.Contain(order.DestinationGroupItem.DestinationId));

How to write an linq statement to get the last of a group of records

I have 2 SQL statements that basically do the same thing, that is, retrieve the last record from a table based on a datetime field for a group of records. I am using the data-first Entity Framework model. How would I write either of these SQL statements using LINQ Lambda functions?
ie,
var u = db.AccessCodeUsage.Where(...).GroupBy(...)
rather than
var u = from a in db.AccessCodeUsage
where ...
group by ...
SQL Statements:
SELECT *
FROM AccessCodeUsage a
WHERE NOT EXISTS (SELECT 1
FROM AccessCodeUsage
WHERE LocationId = a.LocationId
AND Timestamp > a.Timestamp)
SELECT a.*
FROM AccessCodeUsage a
WHERE a.Timestamp =
(SELECT MAX(Timestamp)
FROM AccessCodeUsage
WHERE a.LocationId = LocationId
AND a.AccessCode = AccessCode
GROUP By LocationId, AccessCode)
If you need to have the method-call form, but are finding it tricky to work out, then use the other syntax first:
from a in db.AccessCodeUsage
orderby a.TimeStamp descending
group a by a.LocationId into grp
from g in grp select g.First();
Then convert to method calls by taking each clause one at a time:
db.AccessCodeUsage
.OrderByDescending(a => a.TimeStamp)
.GroupBy(a => a.LocationId)
.Select(g => g.First());
From which I can workout the second without bothering to write out the linq-syntax form first:
db.AccessCodeUsage
.OrderByDescending(a => a.TimeStamp)
.GroupBy(a => new {a.LocationId, a.AccessCode})
.Select(g => g.First());
(Except it doesn't include what may be a bug, in that if timestamps aren't guaranteed unique, the SQL given in the question could include some extra inappropriate results).
I can't check on the SQL produced right now, but it should hopefully be equivalent in results (if not necessarily matching). There's cases where grouping doesn't translate to SQL well, but I certainly don't think this would be one.
I ended up using the following which corresponds to the first SQL statement.
// Retrieve only the latest (greatest value in timestamp field) record for each Access Code
var last = AccessCodeUsages.Where(u1 => !AccessCodeUsages
.Any(u2 => u2.LocationId == u1.LocationId &&
u2.AccessCode == u1.AccessCode &&
u2.Timestamp > u1.Timestamp));

Linking Multiple Tables in LINQ to SQL

I would like to get the list of albums (Distinct) which was sung by the artistId=1
I am very new to LINQ to SQL and do not know how to join multiple tables. Please see the database diagram below:
alt text http://a.imageshack.us/img155/8572/13690801.jpg
SingBy is the middle table between Track and Artist.
How could I achieve this?
var albums = from singer in artist
from sb in singby
from t in track
from a in album
where singer.artistId == 1 &&
sb.artistId == 1 &&
sb.trackId == t.trackId &&
a.albumId == track.albumId
select a;
I'm sure there must be a better way. You should look into creating Navigation Properties on your entities. Navigation Properties are like foreign keys.
Edit - corrected to get albums, not artists.
Now, I wrote the codes like the following and it works.
var albums = (from a in db.artists
where a.artistId == 1
join sb in db.singbies on a equals sb.artist
join t in db.tracks on sb.track equals t
join al in db.albums on t.album equals al
select al).Distinct();
return albums.ToList() as List<album>;
I tested the Chad's version and it works too. I would like to know which way is better and good for query optimization? Thanks all.
If you have all the foreign key relationship defined, you should be able to issue call like below:
dc.GetTable<Album>().Where(a => a.Track.Singby.ArtistId == 1).ToList();
This is relying on Linq to perform lazy load for Track and Singby automatically when required. Obviously this is not optimal to use when you have a large set of data in the db and performance is critical. You can chain the query with GroupBy or Distinct operation to return only the distinct set such as
dc.GetTable<Album>().Where(a => a.Track.Singby.ArtistId == 1).Distinct().ToList();
I would like to get the list of albums
(Distinct) which was sung by the
artistId=1
DBDataContext = new DBDataContext();
album[] = db.artists.Where(a => a.artistId == 1) /* Your artist */
.SelectMany(a => a.singbies) /* Check if `singby` converted to `singbies` */
.Select(sb => sb.track) /* The tracks */
.Select(t => t.album) /* The albums */
.GroupBy(al => al.albumId) /* Group by id */ /* "Distinct" for objects */
.Select(alG => alG.First()) /* Select first of each group */
.ToArray();
IEnumerable<Album> query =
from album in myDC.Albums
let artists =
from track in album.Tracks
from singBy in track.SingBys
select singBy.Artist
where artists.Any(artist => artist.ArtistId == 1)
select album;
List<int> Ids = dc.Albums.Where(a => a.Track.Singby.ArtistId == 1).Select(a=> a.albumId).Distinct().ToList();
List<Album> distinctAlbums = dc.Albums.Where(a => distinctAlbumIds.Contains(a.albumId)).ToList();
Hey TTCG, above is the simplest way to do it. This is because doing a Distinct on a List of objects won't do it based on the albumId.
Either you do it in two steps as above, or, you write your own Album Comparer which specifies uniqueness based on AlbumId and pass it to the Distinct call on a List.
NOTE:
The above will only work if you've defined the constraints in your DBML, but better still in your DB.
For best practices, always define your relationships IN THE DATABASE when using Linq to SQL, as Linq to SQL is not like EF, or NHibernate, in that is does not "abstract" your db, it simply reflects it. It's a tool for Data Driven Design, not Domain Driven, so define the relationships in the db.

LINQ: Doing an order by!

i have some Linq to Entity code like so:
var tablearows = Context.TableB.Include("TableA").Where(c => c.TableBID == 1).Select(c => c.TableA).ToList();
So i'm returning the results of TableA with TableB.TableBID = 1
That's all good
Now how can I sort TableA by one of its column? There is a many to many relation ship between the two tables
I tried various ways with no look, for example
var tablearows = Context.TableB.Include("TableA").Where(c => c.TableBID == 1).Select(c => c.TableA).OrderBy(p => p.ColumnToSort).ToList();
In the above case when i type "p." i don't have access to the columns from TableA, presumably because it's a collection of TableA objects, not a single row
How about using SelectMany instead of Select :
var tablearows = Context.TableB.Include("TableB")
.Where(c => c.TableBID == 1)
.SelectMany(c => c.TableA)
.OrderBy(p => p.ColumnToSort)
.ToList();
EDIT :
The expression below returns collection of TableAs -every element of the collection is an instance of TableA collection not TableA instance- (that's why you can't get the properties of the TableA) :
var tablearows = Context.TableB.Include("TableB")
.Where(c => c.TableBID == 1)
.Select(c => c.TableA);
If we turn the Select to SelectMany, we get the result as one concatenated collection that includes elements :
var tablearows = Context.TableB.Include("TableB")
.Where(c => c.TableBID == 1)
.SelectMany(c => c.TableA);
Okay, so now I've taken on board that there's a many to many relationship, I think Canavar is right - you want a SelectMany.
Again, that's easier to see in a query expression:
var tableARows = from rowB in Context.TableB.Include("TableA")
where rowB.TableBID == 1
from rowA in rowB.TableA
orderby rowA.ColumnToSort
select rowA;
The reason it didn't work is that you've got a different result type. Previously, you were getting a type like:
List<EntitySet<TableA>>
(I don't know the exact type as I'm not a LINQ to Entities guy, but it would be something like that.)
Now we've flattened all those TableA rows into a single list:
List<TableA>
Now you can't order a sequence of sets by a single column within a row - but you can order a sequence of rows by a column. So basically your intuition in the question was right when you said "presumably because it's a collection of TableA objects, not a single row" - but it wasn't quite clear what you mean by "it".
Now, is that flattening actually appropriate for you? It means you no longer know which B contributed any particular A. Is there only actually one B involved here, so it doesn't matter? If so, there's another option which may even perform better (I really don't know, but you might like to look at the SQL generated in each case and profile it):
var tableARows = Context.TableB.Include("TableA")
.Where(b => b.TableBID == 1)
.Single()
.TableA.OrderBy(a => a.ColumnToSort)
.ToList();
Note that this will fail (or at least would in LINQ to Objects; I don't know exactly what will happen in entities) if there isn't a row in table B with an ID of 1. Basically it selects the single row, then selects all As associated with that row, and orders them.

Categories

Resources