I use queries in EF4 to pull back records and process information through various other means (not EF) based on the data within, so I frequently have detached EF objects in lists.
In this case, I have a query in EntityFramework 4.0 that is not loading a related entity, even though I am using the .Include("...") method.
using (MyDBEntities ctx = new MyDBEntities())
{
ctx.ContextOptions.LazyLoadingEnabled = false;
// Get the first X records that need to be processed
var q = (from t in ctx.DBTables
.Include("Customer")
let c = t.Customer
where t.statusID == (int)Enums.Status.PostProcessing
&& c.isActive == true
select t
).Take(batchSize).ToList();
foreach (DBTable t in q)
{
// this results in c == null
Customer c = t.Customer;
// However t.CustomerID has a value, thus I know
// that t links to a real Customer record
Console.WriteLine(t.CustomerID);
}
}
Can anyone help me understand why Customer is not loading, even though I am explicitly stating to include it?
I found the root of the issue! The demon lies in the "let" command. Whenever I have a let, or a second "from" clause (like a join), then the ".Includes" get ignored!!!
// -- THIS FAILS TO RETRIEVE CUSTOMER
// Get the first X records that need to be processed
var q = (from t in ctx.DBTables
.Include("Customer")
// Using a "let" like this or
let c = t.Customer
// a "from" like this immediately causes my include to be ignored.
from ca in c.CustomerAddresses
where t.statusID == (int)Enums.Status.PostProcessing
&& c.isActive == true
&& ca.ValidAddress == true
select t
).Take(batchSize).ToList();
However, I can go get the ID's that I need to fetch in one call, then have a second "go get my includes" call, and everything works just fine.
// Get the first X record IDs that need to be processed
var q = (from t in ctx.DBTables
let c = t.Customer
from ca in c.CustomerAddresses
where t.statusID == (int)Enums.Status.PostProcessing
&& c.isActive == true
&& ca.ValidAddress == true
select t.TableID
).Take(batchSize).ToList();
// Now... go "deep-load" the records I need by ID
var ret = (from t in ctx.DBTables
.Include("Customer")
where q.Contains(t.TableID)
select t);
Related
My model has:
Several DeviceStatus attached to one mandatory Device
SeveralDevice attached to one mandatory Panel
When I query DeviceStatus, I need to have Device and Panel attached to it in the query result.
... DeviceStatus.Device is null in the query result.
Here is the Linq Query:
using (var actiContext = new ActigraphyContext())
{
var todayStatus =
from s in actiContext.DeviceStatus.Include(s1 => s1.Device.Panel)
where DbFunctions.TruncateTime(s.TimeStamp) == DbFunctions.TruncateTime( DateTimeOffset.Now)
&& s.Device.Panel.Mac == mac
&& (s.Device.Ty == 4 || s.Device.Ty == 9)
select s;
// var tempList = todayStatus.toList();
var todayLastStatus =
from s in todayStatus.Include(s1 => s1.Device.Panel)
let lastTimeStamp = todayStatus.Max(s1 => s1.TimeStamp)
where s.TimeStamp == lastTimeStamp
select s;
var requestResult = todayLastStatus.FirstOrDefault();
return requestResult;
}
If I uncomment the line // var tempList = todayStatus.toList();, where tempList is not used, it works: requestResult.Device is set!
But the bad side is todayStatus.toList triggers a request that brings a huge amount of data.
So how to get the DeviceStatus with its relative objects ?
Note: the database behind is SQL Server 2012
When you call an Include() over a LINQ query, it performs Eagerly Loading.
As documented in MSDN:
Eager loading is the process whereby a query for one type of entity also loads related entities as part of the query. Eager loading is achieved by use of the Include method.
When the entity is read, related data is retrieved along with it. This typically results in a single join query that retrieves all of the data that's needed. You specify eager loading by using the Include method.
So you need to call the .toList() to complete the query execution.
Since the data is huge, you can pickup relative specific columns as per your requirement by using the Select clause.
var todayStatus =
from s in actiContext.DeviceStatus
.Include(s1 => s1.Device.Panel.Select(d => new
{
d.DeviceId,
d.DeviceName,
d.PanelID
}))
where DbFunctions.TruncateTime(s.TimeStamp) == DbFunctions.TruncateTime( DateTimeOffset.Now)
&& s.Device.Panel.Mac == mac
&& (s.Device.Ty == 4 || s.Device.Ty == 9)
select s;
var tempList = todayStatus.toList();
The query doesn't actually run until you do a call like ToList(), which is why uncommenting that line works. If the query is bringing back too much data, then you need to change the query to narrow down the amount of data you're bringing back.
Ok this request is a more simple way to achieve this:
using (var actiContext = new ActigraphyContext())
{
var todayLastStatus =
from s in actiContext.DeviceStatus.Include(s1 => s1.Device.Panel)
where DbFunctions.TruncateTime(s.TimeStamp) == DbFunctions.TruncateTime( DateTimeOffset.Now)
&& s.Device.Panel.Mac == mac
&& (s.Device.Ty == 4 || s.Device.Ty == 9)
orderby s.TimeStamp descending
select s;
var requestResult = todayLastStatus.Take(1).FirstOrDefault();
return requestResult;
}
But the question remains: why didn't I get the relative object in my first request ?
I have a database that contains 3 tables:
Phones
PhoneListings
PhoneConditions
PhoneListings has a FK from the Phones table(PhoneID), and a FK from the Phone Conditions table(conditionID)
I am working on a function that adds a Phone Listing to the user's cart, and returns all of the necessary information for the user. The phone make and model are contained in the PHONES table, and the details about the Condition are contained in the PhoneConditions table.
Currently I am using 3 queries to obtain all the neccesary information. Is there a way to combine all of this into one query?
public ActionResult phoneAdd(int listingID, int qty)
{
ShoppingBasket myBasket = new ShoppingBasket();
string BasketID = myBasket.GetBasketID(this.HttpContext);
var PhoneListingQuery = (from x in myDB.phoneListings
where x.phonelistingID == listingID
select x).Single();
var PhoneCondition = myDB.phoneConditions
.Where(x => x.conditionID == PhoneListingQuery.phonelistingID).Single();
var PhoneDataQuery = (from ph in myDB.Phones
where ph.PhoneID == PhoneListingQuery.phonePageID
select ph).SingleOrDefault();
}
You could project the result into an anonymous class, or a Tuple, or even a custom shaped entity in a single line, however the overall database performance might not be any better:
var phoneObjects = myDB.phoneListings
.Where(pl => pl.phonelistingID == listingID)
.Select(pl => new
{
PhoneListingQuery = pl,
PhoneCondition = myDB.phoneConditions
.Single(pc => pc.conditionID == pl.phonelistingID),
PhoneDataQuery = myDB.Phones
.SingleOrDefault(ph => ph.PhoneID == pl.phonePageID)
})
.Single();
// Access phoneObjects.PhoneListingQuery / PhoneCondition / PhoneDataQuery as needed
There are also slightly more compact overloads of the LINQ Single and SingleOrDefault extensions which take a predicate as a parameter, which will help reduce the code slightly.
Edit
As an alternative to multiple retrievals from the ORM DbContext, or doing explicit manual Joins, if you set up navigation relationships between entities in your model via the navigable join keys (usually the Foreign Keys in the underlying tables), you can specify the depth of fetch with an eager load, using Include:
var phoneListingWithAssociations = myDB.phoneListings
.Include(pl => pl.PhoneConditions)
.Include(pl => pl.Phones)
.Single(pl => pl.phonelistingID == listingID);
Which will return the entity graph in phoneListingWithAssociations
(Assuming foreign keys PhoneListing.phonePageID => Phones.phoneId and
PhoneCondition.conditionID => PhoneListing.phonelistingID)
You should be able to pull it all in one query with join, I think.
But as pointed out you might not achieve alot of speed from this, as you are just picking the first match and then moving on, not really doing any inner comparisons.
If you know there exist atleast one data point in each table then you might aswell pull all at the same time. if not then waiting with the "sub queries" is nice as done by StuartLC.
var Phone = (from a in myDB.phoneListings
join b in myDB.phoneConditions on a.phonelistingID equals b.conditionID
join c in ph in myDB.Phones on a.phonePageID equals c.PhoneID
where
a.phonelistingID == listingID
select new {
Listing = a,
Condition = b,
Data = c
}).FirstOrDefault();
FirstOrDefault because single throws error if there exists more than one element.
I have a workflow table that takes all the steps of a process. Lets work with 2 of those statuses:
Saved (new item saved but not submitted yet)
Submitted (item submitted for review)
Now I want to create a BatchSumbit function that will submit all the unsubmitted items. For this I need to query for all the items which has a latest workflow status of "Saved". All the historical workflow entries for the item still exist and it can go from "Submitted" back to "Saved" a few times.
Here is the table structure:
Now i want a linq query that will give me what I require:
from wasteInformation in wasteDB.WasteInformations
join workFlowHistory in wasteDB.WorkFlowHistories on wasteInformation.WasteInformationId equals workFlowHistory.WasteInformationId
// Join with last instance in workflow table (where workflowHistory.DateAdded is greatest)
where workFlowHistory.WorkFlowStep == "Saved"
&& wasteInformation.WasteProgrammeId == captureModel.WasteProgrammeId
&& wasteInformation.WasteSourceId == captureModel.WasteSourceId
select new
{
WasteInformationId = wasteInformation.WasteInformationId,
FinancialQuarter = wasteInformation.FinancialQuarter,
FinancialYear = wasteInformation.FinancialYear,
WasteProgrammeId = wasteInformation.WasteProgrammeId,
WasteMonth = wasteInformation.WasteMonth,
WasteYear = wasteInformation.WasteYear,
DateCaptured = wasteInformation.DateCaptured,
WasteSourceId = wasteInformation.WasteSourceId,
WasteDate = wasteInformation.WasteDate
}
The query as it is will give be all the saved entries for the item. I want it to give me the item if that item's last entry has a WorkFlowStep of "Saved"
Edit:
I've got something that looks like it works. Still need to test it some more:
var SavedWasteInformation = wasteDB.WasteInformations.Where(wi => wi.WorkFlowHistories.FirstOrDefault(wf => wf.DateAdded == wi.WorkFlowHistories.Max(wf_in => wf_in.DateAdded)).WorkFlowStep == "Saved"
&& wi.WasteProgrammeId == captureModel.WasteProgrammeId
&& wi.WasteSourceId == captureModel.WasteSourceId);
Edit:
My solution above and Vladimirs's below both seem to work, but after inspecting the execution plans Vladimirs's looks like the better option:
Providing that you have collection of WorkFlowHistories on your WasteInformation I believe that query will select WasteInformations with their latest WorkFlowHistory (if any):
from wasteInformation in wasteDB.WasteInformations
where wasteInformation.WasteProgrammeId == captureModel.WasteProgrammeId
&& wasteInformation.WasteSourceId == captureModel.WasteSourceId
select new
{
WasteInformation = wasteInformation,
LastSavedWorkFlowHistory = wasteInformation.WorkFlowHistories
.Where(x => x.WorkFlowStep == "Saved")
.OrderByDescending(x => x.DateAdded)
.FirstOrDefalt()
}
Sometimes when I'm writing queries using LINQ and if I use it inside of a loop. It turns so slow the performance.
var query1 = from c in db.Classes
where c.TeacherId.Equals(teacherId)
select c;
// AnsweredAssignment Query
var query2 = (from c in db.AnsweredAssignments
where c.AssignmentId == assignmentId && c.Student.Class.TeacherId.Equals(teacherId)
select c).ToArray();
// Tokens Query
var query3 = (from c in db.Tokens
where c.AssignmentId == assignmentId && c.Student.Class.TeacherId.Equals(teacherId)
select c).ToArray();
// OverwrittenScores Query
var query4 = (from os in db.OverwrittenScores
where os.AssignmentId == assignmentId && os.Student.Class.TeacherId.Equals(teacherId)
select os).ToArray();
foreach (var c in query1)
{
foreach (var s in c.Students)
{
var aaItems = (from aa in query2
where aa.StudentId == s.StudentId
select aa).ToArray();
// Generate scores for objectives
var id3 = (from aa in aaItems
where !aa.IsMakeup
orderby aa.Score descending
select aa).FirstOrDefault();
if (id3 != null)
{
var aa3 = (from aa in query2
where aa.AnsweredAssignmentId == id3.AnsweredAssignmentId
select aa).SingleOrDefault();
...
}
var tokens = (from t in query3
where t.StudentId == s.StudentId
select new MonitorByGeneralScoreToAnsweredAssignment(AssignmentStatus.Pending)).ToList();
...
// does exist any overwritten score?
var osItem = query4.Where(os => os.StudentId == s.StudentId).SingleOrDefault();
...
}
// OverwrittenScores Query
var query4 = (from os in db.OverwrittenScores
where os.AssignmentId == assignmentId && os.Student.Class.TeacherId.Equals(teacherId)
select os).ToArray();
What I'm doing now is to get the records which I'm gonna use instead of getting one by one inside of the loop. Is this a good practice? Sometimes I guess that I'm not doing a good work :(
When I've got the records, I've save it into memory and using LINQ TO OBJECTS (from memory) I get to record.
So remember that making calls to a database will always be slow. In fact, it's often the slowest part of most applications. Thus, you should strive to return a lot of stuff at once, rather than trying to get items one at a time.
Strive to rewrite your queries such that you return as much of the required information in one go as necessary. Although you might use up more memory, it's more often than not worth it for the time savings. Connecting to databases is slow!
Secondly, (last I checked) Entity Framework uses reflection to be able to set properties on your objects. Reflection is also very slow, which is why - despite EFs cool factor - I still prefer to do my queries by hand. The performance is just significantly faster (but of course introduces another layer of complication since now you're not only dealing with one language - C# - but two - C# and SQL - which are conceptually very different).
i'm very new to linq to sql and in need of a little assistance.
Basically i'm building a message board in C#. I have 3 database tables - basic info is as follows.
FORUMS
forumid
name
THREADS
threadid
forumid
title
userid
POSTS
postid
threadid
text
userid
date
Basically I want to bring back everything I need in one query. I want to list a page of THREADS (for a particular FORUM) and also display the number of POSTS in that THREAD row and when the last POST was for that THREAD.
At the moment i'm getting back all THREADS and then looping through each the result set and making calls to the POST table seperately for the POST count for a Thread and the Latest Post in that thread but obviously this will cause problems in terms of hitting the database as the Message Board gets bigger.
My Linq To SQL so far:
public IList<Thread> ListAll(int forumid)
{
var threads =
from t in db.Threads
where t.forumid == forumid
select t;
return threads.ToList();
}
basicaly i now need to get the number of POSTS in each thread and the date of the last post in each thread.
Any help would be most appreciated :)
EDIT
Hi guys. Thanks for tyour help so far. Basically i'm almost there. However, I left an important part out of my initial question in the fact that I need to retrieve the user name of the person making the last POST. Therefore I need to join p.userid with u.userid on the USERS table. So far I have the following but just need to amend this to join the POST table with the USER table:
public IList<ThreadWithPostInfo> ListAll(int forumid)
{
var threads = (from t in db.Threads
where t.forumid == forumid
join p in db.Posts on t.threadid equals p.threadid into j
select new ThreadWithPostInfo() { thread = t, noReplies = j.Count(), lastUpdate = j.Max(post => post.date) }).ToList();
return threads;
}
UPDATE:
public IList<ThreadWithPostInfo> ListAll(int forumid)
{
var threads = (from t in db.Threads
from u in db.Users
where t.forumid == forumid && t.hide == "No" && t.userid == u.userid
join p in db.Posts on t.threadid equals p.threadid into j
select new ThreadWithPostInfo() { thread = t, deactivated = u.deactivated, lastPostersName = j.OrderByDescending(post => post.date).FirstOrDefault().User.username, noReplies = j.Count(), lastUpdate = j.Max(post => post.date) }).ToList();
return threads;
}
I finally figured that part of it out with thanks to all of you guys :). My only problem now is the Search Results method. At the moment it is like this:
public IList<Thread> SearchThreads(string text, int forumid)
{
var searchResults = (from t in db.Threads
from p in db.Posts
where (t.title.Contains(text) || p.text.Contains(text)) && t.hide == "No"
&& p.threadid == t.threadid
&& t.forumid == forumid
select t).Distinct();
return searchResults.ToList();
}
Note that I need to get the where clause into the new linq code:
where (t.title.Contains(text) || p.text.Contains(text)) && t.hide == "No"
so incorporating this clause into the new linq method. Any help is gratefully received :)
SOLUTION:
I figured out a solution but I don't know if its the best one or most efficient. Maybe you guys can tell me because i'm still getting my head around linq. James I think your answer was closest and got me to near to where I wanted to be - thanks :)
public IList<ThreadWithPostInfo> SearchThreads(string text, int forumid)
{
var searchResults = (from t in db.Threads
from p in db.Posts
where (t.title.Contains(text) || p.text.Contains(text)) && t.hide == "No"
&& p.threadid == t.threadid
&& t.forumid == forumid
select t).Distinct();
//return searchResults.ToList();
var threads = (from t in searchResults
join p in db.Posts on t.threadid equals p.threadid into j
select new ThreadWithPostInfo() { thread = t, lastPostersName = j.OrderByDescending(post => post.date).FirstOrDefault().User.username, noReplies = j.Count(), lastUpdate = j.Max(post => post.date) }).ToList();
return threads;
}
May be Too many database calls per session ....
Calling the database,. whether to query or to write, is a remote call, and we want to reduce the number of remote calls as much as possible. This warning is raised when the profiler notices that a single session is making an excessive number of calls to the database. This is usually an indication of a potential optimization in the way the session is used.
There are several reasons why this can be:
A large number of queries as a result of a Select N + 1
Calling the database in a loop
Updating (or inserting / deleting) a large number of entities
A large number of (different) queries that we execute to perform our task
For the first reason, you can see the suggestions for Select N + 1. Select N + 1 is a data access anti-pattern where the database is accessed in a suboptimal way. Take a look at this code sample :
// SELECT * FROM Posts
var postsQuery = from post in blogDataContext.Posts
select post;
foreach (Post post in postsQuery)
{
//lazy loading of comments list causes:
// SELECT * FROM Comments where PostId = #p0
foreach (Comment comment in post.Comments)
{
//print comment...
}
}
In this example, we can see that we are loading a list of posts (the first select) and then traversing the object graph. However, we access the collection in a lazy fashion, causing Linq to Sql to go to the database and bring the results back one row at a time. This is incredibly inefficient, and the Linq to Sql Profiler will generate a warning whenever it encounters such a case.
The solution for this example is simple. Force an eager load of the collection using the DataLoadOptions class to specify what pieces of the object model we want to load upfront.
var loadOptions = new DataLoadOptions();
loadOptions.LoadWith<Post>(p => p.Comments);
blogDataContext.LoadOptions = loadOptions;
// SELECT * FROM Posts JOIN Comments ...
var postsQuery = (from post in blogDataContext.Posts
select post);
foreach (Post post in postsQuery)
{
// no lazy loading of comments list causes
foreach (Comment comment in post.Comments)
{
//print comment...
}
}
next is updating a large number of entities is discussed in Use Statement Batching, and can be achieved by using the PLinqO project, which is a set of extensions on top of Linq to Sql. How cool would it be to store items in cache as a group. Well, guess what! PLINQO is cool! When storing items in cache, just tell PLINQO the query result needs to belong to a group and specify the name. Invalidating cache is where the coolness of grouping really shows up. No coupling of cache and actions taken on that cache when they are in a group. Check out this example :
public ActionResult MyTasks(int userId)
{
// will be separate cache for each user id, group all with name MyTasks
var tasks = db.Task
.ByAssignedId(userId)
.ByStatus(Status.InProgress)
.FromCache(CacheManager.GetProfile().WithGroup("MyTasks"));
return View(tasks);
}
public ActionResult UpdateTask(Task task)
{
db.Task.Attach(task, true);
db.SubmitChanges();
// since we made an update to the tasks table, we expire the MyTasks cache
CacheManager.InvalidateGroup("MyTasks");
}
PLinqO supports the notion of query batching, using a feature called futures, which allow you to take several different queries and send them to the database in a single remote call. This can dramatically reduce the number of remote calls that you make and increase your application performance significantly.
cmiiw ^_^
public IList<Thread> ListAll(int forumid)
{
var threads =
from t in db.Threads
where t.forumid == forumid
select new
{
Thread = t,
Count = t.Post.Count,
Latest = t.Post.OrderByDescending(p=>p.Date).Select(p=>p.Date).FirstOrDefault()
}
}
Should be something like that
I think what you're really looking for is this:
var threadsWithPostStats = from t in db.Threads
where t.forumid == forumid
join p in db.Posts on t.threadid equals p.threadid into j
select new { Thread = t, PostCount = j.Count(), LatestPost = j.Max(post => post.date) };
Per your comment and updated question, I'm adding this restatement:
var threadsWithPostsUsers = from t in db.Threads
where t.forumid == forumid
join p in db.Posts on t.threadid equals p.threadid into threadPosts
let latestPostDate = threadPosts.Max(post => post.date)
join post in db.Posts on new { ThreadID = t.threadid, PostDate = latestPostDate } equals new { ThreadID = post.threadid, PostDate = post.date} into latestThreadPosts
let latestThreadPost = latestThreadPosts.First()
join u in db.Users on latestThreadPost.userid equals u.userid
select new { Thread = t, LatestPost = latestThreadPost, User = u };
Wouldn't hurt to get familiar with group by in LINQ and aggregates (Max, Min, Count).
Something like this:
var forums = (from t in db.Threads
group t by t.forumid into g
select new { forumid = g.Key, MaxDate = g.Max(d => d.ForumCreateDate) }).ToList();
Also check out this article for how to count items in a LINQ query with group by:
LINQ to SQL using GROUP BY and COUNT(DISTINCT)
LINQ aggregates:
LINQ Aggregate with Sub-Aggregates