How do I make Entity Framework not join tables? - c#

Ok so I am testing out EF once again for performance and I just want a simple result back from my database.
Example
var jobsList = from j in mf.Jobs
where j.UserID == 1001 select new { Job = j };
This unfortunately joins my User object to this list, which I don't want EF to do. How do I tell EF not to join just because there is a relationship. Basically I just want a simple row from that table.
Or do I need to use a different type of retrieval. I am still using the basic type of database retrieval below and I feel there are better ways to handle db work by now.
SqlConnection myconnection = new SqlConnection();
Edit
Basically what I am saying in a more clearer context. Is that instead of only getting the following.
Job.JobID
Job.UserID
//Extra properties
I Get
Job.JobID
Job.UserID
Job.User
//Extra properties
That User object easily consumes way more memory than is needed, plus I don't need it.
My Solution
So I am still not believing in EF too much and here is why. I turned off LazyLoading and turned it on and didn't really notice too much of a performance difference there. I then compared the amount of data that my SqlConnection type method uses compared to my EF method.
I get back the exact same result set and here are the performance differences.
For my Entity Framework method I get back a list of jobs.
MyDataEntities mf = new MyDataEntities(); // 4MB for the connection...really?
mf.ContextOptions.LazyLoadingEnabled = false;
// 9MB for the list below
var test = from j in mf.Jobs
where j.UserID == 1031
select j;
foreach (Job job in test) {
Console.WriteLine(job.JobID);
}
For my SqlConnection method that executes a Stored Proc and returns a result set.
//356 KB for the connection and the EXACT same list.
List<MyCustomDataSource.Jobs> myJobs = MyCustomDataSource.Jobs.GetJobs(1031);
I fully understand that Entity Framework is doing way way more than a standard SqlConnection, but why all this hype if it is going to take at a minimum of 25x more memory for a result set. Just doesn't seem worth it.
My solution is not to go with EF after all.

The User property is part of the job class but wont be loaded until you access it (lazy loading). So it is not actually "joined".
If you only want the two specified columns you can write
var jobsList = from j in mf.Jobs
where j.UserID == 1001
select new {
Job.JobID,
Job.UserID
};

The most probable reason for this behavior is that you have LazyLoadingEnabled property set to true.
If this is the case, the User isn't recovered in the original query. But if you try to acces this property, even if you do it through an inspection while debugging, this will be loaded from the database. But only if you try to access it.
You can check this opening a a SQL Server Profiler, and seeing what commands are begin sent to the DB.
Your code is not using eager loading or explicit loading. So this must be the reason.

I think that EF don't know that you want one result only. Try something like this.
Job jobsItem = mf.Jobs.Single(j=>j.UserID==1001)
If you don't want to use lambas...
Job JobItem = (from j in mf.Jobs where j.UserID == 1001 select j).Single()
I haven't a compiler near right now, I hope the syntax is right. Use can use var instead of Job for your variable if you preffer. It has no effect but I think Job is more readable in this case.

User actually is not attached to context until you access User property of Job. Turn off Lazy Loading if you want to get a null for User.

Entity Framework does not support lazy loading of properties. However, it has table-splitting
Emphasized the properties. Of course, Entity Framework supports lazy loading of rows

Related

Mark some child entities as "Never Load" in LINQ to Entities query

TL;DR
I want to write a query in LINQ to Entities and tell it that I'll never load the child entities of an entity. How do I do that without projecting?
Eg,
return (from a in this.Db.Assets
join at in this.Db.AssetTypes on a.AssetTypeId equals at.AssetTypeId
join ast in this.Db.AssetStatuses on a.AssetStatusId equals ast.AssetStatusId
select new {
a = a,
typeDesc = at.AssetTypeDesc,
statusDesc = ast.AssetStatusDesc
}).ToList().Select(anon => new AssetViewModel(anon.a, anon.typeDesc, anon.statusDesc)).ToList();
I want the entity called Asset pulled into a on the anonymous type I'm defining, and when I call ToList(), I don't want the Assets' children, Status and Type, to lazy load.
EDIT: After some random Visual Studio autcomplete investigation, much of this can be accomplished by turning off lazy loading in the DbContext:
this.Db.Configuration.LazyLoadingEnabled = false;
Unfortunately, if your work with the query results does have a few child tables, even with LazyLoadingEnabled turned off, things may still "work" for some subset of them iff the data for those children has already been loaded earlier in this DbContext -- that is, if those children have already had their context cached -- which can make for some surprising and temporarily confusing results.
That is to say, I want to explicitly load some children at query time and completely sever any relationship to other child entities.
Best would be some way to actively load some entities and to ignore the rest. That is, I could call ToList() and not have to worry about throwing off lots of db connections.
Context
I have a case where I'm hydrating a view model with the results of a LINQ to Entities query from an entity called Asset. The Asset table has a couple of child tables, Type and Status. Both Type and Status have Description fields, and my view model contains both descriptions in it. Let's pretend that's as complicated as this query gets.
So I'd like to pull everything from the Asset table joined to Type and Status in one database query, during which I pull the Type and Status descriptions. In other words, I don't want to lazy load that info.
WET (Woeful Entity reTranscription?)
What we're doing now, which does exactly what I want from a connection standpoint, is the usual .Select into the view model, with a tedious field matchup.
return (from a in this.Db.Assets
join at in this.Db.AssetTypes on a.AssetTypeId equals at.AssetTypeId
join ast in this.Db.AssetStatuses on a.AssetStatusId equals ast.AssetStatusId
select new AssetViewModel
{
AssetId = a.AssetId,
// *** LOTS of fields from Asset removed ***
AssetStatusDesc = ast.AssetStatusDesc,
AssetTypeDesc = at.AssetTypeDesc
}).ToList();
That's good in that the Status and Type child entities of Asset are never accessed, and there's no lazy load. The SQL is one join in one database hit for all the assets. Perfect.
The worry is all the repeated jive in // *** LOTS of fields from Asset removed ***. Currently, we've got that projection in every freakin query, which obviously isn't DRY. And it means that when the Asset table changes, it's rare that the new field is included in every projection (because humans), which stinks.
I don't see a quick way around the query, btw. If I want to do it in a single query, I have to have the joins. I could add wheres to it in separate methods, but I'm not sure how I'd skip the projection each time. Or I could add joins to the query in cascading methods, but then my projection is still "repository bound", which isn't best case if I'm using these sorts of queries elsewhere. But I'm betting I'm stupiding something here.
Dumb
When I tried adding a cast to my view model from asset and changing to something like this, which is beautiful from a code standpoint, though I get bitten by lazy loading -- two extra database hits per Asset, one for Status and one for Type.
return (from a in this.Db.Assets
select a).ToList().Select(asset => (AssetViewModel)asset).ToList();
Just as we would expect, since I'm using lines like...
AssetTypeDesc = a.AssetType.AssetTypeDesc,
... inside of the casting code. So that was dumb. Concise, reusable, but dumb. This is why we hate folks who use ORMs without checking the SQL. ;^)
Overly clever, sorta
But then I tried getting too clever, with a new constructor for the view model that took the asset entity & the two description values as strings, which ended up with the same lazy load issue (because, duh, the first ToList() before selecting the anonymous objects means we don't know how the Assets are going to be used, and we're stuck pulling back everything to be safe (I assume)).
//Use anon type to skirt "Only parameterless constructors
//and initializers are supported in LINQ to Entities,"
//issue.
return (from a in this.Db.Assets
join at in this.Db.AssetTypes on a.AssetTypeId equals at.AssetTypeId
join ast in this.Db.AssetStatuses on a.AssetStatusId equals ast.AssetStatusId
select new {
a = a,
typeDesc = at.AssetTypeDesc,
statusDesc = ast.AssetStatusDesc
}).ToList().Select(anon => new AssetViewModel(anon.a, anon.typeDesc, anon.statusDesc)).ToList();
If only there was some way to say, "cast these anonymous objects to a List, but don't lazy load the Asset's children while you're doing it." <<< That's my question, natch.
I've read some about DataLoadOptions.LoadWith(), which probably provides an okay solution, and I might end up just doing that, but that's not precisely what I'm asking. I think that's a global-esque setting (? I think just for the life of the data context, which should be the single controller interaction), which I might not necessarily want to set. I may also want ObjectTrackingEnabled = false, but I'm not grokking yet.
I also don't want to use an automapper.
Painfully, after some random Visual Studio autcomplete investigation, this might be as easy as turning off lazy loading in your DbContext:
this.Db.Configuration.LazyLoadingEnabled = false;
The wacky thing is that if your work with the query results does have a few child tables, even with LazyLoadingEnabled turned off, things may still "work" for some subset of them iff the data for those children has already been loaded earlier in this DbContext -- that is, if those children have already had their context cached -- which can make for some surprising and temporarily confusing results.
Better would be to be able to cherry pick what children are "lazy-loading eligible".
I may need to update the question to make it cover this variation of the original question.

Entity Framework accessing db methods

I'm quite new at this so forgive me for not knowing how to word this properly.
I have setup Entity Framework with a rather large database and have been trying to learn how to manipulate tables in the database.
My first method worked but I ran into an error. It would load the data into a datagridview just fine but when sliding the bar over to view the tables not on the screen it would throw an error. This is the process that triggered the error:
using (var context = new MydbEntities())
{
var query = (from a in db.Configurations
select a);
var result = query.ToList();
dataGridView1.DataSource = result;
}
Now if I change the first line to MydbEntities db = new MydbEntities(); I don't get an error. I'm trying to follow online tutorials but I thought maybe someone could help me understand the difference in these two.
Basically you met lazy-load and fact that Context got disposed when you tried to query next batch of records.
You have serveral options here, best ones in my opinion are:
Use EntityDataSource. This way DataControl will take care of EF Context instantiation and disposal. (MSDN has pretty good specs on that).
Implement custom ObjectDataSource. Use EF context within ObjectDataSource methods, instantiating and disposing it when needed.
(This article on the topic is little outdated, but you still can get the idea).

Does LINQ load every object in a request?

Let me explain my self.
With the help of LINQ i'm requesting an object to my database such :
int idGetClient=1;
Client clientToUpdate = context.Client.FirstOrDefault(p=>p.idClient == idGetClient);
in my Model, a client is related to another object called Site. So i can easily get my Site object from my SQL database just by calling :
Site siteToUpdate = clientToUpdate.Site;
So i wonder what's happening here, did LINQ ALREADY loaded up the result in my first request OR is he requesting again taking up my client information and looking up in my database ?
The second one looks the most logic to me but i want to make sure, since if what happen if the first case, it's going to cause some performance issues.
If nothing else is specified, your property Site will be lazy loaded, so just at the moment you try to access to the property, Entity will query the database server.
If a second access to the database server will cause performance issue, you can prevent that with:
int idGetClient=1;
Client clientToUpdate = context.Client
.Include("Site")
.FirstOrDefault(p=>p.idClient == idGetClient);
And you can add as many Include as you want, and you can even load properties of properties :
int idGetClient=1;
Client clientToUpdate = context.Client
.Include("Site")
.Include("Site.SubProperty")
.FirstOrDefault(p=>p.idClient == idGetClient);
The Include will force the eager loading of the specified properties.
More info here :
https://msdn.microsoft.com/en-us/data/jj574232.aspx
Presuming you are using entity framework, you can hook into the sql statements yourself and see what's going on when you access your object - e.g.
context.Database.Log = Console.Write;
It's also possible to ensure the relation is loaded by using include:
context.Client.Include("Site").FirstOrDefault(p=>p.idClient == idGetClient);
In most cases the navigation property .Site causes the Site object to be lazy loaded. That is: a specific database query is issued.
The question is a bit scarce on details, so I assume that:
.Site is a navigation property representing a relation to another table in the database.
No global configuration on early loading is done (which is possible with some linq providers).
I recommend using a SQL profiler to see what queries are actually issued to the database (see this blogpost for some reasons on why).

Queryable Linq Query Differences In Entity Framework

I have a very simple many to many table in entity framework connecting my approvals to my transactions (shown below).
I am trying to do a query inside the approval object to count the amount of transactions on the approval, which should be relatively easy.
If I do something like this then it works super fast.
int count;
EntitiesContainer dbContext = new EntitiesContainer ();
var aCnt = from a in dbContext.Approvals
where a.id == id
select a.Transactions.Count;
count = aCnt.First();
However when I do this
count = Transactions.Count;
or this
count = Transactions.AsQueryable<Transaction>().Count();
its exceedingly slow. I have traced the sql running on the server and it does indeed seem to be trying to load in all the transactions instead of just doing the COUNT query on the collection of Transactions.
Can anyone explain to me why?
Additional :
Here is how the EF model looks in regards to these two classes
UPDATE :
Thanks for all the responses, I believe where I was going wrong was to believe that the collections attached to the Approval object would execute as IQueryable. I'm going to have to execute the count against the dbContext object.
Thanks everyone.
var aCnt = from a in dbContext.Approvals
where a.id == id
select a.Transactions.Count;
EF compiles query by itself, the above query will be compiled as select count transactions
Unlike,
count = Transactions.AsQueryable<Transaction>().Count();
count = Transactions.Count;
these will select all the records from transaction and then computes the count
When you access the a.Transactions property, then you load the list of transactions (lazy loading). If you want to get the Count only, then use something like this:
dbContext.Transactions.Where(t => t.Approvals.Any(ap => ap.Id == a.Id)).Count();
where a is given Approval.
Your first method allows the counting to take place on the database server level. It will ask the database not to return the records, but to return the amount of records found. This is the most efficient method.
This is not to say that other methods can't work as efficiently, but with the other two lines, you are not making it clear in the first place that you are retrieving transactions from a join on Approvals. Instead, in the other two lines, you take the Transactions collection just by itself and do a count on that, basically forcing the collection to be filled so it can be counted.
Your first snippet causes a query to be executed on the database server. It works that because the IQueryable instance is of type ObjectQuery provided by the Entity Framework which performs the necessary translation to SQL and then execution.
The second snippet illustrates working with IEnumerable instances. Count() works on them by, in worst case, enumerating the entire collection.
In the third snippet you attempt to make the IEnumerable an IQueryable again. But the Enumerable.AsQueryable method has no way of knowing that the IEnumerable it is getting "came" from Entity Framework. The best it can do is to wrap the IEnumerable in a EnumerableQuery instance which simply dynamically compiles the expression trees given to all LINQ query operators and executes them in memory.
If you need the count to be calculated by the database server, you can either formulate the requisite query manually (that is, write what you already did in snippet one), or use the method CreateSourceQuery available to you if you're not using Code First. Note that it will really be executed on the database server, so if you have modified the collection and have not yet saved changes, the result will be different to what would be returned by calling Count directly.

sorting on related field in llblgen 2.6

I inherited an application that uses llblgen 2.6. I have a PersonAppointmentType entity that has a AppointmentType property (n:1 relation). Now I want to sort a collection of PersonAppointmentTypes on the name of the AppointmentType. I tried this so far in the Page_Load:
if (!Page.IsPostBack)
{
var p = new PrefetchPath(EntityType.PersonAppointmentTypeEntity);
p.Add(PersonAppointmentTypeEntity.PrefetchPathAppointmentType);
dsItems.PrefetchPathToUse = p;
// dsItems.SorterToUse = new SortExpression(new SortClause(PersonAppointmentTypeFields.StartDate, SortOperator.Ascending)); // This works
dsItems.SorterToUse = new SortExpression(new SortClause(AppointmentTypeFields.Name, SortOperator.Ascending));
}
I'm probably just not getting it.
EDIT:
Phil put me on the right track, this works:
if (!Page.IsPostBack)
{
dsItems.RelationsToUse = new RelationCollection(PersonAppointmentTypeEntity.Relations.AppointmentTypeEntityUsingAppointmentTypeId);
dsItems.SorterToUse = new SortExpression(new SortClause(AppointmentTypeFields.Name, SortOperator.Ascending));
}
You'll need to share more code if you want an exact solution. You didn't post the code where you actually fetch the entity (or collection). This may not seem relevant but it (probably) is, as I'm guessing you are making a common mistake that people make with prefetch paths when they are first trying to sort or filter on a related entity.
You have a prefetch path from PersonAppointmentType (PAT) to AppointType (AT). This basically tells the framework to fetch PATs as one query, then after that query completes, to fetch ATs based on the results of the PAT query. LLBLGen takes care of all of this for you, and wires the objects together once the queries have completed.
What you are trying to do is sort the first query by the entity you are fetching in the second query. If you think in SQL terms, you need a join from PAT=>AT in the first query. To achieve this, you need to add a relation (join) via a RelationPredicateBucket and pass that as part of your fetch call.
It may seem counter-intuitive at first, but relations and prefetch paths are completely unrelated (although you can use them together). You may not even need the prefetch path at all; It may be that you ONLY need the relation and sort clause added to your fetch code (depending on whether you actually want the AT Entity in your graph, vs. the ability to sort by its fields).
There is a very good explanation of Prefetch Paths and how they were here:
http://www.llblgening.com/archive/2009/10/prefetchpaths-in-depth/
Post the remainder of your fetch code and I may be able to give you a more exact answer.

Categories

Resources