I am still new in entity framework. So forgive me if question is dummy :)
I have a domain class that get's a list of some data from database:
public IEnumerable<Item> GetItems()
{
return context.Items.ToList();
}
This code return all items from database.
On the site I use paging so I need only 10 items per page.
So I did something like this:
var model = itemsRepository.GetItems().
Where(x => x.CategoryId == categoryId).
OrderByDescending(x => x.CreatedOnDate).
Skip(0).
Take(pageSize);
Now as I see what I did here is, I take all items from db and filter them.
Will I get some benefit if I put new method in domain and put the following code in it:
return context.Items.Where(x => x.CategoryId == categoryId).
OrderByDescending(x => x.CreatedOnDate).
Skip(0).
Take(pageSize);
Yes. You will get the benefit that your LINQ query in the latter case will get translated to SQL and executed in the database. Therefore, your first example will load the entire table into memory - while the second example will do a much more efficient query in the database.
Essentially, the .ToList() breaks deferred execution - but it might also make sense for you to return IQueryable<T> rather than IEnumerable<T>, and then working on that in upper layers - depending on your requirements. Also, try reading this question and answer.
Yes, you should. I'm assuming you're using SQL as the backend for your context, but the query that ends up getting constructed with your new method will only pull those 10 records out and return them as an IEnumerable (deferred execution) rather than pulling everything from the database and then just filtering out the first 10 results.
I think you're better off with the second (new) method using deferred execution.
Are you seeing an improvement in performance via SQL Profiler, too?
There are some problems in your code:
Do not use variable in class for context, every time you need it, create it and dispose it (with using):
using(var context = new ...)
{
// do DB stuffs
}
Do not call ToList(), to fetch all items, use normal paging then call ToList (something like your second sample but with using, ...).
The problem with the second approach is that the domain now is coupled to the context. This defeats one of the main purposes of the repository pattern. I suggest you have the second method inside the repository where you passed it the page number you want retrieved and it returns them to you. In your repository have something like
public IEnumerable<Item> GetItemsForPage(int pageNumber)
{
return context.Items.Where(x => x.CategoryId == categoryId).
OrderByDescending(x => x.CreatedOnDate).
Skip(pageNumber * pageSize). //Note not always 0
Take(pageSize);
}
In your domain you would call repository.GetItemsForPage(). This gives you the benefit of delayed execution while maintaining the decoupling of domain and context.
Related
I am making a call to a function in .net from angular js, the time it takes to get the response to angular from .net is more than 5 seconds. How can I make the mapping of the result of a sql query decrease in time, I have the following code.
List<CarDTO> result = new List<CarDTO>();
var cars = await _carsUnitOfWork.CarRepository.GetCarDefault(carFilter,true,_options.Value.priorityLabel);
result = cars.Select(car => _mapper.Map<Car, CarDTO>(car)).ToList();
The code you have provided isn't expanded enough to identify a cause, but there are a number of clues:
Check / post the code for CareRepository.GetCarDefault(). The call implies that this is returning an IEnumerable given it is Awaited. It isn't clear what all of the parameters are and how they affect the query. As your database grows, this appears to return all cars, rather than supporting pagination. (What happens when you have 10,000 Car records, or 1,000,000?)
Next would be the use of Automapper's Map method. Combined with IEnumerable this means that your repository is going through the hassle of loading all Care entities into memory, then Automapper is allocating a duplicate set of DTOs into memory copying across data from the entities.
Lazy loading is a distinct risk with an approach like this. If the CarDTO pulls any fields from entities referenced by a Car, this will trip off additional queries for each individual car.
For best performance, I highly recommend adopting an IQueryable return type on Repository methods and leveraging Automapper's ProjectTo method rather than Map. This is equivalent to using Select in Linq, as ProjectTo will bubble down into the SQL generation to build efficient queries and return the collection of DTOs. This removes the risk of lazy loading calls as well as the double memory allocation for entities then DTOs.
Implementing this with your Unit of Work pattern is a bit of an unknown without seeing the code. However it would look something more like:
var result = await _carsUnitOfWork.CarRepository
.GetCarDefault(carFilter,true,_options.Value.priorityLabel)
.ProjectTo<CarDto>(mapperConfig)
.ToListAsync(); // Add Skip() and Take() to support pagination.
The repository method would be changed from being something like:
public async IEnumerable<Car> GetCarDefault( ... )
to
public IQueryable<Car> GetCarDefault( ... )
Rather than the method returning something like .ToListAsync(), you return the built Linq expression.
I.e. change from something like:
var result = _context.Cars.Include().Where(x => ...).ToListAsync();
return result;
to
var query = _context.Cars.Where(x => ....);
return query;
The key differences is that where the existing method likely returns ToListAsync() we remove that and return the unmaterialized IQueryable that Linq is building. Also, if the current implementation is Eager Loading any relations /w .Include() we exclude those. The caller performing projection doesn't need that. If the caller does need Car entity graphs, (such as when updating data) the caller can append .Include() statements.
It is also worth running an SQL Profiler to look at what queries are being run against the database server. This can give you the queries to inspect and test, as well as highlight any unexpected queries being triggered. (I.e. caused by lazy loading calls)
That should give you some ideas on where to start.
I need to dynamically filter data from the particular table depending on user's permissions. E.g. "normal" user can see only records assigned to him but admin can see all. I'm using ninject to create DB context per request and create context by passing additional user info to a constructor. Then I apply dynamic filtering (EF6) from EntityFramework-Plus extensions:
public MyDbContext(bool isAdmin, string userId) : this()
{
if (!isAdmin)
{
this.Filter<MyTable>(table => table.Where(...));
}
}
This solution works as expected i.e. calling methods like:
ctx.MyTable.Where(...)
Results in extra join declared in filter.
But behaves oddly when I'm using method Find(). I'm using SqlServer profiler to see what happens under the hood:
I Create context as restricted (non-admin user) - calling Find() will result in extra WHERE statement corresponding to filter's lambda expression
Then I create the context as admin (separate request) - calling Find() will result in the same SQL expression (i expect no extra SQL clauses).
AFAIK this has something to do with query caching since adding extra line to constructor seems to solve the problem:
public MyDbContext(bool isAdmin, string userId) : this()
{
// this "solves" the problem
QueryFilterManager.ClearQueryCache(this);
if (!isAdmin)
{
this.Filter<MyTable>(table => table.Where(...));
}
}
That looks like a big overkill and it doesn't bring me any closer to understanding the problem. So here are my questions:
Why this problem does not affect Where() but affects Find()?
Is there any cleaner way to solve this issue? I've read about Dynamic Filters library but it's no good for me as it works only in code first model (DB first here).
Are there better concepts of filtering data basing on per-request data (like userId in my example)?
UPDATE
This is what my lambda expression looks like:
private Func<IQueryable<MyTable>, IQueryable<MyTable>> GetFilter(string userId)
{
return t => t
.Where(c.DataScopes.Any(
x => x.AspNetGroups.Any(
ang => ang.AspNetUsers.Any(
anu => anu.Id == userId))));
}
AspNetGroups is my custom table to group users. Data persmissions are assigned to users group.
I am using EF to get data from my MySQL database.
Now, I have two tables, the first is customers and project_codes.
The table project_codes have a FK to customers named id_customers. This way I am able to query which projects belong to a customer.
On my DAL, I got the following:
public List<customer> SelectAll()
{
using (ubmmsEntities db = new ubmmsEntities())
{
var result = db.customers
.Include(p => p.project_codes)
.OrderBy(c=>c.customer_name)
.ToList();
return result;
}
}
That outputs to me all my customer and their respective project_codes.
Recently I needed to only get the project codes, so I created a DAL to get all the project codes from my database. But then I got myself thinking: "Should I have done that? Wouldn't be best to use my SelectAll() method and from it use Linq to fetch me the list of project_codes off all customers?
So this that was my first question. I mean, re-utilizing methods as much as possible is a good thing from a maintainability perspective, right?
The second question would be, how to get all the project_codes to a List? Doing it directly is easy, but I failed to achieve that using the SelectAll() as a base.
It worked alright if I had the customer_id using:
ddlProject.DataSource = CustomerList.Where(x => x.id.Equals(customer_id))
.Select(p => p.project_codes).FirstOrDefault();
That outputed me the project codes of that customer, but I tried different approaches (foreach, where within whhere and some others at random) but they either the syntax fail or don't output me a list with all the project_codes. So that is another reason for me going with a specific method to get me the project codes.
Even if "common sense" or best practices says it is a bad idea to re-utilize the method as mentioned above, I would like some directions on how to achieve a list of project_codes using the return of SelectAll()... never know when it can come in hand.
Let me know your thoughts.
There's a trade-off here; you are either iterating a larger collection (and doing selects, etc) on an in-memory collection, or iterating a smaller collection but having to go to a database to do it.
You will need to profile your setup to determine which is faster, but its entirely possible that the in-memory approach will be better (though stale if your data could have changed!).
To get all the project_codes, you should just need:
List<customer> customers; //Fill from DAL
List<project_code> allProjects = customers.SelectMany(c => c.project_codes).ToList();
Note that I used SelectMany to flatten the hierarchy of collections, I don't think SelectAll is actually a LINQ method.
Usually the distinction between LINQ to SQL and LINQ to Objects isn't much of an issue, but how can I determine which is happening?
It would be useful to know when writing the code, but I fear one can only be sure at run time sometimes.
It's not micro optimization to make the distinction between Linq-To-Sql and Linq-To-Objects. The latter requires all data to be loaded into memory before you start filtering it. Of course, that can be a major issue.
Most LINQ methods are using deferred execution, which means that it's just building the query but it's not yet executed (like Select or Where). Few others are executing the query and materialize the result into an in-memory collection (like ToLIst or ToArray). If you use AsEnumerable you are also using Linq-To-Objects and no SQL is generated for the parts after it, which means that the data must be loaded into memory (yet still using deferred execution).
So consider the following two queries. The first selects and filters in the database:
var queryLondonCustomers = from cust in db.customers
where cust.City == "London"
select cust;
whereas the second selects all and filters via Linq-To-Objects:
var queryLondonCustomers = from cust in db.customers.AsEnumerable()
where cust.City == "London"
select cust;
The latter has one advantage: you can use any .NET method since it doesn't need to be translated to SQL (e.g. !String.IsNullOrWhiteSpace(cust.City)).
If you just get something that is an IEnumerable<T>, you can't be sure if it's actually a query or already an in-memory object. Even the try-cast to IQueryable<T> will not tell you for sure what it actually is because of the AsQueryable-method. Maybe you could try-cast it to a collection type. If the cast succeeds you can be sure that it's already materialized but otherwise it doesn't tell you if it's using Linq-To-Sql or Linq-To-Objects:
bool isMaterialized = queryLondonCustomers as ICollection<Customer> != null;
Related: EF ICollection Vs List Vs IEnumerable Vs IQueryable
The first solution comes into my mind is checking the query provider.
If the query is materialized, which means the data is loaded into memory, EnumerableQuery(T) is used. Otherwise, a special query provider is used, for example, System.Data.Entity.Internal.Linq.DbQueryProvider for entityframework.
var materialized = query
.AsQueryable()
.Provider
.GetType()
.GetGenericTypeDefinition() == typeof(EnumerableQuery<>);
However the above are ideal cases because someone can implement a custom query provider behaves like EnumerableQuery.
I had the same question, for different reasons.
Judging purely on your title & initial description (which is why google search brought me here).
Pre compilation, given an instance that implements IQueryable, there's no way to know the implementation behind the interface.
At runtime, you need to check the instance's Provider property like #Danny Chen mentioned.
public enum LinqProvider
{
Linq2SQL, Linq2Objects
}
public static class LinqProviderExtensions
{
public static LinqProvider LinqProvider(this IQueryable query)
{
if (query.Provider.GetType().IsGenericType && query.Provider.GetType().GetGenericTypeDefinition() == typeof(EnumerableQuery<>))
return LinqProvider.Linq2Objects;
if (typeof(ICollection<>).MakeGenericType(query.ElementType).IsAssignableFrom(query.GetType()))
return LinqProvider.Linq2Objects;
return LinqProvider.Linq2SQL;
}
}
In our case, we are adding additional filters dynamically, but ran into issues with different handling of case-sensitivity/nullreference handling on different providers.
Hence, at runtime we had to tweak the filters that we add based on the type of provider, and ended up adding this extension method:
Using EF core in net core 6
To see if the provider is an EF provider, use the following code:
if (queryable.Provider is Microsoft.EntityFrameworkCore.Query.Internal.EntityQueryProvider)
{
// Queryable is backed by EF and is not an in-memory/client-side queryable.
}
One could get the opposite by testing the provider against System.Linq.EnumerableQuery (base type of EnumerableQuery<T> - so you don't have to test generics).
This is useful if you have methods like EF.Functions.Like(...) which can only be executed in the database - and you want to branch to something else in case of client-side execution.
I would like to know when to use tolist. In the following example, both of the following do not result in error. So, which way to use?
var employees = db.Employees.Include(e => e.Department);
return View(employees);
var employees = db.Employees.Include(e => e.Department);
return View(employees.ToList());
Seems like the code is ASP.Net MVC code.. given return View(employees);
I also assume that the data is being pulled from the DB, using some LinqToSQL or EntityFramework like technology.
Given those two assumptions, I'd recommend that the latter be used. i.e. with .ToList()
Reason being, if the query is lazily evaluated, if you pass employees without .ToList(), you are essentially passing the query to the View, and it will execute when query is enumerated while rendering the View. View rendering should be fast, and not be blocked by a call to database.
.ToList() would avoid that, and force execution of the query in controller, and View will have data available in memory, for fast rendering...
Hope it answers your question.
EDIT: One caveat.. there some scenarios, for example when building APIs, like with OData APIs with WebAPI, that you actually want to return the query than the materialized list. The reason there is that by design, you want Framework to build on top of that query, before the filtered data is returned to the caller. In other words, framework does some more leg work for you, before view (serialized data - typically not HTML) is actually rendered.
After the first line is executed, employees collection will not be loaded into the memory (Lazy Loading). It is loaded when the collection is first accessed. When you call ToList() collection will be forced to be loaded into memory.
Usage is based on the trade-off between memory limitation and speed. Accessing from the memory is faster than lazy loading.