I'm using Autofac and EF6.
I have a service which I would like to use with dbContext or with local collection. I've already tried to use InMemoryDatabase and inject it in a service, but I need to load all related entites due to validation rules and it'll lead to performance issues.
Is any way to solve this issue without creating the same service, but for local store (in such way I'll need to edit the logic in both of them and can lead to different behaviour)?
The goal of this is to preload in some cases all data and use local store, accessing multiple times to db context cause low performance.
Example of service:
public class ProductServiceDb {
private readonly IDbContext _db;
public ProductServiceDb(IDbContext db) {
_db = db;
}
public List<Product> GetAvailable()
{
return _db.Products.Where(_=>_.InStock).AsNoTracking().ToList();
}
}
public class ProductServiceLocal {
public List<Product> Products {get; set;}
public List<Product> GetAvailable()
{
return Products.Where(_=>_.InStock).ToList();
}
}
Also I've already thought about local data, but in such way I need to check in all methods if it's local data and use Local property then.
I once made this little extension method to tell EF that I'd prefer to get data from its cache:
public static IEnumerable<TEntity> PreferLocal<TEntity>(this DbSet<TEntity> dbSet,
Expression<Func<TEntity, bool>> predicate) where TEntity : class
{
var func = predicate.Compile();
return dbSet.Local.Any(func) ? dbSet.Local.Where(func) : dbSet.Where(predicate);
}
As you see, if there's no data that meets the given predicate then EF tries to get it from the database.
I use this function in more complex business logic when I know that some entities will have to be addressed more than once, while it's not always clear when they will be fetched for the first time.
A caveat: it works well the way I use it, but it may fail badly if you rely on navigation properties being populated fully. When, for example, you first fetch a couple of products based on some predicate and then order lines, not all order lines may have a orderLine.Product property populated. Also, if lazy loading is enabled, EF will still query the database when collection navigation properties are addressed.
Related
I have a collection property in my code-first model:
public class Advertisment
{
//...
public ICollection<Comment> Comments {get; set;}
}
where Comment is other model-class represents the comment of the advertisment.
My task is obvious. The program have to count the number of comments.
There are two way:
through additional Int32-field:
public ICollection<Comment> Comments {get; set; }
public int NumberOfComments;
public void IncrementComments() // call after client leave a comment
{
NumberOfComments++;
}
public int SumComments() //so, just return the field
{
return NumberOfComments;
}
through calling the Count method
public int SumComments()
{
return db.Comments.Where(c => c.Advertisment == this).Count(); //where db is instance of DbContext
}
My question about performance. The second way can appear more easy for developing, however, in this case, Entity Framework does make some request for counting elements in DB. Does it have a negative impact on performance?
db.Comments.Count() can make a DB request, if you haven't queried the data since you initialized your DB-Context.
If you had queried the data, EF will count the data of your local instance.
So you have 2 possibilities:
You had already queried the data:
So EF only uses the count() method of an IEnumerable element
You have not queried the data:
EF will query the data and than use the count() method as in 1.
=> EF only will query the data if needed.
=> I don't see any downsides by using db.Comments.Count();
Edit:
The same is happening, when you're using Linq expressions on the table as in:
db.Comments.Where(c => c.Advertisment == this).Count();
But the Linq filter has to be applied every time you're using this line, but only on the local instance of your Database.
Q: in my scenario I only need the number of the comments, i.e. EF don't query Collection navigation property (not eager loading). I think for my situation it would be better to use only single field (int NumberOfComments) than EF would query additional data for counting. Am I wrong?
A: If you have only one Server instance running, you're right, but if there will be multiple instances, you get an inconsistency of data. EF context objects aren't designed to be long lived objects see the comment on this post:
EntityFramework force globally reload before each query
The data context is not designed to be long-lived. Reusing it over multiple operations is a mistake in your design. (spender)
I.e.: you have two server running the same application in terms of Load-Balancing. If a comment would sent to one server, how should the other server increment the comment flag without counting the comments?
Sorry if my question is a bit stupid, but I got really stuck with this basic thing.
I have generic repository that has method like this:
protected List<T> GetEntities(Expression<Func<T, bool>> whereExpression)
{
var query = Context.Set<T>().AsQueryable();
if (whereExpression != null)
query = query.Where(whereExpression).AsQueryable();
List<T> result = query.ToList();
return result;
}
Our app should allow to work with data without commiting to database. But at any time user should be able to save changes.
For adding or deleting entries I use:
Context.Set<T>().Add(entity);
and
Context.Set<T>().Remove(entity);
But after I call GetEntities deleted entities appear again and new entities does not appear.
Obviosuly that GetEntities queries directly database and does not take in account local changes.
So, my question: is there any easy way to combine where-expression on data loaded from database and local changes?
Thanks.
I sometimes use this little method to give preference to local data:
public static IQueryable<TEntity> PreferLocal<TEntity>(this DbSet<TEntity> dbSet,
Expression<Func<TEntity, bool>> predicate)
where TEntity : class
{
if (dbSet.Local.AsQueryable().Any(predicate))
{
return dbSet.Local.AsQueryable().Where(predicate);
}
return dbSet.Where(predicate);
}
It first tries to find entities in the Local collection (and because its all in-memory, the extra Any isn't expensive). If they're not there, the database is queried.
In your repository you could use like so:
return PreferLocal(Context.Set<T>(), whereExpression).ToList();
But maybe you also should re-consider the life cycle of your Context. It looks like you have a long-lived context, which isn't recommended.
I would like to know what's the best code design when storing and retrieving data from a database when working with objects and classes.
I could do this in two ways.
In the class constructur I query the database and stores all info in instance variables inside the class and retrieve them with getters/setters. This way I can always get any information I want, but in many cases wont be needing all the information all the time.
public class Group {
public int GroupID { get; private set; }
public string Name { get; private set; }
public Group(int groupID)
{
this.GroupID = groupID;
this.Name = // retrieve data from database
}
public string getName()
{
// this is just an example method, I know I can retrieve the name from the getter :)
return Name;
}
}
The other way is to create some methods and pass in the groupID as a parameter, and then query the database for that specific information I need. This could result in more querys but I will only get the information I need.
public class Group {
public Group()
{
}
public string getName(int groupID)
{
// query database for the name based on groupID
return name;
}
}
What do you think is the best way to go? Is there a best practice to go with here or is it up to me what I think works the best?
You don't want to do heavy DB work in the constructor. Heavy work should be done in methods.
You also don't want to necessarily couple the DB work with the entity class that holds the data. What if you want a method to return two of those objects from the database? For example GetGroups() - you can't even construct one without doing DB work. For something that returns multiple, the storage and retrieval is decouple from the entity class.
Instead, decouple your DB work from your entity objects. One option is you can have a dataaccesslayer with methods like GetFoo or GetFoos etc... that query the database, populate the objects and return them.
If you use an ORM, see:
https://stackoverflow.com/questions/3505/what-are-your-favorite-net-object-relational-mappers-orm
Lazy loading versus early loading, which is what this really boils down to, is best determined by usage.
Mostly this means related entities -- if you are dealing with an individual address for instance, splitting the read for the city from the read for the state would be crazy; OTOH when returning a list of company employee's reading their address information is probably a waste of time and memory.
Also, these aren't mutually exclusive options -- you can have a constructor that calls the databases, and a constructor that uses provided data.
If it is a relational database the best way would be to do it with ORM (object-relational mapping). See here for a list of ORM-mappers:
https://en.wikipedia.org/wiki/List_of_object-relational_mapping_software
I have a set of subclassed domain objects that I fetch with Linq and NHibernate. Here's an example of what I have:
public abstract class Car {
public abstract bool Runs();
}
public class Junker : Car {
public override bool Runs() {
return false;
}
}
public class NewCar : Car {
public override bool Runs() {
return true;
}
}
What I need to do is to fetch only the cars that Run(). So, I want to do this:
var goodCars = _session.Query<Car>().Where(car => car.Runs());
... but, that doesn't work because Runs() isn't a supported query source. Here's the error I get:
Cannot parse expression 'car' as it has an unsupported type. Only query sources (that is, expressions that implement IEnumerable) and query operators can be parsed.
I've tried separating the query into two steps: 1) get all cars, 2) filter by Runs() ... but I can't do this because it breaks Lazy Loading (my domain model is a bit more complex that my car example). Besides, I only want to fetch the items from the database that actually fit my query.
Is there a way to do what I'm trying to do?
You cannot do this. The thing you're trying to do is impossible to translate into a SQL query and since that is all NHibernate is doing ultimately...no.
To get all with one query is going to require you dropping down to the db level and using some non-domain knowledge. I recommend hiding this behind a service interface.
public interface RunningCars {
IEnumerable All();
}
and implementing it using maybe a custom sql query or a stored procedure.
How does doing it in 2 steps break lazy loading? Perhaps you need to specify that you want to pre-fetch those asociations during the initial query.
Also in this specific example, couldn't you just fetch all instances of NewCar?
if you convert Runs to a read-only property
public virtual bool Runs {get; private set;}
You can map it in your HBM and query the runs property. But being this isn't your actual model, there isn't really a way to help guide you other than to say
change the model
you cannot query object methods
I'm currently reading the book Pro Asp.Net MVC Framework. In the book, the author suggests using a repository pattern similar to the following.
[Table(Name = "Products")]
public class Product
{
[Column(IsPrimaryKey = true,
IsDbGenerated = true,
AutoSync = AutoSync.OnInsert)]
public int ProductId { get; set; }
[Column] public string Name { get; set; }
[Column] public string Description { get; set; }
[Column] public decimal Price { get; set; }
[Column] public string Category { get; set; }
}
public interface IProductsRepository
{
IQueryable<Product> Products { get; }
}
public class SqlProductsRepository : IProductsRepository
{
private Table<Product> productsTable;
public SqlProductsRepository(string connectionString)
{
productsTable = new DataContext(connectionString).GetTable<Product>();
}
public IQueryable<Product> Products
{
get { return productsTable; }
}
}
Data is then accessed in the following manner:
public ViewResult List(string category)
{
var productsInCategory = (category == null) ? productsRepository.Products : productsRepository.Products.Where(p => p.Category == category);
return View(productsInCategory);
}
Is this an efficient means of accessing data? Is the entire table going to be retrieved from the database and filtered in memory or is the chained Where() method going to cause some LINQ magic to create an optimized query based on the lambda?
Finally, what other implementations of the Repository pattern in C# might provide better performance when hooked up via LINQ-to-SQL?
I can understand Johannes' desire to control the execution of the SQL more tightly and with the implementation of what i sometimes call 'lazy anchor points' i have been able to do that in my app.
I use a combination of custom LazyList<T> and LazyItem<T> classes that encapsulate lazy initialization:
LazyList<T> wraps the IQueryable functionality of an IList collection but maximises some of LinqToSql's Deferred Execution functions and
LazyItem<T> will wrap a lazy invocation of a single item using the LinqToSql IQueryable or a generic Func<T> method for executing other code deferred.
Here is an example - i have this model object Announcement which may have an attached image or pdf document:
public class Announcement : //..
{
public int ID { get; set; }
public string Title { get; set; }
public AnnouncementCategory Category { get; set; }
public string Body { get; set; }
public LazyItem<Image> Image { get; set; }
public LazyItem<PdfDoc> PdfDoc { get; set; }
}
The Image and PdfDoc classes inherit form a type File that contains the byte[] containing the binary data. This binary data is heavy and i might not always need it returned from the DB every time i want an Announcement. So i want to keep my object graph 'anchored' but not 'populated' (if you like).
So if i do something like this:
Console.WriteLine(anAnnouncement.Title);
..i can knowing that i have only loaded from by db the data for the immediate Announcement object. But if on the following line i need to do this:
Console.WriteLine(anAnnouncement.Image.Inner.Width);
..i can be sure that the LazyItem<T> knows how to go and get the rest of the data.
Another great benefit is that these 'lazy' classes can hide the particular implementation of the underlying repository so i don't necessarily have to be using LinqToSql. I am (using LinqToSql) in the case of the app I'm cutting examples from, but it would be easy to plug another data source (or even completely different data layer that perhaps does not use the Repository pattern).
LINQ but not LinqToSql
You will find that sometimes you want to do some fancy LINQ query that happens to barf when the execution flows down to the LinqToSql provider. That is because LinqToSql works by translating the effective LINQ query logic into T-SQL code, and sometimes that is not always possible.
For example, i have this function that i want an IQueryable result from:
private IQueryable<Event> GetLatestSortedEvents()
{
// TODO: WARNING: HEAVY SQL QUERY! fix
return this.GetSortedEvents().ToList()
.Where(ModelExtensions.Event.IsUpcomingEvent())
.AsQueryable();
}
Why that code does not translate to SQL is not important, but just believe me that the conditions in that IsUpcomingEvent() predicate involve a number of DateTime comparisons that simply are far too complicated for LinqToSql to convert to T-SQL.
By using .ToList() then the condition (.Where(..) and then .AsQueryable() i'm effectively telling LinqToSql that i need all of the .GetSortedEvents() items even tho i'm then going to filter them. This is an instance where my filter expression will not render to SQL correctly so i need to filter it in memory. This would be what i might call the limitation of LinqToSql's performance as far as Deferred Execution and lazy loading goes - but i only have a small number of these WARNING: HEAVY SQL QUERY! blocks in my app and i think further smart refactoring could eliminate them completely.
Finally, LinqToSql can make a fine data access provider in large apps if you want it to. I found that to get the results i want and to abstract away and isolate certain things i've needed to add code here and there. And where i want more control over the actual SQL performance from LinqToSql, i've added smarts to get the desired results. So IMHO LinqToSql is perfectly ok for heavy apps that need db query optimization provided you understand how LinqToSql works. My design was originally based on Rob's Storefront tutorial so you might find it useful if you need more explanation about my rants above.
And if you want to use those lazy classes above, you can get them here and here.
Is this an efficient means of
accessing data? Is the entire table
going to be retrieved from the
database and filtered in memory or is
the chained Where() method going to
cause some LINQ magic to create an
optimized query based on the lambda?
It is efficient, if you wish to say so. The Repository exposes an IQueryable inteface, which basically represents any LINQ Data Provider (in this case Linq2Sql).
Queries are executed the moment you start iterating over the result.
IQueryable therefore supports query composition. You can add any .Where() or .GroupBy() or .OrderBy() call to a query and it will be statisfied by the database.
If you put an enumeration in your query, such as .ToList(), everything after that will happen in memory (LinqToObjects).
But I think the repository implementation is useless. I want my repository to control query execution, which is impossible when exposing IQueryable.
Yes linq2sql will generate magic to make it more efficient. It depends on you using the IQueryable interface. If you want to check clamp the SQL profiler on and you can see it generate the appropriate query.
I would recommend introducing a service layer to abstract away your dependancy on linq2sql.
I've also read that book recently and this is the SQL generated when I ran the sample code:
SELECT [t1].[Category]
FROM ( SELECT DISTINCT [t0].[Category]
FROM [Products] AS [t0] ) AS [t1] ORDER BY [t1].[Category]
I don't think you can write anything more efficient given that database. However in most real databases your Categories would be in a separate table to keep things DRY.