C# EntityFramework IQueryable Memory Leak - c#

We're seeing memory resources not be released:
With the following code using .NET Core:
class Program
{
static void Main(string[] args)
{
while (true) {
var testRunner = new TestRunner();
testRunner.RunTest();
}
}
}
public class TestRunner {
public void RunTest() {
using (var context = new EasyMwsContext()) {
var result = context.FeedSubmissionEntries.Where(fse => TestPredicate(fse)).ToList();
}
}
public bool TestPredicate(FeedSubmissionEntry e) {
return e.AmazonRegion == AmazonRegion.Europe && e.MerchantId == "1234";
}
}
If I remove the test predicate .Where I get a straight line as expected, with the predicate the memory will continue to rise indefinitely.
So while I can fix the problem I'd like to understand what is happening?
EDIT:
Altering the line to:
public void RunTest() {
using (var context = new EasyMwsContext()) {
var result = context.FeedSubmissionEntries.ToList();
}
}
Gives the graph:
So I don't believe this is due to client side evaluation either?
EDIT 2:
Using EF Core 2.1.4
And the object heap:
Edit 3:
Added a retention graph, seems to be an issue with EF Core?

I ended up running into the same issue. Once I knew what the problem was I was able to find a bug report for it here in the EntityFrameworkCore repository.
The short summary is that when you include an instance method in an IQueryable it gets cached, and the methods do not get released even after your context is disposed of.
At this time it doesn't look like much progress has been made towards resolving the issue. I'll be keeping an eye on it, but for now I believe the best options for avoiding the memory leak are:
Rewrite your methods so no instance methods are included in your IQueryable
Convert the IQueryableto a list with ToList() before using LINQ methods that contain instance methods (not ideal if you're trying to limit the results of a database query)
Make the method you're calling static to limit how much memory piles up

I suspect the culprit isn't a memory leak but a rather unfortunate addition to EF Core, Client Evaluation. Like LINQ-to-SQL, when faced with a lambda/function that can't be translated to SQL, EF Core will create a simpler query that reads more data and evaluate the function on the client.
In your case, EF Core can't know what TestPredicate is so it will read every record in memory and try to filter the data afterwards.
BTW that's what happened when SO moved to EF Core last Thursday, October 4, 2018. Instead of returning a few dozen lines, the query returned ... 52 million lines :
var answers = db.Posts
.Where(p => grp.Select(g=>g.PostId).Contains(p.Id))
...
.ToList();
Client evaluation is optional but on by default. EF Core logs a warning each time client evaluation is performed, but that won't help if you haven't configured EF Core logging.
The safe solution is to disable client-side evaluation as shown in the Optional behavior: throw an exception for client evaluation section of the docs, either in each context's OnConfiguring method or globally in the Startup.cs configuration :
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
optionsBuilder
.UseSqlServer(...)
.ConfigureWarnings(warnings =>
warnings.Throw(RelationalEventId.QueryClientEvaluationWarning));
}
UPDATE
A quick way to find out what's leaking is to take two memory snapshots in the Diagnostics window and check what new objects were created and how much memory they use. It's quite possible there's a bug in client evaluation.

Related

Entity framework very slow to load for first time after every compilation

As the title suggest i'm having a problem with the first query against a SQL Server database using the Entity Framework. I have tried looking for an answer but no one seems to actually have a solution to this.
The tests was done in Visual Studio 2012 using Entity Framework 6, I also used the T4 views template to pre-compile the views. The database was on a SQL Server 2008. We have about 400 POCOs (400 mapping files), only have 100 rows data in database table.
Following capture is my test code and result.
static void Main(string[] args){
Stopwatch st=new Stopwatch();
st.Start();
new TestDbContext().Set<Table1>.FirstOrDefault();
st.stop();
Console.WriteLine("First Time "+st.ElapsedMilliseconds+ " milliseconds");
st.Reset();
st.Start();
new TestDbContext().Set<Table1>.FirstOrDefault();
st.stop();
Console.WriteLine("Second Time "+st.ElapsedMilliseconds+ " milliseconds");
}
Test results
First Time 15480 milliseconds
Second Time 10 milliseconds
On the first query EF compiles the model. This can take some serious time for a model this large.
Here are 3 suggestions: http://www.fusonic.net/en/blog/2014/07/09/three-steps-for-fast-entityframework-6.1-first-query-performance/
A summary:
Using a cached db model store
Generate pre-compiled views
Generate pre-compiled version of entityframework using n-gen to avoid jitting
I would also make sure that I compile the application in release mode when doing the benchmarks.
Another solution is to look at splitting the DBContext. 400 entities is a lot and it should be nicer to work with smaller chunks. I haven't tried it but I assume it would be possible to build the models one by one meaning no single load takes 15s. See this post by Julie Lerman https://msdn.microsoft.com/en-us/magazine/jj883952.aspx
With EF Core, you can cheat and load the model early after you call services.AddDbContext (you can probably do something similar with EF6 too, but I haven't tested it).
services.AddDbContext<MyDbContext>(options => ...);
var options = services.BuildServiceProvider()
.GetRequiredService<DbContextOptions<MyDbContext>>();
Task.Run(() =>
{
using(var dbContext = new MyDbContext(options))
{
var model = dbContext.Model; //force the model creation
}
});
This will create the model of the dbcontext in another thread while the rest of the initialization of the application is done (and maybe other warmups) and the beginning of a request. This way, it will be ready sooner. When you need it, EFCore will wait for the Model to be created if it hasn't finished already. The Model is shared across all DbContext instances so it is ok to fire and forget this dummy dbcontext.
You can try something like this: (it worked for me)
protected void Application_Start()
{
Start(() =>
{
using (EF.DMEntities context = new EF.DMEntities())
{
context.DMUsers.FirstOrDefault();
}
});
}
private void Start(Action a)
{
a.BeginInvoke(null, null);
}
Entity Framework - First query slow
this work for me:
using (MyEntities db = new MyEntities())
{
db.Configuration.AutoDetectChangesEnabled = false; // <----- trick
db.Configuration.LazyLoadingEnabled = false; // <----- trick
DateTime Created = DateTime.Now;
var obj = from tbl in db.MyTable
where DateTime.Compare(tbl.Created, Created) == 0
select tbl;
dataGrid1.ItemsSource = obj.ToList();
dataGrid.Items.Refresh();
}
If you have many tables that are not being used on c#, exclude them.
Add a partial class, add the following code and reference this function on OnModelCreating
void ExcludedTables(DbModelBuilder modelBuilder)
{
modelBuilder.Ignore<Table1>();
modelBuilder.Ignore<Table>();
// And so on
}
For me, just using AsParallel() in the first query solved the problem. This runs the query on multiple processor cores (apparently). All my subsequent queries are unchanged, it is only the first one which was causing the delay.
I also tried pre-generated mapping views https://learn.microsoft.com/en-us/ef/ef6/fundamentals/performance/pre-generated-views but this did not improve startup time by much.
I think that is not a very good solution. Ado.net looks like a lot more performance. However, this is my opinion.
Alternatively look at them.
https://msdn.microsoft.com/tr-tr/data/dn582034
https://msdn.microsoft.com/en-us/library/cc853327(v=vs.100).aspx

Nhibernate sessionPerThread

I am creating entities in with multiple thread at the same time.
When i do this in sequence order (with one thread) everything is fine, but when i introduce concurrency there are pretty much always new exception.
i call this method asynchronously:
public void SaveNewData(){
....DO SOME HARD WORK....
var data = new Data
{
LastKnownName = workResult.LastKnownName
MappedProperty = new MappedProperty
{
PropertyName = "SomePropertyName"
}
};
m_repository.Save(data);
}
I already got this exception:
a different object with the same identifier value was already
associated with the session: 3, of
entity:TestConcurrency.MappedProperty
and also this one:
Flushing during cascade is dangerous
and of course my favourite one:
Session is closed!Object name: 'ISession'.
What i think is going on is: Everythread got same session (nhibernateSession) and then it... go wrong cos everything try to send queries with same session.
For nhibernate configuration i use NhibernateIntegration with windsor castle.
m_repository.Save(data) looks like:
public virtual void Save(object instance)
{
using (ISession session = m_sessionManager.OpenSession())
{
Save(instance, session);
}
}
where m_sessionManager is injected in constructor from Castle and it is ISessionManager. Is there any way how to force this ISessionManager to give me SessionPerThread or any other concurrent session handling ?
So i researched and it seems that NHibernateIntengrationFacility doesnt support this transaction management out of the box.
I solved it when i changed to new Castle.NHibernate.Facility which supersede Castle.NHibernateIntegration - please note that this is only beta version currently.
Castle.Nhibernate.Facility supports session-per-transaction management, so it solved my problem completely.

Entity Framework - Effect of MultipleActiveResultSets on Caching

So I have a Class that looks something like the following. There is a thread that does some work using an Entity Framework Code First DbContext.
The problem I'm having is that it seems like m_DB context is caching data even though it should be disposed and recreated for every processing loop.
What I've seen is that some data in a relationship isn't present in the models loaded. If I kill and restart the process suddenly the data is found just like it should.
The only thing I can think of is this app is using the MultipleActiveResultSets=true in the database connection string, but I can't find anything stating clearly that this would cause the behavior I'm seeing.
Any insight would be appreciated.
public class ProcessingService
{
private MyContext m_DB = null
private bool m_Run = true;
private void ThreadLoop()
{
while(m_Run)
{
try
{
if(m_DB == null)
m_DB = new MyContext();
}
catch(Exception ex)
{
//Log Error
}
finally
{
if(m_DB != null)
{
m_DB.Dispose();
m_DB = null;
}
}
}
}
private void ProcessingStepOne()
{
// Do some work with m_DB
}
private void ProcessingStepTwo()
{
// Do some work with m_DB
}
}
Multiple Active Result Sets or MARS is a feature of SQL 2005/2008 and ADO.NET where one connection can be used by multiple active result sets (Just as the name implies). try switching this off on the connection string and observe the behaviour of the app, i am guessing that this could be the likely cause of your problem. read the following MSDN link for more on MARS
MSDN - Multiple Active Result Sets
Edit:
Try:
var results = context.SomeEntitiy.AsNoTracking() where this = that select s;
AsNoTracking() switches off internal change tracking of entities and it should also force Entity Framework to reload entities every time.
Whatever said and done you will require some amount of re-factoring since there's obviously a design flaw in your code.
I hate answering my own question, especially when I don't have a good explanation of why it fixes the problem.
I ended up removing MARS and it did resolve my issue. The best explanation I have is this:
Always read to the end of results for procedural requests regardless of whether they return results or not, and for batches that return multiple results. (http://technet.microsoft.com/en-us/library/ms131686.aspx)
My application doesn't always read through all the results returned, so its my theory that this some how caused data to get cached and reused the new DbContext.

First query is slow and pre-generated views aren't being hit (probably)

I'm having a bit of trouble with the time it takes EF to pull some entities. The entity in question has a boatload of props that live in 1 table, but it also has a handful of ICollection's that relate to other tables. I've abandoned the idea of loading the entire object graph as it's way too much data and instead will have my Silverlight client send out a new request to my WCF service as details are needed.
After slimming down to 1 table's worth of stuff, it's taking roughly 8 seconds to pull the data, then another 1 second to .ToList() it up (I expect this to be < 1 second). I'm using the stopwatch class to take measurements. When I run the SQL query in SQL management studio, it takes only a fraction of a second so I'm pretty sure the SQL statement itself isn't the problem.
Here is how I am trying to query my data:
public List<ComputerEntity> FindClientHardware(string client)
{
long time1 = 0;
long time2 = 0;
var stopwatch = System.Diagnostics.Stopwatch.StartNew();
// query construction always takes about 8 seconds, give or a take a few ms.
var entities =
DbSet.Where(x => x.CompanyEntity.Name == client); // .AsNoTracking() has no impact on performance
//.Include(x => x.CompanyEntity)
//.Include(x => x.NetworkAdapterEntities) // <-- using these 4 includes has no impact on SQL performance, but faster to make lists without these
//.Include(x => x.PrinterEntities) // I've also abandoned the idea of using these as I don't want the entire object graph (although it would be nice)
//.Include(x => x.WSUSSoftwareEntities)
//var entities = Find(x => x.CompanyEntity.Name == client); // <-- another test, no impact on performance, same execution time
stopwatch.Stop();
time1 = stopwatch.ElapsedMilliseconds;
stopwatch.Restart();
var listify = entities.ToList(); // 1 second with the 1 table, over 5 seconds if I use all the includes.
stopwatch.Stop();
time2 = stopwatch.ElapsedMilliseconds;
var showmethesql = entities.ToString();
return listify;
}
I'm assuming that using the .Include means eager loading, although it isn't relevant in my current case as I just want the 1 table's worth of stuff. The SQL generated by this statement (which executes super fast in SSMS) is:
SELECT
[Extent1].[AssetID] AS [AssetID],
[Extent1].[ClientID] AS [ClientID],
[Extent1].[Hostname] AS [Hostname],
[Extent1].[ServiceTag] AS [ServiceTag],
[Extent1].[Manufacturer] AS [Manufacturer],
[Extent1].[Model] AS [Model],
[Extent1].[OperatingSystem] AS [OperatingSystem],
[Extent1].[OperatingSystemBits] AS [OperatingSystemBits],
[Extent1].[OperatingSystemServicePack] AS [OperatingSystemServicePack],
[Extent1].[CurrentUser] AS [CurrentUser],
[Extent1].[DomainRole] AS [DomainRole],
[Extent1].[Processor] AS [Processor],
[Extent1].[Memory] AS [Memory],
[Extent1].[Video] AS [Video],
[Extent1].[IsLaptop] AS [IsLaptop],
[Extent1].[SubnetMask] AS [SubnetMask],
[Extent1].[WINSserver] AS [WINSserver],
[Extent1].[MACaddress] AS [MACaddress],
[Extent1].[DNSservers] AS [DNSservers],
[Extent1].[FirstSeen] AS [FirstSeen],
[Extent1].[IPv4] AS [IPv4],
[Extent1].[IPv6] AS [IPv6],
[Extent1].[PrimaryUser] AS [PrimaryUser],
[Extent1].[Domain] AS [Domain],
[Extent1].[CheckinTime] AS [CheckinTime],
[Extent1].[ActiveComputer] AS [ActiveComputer],
[Extent1].[NetworkAdapterDescription] AS [NetworkAdapterDescription],
[Extent1].[DHCP] AS [DHCP]
FROM
[dbo].[Inventory_Base] AS [Extent1]
INNER JOIN [dbo].[Entity_Company] AS [Extent2]
ON [Extent1].[ClientID] = [Extent2].[ClientID]
WHERE
[Extent2].[CompanyName] = #p__linq__0
Which is basically a select all columns in this table, join a second table that has a company name, and filter with a where clause of companyname == input value to the method. The particular company I'm pulling only returns 75 records.
Disabling object tracking with .AsNoTracking() has zero impact on execution time.
I also gave the Find method a go, and it had the exact same execution time. The next thing I tried was to pregenerate the views in case the issue was there. I am using code first, so I used the EF power tools to do this.
This long period of time to run this query causes too long of a delay for my users. When I hand write the SQL code and don't touch EF, it is super quick. Any ideas as to what I'm missing?
Also, maybe related or not, but since I'm doing this in WCF which is stateless I assume absolutely nothing gets cached? The way I think about it is that every new call is a firing up this WCF service library for the first time, therefore there is no pre-existing cache. Is this an accurate assumption?
Update 1
So I ran this query twice within the same unit test to check out the cold/warm query thing. The first query is horrible as expected, but the 2nd one is lightning fast clocking in at 350ms for the whole thing. Since WCF is stateless, is every single call to my WCF service going to be treated as this first ugly-slow query? Still need to figure out how to get this first query to not suck.
Update 2
You know those pre-generated views I mentioned earlier? Well... I don't think they are being hit. I put a few breakpoints in the autogenerated-by-EF-powertools ReportingDbContext.Views.cs file, and they never get hit. This coupled with the cold/warm query performance I see, this sounds like it could be meaningful. Is there a particular way I need to pregenerate views with the EF power tools in a code first environment?
Got it! The core problem was the whole cold query thing. How to get around this cold query issue? By making a query. This will "warm up" EntityFramework so that subsequent query compilation is much faster. My pre-generated views did nothing to help with the query I was compiling in this question, but they do seem to work if I want to dump an entire table to an array (a bad thing). Since I am using WCF which is stateless, will I have to "warm up" EF for every single call? Nope! Since EF lives in the app domain and not the context, I just to need to do my warm up on the init of the service. For dev purposes I self host, but in production it lives in IIS.
To do the query warm up, I made a service behavior that takes care of this for me. Create your behavior class as such:
using System;
using System.Collections.ObjectModel;
using System.ServiceModel;
using System.ServiceModel.Channels; // for those without resharper, here are the "usings"
using System.ServiceModel.Description;
public class InitializationBehavior : Attribute, IServiceBehavior
{
public InitializationBehavior()
{
}
public void Validate(ServiceDescription serviceDescription, ServiceHostBase serviceHostBase)
{
}
public void AddBindingParameters(ServiceDescription serviceDescription, ServiceHostBase serviceHostBase, Collection<ServiceEndpoint> endpoints,
BindingParameterCollection bindingParameters)
{
Bootstrapper.WarmUpEF();
}
public void ApplyDispatchBehavior(ServiceDescription serviceDescription, ServiceHostBase serviceHostBase)
{
}
}
I then used this to do the warmup:
public static class Bootstrapper
{
public static int initialized = 0;
public static void WarmUpEF()
{
using (var context = new ReportingDbContext())
{
context.Database.Initialize(false);
}
initialized = 9999; // I'll explain this
}
}
This SO question helped with the warmup code:
How do I initialize my Entity Framework queries to speed them up?
You then slap this behavior on your WCF service like so:
[InitializationBehavior]
public class InventoryService : IInventoryService
{
// implement your service
}
I launched my services project in debug mode which in turn fired up the initialization behavior. After spamming the method that makes the query referenced in my question, my breakpoint in the behavior wasn't being hit (other than being hit when I first self hosted it). I verified that it was it by checking out the static initialized variable. I then published this bad boy into IIS with my verification int and it had the exact same behavior.
So, in short, if you are using Entity Framework 5 with a WCF service and don't want a crappy first query, warm it up with a service behavior. There are probably other/better ways of doing this, but this way works too!
edit:
If you are using NUnit and want to warm up EF for your unit tests, setup your test as such:
[TestFixture]
public class InventoryTests
{
[SetUp]
public void Init()
{
// warm up EF.
using (var context = new ReportingDbContext())
{
context.Database.Initialize(false);
}
// init other stuff
}
// tests go here
}

How can I use Sql CE 4 databases for functional tests

Due to the potential differences between Linq-to-Entities (EF4) and Linq-to-Objects, I need to use an actual database to make sure my query classes retrieve data from EF correctly. Sql CE 4 seems to be the perfect tool for this however I have run into a few hiccups. These tests are using MsTest.
The problem I have is if the database doesn't get recreated (due to model changes), data keeps getting added to the database after each test with nothing getting rid of the data. This can potentially cause conflicts in tests, with more data being returned by queries than intended.
My first idea was to initialize a TransactionScope in the TestInitialize method, and dispose the transaction in TestCleanup. Unfortunately, Sql CE4 does not support transactions.
My next idea was to delete the database in TestCleanup via a File.Delete() call. Unfortunately, this seems to not work after the first test is run, as the first test's TestCleanup seems to delete the database, but every test after the first does not seem to re-create the database, and thus it gives an error that the database file is not found.
I attempted to change TestInitialize and TestCleanup tags to ClassInitialize and ClassCleanup for my testing class, but that errored with a NullReferenceException due to the test running prior to ClassInitialize (or so it appears. ClassInitialize is in the base class so maybe that's causing it).
I have run out of ways to effectively use Sql CE4 for testing. Does anyone have any better ideas?
Edit: I ended up figuring out a solution. In my EF unit test base class I initiate a new instance of my data context and then call context.Database.Delete() and context.Database.Create(). The unit tests run a tad slower, but now I can unit test effectively using a real database
Final Edit: After some emails back and forth with Microsoft, it turns out that TransactionScopes are now allowed in SqlCE with the latest release of SqlCE. However, if you are using EF4 there are some limitations in that you must explicitly open the database connection prior to starting the transaction. The following code shows a sample on how to successfully use Sql CE for unit/functional testing:
[TestMethod]
public void My_SqlCeScenario ()
{
using (var context = new MySQLCeModelContext()) //ß derived from DbContext
{
ObjectContext objctx = ((IObjectContextAdapter)context).ObjectContext;
objctx.Connection.Open(); //ß Open your connection explicitly
using (TransactionScope tx = new TransactionScope())
{
var product = new Product() { Name = "Vegemite" };
context.Products.Add(product);
context.SaveChanges();
}
objctx.Connection.Close(); //ß close it when done!
}
}
In your TestInitialize you should do the following:
System.Data.Entity.Database.DbDatabase.SetInitializer<YourEntityFrameworkClass>(
new System.Data.Entity.Database.DropCreateDatabaseAlways<YourEntityFrameworkClass>());
This will cause entity framework to always recreate the database whenever the test is run.
Incidentally you can create an alternative class that inherits from DropCreateDatabaseAlways. This will allow you to seed your database with set data each time.
public class DataContextInitializer : DropCreateDatabaseAlways<YourEntityFrameworkClass> {
protected override void Seed(DataContext context) {
context.Users.Add(new User() { Name = "Test User 1", Email = "test#test.com" });
context.SaveChanges();
}
}
Then in your Initialize you would change the call to:
System.Data.Entity.Database.DbDatabase.SetInitializer<YourEntityFrameworkClass>(
new DataContextInitializer());
I found the approach in the "final edit" works for me as well. However, it's REALLY annoying. It's not just for testing, but any time you want to use TransactionScope with Entity Framework and SQL CE. I want to code once and have my app support both SQL Server and SQL CE, but anywhere I use transactions I have to do this. Surely the Entity Framework team should have handled this for us!
In the meantime, I took it one step farther to make it a little cleaner in my code. Add this block to your data context (whatever class you derive from DbContext):
public MyDataContext()
{
this.Connection.Open();
}
protected override void Dispose(bool disposing)
{
if (this.Connection.State == ConnectionState.Open)
this.Connection.Close();
base.Dispose(disposing);
}
private DbConnection Connection
{
get
{
var objectContextAdapter = (IObjectContextAdapter) this;
return objectContextAdapter.ObjectContext.Connection;
}
}
This makes it a lot cleaner when you actually use it:
using (var db = new MyDataContext())
{
using (var ts = new TransactionScope())
{
// whatever you need to do
db.SaveChanges();
ts.Complete();
}
}
Although I suppose that if you design your app such that all changes are committed in a single call to SaveChanges(), then the implicit transaction would be good enough. For the testing scenario, we want to roll everything back instead of calling ts.Complete(), so it's certainly required there. I'm sure there are other scenarios where we need the transaction scope available. It's a shame it isn't supported directly by EF/SQLCE.

Categories

Resources