I am already using transactions inside my repository functions in some cases because I sometimes need to insert data into two tables at once and I want the whole operation to fail if one of the inserts fails.
Now I ran into a situation where I had to wrap calls to multiple repositories / functions in another transaction, but when one of those functions already uses a transaction internally I will get the error The connection is already in a transaction and cannot participate in another transaction.
I do not want to remove the transaction from the repository function because this would mean that I have to know for which repository functions a transaction is required which I would then have to implement in the service layer. On the other hand, it seems like I cannot use repository functions in a transaction when they already use a transaction internally. Here is an example for where I am facing this problem:
// Reverse engineered classes
public partial class TblProject
{
public TblProject()
{
TblProjectStepSequences = new HashSet<TblProjectStepSequence>();
}
public int ProjectId { get; set; }
public virtual ICollection<TblProjectStepSequence> TblProjectStepSequences { get; set; }
}
public partial class TblProjectTranslation
{
public int ProjectId { get; set; }
public string Language { get; set; }
public string ProjectName { get; set; }
public virtual TblProject Project { get; set; }
}
public partial class TblProjectStepSequence
{
public int SequenceId { get; set; }
public int ProjectId { get; set; }
public int StepId { get; set; }
public int SequencePosition { get; set; }
public virtual TblStep Step { get; set; }
public virtual TblProject Project { get; set; }
}
// Creating a project in the ProjectRepository
public async Task<int> CreateProjectAsync(TblProject project, ...)
{
using (var transaction = this.Context.Database.BeginTransaction())
{
await this.Context.TblProjects.AddAsync(project);
await this.Context.SaveChangesAsync();
// Insert translations... (project Id is required for this)
await this.Context.SaveChangesAsync();
transaction.Commit();
return entity.ProjectId;
}
}
// Creating the steps for a project in the StepRepository
public async Task<IEnumerable<int>> CreateProjectStepsAsync(int projectId, IEnumerable<TblProjectStepSequence> steps)
{
await this.Context.TblProjectStepSequences.AddRangeAsync(steps);
await this.Context.SaveChangesAsync();
return steps.Select(step =>
{
return step.SequenceId;
}
);
}
// Creating a project with its steps in the service layer
public async Task<int> CreateProjectWithStepsAsync(TblProject project, IEnumerable<TblProjectStepSequence> steps)
{
// This is basically a wrapper around Database.BeginTransaction() and IDbContextTransaction
using (Transaction transaction = await transactionService.BeginTransactionAsync())
{
int projectId = await projectRepository.CreateProjectAsync(project);
await stepRepository.CreateProjectStepsAsync(projectId, steps);
return projectId;
}
}
Is there a way how I can nest multiple transactions inside each other without already knowing in the inner transactions that there could be an outer transaction?
I know that it might not be possible to actually nest those transactions from a technical perspective but I still need a solution which either uses the internal transaction of the repository or the outer one (if one exists) so there is no way how I could accidentally forget to use a transaction for repository functions which require one.
You could check the CurrentTransaction property and do something like this:
var transaction = Database.CurrentTransaction ?? Database.BeginTransaction()
If there is already a transaction use that, otherwise start a new one...
Edit: Removed the Using block, see comments. More logic is needed for Committing/Rollback the transcaction though...
I am answering the question you asked "How to nest transactions in EF Core 6?"
Please note that this is just a direct answer, but not an evaluation what is best practice and what not. There was a lot of discussion going around best practices, which is valid to question what fits best for your use case but not an answer to the question (keep in mind that Stack overflow is just a Q+A site where people want to have direct answers).
Having said that, let's continue with the topic:
Try to use this helper function for creating a new transaction:
public CommittableTransaction CreateTransaction()
=> new System.Transactions.CommittableTransaction(new TransactionOptions()
{
IsolationLevel = System.Transactions.IsolationLevel.ReadCommitted
});
Using the Northwind database as example database, you can use it like:
public async Task<int?> CreateCategoryAsync(Categories category)
{
if (category?.CategoryName == null) return null;
using(var trans = CreateTransaction())
{
await this.Context.Categories.AddAsync(category);
await this.Context.SaveChangesAsync();
trans.Commit();
return category?.CategoryID;
}
}
And then you can call it from another function like:
/// <summary>Create or use existing category with associated products</summary>
/// <returns>Returns null if transaction was rolled back, else CategoryID</returns>
public async Task<int?> CreateProjectWithStepsAsync(Categories category)
{
using var trans = CreateTransaction();
int? catId = GetCategoryId(category.CategoryName)
?? await CreateCategoryAsync(category);
if (!catId.HasValue || string.IsNullOrWhiteSpace(category.CategoryName))
{
trans.Rollback(); return null;
}
var product1 = new Products()
{
ProductName = "Product A1", CategoryID = catId
};
await this.Context.Products.AddAsync(product1);
var product2 = new Products()
{
ProductName = "Product A2", CategoryID = catId
};
await this.Context.Products.AddAsync(product2);
await this.Context.SaveChangesAsync();
trans.Commit();
return catId;
}
To run this with LinqPad you need an entry point (and of course, add the NUGET package EntityFramework 6.x via F4, then create an EntityFramework Core connection):
// Main method required for LinqPad
UserQuery Context;
async Task Main()
{
Context = this;
var category = new Categories()
{
CategoryName = "Category A1"
// CategoryName = ""
};
var catId = await CreateProjectWithStepsAsync(category);
Console.WriteLine((catId == null)
? "Transaction was aborted."
: "Transaction successful.");
}
This is just a simple example - it does not check if there are any product(s) with the same name existing, it will just create a new one. You can implement that easily, I have shown it in the function CreateProjectWithStepsAsync for the categories:
int? catId = GetCategoryId(category.CategoryName)
?? await CreateCategoryAsync(category);
First it queries the categories by name (via GetCategoryId(...)), and if the result is null it will create a new category (via CreateCategoryAsync(...)).
Also, you need to consider the isolation level: Check out System.Transactions.IsolationLevel to see if the one used here (ReadCommitted) is the right one for you (it is the default setting).
What it does is creating a transaction explicitly, and notice that here we have a transaction within a transaction.
Note:
I have used both ways of using - the old one and the new one. Pick the one you like more.
Just don't call SaveChanges multiple times.
The problem is caused by calling SaveChanges multiple times to commit changes made to the DbContext instead of calling it just once at the end. It's simply not needed. A DbContext is a multi-entity Unit-of-Work. It doesn't even keep an open connection to the database. This allows 100-1000 times better throughput for the entire application by eliminating cross-connection blocking.
A DbContext tracks all modifications made to the objects it tracks and persists/commits them when SaveChanges is called using an internal transaction. To discard the changes, simply dispose the DbContext. That's why all examples show using a DbContext in a using block - that's actually the scope of the Unit-of-Work "transaction".
There's no need to "save" parent objects first. EF Core will take care of this itself inside SaveChanges.
Using the Blog/Posts example in the EF Core documentation tutorial :
public class BloggingContext : DbContext
{
public DbSet<Blog> Blogs { get; set; }
public DbSet<Post> Posts { get; set; }
public string DbPath { get; }
// The following configures EF to create a Sqlite database file in the
// special "local" folder for your platform.
protected override void OnConfiguring(DbContextOptionsBuilder options)
=> options.UseSqlServer($"Data Source=.;Initial Catalog=tests;Trusted_Connection=True; Trust Server Certificate=Yes");
}
public class Blog
{
public int BlogId { get; set; }
public string Url { get; set; }
public List<Post> Posts { get; } = new();
}
public class Post
{
public int PostId { get; set; }
public string Title { get; set; }
public string Content { get; set; }
public int BlogId { get; set; }
public Blog Blog { get; set; }
}
The following Program.cs will add a Blog with 5 posts but only call SaveChanges once at the end :
using (var db = new BloggingContext())
{
Blog blog = new Blog { Url = "http://blogs.msdn.com/adonet" };
IEnumerable<Post> posts = Enumerable.Range(0, 5)
.Select(i => new Post {
Title = $"Hello World {i}",
Content = "I wrote an app using EF Core!"
});
blog.Posts.AddRange(posts);
db.Blogs.Add(blog);
await db.SaveChangesAsync();
}
The code never specifies or retrieves the IDs. Add is an in-memory operation so there's no reason to use AddAsync. Add starts tracking both the blog and the related Posts in the Inserted state.
The contents of the tables after this are :
select * from blogs
select * from posts;
-----------------------
BlogId Url
1 http://blogs.msdn.com/adonet
PostId Title Content BlogId
1 Hello World 0 I wrote an app using EF Core! 1
2 Hello World 1 I wrote an app using EF Core! 1
3 Hello World 2 I wrote an app using EF Core! 1
4 Hello World 3 I wrote an app using EF Core! 1
5 Hello World 4 I wrote an app using EF Core! 1
Executing the code twice will add another blog with another 5 posts.
PostId Title Content BlogId
1 Hello World 0 I wrote an app using EF Core! 1
2 Hello World 1 I wrote an app using EF Core! 1
3 Hello World 2 I wrote an app using EF Core! 1
4 Hello World 3 I wrote an app using EF Core! 1
5 Hello World 4 I wrote an app using EF Core! 1
6 Hello World 0 I wrote an app using EF Core! 2
7 Hello World 1 I wrote an app using EF Core! 2
8 Hello World 2 I wrote an app using EF Core! 2
9 Hello World 3 I wrote an app using EF Core! 2
10 Hello World 4 I wrote an app using EF Core! 2
Using SQL Server XEvents Profiler shows that these SQL calls are made:
exec sp_executesql N'SET NOCOUNT ON;
INSERT INTO [Blogs] ([Url])
VALUES (#p0);
SELECT [BlogId]
FROM [Blogs]
WHERE ##ROWCOUNT = 1 AND [BlogId] = scope_identity();
',N'#p0 nvarchar(4000)',#p0=N'http://blogs.msdn.com/adonet'
exec sp_executesql N'SET NOCOUNT ON;
DECLARE #inserted0 TABLE ([PostId] int, [_Position] [int]);
MERGE [Posts] USING (
VALUES (#p1, #p2, #p3, 0),
(#p4, #p5, #p6, 1),
(#p7, #p8, #p9, 2),
(#p10, #p11, #p12, 3),
(#p13, #p14, #p15, 4)) AS i ([BlogId], [Content], [Title], _Position) ON 1=0
WHEN NOT MATCHED THEN
INSERT ([BlogId], [Content], [Title])
VALUES (i.[BlogId], i.[Content], i.[Title])
OUTPUT INSERTED.[PostId], i._Position
INTO #inserted0;
SELECT [i].[PostId] FROM #inserted0 i
ORDER BY [i].[_Position];
',N'#p1 int,#p2 nvarchar(4000),#p3 nvarchar(4000),#p4 int,#p5 nvarchar(4000),#p6 nvarchar(4000),#p7 int,#p8 nvarchar(4000),#p9 nvarchar(4000),#p10 int,#p11 nvarchar(4000),#p12 nvarchar(4000),#p13 int,#p14 nvarchar(4000),#p15 nvarchar(4000)',#p1=3,#p2=N'I wrote an app using EF Core!',#p3=N'Hello World 0',#p4=3,#p5=N'I wrote an app using EF Core!',#p6=N'Hello World 1',#p7=3,#p8=N'I wrote an app using EF Core!',#p9=N'Hello World 2',#p10=3,#p11=N'I wrote an app using EF Core!',#p12=N'Hello World 3',#p13=3,#p14=N'I wrote an app using EF Core!',#p15=N'Hello World 4'
The unusual SELECT and MERGE are used to ensure IDENTITY values are returned in the order the objects were inserted, so EF Core can assign them to the object properties. After calling SaveChanges all Blog and Post objects will have the correct database-generated IDs
Related
I am having a strange performance issue with ef core 6 and MySlq and hoping you can help me spot the problem.
here's my setup.
Ef core 6/ Mysql
Table per Hierarchy approach. Here's the hierarchy:
public class RealEstate : Property{
}
Repository pattern with UnitOfWork. Here it is:
public interface IUnitOfWork
{
IDataAccessLayer<Property> PropertyRepository { get; }
IDataAccessLayer<RealEstate> RealEstateRepository { get; }
}
Here's my database context:
public class MeerkatContext : IdentityDbContext<AppUser>
{
public MeerkatContext(DbContextOptions<MeerkatContext> options) : base(options)
{
}
public DbSet<Property> Property { get; set; }
public DbSet<RealEstate> RealEstate { get; set; }
}
I have the following index defined on the "property" table
I have 1 million records in the table.
Here's the issue:
The following query take less than 1 seconds:
var count1 = await this._unitOfWork.PropertyRepository.CountAsync(x =>
x.CountryId == 1 && !x.IsBlocked && x.IsPublic);
this one takes 10 seconds:
var count2 = await this._unitOfWork.RealEstateRepository.CountAsync(x =>
x.CountryId == 1 && !x.IsBlocked && x.IsPublic);
I am stomped. Any help would be really appreciated.
edited to show query excution in MySql WorkBench
Thanks
Sind you probably want to search for un-blocked items, change the logic to help with the SQL:
AND NOT IsBlocked -- inefficient
AND IsBlocked = 0 -- efficient (using "=" instead of "NOT")
AND NotBlocked -- efficient (flip the name and logic)
In general: avoid OR and NOT when constructing SQL. (This is an oversimplification.)
Then add this 4-column index to the table:
INDEX(CountryId, IsBlocked [or NotBlocked], IsPublic, CategoryName)
Look through the rest of the schema for similar changes.
I am already using transactions inside my repository functions in some cases because I sometimes need to insert data into two tables at once and I want the whole operation to fail if one of the inserts fails.
Now I ran into a situation where I had to wrap calls to multiple repositories / functions in another transaction, but when one of those functions already uses a transaction internally I will get the error The connection is already in a transaction and cannot participate in another transaction.
I do not want to remove the transaction from the repository function because this would mean that I have to know for which repository functions a transaction is required which I would then have to implement in the service layer. On the other hand, it seems like I cannot use repository functions in a transaction when they already use a transaction internally. Here is an example for where I am facing this problem:
// Reverse engineered classes
public partial class TblProject
{
public TblProject()
{
TblProjectStepSequences = new HashSet<TblProjectStepSequence>();
}
public int ProjectId { get; set; }
public virtual ICollection<TblProjectStepSequence> TblProjectStepSequences { get; set; }
}
public partial class TblProjectTranslation
{
public int ProjectId { get; set; }
public string Language { get; set; }
public string ProjectName { get; set; }
public virtual TblProject Project { get; set; }
}
public partial class TblProjectStepSequence
{
public int SequenceId { get; set; }
public int ProjectId { get; set; }
public int StepId { get; set; }
public int SequencePosition { get; set; }
public virtual TblStep Step { get; set; }
public virtual TblProject Project { get; set; }
}
// Creating a project in the ProjectRepository
public async Task<int> CreateProjectAsync(TblProject project, ...)
{
using (var transaction = this.Context.Database.BeginTransaction())
{
await this.Context.TblProjects.AddAsync(project);
await this.Context.SaveChangesAsync();
// Insert translations... (project Id is required for this)
await this.Context.SaveChangesAsync();
transaction.Commit();
return entity.ProjectId;
}
}
// Creating the steps for a project in the StepRepository
public async Task<IEnumerable<int>> CreateProjectStepsAsync(int projectId, IEnumerable<TblProjectStepSequence> steps)
{
await this.Context.TblProjectStepSequences.AddRangeAsync(steps);
await this.Context.SaveChangesAsync();
return steps.Select(step =>
{
return step.SequenceId;
}
);
}
// Creating a project with its steps in the service layer
public async Task<int> CreateProjectWithStepsAsync(TblProject project, IEnumerable<TblProjectStepSequence> steps)
{
// This is basically a wrapper around Database.BeginTransaction() and IDbContextTransaction
using (Transaction transaction = await transactionService.BeginTransactionAsync())
{
int projectId = await projectRepository.CreateProjectAsync(project);
await stepRepository.CreateProjectStepsAsync(projectId, steps);
return projectId;
}
}
Is there a way how I can nest multiple transactions inside each other without already knowing in the inner transactions that there could be an outer transaction?
I know that it might not be possible to actually nest those transactions from a technical perspective but I still need a solution which either uses the internal transaction of the repository or the outer one (if one exists) so there is no way how I could accidentally forget to use a transaction for repository functions which require one.
You could check the CurrentTransaction property and do something like this:
var transaction = Database.CurrentTransaction ?? Database.BeginTransaction()
If there is already a transaction use that, otherwise start a new one...
Edit: Removed the Using block, see comments. More logic is needed for Committing/Rollback the transcaction though...
I am answering the question you asked "How to nest transactions in EF Core 6?"
Please note that this is just a direct answer, but not an evaluation what is best practice and what not. There was a lot of discussion going around best practices, which is valid to question what fits best for your use case but not an answer to the question (keep in mind that Stack overflow is just a Q+A site where people want to have direct answers).
Having said that, let's continue with the topic:
Try to use this helper function for creating a new transaction:
public CommittableTransaction CreateTransaction()
=> new System.Transactions.CommittableTransaction(new TransactionOptions()
{
IsolationLevel = System.Transactions.IsolationLevel.ReadCommitted
});
Using the Northwind database as example database, you can use it like:
public async Task<int?> CreateCategoryAsync(Categories category)
{
if (category?.CategoryName == null) return null;
using(var trans = CreateTransaction())
{
await this.Context.Categories.AddAsync(category);
await this.Context.SaveChangesAsync();
trans.Commit();
return category?.CategoryID;
}
}
And then you can call it from another function like:
/// <summary>Create or use existing category with associated products</summary>
/// <returns>Returns null if transaction was rolled back, else CategoryID</returns>
public async Task<int?> CreateProjectWithStepsAsync(Categories category)
{
using var trans = CreateTransaction();
int? catId = GetCategoryId(category.CategoryName)
?? await CreateCategoryAsync(category);
if (!catId.HasValue || string.IsNullOrWhiteSpace(category.CategoryName))
{
trans.Rollback(); return null;
}
var product1 = new Products()
{
ProductName = "Product A1", CategoryID = catId
};
await this.Context.Products.AddAsync(product1);
var product2 = new Products()
{
ProductName = "Product A2", CategoryID = catId
};
await this.Context.Products.AddAsync(product2);
await this.Context.SaveChangesAsync();
trans.Commit();
return catId;
}
To run this with LinqPad you need an entry point (and of course, add the NUGET package EntityFramework 6.x via F4, then create an EntityFramework Core connection):
// Main method required for LinqPad
UserQuery Context;
async Task Main()
{
Context = this;
var category = new Categories()
{
CategoryName = "Category A1"
// CategoryName = ""
};
var catId = await CreateProjectWithStepsAsync(category);
Console.WriteLine((catId == null)
? "Transaction was aborted."
: "Transaction successful.");
}
This is just a simple example - it does not check if there are any product(s) with the same name existing, it will just create a new one. You can implement that easily, I have shown it in the function CreateProjectWithStepsAsync for the categories:
int? catId = GetCategoryId(category.CategoryName)
?? await CreateCategoryAsync(category);
First it queries the categories by name (via GetCategoryId(...)), and if the result is null it will create a new category (via CreateCategoryAsync(...)).
Also, you need to consider the isolation level: Check out System.Transactions.IsolationLevel to see if the one used here (ReadCommitted) is the right one for you (it is the default setting).
What it does is creating a transaction explicitly, and notice that here we have a transaction within a transaction.
Note:
I have used both ways of using - the old one and the new one. Pick the one you like more.
Just don't call SaveChanges multiple times.
The problem is caused by calling SaveChanges multiple times to commit changes made to the DbContext instead of calling it just once at the end. It's simply not needed. A DbContext is a multi-entity Unit-of-Work. It doesn't even keep an open connection to the database. This allows 100-1000 times better throughput for the entire application by eliminating cross-connection blocking.
A DbContext tracks all modifications made to the objects it tracks and persists/commits them when SaveChanges is called using an internal transaction. To discard the changes, simply dispose the DbContext. That's why all examples show using a DbContext in a using block - that's actually the scope of the Unit-of-Work "transaction".
There's no need to "save" parent objects first. EF Core will take care of this itself inside SaveChanges.
Using the Blog/Posts example in the EF Core documentation tutorial :
public class BloggingContext : DbContext
{
public DbSet<Blog> Blogs { get; set; }
public DbSet<Post> Posts { get; set; }
public string DbPath { get; }
// The following configures EF to create a Sqlite database file in the
// special "local" folder for your platform.
protected override void OnConfiguring(DbContextOptionsBuilder options)
=> options.UseSqlServer($"Data Source=.;Initial Catalog=tests;Trusted_Connection=True; Trust Server Certificate=Yes");
}
public class Blog
{
public int BlogId { get; set; }
public string Url { get; set; }
public List<Post> Posts { get; } = new();
}
public class Post
{
public int PostId { get; set; }
public string Title { get; set; }
public string Content { get; set; }
public int BlogId { get; set; }
public Blog Blog { get; set; }
}
The following Program.cs will add a Blog with 5 posts but only call SaveChanges once at the end :
using (var db = new BloggingContext())
{
Blog blog = new Blog { Url = "http://blogs.msdn.com/adonet" };
IEnumerable<Post> posts = Enumerable.Range(0, 5)
.Select(i => new Post {
Title = $"Hello World {i}",
Content = "I wrote an app using EF Core!"
});
blog.Posts.AddRange(posts);
db.Blogs.Add(blog);
await db.SaveChangesAsync();
}
The code never specifies or retrieves the IDs. Add is an in-memory operation so there's no reason to use AddAsync. Add starts tracking both the blog and the related Posts in the Inserted state.
The contents of the tables after this are :
select * from blogs
select * from posts;
-----------------------
BlogId Url
1 http://blogs.msdn.com/adonet
PostId Title Content BlogId
1 Hello World 0 I wrote an app using EF Core! 1
2 Hello World 1 I wrote an app using EF Core! 1
3 Hello World 2 I wrote an app using EF Core! 1
4 Hello World 3 I wrote an app using EF Core! 1
5 Hello World 4 I wrote an app using EF Core! 1
Executing the code twice will add another blog with another 5 posts.
PostId Title Content BlogId
1 Hello World 0 I wrote an app using EF Core! 1
2 Hello World 1 I wrote an app using EF Core! 1
3 Hello World 2 I wrote an app using EF Core! 1
4 Hello World 3 I wrote an app using EF Core! 1
5 Hello World 4 I wrote an app using EF Core! 1
6 Hello World 0 I wrote an app using EF Core! 2
7 Hello World 1 I wrote an app using EF Core! 2
8 Hello World 2 I wrote an app using EF Core! 2
9 Hello World 3 I wrote an app using EF Core! 2
10 Hello World 4 I wrote an app using EF Core! 2
Using SQL Server XEvents Profiler shows that these SQL calls are made:
exec sp_executesql N'SET NOCOUNT ON;
INSERT INTO [Blogs] ([Url])
VALUES (#p0);
SELECT [BlogId]
FROM [Blogs]
WHERE ##ROWCOUNT = 1 AND [BlogId] = scope_identity();
',N'#p0 nvarchar(4000)',#p0=N'http://blogs.msdn.com/adonet'
exec sp_executesql N'SET NOCOUNT ON;
DECLARE #inserted0 TABLE ([PostId] int, [_Position] [int]);
MERGE [Posts] USING (
VALUES (#p1, #p2, #p3, 0),
(#p4, #p5, #p6, 1),
(#p7, #p8, #p9, 2),
(#p10, #p11, #p12, 3),
(#p13, #p14, #p15, 4)) AS i ([BlogId], [Content], [Title], _Position) ON 1=0
WHEN NOT MATCHED THEN
INSERT ([BlogId], [Content], [Title])
VALUES (i.[BlogId], i.[Content], i.[Title])
OUTPUT INSERTED.[PostId], i._Position
INTO #inserted0;
SELECT [i].[PostId] FROM #inserted0 i
ORDER BY [i].[_Position];
',N'#p1 int,#p2 nvarchar(4000),#p3 nvarchar(4000),#p4 int,#p5 nvarchar(4000),#p6 nvarchar(4000),#p7 int,#p8 nvarchar(4000),#p9 nvarchar(4000),#p10 int,#p11 nvarchar(4000),#p12 nvarchar(4000),#p13 int,#p14 nvarchar(4000),#p15 nvarchar(4000)',#p1=3,#p2=N'I wrote an app using EF Core!',#p3=N'Hello World 0',#p4=3,#p5=N'I wrote an app using EF Core!',#p6=N'Hello World 1',#p7=3,#p8=N'I wrote an app using EF Core!',#p9=N'Hello World 2',#p10=3,#p11=N'I wrote an app using EF Core!',#p12=N'Hello World 3',#p13=3,#p14=N'I wrote an app using EF Core!',#p15=N'Hello World 4'
The unusual SELECT and MERGE are used to ensure IDENTITY values are returned in the order the objects were inserted, so EF Core can assign them to the object properties. After calling SaveChanges all Blog and Post objects will have the correct database-generated IDs
So I'm just cutting my teeth on ASP.NET Core MVC pages and Entity Framework all at once.
I have an SQLite database that has 2 tables in it.
The tables are configured like so:
Table 1 was created as:
CREATE TABLE collections (collections_id INTEGER PRIMARY KEY AUTOINCREMENT,
datetime VARCHAR NOT NULL,
seed INTEGER NOT NULL
Table 2 was created as:
CREATE TABLE samples (samples_id INTEGER PRIMARY KEY AUTOINCREMENT, "
datetime VARCHAR NOT NULL,
collections_id INTEGER NOT NULL, "
FOREIGN KEY(collections_id) REFERENCES collections(collections_id));
In my web API project, I created two models:
namespace FPSTestAPI.Models
{
[Table("Collections")]
public class CollectionModel
{
[Key, Column("Collections_id")]
public long Id { get; set; }
[Column("datetime")]
public String Timestamp { get; set; }
public long Seed { get; set; }
public List<SampleModel> Samples { get; set; }
}
[Table("Samples")]
public class SampleModel
{
[Key, Column("samples_id")]
public long Id { get; set; }
[Column("datetime")]
public String TimeStamp { get; set; }
[ForeignKey("CollectionModel"), Column("Collections_id")]
public long Collection_Id { get; set; }
}
}
So I then created a DB Context to read the information from the SQLite Database and wired this into an MVC web API I'm writing.
I created a Controller to retrieve data from the DB and it seems to be PARTIALLY working with the following code.
[HttpGet]
public ActionResult<IEnumerable<CollectionModel>> Test()
{
List<CollectionModel> collections = _context.CollectionModel.ToList(); // Line 1
collections.ForEach(c => c.Samples = _context.SampleModel.Where(s => s.Collection_Id == c.Id).ToList()); // Line 2
return collections; // Line 3
}
If I comment out the line designated Line 2 above, it works fine and returns a list of all Data_Collection elements.
However, if I uncomment the line designated Line 2 above the application seems to get hung somewhere and never returns. I have debugged the application and if I stop my execution on the return statement (Line 3) and examine the "collections" object then I can see that it is properly filled with the expected data. It just never returns.
Does anyone have any idea why this would be failing?
You have navigation properties, which can be used for retrieving data in one roundtrip to database:
[HttpGet]
public ActionResult<IEnumerable<CollectionModel>> Test()
{
var collections = _context.CollectionModel
.Include(m => m.Samples)
.ToList();
return collections;
}
Here is the script which I use manually after the database creating to generate useless data for testing:
DECLARE #index BIGINT
SET #index = 0
SET IDENTITY_INSERT Persons ON
WHILE #index < 50000
BEGIN
INSERT INTO Persons
(Id, [Name], Code)
VALUES
(NEWID(), CONCAT('Person', #index), #index)
SET #index = #index + 1
END
How can I run it using EF core in the moment of the database initialization or somehow using data seeding methods? All the answers around are about small amounts of data, but in my case, I work with ~ 50000 records.
If you are using EF Core 2.1 and higher, then making use of HasData method is an ideal way to add seed data.
We can call it using ModelBuilder object in 'OnModelCreating' method to add the data as part of code first migrations.The data then gets seeded the very first time when a database is scaffold or initialized and migrations are applied.
You can also combine it with Bogus to generate fake data for entities.
Inline is a scenario for creating 50000 objects as stated in question. The program executed successfully without any issues. In fact ef core is intelligent enough to split the data into batch queries of 700-800 objects per batch and push it to database.
Entity:
public class Person
{
[Key]
[DatabaseGenerated(DatabaseGeneratedOption.Identity)]
public int Id { get; set; }
[Required]
public string FirstName { get; set; }
public string LastName { get; set; }
}
OnModelCreating
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
int id = 1;
var fakePersons = new Faker<Person>().StrictMode(true)
.RuleFor(o => o.Id, f => id++)
.RuleFor(u => u.FirstName, (f, u) => f.Name.FirstName())
.RuleFor(u => u.LastName, (f, u) => f.Name.LastName());
var persons = fakePersons.Generate(50000);
modelBuilder.Entity<Person>().HasData(persons);
}
Given the following set up where there are many Teams and there are many LeagueSessions. Each Team belongs to zero or more LeagueSessions but only ever one LeagueSession is active. LeagueSessions have many teams, and the teams will be repeated. Many-to-many relationship is established between Teams and LeagueSessions with a join table called TeamsSessions.
Team model looks like this:
public class Team
{
public string Id { get; set; }
public string Name { get; set; }
public League League { get; set; }
public string LeagueID { get; set; }
public bool Selected { get; set; }
public ICollection<Match> Matches { get; set; }
public virtual ICollection<TeamSession> TeamsSessions { get; set; }
}
Team model fluent api configuration:
`
public class TeamConfiguration
{
public TeamConfiguration(EntityTypeBuilder<Team> model)
{
// The data for this model will be generated inside ThePLeagueDataCore.DataBaseInitializer.DatabaseBaseInitializer.cs class
// When generating data for models in here, you have to provide it with an ID, and it became mildly problematic to consistently get
// a unique ID for all the teams. In ThePLeagueDataCore.DataBaseInitializer.DatabaseBaseInitializer.cs we can use dbContext to generate
// unique ids for us for each team.
model.HasOne(team => team.League)
.WithMany(league => league.Teams)
.HasForeignKey(team => team.LeagueID);
}
}
`
Each team belongs to a single League. League model looks like this:
`public class League
{
public string Id { get; set; }
public string Type { get; set; }
public string Name { get; set; }
public IEnumerable<Team> Teams { get; set; }
public bool Selected { get; set; }
public string SportTypeID { get; set; }
public SportType SportType { get; set; }
public IEnumerable<LeagueSessionSchedule> Sessions { get; set; }
}`
fluent API for the League:
`public LeagueConfiguration(EntityTypeBuilder<League> model)
{
model.HasOne(league => league.SportType)
.WithMany(sportType => sportType.Leagues)
.HasForeignKey(league => league.SportTypeID);
model.HasMany(league => league.Teams)
.WithOne(team => team.League)
.HasForeignKey(team => team.LeagueID);
model.HasData(leagues);
}`
SessionScheduleBase class looks like this:
public class SessionScheduleBase
{
public string LeagueID { get; set; }
public bool ByeWeeks { get; set; }
public long? NumberOfWeeks { get; set; }
public DateTime SessionStart { get; set; }
public DateTime SessionEnd { get; set; }
public ICollection<TeamSession> TeamsSessions { get; set; } = new Collection<TeamSession>();
public ICollection<GameDay> GamesDays { get; set; } = new Collection<GameDay>();
}
Note: LeagueSessionSchedule inherits from SessionScheduleBase
The TeamSession model looks like this:
`public class TeamSession
{
public string Id { get; set; }
public string TeamId { get; set; }
public Team Team { get; set; }
public string LeagueSessionScheduleId { get; set; }
public LeagueSessionSchedule LeagueSessionSchedule { get; set; }
}`
I then configure the relationship with the fluent API like this:
`public TeamSessionConfiguration(EntityTypeBuilder<TeamSession> model)
{
model.HasKey(ts => new { ts.TeamId, ts.LeagueSessionScheduleId });
model.HasOne(ts => ts.Team)
.WithMany(t => t.TeamsSessions)
.HasForeignKey(ts => ts.TeamId);
model.HasOne(ts => ts.LeagueSessionSchedule)
.WithMany(s => s.TeamsSessions)
.HasForeignKey(ts => ts.LeagueSessionScheduleId);
}`
The problem arises whenever I attempt to insert a new LeagueSessionSchedule. The way I am adding a new TeamSession object onto the new LeagueSessionSchedule is like this:
`foreach (TeamSessionViewModel teamSession in newSchedule.TeamsSessions)
{
Team team = await this._teamRepository.GetByIdAsync(teamSession.TeamId, ct);
if(team != null)
{
TeamSession newTeamSession = new TeamSession()
{
Team = team,
LeagueSessionSchedule = leagueSessionSchedule
};
leagueSessionSchedule.TeamsSessions.Add(newTeamSession);
}
}`
Saving the new LeagueSessionSchedule code:
public async Task<LeagueSessionSchedule> AddScheduleAsync(LeagueSessionSchedule newLeagueSessionSchedule, CancellationToken ct = default)
{
this._dbContext.LeagueSessions.Add(newLeagueSessionSchedule);
await this._dbContext.SaveChangesAsync(ct);
return newLeagueSessionSchedule;
}
Saving the new LeagueSessionSchedule object throws an error by Entity Framework Core that it cannot INSERT a duplicate primary key value into the dbo.Teams table. I have no idea why its attempting to add to dbo.Teams table and not into TeamsSessions table.
ERROR:
INSERT INTO [LeagueSessions] ([Id], [Active], [ByeWeeks], [LeagueID], [NumberOfWeeks], [SessionEnd], [SessionStart])
VALUES (#p0, #p1, #p2, #p3, #p4, #p5, #p6);
INSERT INTO [Teams] ([Id], [Discriminator], [LeagueID], [Name], [Selected])
VALUES (#p7, #p8, #p9, #p10, #p11),
(#p12, #p13, #p14, #p15, #p16),
(#p17, #p18, #p19, #p20, #p21),
(#p22, #p23, #p24, #p25, #p26),
(#p27, #p28, #p29, #p30, #p31),
(#p32, #p33, #p34, #p35, #p36),
(#p37, #p38, #p39, #p40, #p41),
(#p42, #p43, #p44, #p45, #p46);
System.Data.SqlClient.SqlException (0x80131904): Violation of PRIMARY KEY constraint 'PK_Teams'. Cannot insert duplicate key in object 'dbo.Teams'. The duplicate key value is (217e2e11-0603-4239-aab5-9e2f1d3ebc2c).
My goal is to create a new LeagueSessionSchedule object. Along with the creation of this object, I also have to create a new TeamSession entry to the join table (or not if join table is not necessary) to then be able to pick any given team and see what session it is currently a part of.
My entire PublishSchedule method is the following:
`
public async Task<bool> PublishSessionsSchedulesAsync(List<LeagueSessionScheduleViewModel> newLeagueSessionsSchedules, CancellationToken ct = default(CancellationToken))
{
List<LeagueSessionSchedule> leagueSessionOperations = new List<LeagueSessionSchedule>();
foreach (LeagueSessionScheduleViewModel newSchedule in newLeagueSessionsSchedules)
{
LeagueSessionSchedule leagueSessionSchedule = new LeagueSessionSchedule()
{
Active = newSchedule.Active,
LeagueID = newSchedule.LeagueID,
ByeWeeks = newSchedule.ByeWeeks,
NumberOfWeeks = newSchedule.NumberOfWeeks,
SessionStart = newSchedule.SessionStart,
SessionEnd = newSchedule.SessionEnd
};
// leagueSessionSchedule = await this._sessionScheduleRepository.AddScheduleAsync(leagueSessionSchedule, ct);
// create game day entry for all configured game days
foreach (GameDayViewModel gameDay in newSchedule.GamesDays)
{
GameDay newGameDay = new GameDay()
{
GamesDay = gameDay.GamesDay
};
// leagueSessionSchedule.GamesDays.Add(newGameDay);
// create game time entry for every game day
foreach (GameTimeViewModel gameTime in gameDay.GamesTimes)
{
GameTime newGameTime = new GameTime()
{
GamesTime = DateTimeOffset.FromUnixTimeSeconds(gameTime.GamesTime).DateTime.ToLocalTime(),
// GameDayId = newGameDay.Id
};
// newGameTime = await this._sessionScheduleRepository.AddGameTimeAsync(newGameTime, ct);
newGameDay.GamesTimes.Add(newGameTime);
}
leagueSessionSchedule.GamesDays.Add(newGameDay);
}
// update teams sessions
foreach (TeamSessionViewModel teamSession in newSchedule.TeamsSessions)
{
// retrieve the team with the corresponding id
Team team = await this._teamRepository.GetByIdAsync(teamSession.TeamId, ct);
if(team != null)
{
TeamSession newTeamSession = new TeamSession()
{
Team = team,
LeagueSessionSchedule = leagueSessionSchedule
};
leagueSessionSchedule.TeamsSessions.Add(newTeamSession);
}
}
// update matches for this session
foreach (MatchViewModel match in newSchedule.Matches)
{
Match newMatch = new Match()
{
DateTime = match.DateTime,
HomeTeamId = match.HomeTeam.Id,
AwayTeamId = match.AwayTeam.Id,
LeagueID = match.LeagueID
};
leagueSessionSchedule.Matches.Add(newMatch);
}
try
{
leagueSessionOperations.Add(await this._sessionScheduleRepository.AddScheduleAsync(leagueSessionSchedule, ct));
}
catch(Exception ex)
{
}
}
// ensure all leagueSessionOperations did not return any null values
return leagueSessionOperations.All(op => op != null);
}
`
This is not a many-to-many relationship.
It is two separate one-to-many relationships, which happen to refer to the same table on one end of the relationship.
While it is true that on the database level, both use cases are represented by three tables, i.e. Foo 1->* FooBar *<-1 Bar, these two cases are treated differently by Entity Framework's automated behavior - and this is very important.
EF only handles the cross table for you if it is a direct many-to-many, e.g.
public class Foo
{
public virtual ICollection<Bar> Bars { get; set; }
}
public class Bar
{
public virtual ICollection<Foo> Foos { get; set; }
}
EF handles the cross table behind the scenes, and you are never made aware of the existence of the cross table (from the code perspective).
Importantly, EF Core does not yet support implicit cross tables! There is currently no way to do this in EF Core, but even if there were, you're not using it anyway, so the answer to your problem remains the same regardless of whether you're using EF or EF Core.
However, you have defined your own cross table. While this is still representative of a many-to-many relationship in database terms, it has ceased to be a many-to-many relationship as far as EF is concerned, and any documentation you find on EF's many-to-many relationships no longer applies to your scenario.
Unattached but indirectly added objects are assumed to be new.
By "indirectly added", I mean you that it was added to the context as part of another entity (which you directly added to the context). In the following example, foo is directly added and bar is indirectly added:
var foo = new Foo();
var bar = new Bar();
foo.Bar = bar;
context.Foos.Add(foo); // directly adding foo
// ... but not bar
context.SaveChanges();
When you add (and commit) a new entity to the context, EF adds it for you. However, EF also looks at any related entities that the first entity contains. During the commit in the above example, EF will look at both the foo and bar entities and will handle them accordingly. EF is smart enough to realize that you want bar to be stored in the database since you put it inside the foo object and you explicitly asked EF to add foo to the database.
It is important to realize that you've told EF that foo should be created (since you called Add(), which implies a new item), but you never told EF what it should do with bar. It's unclear (to EF) what you expect EF to do with this, and thus EF is left guessing at what to do.
If you never explained to EF whether bar already exists or not, Entity Framework defaults to assuming it needs to create this entity in the database.
Saving the new LeagueSessionSchedule object throws an error by Entity Framework Core that it cannot INSERT a duplicate primary key value into the dbo.Teams table. I have no idea why its attempting to add to dbo.Teams table
Knowing what you now know, the error becomes clearer. EF is trying to add this team (which was the bar object in my example) because it has no information on this team object and what its state in the database is.
There are a few solutions here.
1. Use the FK property instead of the navigational property
This is my preferred solution because it leaves no room for error. If the team ID does not yet exist, you get an error. At no point will EF try to create a team, since it doesn't even know the team's data, it only knows the (alleged) ID you're trying to create a relationship with.
Note: I am omitting LeagueSessionSchedule as it is unrelated to the current error - but it's essentially the same behavior for both Team and LeagueSessionSchedule.
TeamSession newTeamSession = new TeamSession()
{
TeamId = team.Id
};
By using the FK property instead of the nav prop, you are informing EF that this is an existing team - and therefore EF no longer tries to (re)create this team.
2. Ensure that the team is tracked by the current context
Note: I am omitting LeagueSessionSchedule as it is unrelated to the current error - but it's essentially the same behavior for both Team and LeagueSessionSchedule.
context.Teams.Attach(team);
TeamSession newTeamSession = new TeamSession()
{
Team = team
};
By attaching the object to the context, you are informing it of its existence. The default state of a newly attached entity is Unchanged, meaning "this already exists in the database and has not been changed - so you don't need to update it when we commit the context".
If you have actually made changes to your team that you want to be updated during commit, you should instead use:
context.Entry(team).State = EntityState.Modified;
Entry() inherently also attaches the entity, and by setting its state to Modified you ensure that the new values will be committed to the database when you call SaveChanges().
Note that I prefer solution 1 over solution 2 because it's foolproof and much less likely to lead to unexpected behavior or runtime exceptions.
String primary keys are undesirable
I'm not going to say that it doesn't work, but strings cannot be autogenerated by Entity Framework, making them undesirable as the type of your entity's PK. You will need to manually set your entity PK values.
Like I said, it's not impossible, but your code shows that you're not explicitly setting PK values:
if(team != null)
{
TeamSession newTeamSession = new TeamSession()
{
Team = team,
LeagueSessionSchedule = leagueSessionSchedule
};
leagueSessionSchedule.TeamsSessions.Add(newTeamSession);
}
If you want your PK's to be automatically generated, use an appropriate type. int and Guid are by far the most commonly used types for this.
Otherwise, you're going to have to start setting your own PK values, because if you don't (and the Id value thus defaults to null), your code is going to fail when you add a second TeamSession object using the above code (even though you're doing everything else correctly), since PK null is already taken by the first entity you added to the table.