Seeding huge amount of data (EF Core) - c#

Here is the script which I use manually after the database creating to generate useless data for testing:
DECLARE #index BIGINT
SET #index = 0
SET IDENTITY_INSERT Persons ON
WHILE #index < 50000
BEGIN
INSERT INTO Persons
(Id, [Name], Code)
VALUES
(NEWID(), CONCAT('Person', #index), #index)
SET #index = #index + 1
END
How can I run it using EF core in the moment of the database initialization or somehow using data seeding methods? All the answers around are about small amounts of data, but in my case, I work with ~ 50000 records.

If you are using EF Core 2.1 and higher, then making use of HasData method is an ideal way to add seed data.
We can call it using ModelBuilder object in 'OnModelCreating' method to add the data as part of code first migrations.The data then gets seeded the very first time when a database is scaffold or initialized and migrations are applied.
You can also combine it with Bogus to generate fake data for entities.
Inline is a scenario for creating 50000 objects as stated in question. The program executed successfully without any issues. In fact ef core is intelligent enough to split the data into batch queries of 700-800 objects per batch and push it to database.
Entity:
public class Person
{
[Key]
[DatabaseGenerated(DatabaseGeneratedOption.Identity)]
public int Id { get; set; }
[Required]
public string FirstName { get; set; }
public string LastName { get; set; }
}
OnModelCreating
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
int id = 1;
var fakePersons = new Faker<Person>().StrictMode(true)
.RuleFor(o => o.Id, f => id++)
.RuleFor(u => u.FirstName, (f, u) => f.Name.FirstName())
.RuleFor(u => u.LastName, (f, u) => f.Name.LastName());
var persons = fakePersons.Generate(50000);
modelBuilder.Entity<Person>().HasData(persons);
}

Related

How to nest transactions in EF Core 6?

I am already using transactions inside my repository functions in some cases because I sometimes need to insert data into two tables at once and I want the whole operation to fail if one of the inserts fails.
Now I ran into a situation where I had to wrap calls to multiple repositories / functions in another transaction, but when one of those functions already uses a transaction internally I will get the error The connection is already in a transaction and cannot participate in another transaction.
I do not want to remove the transaction from the repository function because this would mean that I have to know for which repository functions a transaction is required which I would then have to implement in the service layer. On the other hand, it seems like I cannot use repository functions in a transaction when they already use a transaction internally. Here is an example for where I am facing this problem:
// Reverse engineered classes
public partial class TblProject
{
public TblProject()
{
TblProjectStepSequences = new HashSet<TblProjectStepSequence>();
}
public int ProjectId { get; set; }
public virtual ICollection<TblProjectStepSequence> TblProjectStepSequences { get; set; }
}
public partial class TblProjectTranslation
{
public int ProjectId { get; set; }
public string Language { get; set; }
public string ProjectName { get; set; }
public virtual TblProject Project { get; set; }
}
public partial class TblProjectStepSequence
{
public int SequenceId { get; set; }
public int ProjectId { get; set; }
public int StepId { get; set; }
public int SequencePosition { get; set; }
public virtual TblStep Step { get; set; }
public virtual TblProject Project { get; set; }
}
// Creating a project in the ProjectRepository
public async Task<int> CreateProjectAsync(TblProject project, ...)
{
using (var transaction = this.Context.Database.BeginTransaction())
{
await this.Context.TblProjects.AddAsync(project);
await this.Context.SaveChangesAsync();
// Insert translations... (project Id is required for this)
await this.Context.SaveChangesAsync();
transaction.Commit();
return entity.ProjectId;
}
}
// Creating the steps for a project in the StepRepository
public async Task<IEnumerable<int>> CreateProjectStepsAsync(int projectId, IEnumerable<TblProjectStepSequence> steps)
{
await this.Context.TblProjectStepSequences.AddRangeAsync(steps);
await this.Context.SaveChangesAsync();
return steps.Select(step =>
{
return step.SequenceId;
}
);
}
// Creating a project with its steps in the service layer
public async Task<int> CreateProjectWithStepsAsync(TblProject project, IEnumerable<TblProjectStepSequence> steps)
{
// This is basically a wrapper around Database.BeginTransaction() and IDbContextTransaction
using (Transaction transaction = await transactionService.BeginTransactionAsync())
{
int projectId = await projectRepository.CreateProjectAsync(project);
await stepRepository.CreateProjectStepsAsync(projectId, steps);
return projectId;
}
}
Is there a way how I can nest multiple transactions inside each other without already knowing in the inner transactions that there could be an outer transaction?
I know that it might not be possible to actually nest those transactions from a technical perspective but I still need a solution which either uses the internal transaction of the repository or the outer one (if one exists) so there is no way how I could accidentally forget to use a transaction for repository functions which require one.
You could check the CurrentTransaction property and do something like this:
var transaction = Database.CurrentTransaction ?? Database.BeginTransaction()
If there is already a transaction use that, otherwise start a new one...
Edit: Removed the Using block, see comments. More logic is needed for Committing/Rollback the transcaction though...
I am answering the question you asked "How to nest transactions in EF Core 6?"
Please note that this is just a direct answer, but not an evaluation what is best practice and what not. There was a lot of discussion going around best practices, which is valid to question what fits best for your use case but not an answer to the question (keep in mind that Stack overflow is just a Q+A site where people want to have direct answers).
Having said that, let's continue with the topic:
Try to use this helper function for creating a new transaction:
public CommittableTransaction CreateTransaction()
=> new System.Transactions.CommittableTransaction(new TransactionOptions()
{
IsolationLevel = System.Transactions.IsolationLevel.ReadCommitted
});
Using the Northwind database as example database, you can use it like:
public async Task<int?> CreateCategoryAsync(Categories category)
{
if (category?.CategoryName == null) return null;
using(var trans = CreateTransaction())
{
await this.Context.Categories.AddAsync(category);
await this.Context.SaveChangesAsync();
trans.Commit();
return category?.CategoryID;
}
}
And then you can call it from another function like:
/// <summary>Create or use existing category with associated products</summary>
/// <returns>Returns null if transaction was rolled back, else CategoryID</returns>
public async Task<int?> CreateProjectWithStepsAsync(Categories category)
{
using var trans = CreateTransaction();
int? catId = GetCategoryId(category.CategoryName)
?? await CreateCategoryAsync(category);
if (!catId.HasValue || string.IsNullOrWhiteSpace(category.CategoryName))
{
trans.Rollback(); return null;
}
var product1 = new Products()
{
ProductName = "Product A1", CategoryID = catId
};
await this.Context.Products.AddAsync(product1);
var product2 = new Products()
{
ProductName = "Product A2", CategoryID = catId
};
await this.Context.Products.AddAsync(product2);
await this.Context.SaveChangesAsync();
trans.Commit();
return catId;
}
To run this with LinqPad you need an entry point (and of course, add the NUGET package EntityFramework 6.x via F4, then create an EntityFramework Core connection):
// Main method required for LinqPad
UserQuery Context;
async Task Main()
{
Context = this;
var category = new Categories()
{
CategoryName = "Category A1"
// CategoryName = ""
};
var catId = await CreateProjectWithStepsAsync(category);
Console.WriteLine((catId == null)
? "Transaction was aborted."
: "Transaction successful.");
}
This is just a simple example - it does not check if there are any product(s) with the same name existing, it will just create a new one. You can implement that easily, I have shown it in the function CreateProjectWithStepsAsync for the categories:
int? catId = GetCategoryId(category.CategoryName)
?? await CreateCategoryAsync(category);
First it queries the categories by name (via GetCategoryId(...)), and if the result is null it will create a new category (via CreateCategoryAsync(...)).
Also, you need to consider the isolation level: Check out System.Transactions.IsolationLevel to see if the one used here (ReadCommitted) is the right one for you (it is the default setting).
What it does is creating a transaction explicitly, and notice that here we have a transaction within a transaction.
Note:
I have used both ways of using - the old one and the new one. Pick the one you like more.
Just don't call SaveChanges multiple times.
The problem is caused by calling SaveChanges multiple times to commit changes made to the DbContext instead of calling it just once at the end. It's simply not needed. A DbContext is a multi-entity Unit-of-Work. It doesn't even keep an open connection to the database. This allows 100-1000 times better throughput for the entire application by eliminating cross-connection blocking.
A DbContext tracks all modifications made to the objects it tracks and persists/commits them when SaveChanges is called using an internal transaction. To discard the changes, simply dispose the DbContext. That's why all examples show using a DbContext in a using block - that's actually the scope of the Unit-of-Work "transaction".
There's no need to "save" parent objects first. EF Core will take care of this itself inside SaveChanges.
Using the Blog/Posts example in the EF Core documentation tutorial :
public class BloggingContext : DbContext
{
public DbSet<Blog> Blogs { get; set; }
public DbSet<Post> Posts { get; set; }
public string DbPath { get; }
// The following configures EF to create a Sqlite database file in the
// special "local" folder for your platform.
protected override void OnConfiguring(DbContextOptionsBuilder options)
=> options.UseSqlServer($"Data Source=.;Initial Catalog=tests;Trusted_Connection=True; Trust Server Certificate=Yes");
}
public class Blog
{
public int BlogId { get; set; }
public string Url { get; set; }
public List<Post> Posts { get; } = new();
}
public class Post
{
public int PostId { get; set; }
public string Title { get; set; }
public string Content { get; set; }
public int BlogId { get; set; }
public Blog Blog { get; set; }
}
The following Program.cs will add a Blog with 5 posts but only call SaveChanges once at the end :
using (var db = new BloggingContext())
{
Blog blog = new Blog { Url = "http://blogs.msdn.com/adonet" };
IEnumerable<Post> posts = Enumerable.Range(0, 5)
.Select(i => new Post {
Title = $"Hello World {i}",
Content = "I wrote an app using EF Core!"
});
blog.Posts.AddRange(posts);
db.Blogs.Add(blog);
await db.SaveChangesAsync();
}
The code never specifies or retrieves the IDs. Add is an in-memory operation so there's no reason to use AddAsync. Add starts tracking both the blog and the related Posts in the Inserted state.
The contents of the tables after this are :
select * from blogs
select * from posts;
-----------------------
BlogId Url
1 http://blogs.msdn.com/adonet
PostId Title Content BlogId
1 Hello World 0 I wrote an app using EF Core! 1
2 Hello World 1 I wrote an app using EF Core! 1
3 Hello World 2 I wrote an app using EF Core! 1
4 Hello World 3 I wrote an app using EF Core! 1
5 Hello World 4 I wrote an app using EF Core! 1
Executing the code twice will add another blog with another 5 posts.
PostId Title Content BlogId
1 Hello World 0 I wrote an app using EF Core! 1
2 Hello World 1 I wrote an app using EF Core! 1
3 Hello World 2 I wrote an app using EF Core! 1
4 Hello World 3 I wrote an app using EF Core! 1
5 Hello World 4 I wrote an app using EF Core! 1
6 Hello World 0 I wrote an app using EF Core! 2
7 Hello World 1 I wrote an app using EF Core! 2
8 Hello World 2 I wrote an app using EF Core! 2
9 Hello World 3 I wrote an app using EF Core! 2
10 Hello World 4 I wrote an app using EF Core! 2
Using SQL Server XEvents Profiler shows that these SQL calls are made:
exec sp_executesql N'SET NOCOUNT ON;
INSERT INTO [Blogs] ([Url])
VALUES (#p0);
SELECT [BlogId]
FROM [Blogs]
WHERE ##ROWCOUNT = 1 AND [BlogId] = scope_identity();
',N'#p0 nvarchar(4000)',#p0=N'http://blogs.msdn.com/adonet'
exec sp_executesql N'SET NOCOUNT ON;
DECLARE #inserted0 TABLE ([PostId] int, [_Position] [int]);
MERGE [Posts] USING (
VALUES (#p1, #p2, #p3, 0),
(#p4, #p5, #p6, 1),
(#p7, #p8, #p9, 2),
(#p10, #p11, #p12, 3),
(#p13, #p14, #p15, 4)) AS i ([BlogId], [Content], [Title], _Position) ON 1=0
WHEN NOT MATCHED THEN
INSERT ([BlogId], [Content], [Title])
VALUES (i.[BlogId], i.[Content], i.[Title])
OUTPUT INSERTED.[PostId], i._Position
INTO #inserted0;
SELECT [i].[PostId] FROM #inserted0 i
ORDER BY [i].[_Position];
',N'#p1 int,#p2 nvarchar(4000),#p3 nvarchar(4000),#p4 int,#p5 nvarchar(4000),#p6 nvarchar(4000),#p7 int,#p8 nvarchar(4000),#p9 nvarchar(4000),#p10 int,#p11 nvarchar(4000),#p12 nvarchar(4000),#p13 int,#p14 nvarchar(4000),#p15 nvarchar(4000)',#p1=3,#p2=N'I wrote an app using EF Core!',#p3=N'Hello World 0',#p4=3,#p5=N'I wrote an app using EF Core!',#p6=N'Hello World 1',#p7=3,#p8=N'I wrote an app using EF Core!',#p9=N'Hello World 2',#p10=3,#p11=N'I wrote an app using EF Core!',#p12=N'Hello World 3',#p13=3,#p14=N'I wrote an app using EF Core!',#p15=N'Hello World 4'
The unusual SELECT and MERGE are used to ensure IDENTITY values are returned in the order the objects were inserted, so EF Core can assign them to the object properties. After calling SaveChanges all Blog and Post objects will have the correct database-generated IDs

EntityFramework Core deletes entity with included entites that has been read

For some reason, when I read data from my database and makes a SaveChangesAsync() on my database context, the entity that have been loaded in with its relations will be deleted from the database.
Category tools = new("Tools");
ProductType p = new("Hammer", 25, tools);
Ware w1 = new("S1", p, new("Floor"));
Ware w2 = new("S2", p, new("Floor"));
Ware w3 = new("S3", p, new("Floor"));
Ware w4 = new("S4", p, new("Floor"));
Ware w5 = new("S5", p, new("Floor"));
p.AddWare(w1);
p.AddWare(w2);
p.AddWare(w3);
p.AddWare(w4);
p.AddWare(w5);
unitOfWork.ProductTypeRepository.Create(p);
ProductType selected = unitOfWork.ProductTypeRepository.GetByIdAsync(46).Result;
ProductType selected2 = unitOfWork.ProductTypeRepository.GetByIdAsyncWithRelationships(50).Result;
Console.WriteLine("____");
var done = unitOfWork.SaveChangesAsync().Result;
Console.WriteLine(done);
The differences above is that the data in selected is not deleted, while selected2 is.
The two methods are the following
public async Task<ProductType> GetByIdAsync(int id)
{
return await _shopDbContext.ProductTypes.SingleOrDefaultAsync(p => p.ProductTypeId == id);
}
and
public async Task<ProductType> GetByIdAsyncWithRelationships(int id)
{
return await _shopDbContext.ProductTypes
.Include(p => p.OfferProductTypes)
.ThenInclude(op => op.Offer)
.Include(p => p.Wares)
.Include(p => p.Category)
.SingleOrDefaultAsync(p => p.ProductTypeId == id);
}
From my MSSQL database server profiler I can see the following
exec sp_executesql N'SET NOCOUNT ON;
DELETE FROM [Ware]
WHERE [WareId] = #p0;
SELECT ##ROWCOUNT;
DELETE FROM [Ware]
WHERE [WareId] = #p1;
SELECT ##ROWCOUNT;
DELETE FROM [Ware]
WHERE [WareId] = #p2;
SELECT ##ROWCOUNT;
DELETE FROM [Ware]
WHERE [WareId] = #p3;
SELECT ##ROWCOUNT;
DELETE FROM [Ware]
WHERE [WareId] = #p4;
SELECT ##ROWCOUNT;
',N'#p0 int,#p1 int,#p2 int,#p3 int,#p4 int',#p0=246,#p1=247,#p2=248,#p3=249,#p4=250
and
exec sp_executesql N'SET NOCOUNT ON;
DELETE FROM [ProductTypes]
WHERE [ProductTypeId] = #p5;
SELECT ##ROWCOUNT;
',N'#p5 int',#p5=50
Meanwhile the code I have in SaveChangesAsync() in UnitOfWork
public Task<int> SaveChangesAsync()
{
var ct = _context.ChangeTracker;
foreach(var e in ct.Entries())
{
Console.WriteLine(e.Entity.GetType().Name + " : " +e.State);
}
return _context.SaveChangesAsync();
}
displays the following
____
ProductType : Unchanged
0
It does not even indicate that a ProductType has been added together with its wares and Category. The database is updated.
However, sometimes it does display the changes
ProductType : Added
Category : Added
Ware : Added
Location : Added
Ware : Added
Location : Added
Ware : Added
Location : Added
Ware : Added
Location : Added
Ware : Added
Location : Added
ProductType : Deleted
Ware : Deleted
Ware : Deleted
Ware : Deleted
Ware : Deleted
Ware : Deleted
ProductType : Unchanged
Category : Unchanged
Here is a link to the git repository https://github.com/BenjaminElifLarsen/Shopping---MCVE in a MCVE form. The files is program in ShoppingCli and the repositories in Ipl.Repositories and UnitOfWork in Ipl.Services
------ Update -----
I have just tried circumvent my unit of work and called the database context directly and any entity that is loaded together with related entities using Include() and ThenInclude() are deleted.
OK, so this took some digging into, but I believe I've come to the logical explanation:
Your DB entities are a bit wonky, and behave in ways EF can't tolerate
Here's what happens in a normal scenario (and this isn't the exact code the EF does, it's a sort of pseudo-explanation for simplicity):
You invoke a query, EF runs it, turns the received data into objects and wires them up into a graph. Your query is essentially this:
await ctx.ProductTypes
.Include(p => p.OfferProductTypes)
.Include(p => p.Wares)
.Include(p => p.Category)
.SingleOrDefaultAsync(p => p.ProductTypeId == 4);
EF will make a ProductType, EF will make a Category, EF will make 5 Wares. Category is the parent of ProductType, ProductType is the parent of Ware
After it's made the ProductType and Category, EF wants to wire up a bidirectional relationship between ProductType and Category so it sets theProductType.Category = theCategory and because it's many:1 (pt:c) EF makes a collection to hold the ProductTypes and sets theCategory.ProductTypes = theCollectionOfProductTypes. It then fills theCollectionOfProductTypes with all the ProductType instances it knows about
In pseudocode terms EF does this:
ProductType theProductType = get(...) //gets product type 4, we don't care so much how
Category theCategory = get(...) //gets the category, we don't care so much how
List<ProductType> theCollectionOfProductTypes = new(); //it's probably not a list, but it doesn't matter
theProductType.Category = theCategory;
theCategory.ProductTypes = theCollectionOfProductTypes;
theCollectionOfProductTypes.Add(theProductType);
With me so far? Here's the rub:
When EF gave your Category the collection of ProductTypes it made, you made another (separate) collection out of it:
public IEnumerable<ProductType> ProductTypes {
get => _productType;
private set => _productType = value.ToHashSet();
}
Skipping over the part where making this IEnumerable is a bad idea because you can't write to an IEnumerable (which makes it harder to add ProductTypes to your Category) and getting stright to the bit where you ToHashSet it, you've disconnected your graph from EF's reality. If we add a bit to hold what EF gave you:
public IEnumerable<ProductType> ProductTypes {
get => _productType;
private set {
_productType = value.ToHashSet();
_whatEfHolds = value;
}
}
private object _whatEfHolds;
You can see that your category's "list of producttypes" has 0 items, whereas the one EF filled up when it was wiring the graph up, has 1 item (the ProductType id 4 we downloaded)
All will be fine, until we ask EF to detect changes between your graph and its own; when it pulls your Category's ProductTypes property, it will receive 0 entities and assume you removed them:
dbug: 2021-10-20 12:34:56.789 CoreEventId.CollectionChangeDetected[10804] (Microsoft.EntityFrameworkCore.ChangeTracking)
0 entities were added and 1 entities were removed from navigation 'Category.ProductTypes' on entity with key '{CategoryId: 4}'.
dbug: 2021-10-20 12:34:56.789 CoreEventId.StateChanged[10807] (Microsoft.EntityFrameworkCore.ChangeTracking)
The 'ProductType' entity with key '{ProductTypeId: 4}' tracked by 'ShopDbContext' changed state from 'Unchanged' to 'Modified'.
dbug: 2021-10-20 12:34:56.789 CoreEventId.CascadeDeleteOrphan[10003] (Microsoft.EntityFrameworkCore.Update)
An entity of type 'ProductType' with key '{ProductTypeId: 4}' changed to 'Deleted' state due to severed required relationship to its parent entity of type 'Category'.
EF marked your ProductType as deleted, and then cascaded that out to the Wares too
What to do about it? Personally, I think you have too much logic and messing around in your DB ents; they should be much simpler affairs that just hold data, and maybe have some methods for slightly more advanced logic (but perhaps a lot of that would be put into your mapper) in a partial class. A Category should perhaps look more like this (note, it's called a TestCategories because I scaffed it from the DB, and thats what the DB table is called):
public partial class TestCategories
{
public TestCategories()
{
ProductTypes = new HashSet<ProductTypes>();
}
public int CategoryId { get; set; }
public string Name { get; set; }
public virtual ICollection<ProductTypes> ProductTypes { get; set; }
}
A ProductType like this:
public partial class ProductTypes
{
public ProductTypes()
{
OfferProductType = new HashSet<OfferProductType>();
Ware = new HashSet<Ware>();
}
public int ProductTypeId { get; set; }
public string Type { get; set; }
public int Price { get; set; }
public int CategoryId { get; set; }
public virtual TestCategories Category { get; set; }
public virtual ICollection<OfferProductType> OfferProductType { get; set; }
public virtual ICollection<Ware> Ware { get; set; }
}
But the main takeaway; be careful/avoid doing stuff (like ToHashSet) on the ents EF gives you when it's populating your object graph
None of the code you posted involved a delete action, so I'm unsure where the DELETE statements are coming from, but it's not in the code provided. I also think you're paddling up the wrong creek in trying to explain why selected and selected2 are different.
The issue isn't that your entity has been "deleted". The issue is that one of these queries can be answered using the change tracker's cache (selected), and the other cannot (selected2).
In short, EF caches the entities you work with. Whether you attach a new entity or you fetch one from the database, EF tracks it based on its type + PK value. If you request it again, then EF will give you the cached version.
unitOfWork.ProductTypeRepository.Create(p);
You have added the product type, but you have not yet saved your changes, therefore, the added product type provisionally lives in the change tracker (= cache) but not yet in the database.
_shopDbContext.ProductTypes.SingleOrDefaultAsync(p => p.ProductTypeId == id)
Here, in your first query, you are asking for your entity (and only this entity) based on its type and PK value. EF checks its cache, and notices that it already has this entity cached. It's the entity you added (but have not yet saved) in the previous step. Therefore, EF give you back a reference to the cached entity that it already knew about. EF never even bothered to talk to the database.
_shopDbContext.ProductTypes
.Include(p => p.OfferProductTypes)
.ThenInclude(op => op.Offer)
.Include(p => p.Wares)
.Include(p => p.Category)
.SingleOrDefaultAsync(p => p.ProductTypeId == id);
Here, in your second query, you are asking for much more than just your main entity. You're asking for all of its related entities.
This is inherently not something that the change tracker can account for. Even if it knows about several entities (e.g. wares) that belong to the product type, it (a) cannot be sure that it knows about all related wares (maybe there are more in the database), and (b) has to look up the categories based on the product type FK, not the ware PK, which is not something EF's cache is built to do.
Because of this, EF must invariably go to the database for your data. And EF does not perform a hybrid approach, whereby it uses some of the cache and some of the real data. It's always one or the other. Since some of the data needs to be fetched from the actual database, all of the requested data is being fetched from the database.
Because EF is now hitting the database, the database returns nothing, because you never saved your new product type to the database (yet). Therefore, EF tells you that this product type doesn't exist, because the database correctly stated that it doesn't exist (in the database).

Pipelined function as entity in EF/MVC

I have a .Net MVC app using entity framework, and normally I'd use a table or a view in a data entity... eg.
[Table("company_details", Shema = "abd")]
public class CompanyDetails
{
[Key]
[Column("cd_id_pk")]
public int CompanyDetailsId { get; set; }
etc ...
etc ...
...where company_details is an oracle table.
However I need to try to utilise a pipelined function.... eg the sql would be:
SELECT * FROM TABLE(abd.company_pck.f_single_rprt('1A122F', '01-Feb-2020','Y'));
This had been used in a report used in Oracle forms, but now it's to be included in an .Net MVC app.
How can I include a pipelined function in my entity?
thanks in advance
I just tried this and it seems to work. First create a class as you would to be able to map the return from your DbContext. In your case you just call the Pipelined table function from Oracle. I used a TVF in SQL to demonstrate. The TVF returned 3 columns of data, 2 INT and 1 NVarChar.
public class ReturnThreeColumnTableFunction
{
public int ColumnOne { get; set; }
public int ColumnTwo { get; set; }
public string ColumnThree { get; set; }
}
Then based on your Oracle Pipelined function, (see my MSSQL TVF below)
/* SQL TableValuedFunction */
ALTER FUNCTION [dbo].[ReturnThreeColumnTableFunction]
(
#ColumnOne INT,
#ColumnTwo INT,
#ColumnThree NVARCHAR(10)
)
RETURNS TABLE
AS
RETURN
(
SELECT #ColumnOne AS ColumnOne, #ColumnTwo AS ColumnTwo, #ColumnThree AS ColumnThree
)
Then in your DbContext class you setup your CodeFirst entities, be sure to add the complex type in the OnModelCreating method.
protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
modelBuilder.ComplexType<ReturnThreeColumnTableFunction>();
modelBuilder.ComplexType<ReturnThreeColumnTableFunction>().Property(x => x.ColumnOne).HasColumnName("ColumnOne");
modelBuilder.ComplexType<ReturnThreeColumnTableFunction>().Property(x => x.ColumnTwo).HasColumnName("ColumnTwo");
modelBuilder.ComplexType<ReturnThreeColumnTableFunction>().Property(x => x.ColumnThree).HasColumnName("ColumnThree");
}
Then you return this easily using the SqlQuery
var items = context.Database.SqlQuery<ReturnThreeColumnTableFunction>("SELECT * FROM dbo.ReturnThreeColumnTableFunction(1,2,'3')")

Entity Framework inserting duplicates on Seeding database [duplicate]

This question already has an answer here:
Entity Framework database mapping relationships (Duplicate creation using Seed() method)
(1 answer)
Closed 4 years ago.
EDIT----
From here i tried assigning Id's in the seeding method and this was OK for Languages but not when i added Address to the customer and assign Id's as well to these addresses, than it created the dupes again...
Both Address & Language are declared as DbSet<...> in my Context
What i tried:
Adding 1 Address (with Id) - add this to 1 customer => Creates a
dupe
Adding 1 language & 1 Address (with Id's) - add both to 1 customer
=> Creates a dupe
Adding 1 Customer with nothing buts its name => Doesn't create a
dupe
Adding 1 language (with Id) - add this to 1 customer => Doesn't
create a dupe
I have a Override ToString() method on my Customer which return its name, I can observe that when i look at the duplicate while debugging 1 is with the name, the other is the Namespace where the Customer class resides in, which seems loguc since the Name is NULL in the Dupes case but i figured to mention it anyway ...
----EDIT
I am seeding my database with some metadata and i see that it has a very strange behavior i never saw before. I am inserting a Entity "Customer" and it inserts this entity 2 times, first insert is correct and has everything it should have, the other one has NULL properties (string values) but some (like datetimes) have values.
I have totally no clue why this is happening, it is occurring when i call the base.Seed(ctx); method, that i am sure since i stopped the Webapp after this before it reached anything else.
This entity Customer has a related Entity Language as well as a Collection of Addresses.
I have another post open (no suggestions yet) where the same issue occurs and this happened suddenly, i did not make any changes myself to my model or seeding methods ...
Base Entity:
public class BaseEntity
{
public int ID { get; set; }
}
Customer:
public class Customer:BaseEntity
{
public string Name { get; set; }
public Language Language { get; set; }
public ICollection<Address> Addresses { get; set; }
}
Language:
public class Language : BaseEntity
{
public string Name { get; set; }
public string LanguageCode { get; set; }
[Required]
public ICollection<Customer> Customers { get; set; }
}
Address:
public class Address : BaseEntity
{
public Customer Customer { get; set; }
}
Seeding method:
Language newLanguageNL = new Language("Dutch");
newLanguageNL.ID = 1;
Language newLanguageFR = new Language("French");
newLanguageFR.ID = 2;
Language newLanguageEN = new Language("English");
newLanguageEN.ID = 3;
ctx.Languages.Add(newLanguageNL);
ctx.Languages.Add(newLanguageEN);
ctx.Languages.Add(newLanguageFR);
Address addressBE = new Address("informatica laan", "10", "bus nr 1", "8900", "Belgiƫ");
addressBE.ID = 1;
Address addressBE2 = new Address("rue de l'informatique", "20", "boite nr 2", "7780", "Belgique");
addressBE2.ID = 2;
Address addressEN = new Address("techstreet", "30", "box nr 1", "4000", "Bulgaria");
addressEN.ID = 3;
ctx.Addresses.Add(addressEN);
ctx.Addresses.Add(addressBE);
ctx.Addresses.Add(addressBE2);
Customer newCustomer = new Customer("Customer name", newLanguageNL, addressBE);
// ctx.Customers.AddOrUpdate(c => c.Name, newCustomer);
ctx.Customers.Add(newCustomer);
base.Seed(ctx);
OnModelCreating:
protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
base.OnModelCreating(modelBuilder);
// setting the Product FK relation required + related entity
modelBuilder.Entity<Entity.ProductSupplierForContract>().HasRequired(psfc => psfc.Product)
.WithMany(p => p.ProductSupplierForContracts)
.HasForeignKey(psfc => psfc.Product_Id);
// setting the Supplier FK relation required + related entity
modelBuilder.Entity<Entity.ProductSupplierForContract>().HasRequired(psfc => psfc.Supplier)
.WithMany(s => s.ProductSupplierForContracts)
.HasForeignKey(psfc => psfc.Supplier_Id);
// setting the Contract FK relation required + related entity
modelBuilder.Entity<Entity.ProductSupplierForContract>().HasOptional(psfc => psfc.Contract)
.WithMany(c => c.ProductSupplierForContracts)
.HasForeignKey(psfc => psfc.Contract_Id);
modelBuilder.Entity<Entity.PurchasePrice>()
.ToTable("PurchasePrices");
modelBuilder.Entity<Entity.SalesPrice>()
.ToTable("SalesPrices");
// Bundle in Bundle
modelBuilder.Entity<Entity.Bundle>().HasMany(b => b.ChildBundles);
}
Can anyone help me on this one please, thank you in advance for any feedback.
I have tried using AddOrUpdate() with no luck.
Your addresses and languages are persisted wheren you advise them to customer. I think in your constructor you advise the collections to customer, din't you?
This isn't neccessary. You can persist the customer without an expliciet advise of the collections. EF will map the collections by it self.
I see a few issues with your code. By convention, an int column called ID is going to be an identity column so you can't set it's ID explicitly without issuing a SET IDENTITY_INSERT Language ON (unless you have fluent code overriding this).
AddOrUpdate is intended for these situations. You have not shown that code. Another way is shown below:
...
if (!ctx.Languages.Any(l => l.ID == 1)) // Check if already on file
{
ctx.Database.ExecuteSqlCommand("SET IDENTITY_INSERT Language ON"); // Omit if not identity column
var dutch = new Language {
ID = 1,
Name = "Dutch",
Code = "NL"
};
ctx.Languages.Add(dutch);
ctx.SaveChanges();
ctx.Database.ExecuteSqlCommand("SET IDENTITY_INSERT Language OFF"); // Omit if not identity column
}
... repeat for other languages
... similar code for other seeded tables
So changing the relation in the Address Class of the Entity Customer to a ICollection instead of 1 Single Customer doesn't create a dupe (and creates a CustomerAddress table which i actually want as well).
Seems from the database logs (log4net) that due to the relation EF is first inserting a Customer (NULL) for the Address Reference of the customer, AND inserts the Customer (NOT NULL) with its references ... When i compare Address & Language I see that Language has a Collection of Customers as well (which Address didn't), this explains why Address was creating the duplicate customer entry. (If anyone needs any clarification on this let me know ill do my best)
This post HAS MOVED TO HERE
I want to thank everyone that has contributed in any way!

Inserts of stateless session of NHibernate are slow

It's been a couple of days that I'm working on improving NHibernate Insert performance.
I'd read in many posts (such as this one) that stateless session can insert like 1000~2000 records per second.... However the best time that it could insert 1243 records is more than 9 seconds for me :
var sessionFactory = new NHibernateConfiguration().CreateSessionFactory();
using (IStatelessSession statelessSession = sessionFactory.OpenStatelessSession())
{
statelessSession.SetBatchSize(adjustmentValues.Count);
foreach (var adj in adjustmentValues)
statelessSession.Insert(adj);
}
The class :
public partial class AdjustmentValue : PersistentObject, IFinancialValue
{
public virtual double Amount { get; set; }
public virtual bool HasManualValue { get; set; }
public virtual bool HasScaleValue { get; set; }
public virtual string Formula { get; set; }
public virtual DateTime IssueDate { get; set; }
public virtual CompanyTopic CompanyTopic { get; set; }
}
Map of the class :
public class AdjustmentValueMap : ClassMap<AdjustmentValue>
{
public AdjustmentValueMap()
{
Id(P => P.Id);
Map(p => p.Amount);
Map(p => p.IssueDate);
Map(p => p.HasManualValue);
Map(p => p.HasScaleValue);
Map(p => p.Formula);
References(p => p.CompanyTopic)
.Fetch.Join();
}
}
Am I missing something?
Any idea how to speed up the inserts?
The generated queries will be same as below :
from the looks of your NHProf results you are using identity as your POID. Therefore you cannot take advantage of batched writes. every insert/update/delete is a separate command. that is why it's taking so long.
if you change your POID to hilo, guid or guid.comb and set the batch size to 500 or 1000 then you will see a drastic improvement in the write times.
I'm assuming you are using SQL Server 2008.
Few things that come to mind. Are you using the identity key (select SCOPE_IDENTITY() in your sample output) as a primary key for your entities? If yes then I believe NHibernate has to execute the SCOPE_IDENTITY() call for each object before the object is actually saved into the database. So if you are inserting 1000 objects Nhibernate will generate 1000 INSERT statements and 1000 select SCOPE_IDENTITY() statements.
I'm not 100% sure but it might also break the batching. Since you are using NHProf then what does it say? Does it show that all the statements are batched together or can you select individual INSERT statement in the NHProf UI? If your inserts are not batched then you will most likely see "Large number of individual writes" alert in NHProf.
Edit:
If you cannot change your identity generation then you could use SqlBulkCopy. I have used it with NHibernate in data migration and it works. Ayende Rahien has sample on his blog which gets you started.

Categories

Resources