I've seen a number of "Bulk Insert in EF" questions however all of these deal with a usecase where a user is trying to insert a large array of items.
I've a situation where I have a new Parent entity with ~1500 new related entities attached to it. Both the parent and the child entities are mapped to their own tables in EF.
At the moment I'm using something like:
//p is already created and contains all the new child items
public void SaveBigThing(Parent p){
if(p.Id == 0){
// we've got a new object to add
db.BigObjects.Add(p);
}
db.SaveChanges();
}
Entity Framework at the moment creates an individual insert statement for each and every child item. Which takes 50 seconds or so. I want to be able to use db.ChildEntity.AddRange(items) But I'm unsure if there's a better way than to use 2 separate operations. First create the parent to get it's Id then a AddRange for all the child items?
IMHO You dont need to add parent first in-order to insert child items. You could do that in one shot.
You could try this in EF 5 AddRange is only available in EF 6 or higher.
This will not insert the item in bulk it will generate the query and insert at one shot
Ef bulk insertion reference
Another reference
public bool InsertParent(Parent parentObject)
{
//Assuming this is the parent table
db.BigObjects.Add(parentObject);
InsertChilds(parentObject); //Insert childs
db.SaveChanges();
}
public bool InsertChilds(Parent parentObject)
{
// This will save more time .
DataContext).Configuration.AutoDetectChangesEnabled = false;
foreach(var child in parentObject.Childs)
{
//This will set the child parent relation
child.Parent = childParent;
db.ChildEntity.Add(child);
}
DataContext).Configuration.AutoDetectChangesEnabled = true;
}
Related
I am new to using ViewModels, I have a new list here and am adding items to it by looping though a database table. The issue is that all the records that come back are identical using the same record over and over. What could be the issue and is this a good way to accomplish filling with data and Passing a ViewModel or is there a better way? Right now it returns about 500 records with the same data.
public class DimCustomersController : Controller
{
private AdventureWorks_MBDEV_DW2008Entities db = new AdventureWorks_MBDEV_DW2008Entities();
public ActionResult CustomersIndexVM()
{
List<DimCustomersIndexViewModel> CustomerList = new List<DimCustomersIndexViewModel>();
DimCustomersIndexViewModel CustomerItem = new DimCustomersIndexViewModel();
foreach (var m in db.DimCustomers.ToList())// cold do for loop up to count
{
CustomerItem.Title = m.Title;
CustomerItem.FirstName = m.FirstName;
CustomerItem.MiddleName = m.MiddleName;
CustomerItem.LastName = m.LastName;
CustomerItem.BirthDate = m.BirthDate;
CustomerItem.MaritalStatus = m.MaritalStatus;
CustomerItem.Suffix = m.Suffix;
CustomerItem.Gender = m.Gender;
CustomerItem.EmailAddress = m.EmailAddress;
CustomerItem.AddressLine1 = m.AddressLine1;
CustomerItem.AddressLine2 = m.AddressLine2;
CustomerItem.Phone = m.Phone;
//other columns go here
CustomerList.Add(CustomerItem);
}
return View("CustomersIndexVM", CustomerList);
}
This line needs to be inside the loop:
DimCustomersIndexViewModel CustomerItem = new DimCustomersIndexViewModel();
The reason is that you want a new view model for each customer, but instead you are currently creating only one view model and changing its properties. When you add it to the list, you are not adding a copy; you are adding the same view model you already added.
This code would work if DimCustomersIndexViewModel was a struct, because structs are just a bag of values that have no inherent identity and they are copied rather than referenced. (Technical comparison.) But it's a class (as it should be), with a unique identity, so you're adding a reference to the single view model into the list over and over. Customerlist[0] and CustomerList[1] and all the other items point to the same DimCustomersIndexViewModel object instance, whose properties are then overwritten and left equal to the very last customer.
By moving this line inside the loop, you are creating a separate DimCustomersIndexViewModel for each customer, each with its own set of properties, and CustomerList contains references to many different DimCustomersIndexViewModel object instances.
Once you have solid experience with this concept, a future step could be to use AutoMapper so that you don't have to maintain a list of all properties in your code here.
The problem is you add the same reference object during each iteration of your loop. That object never changes (you never new it up again), but you change the properties on the object. Then you add that object over and over. You need to new up that object each iteration of the loop.
Firstly apologies for the poor title. Absolutely no idea how to describe this question!
I have a "Relationship" entity that defines a relationship between 2 users.
public class Relationship{
User User1{get;set;}
User User2{get;set;}
DateTime StateChangeDate {get;set;}
//RelationshipState is an Enum with int values
State RelationshipState State{get;set;}
}
Relationship state example.
public enum RelationshipState{
state1 = 1,
state2 = 2,
state3 = 3,
state4 = 4
}
A Relationship entity is created each time the RelationshipState changes. So for any pair of users, there will be many Relationship objects. With the most recent being current.
I'm trying to query for any Relationship object that represents a REDUCTION in RelationshipState for a particular pair of users.
So all the RelationshipObjects for all the users. That have a later Date than one with a higher RelationshipState.
I'm finding it very hard to figure out how to accomplish this without iterating over the entire Relationship table.
First, create a query to return all the combinations of users and a child that lists all the status changes. For more information, google LINQ Group By.
Then using your collection, filter out all the ones you don't want by looking at the last two status changes and seeing if it's gone down.
Here's an example, tested in LinqPad as a C# Program:
public enum RelationshipState {
state1 = 1,
state2 = 2,
state3 = 3,
state4 = 4
}
public class User {
public int id {get;set;}
}
public class Relationship{
public User User1{get;set;}
public User User2{get;set;}
public DateTime StateChangeDate {get;set;}
//RelationshipState is an Enum with int values
public RelationshipState State {get;set;}
}
void Main()
{
var rs=new List<Relationship>() {
new Relationship{ User1=new User{id=1},User2=new User{id=2},StateChangeDate=DateTime.Parse("1/1/2013"),State=RelationshipState.state2},
new Relationship{ User1=new User{id=1},User2=new User{id=2},StateChangeDate=DateTime.Parse("1/2/2013"),State=RelationshipState.state3},
new Relationship{ User1=new User{id=1},User2=new User{id=3},StateChangeDate=DateTime.Parse("1/1/2013"),State=RelationshipState.state2},
new Relationship{ User1=new User{id=1},User2=new User{id=3},StateChangeDate=DateTime.Parse("1/2/2013"),State=RelationshipState.state1},
new Relationship{ User1=new User{id=2},User2=new User{id=3},StateChangeDate=DateTime.Parse("1/2/3013"),State=RelationshipState.state1}
};
var result=rs.GroupBy(cm=>new {id1=cm.User1.id,id2=cm.User2.id},(key,group)=>new {Key1=key,Group1=group.OrderByDescending(g=>g.StateChangeDate)})
.Where(r=>r.Group1.Count()>1) // Remove Entries with only 1 status
//.ToList() // This might be needed for Linq-to-Entities
.Where(r=>r.Group1.First().State<r.Group1.Skip(1).First().State) // Only keep relationships where the state has gone done
.Select(r=>r.Group1.First()) //Turn this back into Relationship objects
;
// Use this instead if you want to know if state ever had a higher state than it is currently
// var result=rs.GroupBy(cm=>new {id1=cm.User1.id,id2=cm.User2.id},(key,group)=>new {Key1=key,Group1=group.OrderByDescending(g=>g.StateChangeDate)})
// .Where(r=>r.Group1.First().State<r.Group1.Max(g=>g.State))
// .Select(r=>r.Group1.First())
// ;
result.Dump();
}
Create a stored procedure in the database that can use a cursor to iterate the items and pair them off with the item before them (and then filter to decreasing state.)
Barring that, you can perform an inner query that finds the previous value for each item:
from item in table
let previous =
(from innerItem in table
where previous.date < item.Date
select innerItem)
.Max(inner => inner.Date)
where previous.State > item.State
select item
As inefficient as that seems, It might be worth a try. Perhaps, with the proper indexes, and a good query optimizer (and a sufficiently small set of data) it won't be that bad. If it's unacceptably slow, then trying out a stored proc with a cursor is most likely going to be the best.
I have a list of Item entities that are being processed in a batch, something like this:
foreach (var itemId in ItemIdList)
{
var item = getById(itemId); //load line
if(item != null)
{
//...do some processing
delete(item)
}
}
The problem is, the same itemId could be listed in ItemIdList multiple times, so after it is deleted, and I try to load the item a second time, the load line fails with the error
Unexpected row count: 0; expected: 1 (stale state exception)
I understand that the entity is not there any more, but I would have expected my get function to just return null. Here is my getById function:
var item = (from i in UnitOfWork.CurrentSession.QueryOver<T>()
where i.Id == id
select i
).SingleOrDefault();
Why isn't SingleOrDefault just returning null?
My Item entity only has one autogenerated key and the hash function looks like this:
public override int GetHashCode()
{
int hashCode = 0;
hashCode = hashCode ^ Id.GetHashCode();
return hashCode;
}
Edit:
Here is my delete method
public void Delete(T t) //T would be Item in this case
{
UnitOfWork.CurrentSession.Delete(t);
}
I haven't seen your delete method but since you state these items are being done in a 'batch' I'll assume you wait to flush until everything has been deleted.
Since the flush doesn't occur until you've attempted to delete all the items, it still exists in the database. Therefore, when you retrieve the item a second time it thinks that it should be 'deleted' but, behold, it still exists. This is why NHibernate thinks that the state is 'stale.'
One easy way to fix this would be to have the ItemIdList be a set (like a HashSet). This will prevent duplicate IDs from being present and should fix the problem.
On another note, if you are attempting to delete all entities within a list of ids, there are a lot more efficient ways than reading each one, one at a time, from the database and then deleting them.
I have two tables: Transactions and TransactionAgents. TransactionAgents has a foreign key to Transactions called TransactionID. Pretty standard.
I also have this code:
BrokerManagerDataContext db = new BrokerManagerDataContext();
var transactions = from t in db.Transactions
where t.SellingPrice != 0
select t;
var taAgents = from ta in db.TransactionAgents
select ta;
foreach (var transaction in transactions)
{
foreach(var agent in taAgents)
{
agent.AgentCommission = ((transaction.CommissionPercent / 100) * (agent.CommissionPercent / 100) * transaction.SellingPrice) - agent.BrokerageSplit;
}
}
dataGridView1.DataSource = taAgents;
Basically, a TransactionAgent has a property/column named AgentCommission, which is null for all TransactionAgents in my database.
My goal is to perform the math you see in the foreach(var agent in taAgents) to patch up the value for each agent so that it isn't null.
Oddly, when I run this code and break-point on agent.AgentCommission = (formula) it shows the value is being calculated for AgentCommissision and the object is being updated but after it displays in my datagrid (used only for testing), it does not show the value it calculated.
So, to me, it seems that the Property isn't being permanently set on the object. What's more, If I persist this newly updated object back to the database with an update, I doubt the calculated AgentCommission will be set there.
Without having my table set up the same way, is there anyone that can look at the code and see why I am not retaining the property's value?
IEnumerable<T>s do not guarantee that updated values will persist across enumerations. For instance, a List will return the same set of objects on every iteration, so if you update a property, it will be saved across iterations. However, many other implementations of IEnumerables return a new set of objects each time, so any changes made will not persist.
If you need to store and update the results, pull the IEnumerable<T> down to a List<T> using .ToList() or project it into a new IEnumerable<T> using .Select() with the changes applied.
To specifically apply that to your code, it would look like this:
var transactions = (from t in db.Transactions
where t.SellingPrice != 0
select t).ToList();
var taAgents = (from ta in db.TransactionAgents
select ta).ToList();
foreach (var transaction in transactions)
{
foreach(var agent in taAgents)
{
agent.AgentCommission = ((transaction.CommissionPercent / 100) * (agent.CommissionPercent / 100) * transaction.SellingPrice) - agent.BrokerageSplit;
}
}
dataGridView1.DataSource = taAgents;
Specifically, the problem is that each time you access the IEnumerable, it enumerates over the collection. In this case, the collection is a call to the database. In the first part, you're getting the values from the database and updating them. In the second part, you're getting the values from the database again and setting that as the datasource (or, pedantically, you're setting the enumerator as the datasource, and then that is getting the values from the database).
Use .ToList() or similar to keep the results in memory, and access the same collection every time.
Assuming you are using LINQ to SQL, if EnableObjectTracking is false, then the objects will be constructed new every time the query is run. Otherwise, you would be getting the same object instances each time and your changes would survive. However, like others have shown, instead of having the query execute multiple times, cache the results in a list. Not only will you get what you want working, you'll have fewer database round trips.
I found that I had to locate the item in the list that I wanted to modify, extract the copy, modify the copy (by incrementing its count property), remove the original from the list and add the modified copy.
var x = stats.Where(d => d.word == s).FirstOrDefault();
var statCount = stats.IndexOf(x);
x.count++;
stats.RemoveAt(statCount);
stats.Add(x);
It is helpful to rewrite your LINQ expression using lambdas so that we can consider the code in more explicit terms.
//Original code from question
var taAgents = from ta in db.TransactionAgents
select ta;
//Rewritten to explicitly call attention to what Select() is actually doing
var taAgents = db.TransactionAgents.Select(ta => new TransactionAgents(/*database row's data*/)});
In the rewritten code, we can clearly see that Select() is constructing a new object based on each row returned from the database. What's more, this object construction occurs every time the IEnumerable taAgents is iterated through.
So, explained more concretely, if there are 5 TransactionAgents rows in the database, in the following example, the TransactionAgents() constructor is called a total of 10 times.
// Assume there are 5 rows in the TransactionAgents table
var taAgents = from ta in db.TransactionAgents
select ta;
//foreach will iterate through the IEnumerable, thus calling the TransactionAgents() constructor 5 times
foreach(var ta in taAgents)
{
Console.WriteLine($"first iteration through taAgents - element {ta}");
}
// these first 5 TransactionAgents objects are now out of scope and are destroyed by the GC
//foreach will iterate through the IEnumerable, thus calling the TransactionAgents() constructor 5 MORE times
foreach(var ta in taAgents)
{
Console.WriteLine($"second iteration through taAgents - element {ta}");
}
// these second 5 TransactionAgents objects are now out of scope and are destroyed by the GC
As we can see, all 10 of our TransactionAgents objects were created by the lambda in our Select() method, and do not exist outside of the scope of the foreach statement.
Thanks again for all the wonderful answers you have all posted!
I have two tables in SQL. The first defines the parent, and has a primary key column called ParentId. I also have a child table that has a primary key, and a foreign key as 'ParentId'. So the two tables form a one parent - to many children relationship.
The question is what is the most efficient way to pull the parent + child data C# code? The data has to be read into the following objects:
public class Parent
{
public int ParentId { get; set; }
public List<Child> Children { get; set; }
// ... many more properties ... //
}
public class Child
{
public int ChildId { get; set; }
public string Description { get; set; }
// ... many more properties ... //
}
If i use the following query I will get the parent and the children at once where each parent will be repeated as many times as many children it has:
SELECT
p.ParentId as 'ParentId',
c.ChildId as 'ChildId',
-- other relevant fields --
FROM
Parents p
INNER JOIN
Children c
ON
p.ParentId = c.ParentId
Using this approach I'd have to find all the unique parent rows, and then read all the children. The advantage is that I only make 1 trip to the db.
The second version of this is to read all parents separately:
SELECT * FROM Parents
and then read all children separately:
SELECT * FROM Children
and use LINQ to merge all parents with children. This approach makes 2 trips to the db.
The third and final (also most inefficient) approach is to grab all parents, and while constructing each parent object, make a trip to the DB to grab all its children. This approach takes n+1 connections: 1 for all parents and n number of trips to get all children for each parent.
Any advise on how to do this easier? Granted i can't get away from using stored procedures, and I can't use LINQ2SQL or EF. Would you prefer Data Tables vs DataReaders and if so how to use either with approach 1 or 2?
Thanks,
Martin
I prefer pulling all results in one query and just build the tree in one loop
SELECT p.ParentId as 'ParentId', null as 'ChildId'
FROM Parents p
UNION ALL
SELECT c.ParentId as 'ParentId', c.ChildId as 'ChildId'
FROM Children c
List<Parent> result = new List<Parent>();
Parent current;
while (dr.Read())
{
if (string.isNullOrEmpty(dr['ChildId']))
{
//create and initialize your parent object here and set to current
}
else if (!string.isNullOrEmpty(dr['ChildId'])
&& dr['ParentId'].ToString().Equals(current.ParentId.ToString())
{
//create and initialize child
//add child to parents child collection
}
}
Using this approach I'd have to find
all the unique parent rows, and then
read all the children.
You could just include an order by p.ParentId. This ensures all children from the same parent are in consecutive rows. So you can read the next row, if the parent has changed, create a new parent object, otherwise add the child to the previous parent. No need to search for unique parent rows.
I usually make this decision at the table level. Some tables I need the children often, so I grab them right away. In other cases accessing the children is a rarity, so I lazy-load them.
I would guess option #2 would be more efficient bandwidth wise over option #1 (as you're not repeating any data).
You can have both queries in a single stored procedure, and execute the procedure through code using a sqldataadapter (i.e. (new SqlDataAdapter(command)).Fill(myDataSet), where myDataSet would contain the two tables).
From there you'd read the first table, creating a dictionary of the parents (in a Dictionary<int, Parent>) by ParentId, then simply read each row in the 2nd table to add the children:
parents[(int)myDataSet.Tables[1]["ParentId"]].Children.Add(new Child() { etc } );
The pseudo code is probably off a bit, but hopefully you get the general idea