Entity Framework 4.3 - Code First - Update List Property - c#

As a follow-up to my earlier question, I now know that EF doesn't just save all of the changes of the entire entity for me automatically. If my entity has a List<Foo>, I need to update that list and save it. But how? I've tried a few things, but I can't get the list to save properly.
I have a many-to-many association between Application and CustomVariableGroup. An app can have one or more groups, and a group can belong to one or more apps. I believe I have this set up correctly with my Code First implementation because I see the many-to-many association table in the DB.
The bottom line is that the Application class has a List<CustomVariableGroup>. My simple case is that the app already exists, and now a user has selected a group to belong to the app. I want to save that change in the DB.
Attempt #1
this.Database.Entry(application).State = System.Data.EntityState.Modified;
this.Database.SaveChanges();
Result: Association table still has no rows.
Attempt #2
this.Database.Applications.Attach(application);
var entry = this.Database.Entry(application);
entry.CurrentValues.SetValues(application);
this.Database.SaveChanges();
Result: Association table still has no rows.
Attempt #3
CustomVariableGroup group = application.CustomVariableGroups[0];
application.CustomVariableGroups.Clear();
application.CustomVariableGroups.Add(group);
this.Database.SaveChanges();
Result: Association table still has no rows.
I've researched quite a bit, and I've tried more things than I've shown, and I simply don't know how to update an Application's list with a new CustomVariableGroup. How should it be done?
EDIT (Solution)
After hours of trial and error, this seems to be working. It appears that I need to get the objects from the DB, modify them, then save them.
public void Save(Application application)
{
Application appFromDb = this.Database.Applications.Single(
x => x.Id == application.Id);
CustomVariableGroup groupFromDb = this.Database.CustomVariableGroups.Single(
x => x.Id == 1);
appFromDb.CustomVariableGroups.Add(groupFromDb);
this.Database.SaveChanges();
}

While I consider this a bit of a hack, it works. I'm posting this in the hopes that it helps someone else save an entire day's worth of work.
public void Save(Application incomingApp)
{
if (incomingApp == null) { throw new ArgumentNullException("incomingApp"); }
int[] groupIds = GetGroupIds(incomingApp);
Application appToSave;
if (incomingApp.IdForEf == 0) // New app
{
appToSave = incomingApp;
// Clear groups, otherwise new groups will be added to the groups table.
appToSave.CustomVariableGroups.Clear();
this.Database.Applications.Add(appToSave);
}
else
{
appToSave = this.Database.Applications
.Include(x => x.CustomVariableGroups)
.Single(x => x.IdForEf == incomingApp.IdForEf);
}
AddGroupsToApp(groupIds, appToSave);
this.Database.SaveChanges();
}
private void AddGroupsToApp(int[] groupIds, Application app)
{
app.CustomVariableGroups.Clear();
List<CustomVariableGroup> groupsFromDb2 =
this.Database.CustomVariableGroups.Where(g => groupIds.Contains(g.IdForEf)).ToList();
foreach (CustomVariableGroup group in groupsFromDb2)
{
app.CustomVariableGroups.Add(group);
}
}
private static int[] GetGroupIds(Application application)
{
int[] groupIds = new int[application.CustomVariableGroups.Count];
int i = 0;
foreach (CustomVariableGroup group in application.CustomVariableGroups)
{
groupIds[i] = group.IdForEf;
i++;
}
return groupIds;
}

Related

C# - Creating a list by filtering a pre-exisitng list

I am very new to C# lists and databases, please keep this in mind.
I have a list of workouts saved in a database that also has the UserID field to make each workout added to the table unique to each user. I want to make a list view for when the user logs in, they can see only their workouts.
I have tried to do this by creating a new list without all the workouts that don't have that User's primary key/userID
public void Read()
{
using (UserDataContext context = new UserDataContext())
{
DatabaseWorkouts = context.Workouts.ToList(); // Saves the users from the database into a list
// DatabaseWorkouts = context.Workouts.FindAll(item => item.UserID != Globals.primaryKey); I thought this would work
foreach (var item in DatabaseWorkouts.ToList())
{
if (DatabaseWorkouts.Exists(item => item.UserID != Globals.primaryKey))
{
DatabaseWorkouts.Remove(item);
}
}
ItemList.ItemsSource = DatabaseWorkouts; //Displays the list on the listview in the GUI
}
}
I have run many tests with this code above and I think that it only displays the workouts that are most recent and accept conditions, instead of just accepting conditions.
Please help
Instead of fetching all the workouts and then removing the ones that don't belong to the user, you could just directly fetch the user's ones.
Assuming that Globals.primaryKey is the targeted user's id, you can do the following
var userWorkouts = context.Workouts.Where(w => w.UserId == Globals.primaryKey).ToList();
ItemList.ItemsSource = userWorkouts;

How to Optimize Code Performance in .NET [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have an export job migrating data from an old database into a new database. The problem I'm having is that the user population is around 3 million and the job takes a very long time to complete (15+ hours). The machine I am using only has 1 processor so I'm not sure if threading is what I should be doing. Can someone help me optimize this code?
static void ExportFromLegacy()
{
var usersQuery = _oldDb.users.Where(x =>
x.status == 'active');
int BatchSize = 1000;
var errorCount = 0;
var successCount = 0;
var batchCount = 0;
// Using MoreLinq's Batch for sequences
// https://www.nuget.org/packages/MoreLinq.Source.MoreEnumerable.Batch
foreach (IEnumerable<users> batch in usersQuery.Batch(BatchSize))
{
Console.WriteLine(String.Format("Batch count at {0}", batchCount));
batchCount++;
foreach(var user in batch)
{
try
{
var userData = _oldDb.userData.Where(x =>
x.user_id == user.user_id).ToList();
if (userData.Count > 0)
{
// Insert into table
var newData = new newData()
{
UserId = user.user_id; // shortened code for brevity.
};
_db.newUserData.Add(newData);
_db.SaveChanges();
// Insert item(s) into table
foreach (var item in userData.items)
{
if (!_db.userDataItems.Any(x => x.id == item.id)
{
var item = new Item()
{
UserId = user.user_id, // shortened code for brevity.
DataId = newData.id // id from object created above
};
_db.userDataItems.Add(item);
}
_db.SaveChanges();
successCount++;
}
}
}
catch(Exception ex)
{
errorCount++;
Console.WriteLine(String.Format("Error saving changes for user_id: {0} at {1}.", user.user_id.ToString(), DateTime.Now));
Console.WriteLine("Message: " + ex.Message);
Console.WriteLine("InnerException: " + ex.InnerException);
}
}
}
Console.WriteLine(String.Format("End at {0}...", DateTime.Now));
Console.WriteLine(String.Format("Successful imports: {0} | Errors: {1}", successCount, errorCount));
Console.WriteLine(String.Format("Total running time: {0}", (exportStart - DateTime.Now).ToString(#"hh\:mm\:ss")));
}
Unfortunately, the major issue is the number of database round-trip.
You make a round-trip:
For every user, you retrieve user data by user id in the old database
For every user, you save user data in the new database
For every user, you save user data item in the new database
So if you say you have 3 million users, and every user has an average of 5 user data item, it mean you do at least 3m + 3m + 15m = 21 million database round-trip which is insane.
The only way to dramatically improve the performance is by reducing the number of database round-trip.
Batch - Retrieve user by id
You can quickly reduce the number of database round-trip by retrieving all user data at once and since you don't have to track them, use "AsNoTracking()" for even more performance gains.
var list = batch.Select(x => x.user_id).ToList();
var userDatas = _oldDb.userData
.AsNoTracking()
.Where(x => list.Contains(x.user_id))
.ToList();
foreach(var userData in userDatas)
{
....
}
You should already have saved a few hours only with this change.
Batch - Save Changes
Every time you save a user data or item, you perform a database round-trip.
Disclaimer: I'm the owner of the project Entity Framework Extensions
This library allows to perform:
BulkSaveChanges
BulkInsert
BulkUpdate
BulkDelete
BulkMerge
You can either call BulkSaveChanges at the end of the batch or create a list to insert and use directly BulkInsert instead for even more performance.
You will, however, have to use a relation to the newData instance instead of using the ID directly.
foreach (IEnumerable<users> batch in usersQuery.Batch(BatchSize))
{
// Retrieve all users for the batch at once.
var list = batch.Select(x => x.user_id).ToList();
var userDatas = _oldDb.userData
.AsNoTracking()
.Where(x => list.Contains(x.user_id))
.ToList();
// Create list used for BulkInsert
var newDatas = new List<newData>();
var newDataItems = new List<Item();
foreach(var userData in userDatas)
{
// newDatas.Add(newData);
// newDataItem.OwnerData = newData;
// newDataItems.Add(newDataItem);
}
_db.BulkInsert(newDatas);
_db.BulkInsert(newDataItems);
}
EDIT: Answer subquestion
One of the properties of a newDataItem, is the id of newData. (ex.
newDataItem.newDataId.) So newData would have to be saved first in
order to generate its id. How would I BulkInsert if there is a
dependency of an another object?
You must use instead navigation properties. By using navigation property, you will never have to specify parent id but set the parent object instance instead.
public class UserData
{
public int UserDataID { get; set; }
// ... properties ...
public List<UserDataItem> Items { get; set; }
}
public class UserDataItem
{
public int UserDataItemID { get; set; }
// ... properties ...
public UserData OwnerData { get; set; }
}
var userData = new UserData();
var userDataItem = new UserDataItem();
// Use navigation property to set the parent.
userDataItem.OwnerData = userData;
Tutorial: Configure One-to-Many Relationship
Also, I don't see a BulkSaveChanges in your example code. Would that
have to be called after all the BulkInserts?
Bulk Insert directly insert into the database. You don't have to specify "SaveChanges" or "BulkSaveChanges", once you invoke the method, it's done ;)
Here is an example using BulkSaveChanges:
foreach (IEnumerable<users> batch in usersQuery.Batch(BatchSize))
{
// Retrieve all users for the batch at once.
var list = batch.Select(x => x.user_id).ToList();
var userDatas = _oldDb.userData
.AsNoTracking()
.Where(x => list.Contains(x.user_id))
.ToList();
// Create list used for BulkInsert
var newDatas = new List<newData>();
var newDataItems = new List<Item();
foreach(var userData in userDatas)
{
// newDatas.Add(newData);
// newDataItem.OwnerData = newData;
// newDataItems.Add(newDataItem);
}
var context = new UserContext();
context.userDatas.AddRange(newDatas);
context.userDataItems.AddRange(newDataItems);
context.BulkSaveChanges();
}
BulkSaveChanges is slower than BulkInsert due to having to use some internal method from Entity Framework but still way faster than SaveChanges.
In the example, I create a new context for every batch to avoid memory issue and gain some performance. If you re-use the same context for all batchs, you will have millions of tracked entities in the ChangeTracker which is never a good idea.
Entity Framework is a very bad choice for importing large amounts of data. I know this from personal experience.
That being said, I found a few ways to optimize things when I tried to use it in the same way you are.
The Context will cache objects as you add them, and the more inserts you do, the slower future inserts will get. My solution was to limit each context to about 500 inserts before I disposed of that instance and created a new one. This boosted performance significantly.
I was able to make use of multiple threads to increase performance, but you will have to be very careful about resource contention. Each thread will definitely need its own Context, don't even think about trying to share it between threads. My machine had 8 cores, so threading will probably not help you as much; with a single core I doubt it will help you at all.
Turn off ChangeTracking with AutoDetectChangesEnabled = false;, change tracking is incredibly slow. Unfortunately this means you have to modify your code to make all changes directly through the context. No more Entity.Property = "Some Value";, it becomes Context.Entity(e=> e.Property).SetValue("Some Value"); (or something like that, I don't remember the exact syntax), which makes the code ugly.
Any queries you do should definitely use AsNoTracking.
With all that, I was able to cut a ~20 hour process down to about 6 hours, but I still don't recommend using EF for this. It was an extremely painful project due almost entirely to my poor choice of EF to add data. Please use something else... anything else...
I don't want to give the impression that EF is a bad data access library, it is great at what it was designed to do, unfortunately this is not what it was designed for.
I can think on a few options.
1) A little speed increase could be done by moving your _db.SaveChanges() under your foreach() close bracket
foreach (...){
}
successCount += _db.SaveChanges();
2) Add items to a list, and then to context
List<ObjClass> list = new List<ObjClass>();
foreach (...)
{
list.Add(new ObjClass() { ... });
}
_db.newUserData.AddRange(list);
successCount += _db.SaveChanges();
3) If it's a big amount of dada, save on bunches
List<ObjClass> list = new List<ObjClass>();
int cnt=0;
foreach (...)
{
list.Add(new ObjClass() { ... });
if (++cnt % 100 == 0) // bunches of 100
{
_db.newUserData.AddRange(list);
successCount += _db.SaveChanges();
list.Clear();
// Optional if a HUGE amount of data
if (cnt % 1000 == 0)
{
_db = new MyDbContext();
}
}
}
// Don't forget that!
_db.newUserData.AddRange(list);
successCount += _db.SaveChanges();
list.Clear();
4) If TOOOO big, considere using bulkinserts. There are a few examples on internet and a few free libraries around.
Ref: https://blogs.msdn.microsoft.com/nikhilsi/2008/06/11/bulk-insert-into-sql-from-c-app/
On most of these options you loose some control on error handling as it is difficult to know which one failed.

How to update many to many in Entity Framework with AutoDetectChangesEnabled = false

Please, help me to handle this situation:
I meaningly switched off AutoDetectChangesEnabled and I load my
entities AsNoTracked() meaningly either.
And I can't update many-to-many relationship in this case:
Here is the code of Update method:
public void Update(User user)
{
var userRoleIds = user.Roles.Select(x => x.Id);
var updated = _users.Find(user.Id);
if (updated == null)
{
throw new InvalidOperationException("Can't update user that doesn't exists in database");
}
updated.Name = user.Name;
updated.LastName = user.LastName;
updated.Login = user.Login;
updated.Password = user.Password;
updated.State = user.State;
var newRoles = _roles.Where(r => userRoleIds.Contains(r.Id)).ToList();
updated.Roles.Clear();
foreach (var newRole in newRoles)
{
updated.Roles.Add(newRole);
}
_context.Entry(updated).State = EntityState.Modified;
}
All simple fields, like Name, LastName updated. But the set
of Roles for User doesn't get updated - it stays the same.
I tried loading Roles using
_context.Entry(updated).Collection("Roles").Load();
But I can't update this loaded set in any way.
I searched for similar items but failed to find the answer, thought it definitely already exists.
I'm really sorry for possible dublicate.
PS. I want to add that I don't want to delete or update child entities at all.
A lot of existing answers suggest manually delete / add child entities to database in whole, but it is not suitable for me.
Roles are independent entities, any other user can use them.
I just want to update User_Role table in database, but I can't.

Entity framework inserts wrong entity into db on savechanges

I am trying to write a program to scan a directory containing tv show folders, look up some details about the shows using tvrage API and then save the details to a database using entity framework.
My TVShow table pkey is the same value as taken from the tvrage database show id, and I am having issues when duplicate or similar folder names are returning the same Show info. In a situation where I have a directory containing three folders, "Alias", "Alias 1" , "Band of Brothers" I get the following output from my code
* TV SHOWS *
Alias....... NO MATCH......ADDING........DONE
Alias 1 ...... NO MATCH.....ADDING....CANT ADD, ID ALREADY EXISTS IN DB
Band of Brothers ...... NO MATCH..ADDING....
Before getting an UpdateException on the context.SaveChanges(); line
Violation of PRIMARY KEY constraint 'PK_TVShows'.
I can see using SQL profiler that the problem is that my app is trying to perform an insert on the alias show for a second time with duplicate key, but I can't see why. When I step through the code on the second interaction of the foreach loop (second "alias" folder), the code to save the show entity to the database is bypassed.
It is only on the next iteration of the foreach loop when I have created a new TVShow entity for "Band of Brothers" do I
actually reach the code which adds a Tvshow to context and saves, at which point the app crashes. In visual studio I can see
at the point of the crash that;
"show" entity in context.TVShows.AddObject(show) is "Band of Brothers" w/ a unique ID
context.TVShows only contains one record, the first Alias Entity
But SQL profiler shows that EntityFramework is instead inserting Alias for a second time, and I am stumped by why this is
private void ScanForTVShowFolders( GenreDirectoryInfo drive ) {
IEnumerable<DirectoryInfo> shows = drive.DirInfo.EnumerateDirectories();
foreach (DirectoryInfo d in shows) {
//showList contains a list of existing TV show names previously queried out of DB
if (showList.Contains(d.Name)) {
System.Console.WriteLine(d.Name + ".....MATCH");
} else {
System.Console.Write(d.Name + "......NO MATCH..ADDING....");
TVShow show = LookUpShowOnline(d.Name, drive.GenreName);
if (show.Id == -1) { // id of -1 means online search failed
System.Console.Write("..........CANT FIND SHOW" + Environment.NewLine);
} else if (context.TVShows.Any(a => a.Id == show.Id)) { //catch duplicate primary key insert
System.Console.Write(".......CANT ADD, ID ALREADY EXISTS IN DB" + Environment.NewLine);
} else {
context.TVShows.AddObject(show);
context.SaveChanges();
System.Console.Write("....DONE" + Environment.NewLine);
}
}
}
private TVShow LookUpShowOnline( string name, string genre ) {
string xmlPath = String.Format("http://services.tvrage.com/feeds/search.php?show='{0}'", name);
TVShow aShow = new TVShow();
aShow.Id = -1; // -1 = Can't find
XmlDocument xmlResp = new XmlDocument();
try { xmlResp.Load(xmlPath); } catch (WebException e) { System.Console.WriteLine(e); }
XmlNode root = xmlResp.FirstChild;
if (root.NodeType == XmlNodeType.XmlDeclaration) { root = root.NextSibling; }
XmlNode tvShowXML;
//if (showXML["episode"] == null)
// return false;
tvShowXML = root["show"];
if (tvShowXML != null) {
aShow.Id = System.Convert.ToInt16(tvShowXML["showid"].InnerText);
aShow.Name = tvShowXML["name"].InnerText.Trim();
aShow.StartYear = tvShowXML["started"].InnerText.Trim();
aShow.Status = tvShowXML["status"].InnerText.Trim();
aShow.TVGenre = context.TVGenres.Where(b => b.Name.Trim() == genre).Single();
}
return aShow;
}
}
Edit
Doing some more reading I added context.ObjectStateManager to my debug watchlist and I can see everytime I create a new TVShow entity a new record is added to _addedEntityStore. Actually if I remove context.TVShows.AddObject(show) the code still updates the database so manually adding to the context seems redundant.
If your are inserting object by foreach loop > better to keep the Primary Key outside and make it increment!
eg: int newID= Shows.Select(d=>d.Id).Max();
foreach(............)
{
show.Id = newID++;
.
.
. //remaining fields
.
context.TVShows.AddObject(show);
}
context.SaveChanges();
it works for me...!!
Turns out context.TVShows.AddObject(show) is unnecessary in my case, I was inadvertently adding all created show entities to the context when this query runs
aShow.TVGenre = context.TVGenres.Where(b => b.Name.Trim() == genre).Single();
This is not what I wanted, I just wanted to create the object, then decide whether to add it. Will be pretty easy to fix now I know why it's happening.

How to update records from an IList in a Foreach loop?

My controller is passing through a list which I then need to loop through and update every record in the list in my database. I'm using ASP.NET MVC with a repository pattern using Linq to Sql. The code below is my save method which needs to add a record to an invoice table and then update the applicable jobs in the job table from the db.
public void SaveInvoice(Invoice invoice, IList<InvoiceJob> invoiceJobs)
{
invoiceTable.InsertOnSubmit(invoice);
invoiceTable.Context.SubmitChanges();
foreach (InvoiceJob j in invoiceJobs)
{
var jobUpdate = invoiceJobTable.Where(x => x.JobID == j.JobID).Single();
jobUpdate.InvoiceRef = invoice.InvoiceID.ToString();
invoiceJobTable.GetOriginalEntityState(jobUpdate);
invoiceJobTable.Context.Refresh(RefreshMode.KeepCurrentValues, jobUpdate);
invoiceJobTable.Context.SubmitChanges();
}
}
**I've stripped the code down to just the problem area.
This code doesn't work and no job records are updated, but the invoice table is updated fine. No errors are thrown and the invoiceJobs IList is definitely not null. If I change the code by removing the foreach loop and manually specifying which JobId to update, it works fine. The below works:
public void SaveInvoice(Invoice invoice, IList<InvoiceJob> invoiceJobs)
{
invoiceTable.InsertOnSubmit(invoice);
invoiceTable.Context.SubmitChanges();
var jobUpdate = invoiceJobTable.Where(x => x.JobID == 10000).Single();
jobUpdate.InvoiceRef = invoice.InvoiceID.ToString();
invoiceJobTable.GetOriginalEntityState(jobUpdate);
invoiceJobTable.Context.Refresh(RefreshMode.KeepCurrentValues, jobUpdate);
invoiceJobTable.Context.SubmitChanges();
}
I just can't get the foreach loop to work at all. Does anyone have any idea what I'm doing wrong here?
It seems like the mostly likely cause of this problem is that the invokeJobs collection is an empty collection. That is it has no elements hence the foreach loop effectively does nothing.
You can verify this by adding the following to the top of the method (just for debugging purposes)
if (invoiceJobs.Count == 0) {
throw new ArgumentException("It's an empty list");
}
Change this
var jobUpdate = invoiceJobTable.Where(x => x.JobID == 10000).Single();
jobUpdate.InvoiceRef = invoice.InvoiceID.ToString();
invoiceJobTable.GetOriginalEntityState(jobUpdate);
invoiceJobTable.Context.Refresh(RefreshMode.KeepCurrentValues, jobUpdate);
invoiceJobTable.Context.SubmitChanges();
to
var jobUpdate = invoiceJobTable.Where(x => x.JobID == 10000).Single();
jobUpdate.InvoiceRef = invoice.InvoiceID.ToString();
invoiceJobTable.SubmitChanges();
It looks like your GetOriginalEntityState doesn't actually do anything, because you don't use the returned value. I can't see any reason why you are making the DataContext.Refresh() call. All it does is erase the changes you made, thus making your "foreach loop not work"

Categories

Resources