I want to learn how others cope with the following scenario.
This is not homework or an assignment of any kind. The example classes have been created to better illustrate my question however it does reflect a real life scenario which we would like feedback on.
We retrieve all data from the database and place it into an object. A object represents a single record and if multiple records exist in the database, we place the data into a List<> of the record object.
Lets say we have the following classes;
public class Employee
{
public bool _Modified;
public string _FirstName;
public string _LastName;
public List<Emplyee_Address> _Address;
}
public class Employee_Address
{
public bool _Modified;
public string _Address;
public string _City;
public string _State;
}
Please note that the Getters and Setters have been omitted from the classes for the sake of clarity. Before any code police accuse me of not using them, please note that have been left out for this example only.
The database has a table for Employees and another for Employee Addresses.
Conceptually, what we do is to create a List object that represents the data in the database tables. We do a deep clone of this object which we then bind to controls on the front end. We then have two objects (Orig and Final) representing data from the database.
The user then makes changes to the "Final" object by creating, modifying, deleting records. We then want to persist these changes to the database.
Obviously we want to be as elegant as possible, only editing, creating, deleting those records that require it.
We ultimately want to compare the two List objects so that we can;
See what properties have changed so that the changes can be persisted to the database.
See what properties (records) no longer exist in the second List<> so that these records can be deleted from the database.
See what new properties exist in the new List<> so that we can create these in the database.
Who wants to get the ball rolling on how we can best achieve this. Keep in mind that we also need to drill down into the Employee_Address list to check for any changes, not just the top level properties.
I hope I have made myself clear and look forward to any suggestions.
Add nullable ObjectID field to your layer's base type. Pass it to front end and back to see if particular instance persists in the database.
It also has many other uses even if you don't have any kind of Identity Map
I would do exactly the same thing .NET does in their Data classes, that is keep the record state (System.Data.DataRowState comes to mind) and all associated versions together in one object.
This way:
You can tell at a glance whether it has been modified, inserted, deleted, or is still the original record.
You can quickly find what has been changed by querying the new vs old versions, without having to dig in another collection to find the old version.
You should investigate the use of the Identity Map pattern. Coupled with Unit of Work, this allows you to maintain an object "cache" of sorts from which you can check which objects need saving to the database, and when reading, to return objects from the identity map rather than creating new objects and returning those.
Why would you want to compare two list objects? You will potentially be using a lot of memory for what is essentially duplicate data.
I'd suggest having a status property for each object that can tell you if that particular object is New, Deleted, or Changed. If you want go further than making the property an Enum, you can make it an object that contains some sort of Dictionary that contains the changes to update, though that will most likely apply only in the case of the Changed status.
After you've added such a property, it should be easy to go through your list, add the New objects, remove the Deleted objects etc.
You may want to check how the Entity Framework does this sort of thing as well.
Related
I've got a class that I'd like to let it get a record from a database. I need to make sure that there's at most one record. The record should be a single match to the class, based on the OrderId.
I feel like a property getter would make more sense than a method, but I know that property getters should avoid throwing Exceptions and .Single()/.SingleOrDefault() could end up throwing one. I feel like the method might make people think that it was fetching from the database each time. Either way, I'd have the result cached in a local field.
What is the best practice for something like this? I have an example of what my code is like below.
N.B.: I know that ideally we'd have a unique index on the DB column to make sure it's unique, it's not possible to do with the vendor database we're using.
class OrderDetails
{
DbOrder _order;
string OrderId { get; set; }
DbOrder Order // property way
{
get{
if (this._order == null)
this._order = dbContext.Where(x => x.OrderId == this.OrderId).SingleOrDefault();
return _order;
}
}
DbOrder GetOrder() // method way
{
if (this._order == null)
this._order = dbContext.Where(x => x.OrderId == this.OrderId).SingleOrDefault();
return _order;
}
}
I'd say a property should always raise an exception if it's needed, same as anywhere (just as similarly as it should be avoided if possible).
More to the point, I think it's that properties shouldn't have "side effects", and although yours does not strictly, this is the closest thing I can liken it to. It seems like "a lot to do" (open db connection, queried data, piped results) for a property when a method could be more descriptive: you kind of expect a method will do more legwork ad hoc
Take some time to think by yourself what your class OrderDetails represents:
It represents the details of an order.
or: it represents some access to some storage where order details can be fetched (and possibly changed).
If OrderDetails would represent the first, then it would simply be some POCO with only get and set properties.
Clearly, your OrderDetails represent the 2nd. Your class is meant to ease the access to the storage of the order details. It also hides the storage, so if the storage changes (database changes structure, or not a database anymore but in-memory data), users of your class won't have to change.
The function of your OrderDetails is also more an access to order details, because two objects of class OrderDetails would not mean two order details, but two methods to access the same order details.
If you didn't mean this, but you wanted every order detail object to represent its own order details, consider changing the class such that it contains the fetched data of the order details, not some access to fetch the data. Also make some functions that fetches the data and returns an object with the fetched data.
Consider making static functions like OrderDetails.Create(...) or even better, create an order detail factory class that creates OrderDetail objects for you filled with the desired data.
If you have separated the data from the methods to fetch the data your question will be answered: the methods to fetch the data will be raising exceptions if fetching the data does not succeed. The POCO object that contains the fetched data won't have to raise exceptions: the data itself is not wrong
If you really meant, that your OrderDetail is not the detail of the order, but some access to get the order details, then getting the details of the order is clearly not a property of the access object, but some functionality of this access object.
Another reason to separate the order details from the access to the storage of the order details would be consistency if data. if your Order Detail has some related properties, and you would get them from the storage in separate calls, how would you guarantee that your related data does not change between the first and the second call?
For instance: Get Postcode and City of the address of a person in separate calls. If the data of the person you are querying is changing between your first and second call, because the person is moving, then you could get the old Postcode and the new City
Summarized: make sure that it is clear to you what parts of your design represents the data itself, or some access to get the data, and design your classes accordingly. Data classes will be filled with property get/set, Access classes will be filled with Functions that return the data classes
I have a C# program that loads a list of products from a database into a list of Product objects. The user can add new products, edit products, and delete products through my program's interface. Pretty standard stuff. My question relates to tracking those changes and saving them back to the database. Before I get to the details, I know that using something like Entity Framework or NHiberate would solve my problem about tracking adds and edits, but I don't think it would solve my problem about tracking deletes. In addition to wanting an alternative to converting a large codebase to using Entity Framework or NHiberate, I also want to know that answer to this question for my own curiosity.
In order to track edits, I'm doing something like this on the Product class where I set the IsDirty flag any time a property is changed:
class Product
{
public bool IsDirty { get; set; }
public bool IsNew { get; set; }
// If the description is changed, set the IsDirty property
public string Description
{
get
{
return _description;
}
set
{
if (value != _description)
{
this.IsDirty = true;
_description = value;
}
}
}
private string _description;
// ... and so on
}
When I create a new Product object, I set its IsNew flag, so the program knows to write it to the database the next time the user saves. Once I write a product to the database successfully, I clear its IsNew and IsDirty flags.
In order to track deletes, I made a List class that tracks deleted items:
class EntityList<T> : List<T>
{
public List<T> DeletedItems { get; private set; }
EntityList()
{
this.DeletedItems = new List<T>();
}
// When an item is removed, track it in the DeletedItems list
public new bool Remove(T item)
{
this.DeletedItems.Add(item);
base.Remove(item);
}
}
// ...
// When I work with a list of products, I use an EntityList instead of a List
EntityList<Product> products = myRepository.SelectProducts();
Each time I save a list of products to the database, I iterate through all of the products in the EntityList.DeletedItems property and delete those products from the database. Once the list is saved successfully, I clear the DeletedItems list.
All of this works, but it seems like I may be doing too much work, especially to track deleted items and to remember to set the IsNew flag every time I create a new Product object. I can't set the IsNew flag in Product's constructor because I don't want that flag set if I'm loading a Product object from the database. I'm also not thrilled with the fact that I have to use my EntityList class everywhere instead of using List.
It seems like this scenario is extremely common, but I haven't been able to find an elegant way of doing it through my research. So I have two questions:
1) Assuming that I'm not using something like Entity Framework, is there a better way to track adds, edits, and deletes and then persist those changes to the database?
2) Am I correct in saying that even when using Entity Framework or NHiberate, that I'd still have to write some additional code to track my deleted items?
In EF the DbContext object contains all of the logic to track changes to objects that it knows about. When you can SaveChanges it figures out which changes have happened and performs the appropriate actions to commit those changes to the database. You don't need to do anything specific with your object state other than inform the DbContext when you want to add or remove records.
Updates:
When you query a DbSet the objects you get are tracked internally by EF. During SaveChanges the current state of those objects are compared against their original state and those that are changed are put into a queue to be updated in the data.
Inserts:
When you add a new object to the relevant DbSet it is flagged for insertion during the SaveChanges call. The object is enrolled in the change tracking, it's DB-generated fields (auto-increment IDs for instance) are updated, etc.
Deletes:
To delete a record from the database you call Remove on the relevant DbSet and EF will perform that action during the next SaveChanges call.
So you don't need to worry about tracking those changes for the sake of the database, it's all handled for you. You might need to know for your own benefits - it's sometimes nice to be able to color changed records for instance.
The above is also true for Linq2SQL and probably other ORMs and database interface layers, since their main purpose is to allow you to access data without having to write reams of code for doing things that can be abstracted out.
is there a better way to track adds, edits, and deletes and then persist those changes to the database?
Both Entity Framework and NHibernate chose not to make entities themselves responsible for notifying nor tracking their changes*. So this can't be a bad choice. It certainly is a good choice from a design pattern's point of view (single responsibility).
They store snapshots of the data as they are loaded from the database in the context or session, respectively. Also, these snapshots have states telling whether they are new, updated, deleted or unchanged. And there are processes to compare actual values and the snapshots and update the entity states. When it's time to save changes, the states are evaluated and appropriate CRUD statements are generated.
This is all pretty complex to implement all by yourself. And I didn't even mention integrity of entity states and their mutual associations. But of course it's doable, once you decide to follow the same pattern. The advantage of the data layer notifying/tracking changes (and not the entities themselves) is that the DAL know which changes are relevant for the data store. Not all properties are mapped to database tables, but the entities don't know that.
I'd still have to write some additional code to track my deleted items?
No. Both OR mappers have a concept of persistence ignorance. You basically just work with objects in memory, which may encompass removing them from a list (either nested in an owner entity or a list representing a database table) and the ORM knows how to sync the in-memory state of the entities with the database.
*Entity Framework used to have self-tracking entities, but they were deprecated.
I am designing a DAL.dll for a web application. The Scenario is that on the web user gets the entity and modifies some fields and click save. My problem is that how to make sure only the modifield field to be saved.
For Example, an entity:
public class POCO{
public int POCOID {get;set;}
public string StringField {get;set;}
public int IntField {get;set;}
}
and my update interface
//rows affected
int update (POCO entity);
When only the IntField is modified, because StringField is null, so I can ignore it. However, when only the StringField is modifield, because IntField is 0 - default(int), I cannot determine if it should be ignored or not.
Some limitations:
1. stateless, no session. so cannot use "get and update", context, etc.
2. to be consistent to data model, cannot use nullable "int?"
Just a tip, if negtive number is not allow in your business requirement, you can use -1 to indicate this value does not apply.
I don't really understand how you want to work stateless, but update only changed properties. It will never work when stateless, since you will need a before-after comparison, or anything else to track changes (like events on property setters). Special "virgin" values are not a good solution, since I think your user wants to see the actual IntField value.
Also make your database consistent with your application data - if you have standard, not-nullable int values, make the DB column int not null default 0! It is really a pain to have a database value which can't be represented by the program, so that the software "magically" turns DB null into 0. If you have a not-nullable int in your application, you can't distinguish between DB null and zero, or you have to add a property like bool IsIntFieldNull (no good!).
To reference a common Object-relational mapper, NHibernate: it has an option called "dynamic-update" where only changed properties/columns are updated. This requires, however, a before-after check and stateful sessions, and there's debate on whether it helps performance, since sending the same DB query every time (with different parameter values) is better than sending multiple different queries - opposed to unneccessary updates and network load. By default, NHibernate updates the whole row, after checking if any change has been done. If you only have ID, StringField and IntField, dynamic-update instead of full-row update might in fact be a good solution.
Mapping nullable DB columns to not-nullable application data types, such as int, is a common mistake when implementing NHibernate, since it creates self-changing DAL objects.
However, when working with ORM or writing your own DAL, make sure you have proper database knowledge!
Options
Many ORMs (Object-relational mapping) provide this type of functionality. You define your object to work with say "Entity Framework" or "NHibernate". These ORM's take care of reading and writing to the database. Internally, they have their own mechanisms to keep track of what has been modified.
Look into Delta<\T> (right now it's an ODATA thing, so it may not be useful to use, but you can learn from it)
Make your own. Have some type of base class that all your other objects inherit from, and somehow when you set fields it records those somewhere else.
I highly recommend not relying on null or magic numbers (-1) to keep track of this. You will create a nightmare for yourself.
Consider if you will, the example of an Order class having a collection property of OrderLines.
public class Order
{
public OrderLineCollection OrderLines { get; private set; }
}
Now consider a Data Access Layer that returns a collection of Order objects without the OrderLines property populated (empty collection).
To minimize round trips to the server, the system passes the ids of the all Order objects to the DAL, which returns the OrderLine objects for each Order in one go. Code in the Business Rules Layer is responsible for adding the correct OrderLine objects to the correct Order objects.
public class OrderDAL
{
public IEnumerable<Order> GetOrdersByCustomer(int customerId)
{
...
}
public IEnumerable<OrderLine> GetOrderLines(IEnumerable<int> orderIds)
{
...
}
}
Is this general way of doing this kind of thing (to reduce database round-trips)?
Should the DAL have the responsibility of returning fully populated Order objects?
Are there better ways?
And no, I cannot use a ORM tool in this particular instance!
I for one don't. I don't want to go back to the store to retrieve more data after an initial query. When loading the data, you (ought to) know for what environment you are loading it, so you will know what "navigational properties" or joins you want to make on beforehand. This way with one query you can get all the data you want.
This is however from a stateless point of view, as I'm currently focusing on MVC and Entity Framework. I guess if you're creating an accounting program, you may have one Orders screen that displays order headers, and an Order Details screen where you want to display the details for the selected order. So in that case, yes, it can be useful to only have to retrieve the OrderLines for the selected order(s).
As usual, the answer is: it depends.
And no, I cannot use a ORM tool in this particular instance!
Why?
In order to assist users with repetitve data entry, I am trying to implement a system where many of the previous properties are remembered when adding new data.
Is it possible to use the Properties.Settings.Default.MySetting functionality or is there a better method for doing this kind of thing?
Couldn't you just make a (deep) copy of the previous object, and use that as the next object, allowing users to overwrite any fields that changed? This way, the fields would be what the user last entered, individualized per user, and updated as they chnaged those fields.
If you have a way of remembering the last thing entered per user, you cold even preserve this between sessions.
The OP comments:
Unfortunately, making a deep copy of an object messes up the objectcontext when relationships are involved. A new object with relationships either needs to have new relational objects created or existing objects queried from the database.
So? If the relation is to an another entity (a foreign key in the database), it's a uses-a relationship, and you just retain it. If it's an attribute, you copy it.
For example, lets say your form is data entry about employees, and ithas a drop down for, I dunno, employeeType, that's either "Exempt" (no overtime) or "Non-exempt" (gets overtime). You pulled the values for employeeType from the database, and you want the next employee entered to have the same values as the last entered employee, to save the data entry people keystrokes. So your deep copy would just associate the copied employee with the same database employeeType.
But for attribute data (like name), you'd make a copy.
It depends on what you're trying to achieve. The good thing about using the MySetting functionality is that the "Most Recent" properties can be persisted the next time the application is closed.
I'm assuming this is a winforms application so I'd possibly keep a cached instance of the last save of each of the backing objects in a hashtable somewhere and then when you create a new form, look up the backing object in the hashtable and bind the required properties to the new instance of the form.
You can then serialize and persist the entire hashtable to the MySettings object if you like so it can be used each time the user accesses the application.