I am working on a project where I am wrestling with trying to move from one persistence pattern to another.
I've looked in Patterns of Enterprise Application Architecture, Design Patterns, and here at this MSDN article for help. Our current pattern is the Active Record pattern described in the MSDN article. As a first step in moving to a more modular code base we are trying to break out some of our business objects (aka tables) into multiple interfaces.
So for example, let's say I have a store application something like this:
public interface IContactInfo
{
...
}
public interface IBillingContactInfo: IContactInfo
{
...
}
public interface IShippingContactInfo: IContactInfo
{
...
}
public class Customer: IBillingContactInfo, IShippingContactInfo
{
#region IBillingContactInfo Implementation
...
#endregion
#region IShippingContactInfo Implementation
...
#endregion
public void Load(int customerID);
public void Save();
}
The Customer class represents a row in our Customer Table. Even though the Customer class is one row it actually implements two different interfaces: IBillingContactInfo, IShippingContactInfo.
Historically we didn't have those two interfaces we simply passed around the entire Customer object everywhere and made whatever changes we wanted to it and then saved it.
Here is where the problem comes in. Now that we have those two interfaces we may have a control that takes an IContactInfo, displays it to the user, and allows the user to correct it if it is wrong. Currently our IContactInfo interface doesn't implement any Save() to allow changes to it to persist.
Any suggestions on good design patterns to get around this limitation without a complete switch to other well known solutions? I don't really want to go through and add a Save() method to all my interfaces but it may be what I end up needing to do.
How many different derivatives of IContactInfo do you plan to have?
Maybe I'm missing the point, but I think you would do better with a class called ContactInfo with a BillTo and a ShipTo instance in each Customer. Since your IShippingContactInfo and IBillingContactInfo interfaces inherit from the same IContactInfo interface, your Customer class will satisfy both IContactInfo base interfaces with one set of fields. That would be a problem.
It's better to make those separate instances. Then, saving your Customer is much more straight-forward.
Are you planning on serialization for persistence or saving to a database or something else?
Using a concrete type for Customer and ContactInfo would definitely cover the first two.
(A flat file would work for your original setup, but I hope you aren't planning on that.)
I think it all comes down to how many derivatives of IContactInfo you expect to have. There is nothing wrong with a bit more topography in your graph. If that means one record with multiple portions (your example), or if that is a one-to-many relationship (my example), or if it is a many-to-many that lists the type (ShipTo, BillTo, etc.) in the join table. The many-to-many definitely reduces the relationships between Customer and the various ContactInfo types, but it creates overhead in application development for the scenarios when you want concrete relationships.
You can easily add a Save() method constraint to the inherited interfaces by simply having IContactInfo implement an IPersistable interface, which mandates the Save() method. So then anything that has IContactInfo also has IPersistable, and therefore must have Save(). You can also do this with ILoadable and Load(int ID) - or, with more semantic correctness, IRetrievable and Retrieve(int ID).
This completely depends on how you're using your ContactInfo objects though. If this doesn't make sense with relation to your usage please leave a comment/update your question and I'll revisit my answer.
Related
I'm studying SOLID principles and have a question about dependency management in relation to interfaces.
An example from the book I'm reading (Adaptive Code via C# by Gary McLean Hall) shows a TradeProcessor class that will get the trade data, process it, and store it in the database. The trade data is modeled by a class called TradeRecord. A TradeParser class will handle converting the trade data that is received into a TradeRecord instance(s). The TradeProcessor class only references an ITradeParser interface so that it is not dependent on the TradeParser implementation.
The author has the Parse method (in the ITradeParser interface) return an IEnumerable<TradeRecord> collection that holds the processed trade data. Doesn't that mean that ITradeParser is now dependent on the TradeRecord class?
Shouldn't the author have done something like make an ITradeRecord interface and have Parse return a collection of ITradeRecord instances? Or am I missing something important?
Here's the code (the implementation of TradeRecord is irrelevant so it is omitted):
TradeProcessor.cs
public class TradeProcessor
{
private readonly ITradeParser tradeParser;
public TradeProcessor(ITradeParser tradeParser)
{
this.tradeParser = tradeParser;
}
public void ProcessTrades()
{
IEnumerable<string> tradeData = "Simulated trade data..."
var trades = tradeParser.Parse(tradeData);
// Do something with the parsed data...
}
}
ITradeParser.cs
public interface ITradeParser
{
IEnumerable<TradeRecord> Parse(IEnumerable<string> tradeData);
}
This is a good question that goes into the tradeoff between purity and practicality.
Yes, by pure principal, you can say that ITradeParser.Parse should return a collection of ITraceRecord interfaces. After all, why tie yourself to a specific implementation?
However, you can take this further. Should you accept an IEnumerable<string>? Or should you have some sort of ITextContainer? I32bitNumeric instead of int? This is reductio ad absurdum, of course, but it shows that we always, at some point, reach a point where we're working on something, a concrete object (number, string, TraceRecord, whatever), not an abstraction.
This also brings up the point of why we use interfaces in the first place, which is to define contracts for logic and functionality. An ITradeProcessor is a contract for an unknown implementation that can be replaced or updated. A TradeRecord isn't a contract for implementation, it is the implementation. If it's a DTO object, which it seems to be, there would be no difference between the interface and the implementation, which means there's no real purpose in defining this contract - it's implied in the concrete class.
The author has the Parse method (in the ITradeParser interface) return an IEnumerable collection that holds the processed trade data.
Doesn't that mean that ITradeParser is now dependent on the TradeRecord class?
Yes, ITradeParser is now tightly coupled with TradeRecord. Given the more academic approach of this question, I can see where you are coming from. But what is TradeRecord? A record, by definition, is generally a simple, non-intelligent piece of data (sometimes called POCO, DTO, or Model).
At some point, the potential gain of abstraction is less valuable than the complexities it causes. This approach is pretty common in practice - Models (as I refer to them) are sealed types that flow through the layers of an application. Layers that act upon the models are abstracted to interfaces, so that each layer may be mocked and tested separately.
For example, a client application may have a View, ViewModel, and Repository layer. Each layer knows how to work with the concrete record type. But the ViewModel could be wired up to work with a mocked IRepository, which builds up the concrete types with hardcoded, mocked data. There's no benefit to an abstracted IModel at this point - it just has straight data.
I'm trying to build a new application using the Repository pattern for the first time and I'm a little confused about using a Repository. Suppose I have the following classes:
public class Ticket
{
}
public class User
{
public List<Ticket>AssignedTickets { get; set; }
}
public class Group
{
public List<User> GroupMembers { get;set; }
public List<Ticket> GroupAssignedTickets { get;set; }
}
I need a methods that can populate these collections by fetching data from the database.
I'm confused as to which associated Repository class I should put those methods in. Should I design my repositories so that everything returning type T goes in the repository for type T as such?
public class TicketRepository
{
public List<Ticket> GetTicketsForGroup(Group g) { }
public List<Ticket> GetTicketsForUser(User u) { }
}
public class UserRepository
{
public List<User> GetMembersForGroup(Group g) { }
}
The obvious drawback I see here is that I need to start instantiating a lot of repositories. What if my User also has assigned Widgets, Fidgets, and Lidgets? When I populate a User, I need to instantiate a WidgetRepository, a FidgetRepository, and a LidgetRepository all to populate a single user.
Alternatively, do I construct my repository so that everything requesting based on type T is lumped into the repository for type T as listed below?
public class GroupRepository
{
public List<Ticket> GetTickets(Group g) { }
public List<User> GetMembers(Group g) { }
}
public class UserRepository
{
public List<Ticket> GetTickets(User u) { }
}
The advantage I see here is that if I now need my user to have a collection of Widgets, Fidgets, and Lidgets, I just add the necessary methods to the UserRepository pattern and don't need to instantiate a bunch of different repository classes every time I want to create a user, but now I've scattered the concerns for a user across several different repositories.
I'm really not sure which way is right, if any. Any suggestions?
The repository pattern can help you to:
Put things that change for the same reason together
As well as
Separate things that change for different reasons
On the whole, I would expect a "User Repository" to be a repository for obtaining users. Ideally, it would be the only repository that you can use to obtain users, because if you change stuff, like user tables or the user domain model, you would only need to change the user repository. If you have methods on many repositories for obtaining a user, they would all need to change.
Limiting the impact of change is good, because change is inevitable.
As for instantiating many repositories, using a dependency injection tool such as Ninject or Unity to supply the repositories, or using a repository factory, can reduce new-ing up lots of repositories.
Finally, you can take a look at the concept of Domain Driven Design to find out more about the key purpose behind domain models and repositories (and also about aggregate roots, which are relevant to what you are doing).
Fascinating question with no right answer. This might be a better fit for programmers.stackexchange.com rather than stackoverflow.com. Here are my thoughts:
Don't worry about creating too many repositories. They are basically stateless objects so it isn't like you will use too much memory. And it shouldn't be a significant burden to the programmer, even in your example.
The real benefit of repositories is for mocking the repository for unit testing. Consider splitting them up based on what is simplest for the unit tests, to make the dependency injection simple and clear. I've seen cases where every query is a repository (they call those "queries" instead of repositories). And other cases where there is one repository for everything.
As it turns out, the first option was the more practical option in this case. There were a few reasons for this:
1) When making changes to a type and its associated repository (assume Ticket), it was far easier to modify the Ticket and TicketRepository in one place than to chase down every method in every repository that used a Ticket.
2) When I attempted to use interfaces to dictate the type of queues each repository could pull, I ran into issues where a single repository couldn't implement an generic interface using type T multiple times with the only differentiation in interface method implementation being the parameter type.
3) I access data from SharePoint and a database in my implementation, and created two abstract classes to provide data tools to the concrete repositories for either Sharepoint or SQL Server. Assume that in the example above Users come from Sharepoint while Tickets come from a database. Using my model I would not be able to use these abstract classes, as the group would have to inherit from both my Sharepoint abstract class and my SQL abstract class. C# does not support multiple inheritance of abstract classes. However, if I'm grouping all Ticket-related behaviours into a TicketRepository and all User-related behaviours into a UserRepository, each repository only needs access to one type of underlying data source (SQL or Sharepoint, respectively).
I have a database that contains "widgets", let's say. Widgets have properties like Length and Width, for example. The original lower-level API for creating wdigets is a mess, so I'm writing a higher-level set of functions to make things easier for callers. The database is strange, and I don't have good control over the timing of the creation of a widget object. Specifically, it can't be created until the later stages of processing, after certain other things have happened first. But I'd like my callers to think that a widget object has been created at an earlier stage, so that they can get/set its properties from the outset.
So, I implemented a "ProxyWidget" object that my callers can play with. It has private fields like private_Length and private_Width that can store the desired values. Then, it also has public properties Length and Width, that my callers can access. If the caller tells me to set the value of the Width property, the logic is:
If the corresponding widget object already exists in the database, then set
its Width property
If not, store the given width value in the private_Width field for later use.
At some later stage, when I'm sure that the widget object has been created in the database, I copy all the values: copy from private_Width to the database Width field, and so on (one field/property at a time, unfortunately).
This works OK for one type of widget. But I have about 50 types, each with about 20 different fields/properties, and this leads to an unmaintainable mess. I'm wondering if there is a smarter approach. Perhaps I could use reflection to create the "proxy" objects and copy field/property data in a generic way, rather than writing reams of repetitive code? Factor out common code somehow? Can I learn anything from "data binding" patterns? I'm a mathematician, not a programmer, and I have an uneasy feeling that my current approach is just plain dumb. My code is in C#.
First, in my experience, manually coding a data access layer can feel like a lot of repetitive work (putting an ORM in place, such as NHibernate or Entity Framework, might somewhat alleviate this issue), and updating a legacy data access layer is awful work, especially when it consists of many parts.
Some things are unclear in your question, but I suppose it is still possible to give a high-level answer. These are meant to give you some ideas:
You can build ProxyWidget either as an alternative implementation for Widget (or whatever the widget class from the existing low-level API is called), or you can implement it "on top of", or as a "wrapper around", Widget. This is the Adapter design pattern.
public sealed class ExistingTerribleWidget { … }
public sealed class ShinyWidget // this is the wrapper that sits on top of the above
{
public ShinyWidget(ExistingTerribleWidget underlying) { … }
private ExistingTerribleWidget underlying;
… // perform all real work by delegating to `underlying` as appropriate
}
I would recommend that (at least while there is still code using the existing low-level API) you use this pattern instead of creating a completely separate Widget implementation, because if ever there is a database schema change, you will have to update two different APIs. If you build your new EasyWidget class as a wrapper on top of the existing API, it could remain unchanged and only the underlying implementation would have to be updated.
You describe ProxyWidget having two functions (1) Allow modifications to an already persisted widget; and (2) Buffer for a new widget, which will be added to the database later.
You could perhaps simplify your design if you have one common base type and two sub-classes: One for new widgets that haven't been persisted yet, and one for already persisted widgets. The latter subtype possibly has an additional database ID property so that the existing widget can be identified, loaded, modified, and updated in the database:
interface IWidget { /* define all the properties required for a widget */ }
interface IWidgetTemplate : IWidget
{
IPersistedWidget Create();
bool TryLoadFrom(IWidgetRepository repository, out IPersistedWidget matching);
}
interface IPersistedWidget : IWidget
{
Guid Id { get; }
void SaveChanges();
}
This is one example for the Builder design pattern.
If you need to write similar code for many classes (for example, your 50+ database object types) you could consider using T4 text templates. This just makes writing code less repetitive; but you will still have to define your 50+ objects somewhere.
The repository pattern seems to work well when working with an initial project with several large main tables.
However as the project grows it seems a little inflexible. Say you have lots of child tables that hang off the main table, do you need a repository for each table?
E.g.
CustomerAddress Record has following child tables:
-> County
-> Country
-> CustomerType
On the UI, 3 dropdown lists need to be displayed, but it gets a bit tedious writing a repository for each of the above tables which selects the data for the dropdowns.
Is there a best practice/more efficient way of doing this?
As an example say you have a main CustomerAddress repository which I guess is the 'aggregate root' which inherits the main CRUD operations from the base repo interface.
Previously I have short-cutted the aggregate root and gone straight to the context for these kinds of tables.
e.g.
public Customer GetCustomerById(int id)
{
return Get(id);
}
public IEnumerable<Country> GetCountries()
{
return _ctx.DataContext.Countries.ToList();
}
etc...
But sometimes it doesn't feel right, as countries aren't part of the customer, but I feel like I need to tack it onto something without having to create zillions of repos for each table. A repo per table definately doesn't seem right to me either.
First the code you posted is not the repository pattern. Where is the collection like interface? If it is an aggregate it should only be returning the aggregate type.
Repository pattern doesn't offer up much flexibility when it comes being able to select different types. Repository pattern follows a collection interface (insert/add/update/delete/get/etc), mirroring an in memory thing, and it generally only retrieves on type. So if you were to use the repository pattern you would need to select all CustomerAddresses and then* filter the countries out. I would suggest you move to a different pattern, that allows for more flexibility aka DAO.
If these things are always going to be maintained through CustomerAddress, then switch patterns and create a DAO class that offers some other getters for the other types of things you need.
On a more generic note, build for need.
Never just blindly create repository classes, its a maintenance nightmare. The only time I would argue for a repo per table is when you are doing CMS like things, and need to be able create everything.
Example:
So you have a CustomerAddress which ties together a Customer and a Country, but you have some other process that needs to be able to CRUD the Country. As a result you need* the repository to manipulate Country and if you are following DRY you dont want to have duplicate logic to manipulate Countries. What you would have is a Customer Respotitory that uses the Country repository.
I'm answering my own question here because while the suggestions are certainly useful, I feel I have a better solution.
While I don't have to phsyically create the underlying repository for each and every table as I have a generic repository base class with interface (Get, Add, Remove), I still have to:
1) write the interface to access any specialised methods (generally these are queries)
2) write those implementations
I don't necessarily want to do this when all I want to retrieve is a list of countries or some simple type for populating a dropdown. Think of effort required if you have 10 reference type tables.
What I decided to do was create a new class called SimpleRepo with ISimpleRepo interface which exposes 1-2 methods. While I don't normally like to expose the IQueryable interface out of the repo i/f class, I don't mind here as I want the provided flexibility. I can simply expose a 'Query()' method which provides the flexibility hook. I might need this for specialising the ordering, or filtering.
Whenever a service needs to make use of some simple data, the ISimple< T > interface is passed in, where T is the table/class.
I now avoid the need to create an interface/class for these simple pieces of data.
Thoughts anyone?
Responding to the questioner's own answer: This doesn't make sense to me; though it's possible you still had a good use case, I'm not following. Points 1 and 2 ... if you need specialized methods, then looks like they belong in their own repo. Point 2: yes, that needs an implementation.
Sharing between repos, with the smaller repo being the question (is that one needed), I do appreciate that question / problem, but guys' on this thread steered me to being okay with 1 repo per table, including the possibility of having a 'service layer', though they didn't give any examples of that, and I haven't tried this out yet (currently my practice, for good or ill, has been to have the bigger repo share or instantiate the smaller one it needs):
One repository per table or one per functional section?
I find it difficult to determine the responsiblity of classes: do i have to put this method in this class or should I put this method in another class? For example, imagine a simple User class with an id, forname, lastname and password. Now you have an userId and you want the forname and lastname, so you create a method like: public User GetUserById(int id){}. Next you want to show list of all the users, so you create another method: public List GetAllUsers(){}. And offcourse you want to update, delete and save a user. This gives us 5 methods:
public bool SaveUser(User user);
public bool UpdateUser(User user);
public bool DeleteUser(User user);
public User GetUserById(int id);
public List<User> GetAllUsers();
So my question is: do you put all these methods in the User class? Or do you create another data class (UserData class) which may connect to the database and contain all these methods?
What you are describing here is basically a choice between the Active Record Pattern or the Repository Pattern. I'd advise you to read up on those patterns and choose whichever one fits your application / experience / toolset.
I would not put those specific methods into the 'User' class.
There are 2 common approaches for this 'problem':
You put those method in the User
class, and then this means you 're
using the Active Record pattern
You put those methods in a
separate class (UserRepository) for
instance, and then you're using the
Repository pattern.
I prefer the repository-approach, since that keeps my 'User' class clean, and doesn't clutter it with data access code.
Barring additional complexity specific to a group of users (or really elaborate database access mechanics) I might make those methods static on the User class.
Those methods sound more like a UserManager (or something like that) to me. The user class should correspond to and represent only a single user, not many.
If we look at Enterprise Application design patterns, then the methods for fetching Users i.e. GetUserByID and GetAllUsers would be in separate class - you can name it UserData or UserDAO (DAO - Data Access Object).
Infact you should design an interface for UserDAO with appropriate methods for handling User Objects - such as CreateUser, UpdateUser, DeleterUser, GetUserXXX and so on.
There should be an implementation of UserDAO as per the data source, for example if your users are stored in database then you can implement the logic of accessing database in the implementation of UserDAO.
Following are the advantages of keeping the access methods in separate class:
1) User object should be plain object with just getter setter methods, this would facilitate passing object across tiers - from data access tier, to business tier to web tier. This would also help keep User Object serializable
2) The data access logic is loosely coupled from the User object - that means if the datasource changes, then you need not change the User object itself. This also assists in Test Driven Development where you might need to have mock objects during testing phase
3) If User object is complex object with relations with other objects such as Address or Department or Role etc. then the complexity of relationships will be encapsulated in UserDAO rather than leaking in the User Object.
4) Porting to frameworks like NHibernate or Spring.NET or .NET LINQ would become easier if the patterns are followed
Lets us see you scenario as this.
There are 'N' number of people working in assembly division of you company.
It is okay to go to a person and ask about his information BUT you cant expect him to tell you details of all persons working in assembly division. Reason why shud he remember all the details and if you do expect then his effeciency will go down(work on assembly and also remember details of others).
So ..... perhaps we can appoint a manager who can do this ppl maanagement activities
(get details, add new person, edit ,delete etc etc )
Therefore you have two entities
1) User/Person working in your assembly deivision
2) a Manager
Thus two classes. Hopes this will help you.
Thanks
If I understand your question correctly the User class deals with a single user. Hence the user class does not have a clue about how many users there are or anything about them. The structure holding this information is somewhere else and the methods you mention seem to belong to that structure / class.
With all else being equal either way is fine. Which to choose, though, usually depends on the overall architecture of the application or class library in which you find the User class. If the data access code seems tangled with the User object code, then it might make more sense to split it into two classes as you've considered. If the CRUD methods are one-line delegations to a DAL with maybe application-specific logic, then leaving them in the User class should be okay.
The complexity is more or less the same in both cases—it's a trade-off between a low-maintenace assembly with few high-maintenance classes or a high-maintenance assembly with a larger number of low-maintenance classes.
I'm also assuming that the CRUD methods should be static.
Do what's easiest to get the code written right now but consider possible refactorings in the future should you find that it'll be better that way.