I want to know how most people are dealing with the repository pattern when it involves hitting the same database multiple times (sometimes transactionally) and trying to do so efficiently while maintaining database agnosticism and using multiple repositories together.
Let's say we have repositories for three different entities; Widget, Thing and Whatsit. Each repository is abstracted via a base interface as per normal decoupling design processes. The base interfaces would then be IWidgetRepository, IThingRepository and IWhatsitRepository.
Now we have our business layer or equivalent (whatever you want to call it). In this layer we have classes that access the various repositories. Often the methods in these classes need to do batch/combined operations where multiple repositories are involved. Sometimes one method may make use of another method internally, while that method can still be called independently. What about, in this scenario, when the operation needs to be transactional?
Example:
class Bob
{
private IWidgetRepository _widgetRepo;
private IThingRepository _thingRepo;
private IWhatsitRepository _whatsitRepo;
public Bob(IWidgetRepository widgetRepo, IThingRepository thingRepo, IWhatsitRepository whatsitRepo)
{
_widgetRepo = widgetRepo;
_thingRepo= thingRepo;
_whatsitRepo= whatsitRepo;
}
public void DoStuff()
{
_widgetRepo.StoreSomeStuff();
_thingRepo.ReadSomeStuff();
_whatsitRepo.SaveSomething();
}
public void DoOtherThing()
{
_widgetRepo.UpdateSomething();
DoStuff();
}
}
How do I keep my access to that database efficient and not have a constant stream of open-close-open-close on connections and inadvertent invocation of MSDTS and whatnot? If my database is something like SQLite, standard mechanisms like creating nested transactions are going to inherently fail, yet the business layer should not have to be concerning itself with such things.
How do you handle such issues? Does ADO.Net provide simple mechanisms to handle this or do most people end up wrapping their own custom bits of code around ADO.Net to solve these types of problems?
Basically you want to utilize a UnitOfWork for this. I personally use NHibernate because their ISession interface is basically a Unit of Work, so it will batch together all commands and send them to the database together (there are additional smarts and object state tracking in there as well, which help with this). So the number of discrete commands sent to the database will depend on the life cycle of your ISession. In a web context I generally go with a Conversation Per Request, which means the ISession is created at the beginning of the request, and flushed (sent to the DB) at the end of the request.
There is a lot of potential here to change the life cycle of the session based on whether or not you need shorter or longer session, and you can also utilize transactions which can have a separate life cycle if you need to.
Consider that your abstracted repositories could be implemented in any number of databases: SQLite, Microsoft SQL Server, direct file access, etc -- even though you know that they are from the same database, it's not reasonable for "Bob" to attempt to make sure each repository can be transactionalized with respect to the other repositories.
Bob should perhaps talk to a DoStuffService that contains concrete implementations of your repositories. Since this service is creating concrete implementations of your repositories, it is capable of creating an appropriate transaction. This service would then be responsible for safely executing a UnitOfWork (thanks, ckramer) on behalf of Bob.
Related
I would an opinion about how you see the Repository pattern.
In the "old" Domain conception (for example from P of EAAA) the repository should be like an "in memory collection", so it should returns always the same type, so if you need a projection you have to make it out, so the projection will be made in the service layer for example, right? Or is possible to make it directly into the "Domain" project?
E.g.
public class CustomerRepository
{
//Constructor accepts an IRepository<Customer>
public IQueryable<Customer> GetAllDebtors()
{
//Use IRepository<Customer> here to make the query
}
}
Instead, in DDD, the repository, especially combined with CQRS, can return directly the projected type because the repository becomes a denormalization service, right?
E.g.
public class CustomerDenormalizer
{
//Constructor *could* accept an IRepository<Customer>
public IQueryable<Debtor> GetAllDebtors()
{
//Use IRepository<Customer> here to make the query and the projection
}
}
IMO, the correspondence with an "in-memory" collection is overemphasized. A repository shouldn't hide the fact that it encapsulates some heavy IO - that would be a leaky abstraction. Furthermore, IQueryable<T> is also a leaky abstraction since hardly any provider will support all the operations. I'd suggest delegating as much of the projecting as possible to the database, because it is quite good at it.
A projection in CQRS is somewhat different. It is usually implemented as an event consumer which updates a datastructure stored in some underlying storage mechanisms, which could itself be SQL server, or a key-value store. The central difference is that, in this case, a projection response to events from a message queue. These events could be coming from external systems.
Saying that the Repository in its "original form" has to return only objects of the same entity type is somewhat exaggerated. At all times people have been including Count() methods or other calculations in their Repositories for instance - and that's something that was documented even back in Evan's blue book.
On the other hand, I'm not sure what you mean by "denormalized types" but if that's types that borrow from several different entities to expose their data as is or consolidated, or expose partial views of single domain entities, I tend to think it's not the Domain any more. Oftentimes it turns out they serve application-specific purposes such as generating Excel reports or displaying statistics or summary screens.
If this is the case, and for performance reasons I want to leverage the DB directly instead of relying on the Domain, I prefer to create a separate project where I place all this "reporting" related logic. Never mind if you still name your data access objects Repositories in there (after all, they are also illusions of in-memory collections, only collections of other types).
In the "old" Domain conception (for example from P of EAAA) the repository should be like an "in memory collection", so it should returns always the same type, so if you need a projection you have to make it out, so the projection will be made in the service layer for example, right?
In my own solutions, I keep distinct the domain model (that is an C# expression of the language I learned from the domain expert, almost an internal DSL) and the applicative concerns related to the domain (such as repositories, that cope with the application need of persistence). This means that they are coded in different projects.
In such a structure, I've tried two different approaches: queryable repositories and custom ones.
Repositories that implement IQueryable have worked very well for developers using the domain to build UI or services exposed to third parties, but required a lot of work in infrastructure. We used different approaches on this side, from Linq2NHibernate to re-linq, each with pros and cons, but each one quite expensive. If you plan to use this technique, define good metrics to ensure that the time you save during application development worth the time you have to spend on custom infrastructure.
Custom repositories (those that expose methods returning IEnumerables) are much easier to design and implement, but they require more effort for UI and service's developers. We also had a case where using a custom repository was required by domain rules, since the query objects used to obtain results were specifications that were also part of the ubiquitous language and we were (legally) required to grant that the method used to query was the same used to express such value and predicate.
However, in custom repositories, we often expose projective methods too.
Or is possible to make it directly into the "Domain" project?
It's possible, but when you have many different (and very complex) bounded contexts to work with, it becomes a pain. This is why I use different projects for the domain classes expressing the ubiquitous language and the classes that serve applicative purposes.
Instead, in DDD, the repository, especially combined with CQRS, can return directly the projected type because the repository becomes a denormalization service, right?
Yes, they can.
That's what we do with custom repositories, even without CQRS. Furthermore, even with some repository implementing IQueryable, we occasionally exposed methods that produce projections directly.
Don't have generic Repository.
Don't have IQueryable.
Have ICustomerRepository.
Have your specific implementation of CustomerRepository leverage all the bells and whistles of the underlying storage system.
Have repository return projections.
public Interface ICustomerRepository
{
public IEnumerable< Customer> GetAllCustomer()
public IEnumerable< Debtor> GetAllDebtors()
public IEnumerable< CustomerSummary> GetCustomerSummaryByName(string name)
public IEnumerable< CustomerSummary> GetCustomerSummaryById(string id)
}
I have been reading a lot of articles explaining how to set up Entity Framework's DbContext so that only one is created and used per HTTP web request using various DI frameworks.
Why is this a good idea in the first place? What advantages do you gain by using this approach? Are there certain situations where this would be a good idea? Are there things that you can do using this technique that you can't do when instantiating DbContexts per repository method call?
NOTE: This answer talks about the Entity Framework's DbContext, but
it is applicable to any sort of Unit of Work implementation, such as
LINQ to SQL's DataContext, and NHibernate's ISession.
Let start by echoing Ian: Having a single DbContext for the whole application is a Bad Idea. The only situation where this makes sense is when you have a single-threaded application and a database that is solely used by that single application instance. The DbContext is not thread-safe and since the DbContext caches data, it gets stale pretty soon. This will get you in all sorts of trouble when multiple users/applications work on that database simultaneously (which is very common of course). But I expect you already know that and just want to know why not to just inject a new instance (i.e. with a transient lifestyle) of the DbContext into anyone who needs it. (for more information about why a single DbContext -or even on context per thread- is bad, read this answer).
Let me start by saying that registering a DbContext as transient could work, but typically you want to have a single instance of such a unit of work within a certain scope. In a web application, it can be practical to define such a scope on the boundaries of a web request; thus a Per Web Request lifestyle. This allows you to let a whole set of objects operate within the same context. In other words, they operate within the same business transaction.
If you have no goal of having a set of operations operate inside the same context, in that case the transient lifestyle is fine, but there are a few things to watch:
Since every object gets its own instance, every class that changes the state of the system, needs to call _context.SaveChanges() (otherwise changes would get lost). This can complicate your code, and adds a second responsibility to the code (the responsibility of controlling the context), and is a violation of the Single Responsibility Principle.
You need to make sure that entities [loaded and saved by a DbContext] never leave the scope of such a class, because they can't be used in the context instance of another class. This can complicate your code enormously, because when you need those entities, you need to load them again by id, which could also cause performance problems.
Since DbContext implements IDisposable, you probably still want to Dispose all created instances. If you want to do this, you basically have two options. You need to dispose them in the same method right after calling context.SaveChanges(), but in that case the business logic takes ownership of an object it gets passed on from the outside. The second option is to Dispose all created instances on the boundary of the Http Request, but in that case you still need some sort of scoping to let the container know when those instances need to be Disposed.
Another option is to not inject a DbContext at all. Instead, you inject a DbContextFactory that is able to create a new instance (I used to use this approach in the past). This way the business logic controls the context explicitly. If might look like this:
public void SomeOperation()
{
using (var context = this.contextFactory.CreateNew())
{
var entities = this.otherDependency.Operate(
context, "some value");
context.Entities.InsertOnSubmit(entities);
context.SaveChanges();
}
}
The plus side of this is that you manage the life of the DbContext explicitly and it is easy to set this up. It also allows you to use a single context in a certain scope, which has clear advantages, such as running code in a single business transaction, and being able to pass around entities, since they originate from the same DbContext.
The downside is that you will have to pass around the DbContext from method to method (which is termed Method Injection). Note that in a sense this solution is the same as the 'scoped' approach, but now the scope is controlled in the application code itself (and is possibly repeated many times). It is the application that is responsible for creating and disposing the unit of work. Since the DbContext is created after the dependency graph is constructed, Constructor Injection is out of the picture and you need to defer to Method Injection when you need to pass on the context from one class to the other.
Method Injection isn't that bad, but when the business logic gets more complex, and more classes get involved, you will have to pass it from method to method and class to class, which can complicate the code a lot (I've seen this in the past). For a simple application, this approach will do just fine though.
Because of the downsides, this factory approach has for bigger systems, another approach can be useful and that is the one where you let the container or the infrastructure code / Composition Root manage the unit of work. This is the style that your question is about.
By letting the container and/or the infrastructure handle this, your application code is not polluted by having to create, (optionally) commit and Dispose a UoW instance, which keeps the business logic simple and clean (just a Single Responsibility). There are some difficulties with this approach. For instance, where do you Commit and Dispose the instance?
Disposing a unit of work can be done at the end of the web request. Many people however, incorrectly assume that this is also the place to Commit the unit of work. However, at that point in the application, you simply can't determine for sure that the unit of work should actually be committed. e.g. If the business layer code threw an exception that was caught higher up the callstack, you definitely don't want to Commit.
The real solution is again to explicitly manage some sort of scope, but this time do it inside the Composition Root. Abstracting all business logic behind the command / handler pattern, you will be able to write a decorator that can be wrapped around each command handler that allows to do this. Example:
class TransactionalCommandHandlerDecorator<TCommand>
: ICommandHandler<TCommand>
{
readonly DbContext context;
readonly ICommandHandler<TCommand> decorated;
public TransactionCommandHandlerDecorator(
DbContext context,
ICommandHandler<TCommand> decorated)
{
this.context = context;
this.decorated = decorated;
}
public void Handle(TCommand command)
{
this.decorated.Handle(command);
context.SaveChanges();
}
}
This ensures that you only need to write this infrastructure code once. Any solid DI container allows you to configure such a decorator to be wrapped around all ICommandHandler<T> implementations in a consistent manner.
There are two contradicting recommendations by microsoft and many people use DbContexts in a completely divergent manner.
One recommendation is to "Dispose DbContexts as soon as posible"
because having a DbContext Alive occupies valuable resources like db
connections etc....
The other states that One DbContext per request is highly
reccomended
Those contradict to each other because if your Request is doing a lot of unrelated to the Db stuff , then your DbContext is kept for no reason.
Thus it is waste to keep your DbContext alive while your request is just waiting for random stuff to get done...
So many people who follow rule 1 have their DbContexts inside their "Repository pattern" and create a new Instance per Database Query so X*DbContext per Request
They just get their data and dispose the context ASAP.
This is considered by MANY people an acceptable practice.
While this has the benefits of occupying your db resources for the minimum time it clearly sacrifices all the UnitOfWork and Caching candy EF has to offer.
Keeping alive a single multipurpose instance of DbContext maximizes the benefits of Caching but since DbContext is not thread safe and each Web request runs on it's own thread, a DbContext per Request is the longest you can keep it.
So EF's team recommendation about using 1 Db Context per request it's clearly based on the fact that in a Web Application a UnitOfWork most likely is going to be within one request and that request has one thread. So one DbContext per request is like the ideal benefit of UnitOfWork and Caching.
But in many cases this is not true.
I consider Logging a separate UnitOfWork thus having a new DbContext for Post-Request Logging in async threads is completely acceptable
So Finally it turns down that a DbContext's lifetime is restricted to these two parameters. UnitOfWork and Thread
Not a single answer here actually answers the question. The OP did not ask about a singleton/per-application DbContext design, he asked about a per-(web)request design and what potential benefits could exist.
I'll reference http://mehdi.me/ambient-dbcontext-in-ef6/ as Mehdi is a fantastic resource:
Possible performance gains.
Each DbContext instance maintains a first-level cache of all the entities its loads from the database. Whenever you query an entity by its primary key, the DbContext will first attempt to retrieve it from its first-level cache before defaulting to querying it from the database. Depending on your data query pattern, re-using the same DbContext across multiple sequential business transactions may result in a fewer database queries being made thanks to the DbContext first-level cache.
It enables lazy-loading.
If your services return persistent entities (as opposed to returning view models or other sorts of DTOs) and you'd like to take advantage of lazy-loading on those entities, the lifetime of the DbContext instance from which those entities were retrieved must extend beyond the scope of the business transaction. If the service method disposed the DbContext instance it used before returning, any attempt to lazy-load properties on the returned entities would fail (whether or not using lazy-loading is a good idea is a different debate altogether which we won't get into here). In our web application example, lazy-loading would typically be used in controller action methods on entities returned by a separate service layer. In that case, the DbContext instance that was used by the service method to load these entities would need to remain alive for the duration of the web request (or at the very least until the action method has completed).
Keep in mind there are cons as well. That link contains many other resources to read on the subject.
Just posting this in case someone else stumbles upon this question and doesn't get absorbed in answers that don't actually address the question.
I'm pretty certain it is because the DbContext is not at all thread safe. So sharing the thing is never a good idea.
One thing that's not really addressed in the question or the discussion is the fact that DbContext can't cancel changes. You can submit changes, but you can't clear out the change tree, so if you use a per request context you're out of luck if you need to throw changes away for whatever reason.
Personally I create instances of DbContext when needed - usually attached to business components that have the ability to recreate the context if required. That way I have control over the process, rather than having a single instance forced onto me. I also don't have to create the DbContext at each controller startup regardless of whether it actually gets used. Then if I still want to have per request instances I can create them in the CTOR (via DI or manually) or create them as needed in each controller method. Personally I usually take the latter approach as to avoid creating DbContext instances when they are not actually needed.
It depends from which angle you look at it too. To me the per request instance has never made sense. Does the DbContext really belong into the Http Request? In terms of behavior that's the wrong place. Your business components should be creating your context, not the Http request. Then you can create or throw away your business components as needed and never worry about the lifetime of the context.
I agree with previous opinions. It is good to say, that if you are going to share DbContext in single thread app, you'll need more memory. For example my web application on Azure (one extra small instance) needs another 150 MB of memory and I have about 30 users per hour.
Here is real example image: application have been deployed in 12PM
What I like about it is that it aligns the unit-of-work (as the user sees it - i.e. a page submit) with the unit-of-work in the ORM sense.
Therefore, you can make the entire page submission transactional, which you could not do if you were exposing CRUD methods with each creating a new context.
Another understated reason for not using a singleton DbContext, even in a single threaded single user application, is because of the identity map pattern it uses. It means that every time you retrieve data using query or by id, it will keep the retrieved entity instances in cache. The next time you retrieve the same entity, it will give you the cached instance of the entity, if available, with any modifications you have done in the same session. This is necessary so the SaveChanges method does not end up with multiple different entity instances of the same database record(s); otherwise, the context would have to somehow merge the data from all those entity instances.
The reason that is a problem is a singleton DbContext can become a time bomb that could eventually cache the whole database + the overhead of .NET objects in memory.
There are ways around this behavior by only using Linq queries with the .NoTracking() extension method. Also these days PCs have a lot of RAM. But usually that is not the desired behavior.
Another issue to watch out for with Entity Framework specifically is when using a combination of creating new entities, lazy loading, and then using those new entities (from the same context). If you don't use IDbSet.Create (vs just new), Lazy loading on that entity doesn't work when its retrieved out of the context it was created in. Example:
public class Foo {
public string Id {get; set; }
public string BarId {get; set; }
// lazy loaded relationship to bar
public virtual Bar Bar { get; set;}
}
var foo = new Foo {
Id = "foo id"
BarId = "some existing bar id"
};
dbContext.Set<Foo>().Add(foo);
dbContext.SaveChanges();
// some other code, using the same context
var foo = dbContext.Set<Foo>().Find("foo id");
var barProp = foo.Bar.SomeBarProp; // fails with null reference even though we have BarId set.
I am trying to create 3-tier winform application. Since this is my first attempt of 3-tier design, I got stuck and have few questions.
The application will support attaching multiple sqlite db files.
So I created class like this
public class Database
{
public string Name { get; set; }
public string FilePath { get; set; }
public bool isAttached { get; private set; }
}
Now I want to have collection of those objects.
Should I create another class like DatabaseList below or is enough to just create a List
public class DatabaseList : List<Database>
{
...
vs
List<Database> myDatabases;
What should be created in Form1.cs?
For example I assume the collection above should be created in BusinessLayer and not in Form1.cs and only BusinessLayer class is created in Form1.cs. Is this correct?
Where to put Attach Method?
The method would be like this:
public void AttachDB(Database db)
{
MySqliteHelper.Attach(db.Name, db.FilePath);
this.Add(db);
}
Do I put the method in DatabaseList class (if this is the way to create collection) or should it be in BusinessLayer?
How to make the Attach method to support additional relational databases like MS SQL Compact Edition which also resides in a single file
I was thinknig of creating another general database helper class with same methods as MySqliteHelper and the AttachDB method would call that instead. Something like
MyDBHelper.Attach(db.Name, db.FilePath);
Or is this where Dependency Injections like Ninject can be helpful? I never used that before and all I am recalling from Ninject is a samurai having different weapons so it seems to me to be kinda similar to my problem having different specific database classes.
I'm going to tackle this question in parts because it covers a lot of ground.
What qualifies as a 3-tier architecture?
A 3-tier (or n-tier, tiered) architecture is basically any design where the interface doesn't directly communicate with the database, no matter how thin the actual tiers are. You could create a single class with functions to get and save data, and it would still qualify as a 3-tier architecture. That being said, what I'm going to explain below is probably the most common implementation of a 3-tier architecture.
Layer vs. Tier: What's the difference?
To understand the 3-tier architecture, it's important to first make a distinction between a layer and a tier. An application can have many physical layers and still contain only three logical tiers. If a picture really is worth a million words, the diagram below should clear that up for you.
In the diagram above, the Business/Middle Tier is comprised of business logic, business objects, and data access objects. The purpose of this tier is to serve the middle man between the user interface and the database.
The Data Access Layer (DAL)
The data access layer is comprised of a data access component (see below) and one or more data access objects. Depending on the need, the data acess objects are usually set up one of two ways:
One Data Access Object for each Business Object
One Data Access Object shared by many Business Objects
It sounds like you're going to be dealing with several databases, so it would probably make sense to go with the one-to-one option. Doing it this way you'll have the flexibility to specify which database/connection corresponds to which business object.
Data Access Component
Your data access component should be a very generic class containing only the bare-bones methods needed to connect and interact with a database. In the diagram above, that component is represented by the dbConnection class.
Questions & Answers
What should be created in Form1.cs?
The only thing the front end deals with are the business objects and the business logic. Sometimes it's not that black and white, but that's the idea.
Where to put Attach Method?
Instead of an Attach method, pass a connection string into your data access component. A connection string can be used to attach and/or connect to pretty much any database.
How to make the Attach method to support additional relational databases like MS SQL Compact Edition which also resides in a single file?
See above.
Should I create another class like DatabaseList below or is enough to just create a List?
Honestly, this is up to you and doesn't affect the validity of the 3-tier architecture. You know the specific requirements that you're trying to meet, so do it if it makes sense. Give consideration to how your Data Access Object(s) will interact with this class though, because you will need to expose the methods for executing queries and non-queries on whatever database is selected from the list.
What you lack is thinking in terms of objects and their responsibility.
What object is responsible for creating instances of your database descriptions? Should it be Form1?
The OOP tells you that if you have such doubts you can follow the Pure Fabrication principle and just create another class to be responsible for this. This is just as simple.
So you can create a class, let call it DatabaseManager, put your list of databases there plus the Attach method. You probably also want this manager to be an ambient class (the same instance shared among other classes) so you can build a Singleton out of it (but this is not necessary).
DI containers could probably help you to organize services and manage their lifetime but I recommend you start with a good book on this before you misuse the idea. Mark Seemann's "Dependency Injection in .NET" is fine.
You need to think in terms of modularity and abstraction. See you have multiple entities to be passed across layers.
Following are the examples:
1. Presentation will create an object of business layer or business facade. But it will expect the logical entity from business layer.
Business layer will create the object of DataAccess and will expect the logical entity from DataAccess to perform business operations.
DataAccess will do whatever it would like to do to get the information from database. So if you need to connect the oracle / sql /sqllite / files system whatever but it will convert or say initialize the Logical entity (entity is a Class only consisting of properties).
So every layer will have their own responsibility and perform the operation it is responsible for.
So I think your db related operations will go in DataAccess.
What class in my project should responsible for keeping track of which Aggregate Roots have already been created so as not to create two instances for the same Entity. Should my repository keep a list of all aggregates it has created? If yes, how do i do that? Do I use a singleton repository (doesn't sound right to me)? Would another options be to encapsulate this "caching" somewhere else is some other class? What would this class look like and in what pattern would it fit?
I'm not using an O/R mapper, so if there is a technology out there that handles it, i'd need to know how it does it (as in what pattern it uses) to be able to use it
Thanks!
I believe you are thinking about the Identity Map pattern, as described by Martin Fowler.
In his full description of the pattern (in his book), Fowler discusses implementation concerns for read/write entities (those that participate in transactions) and read-only (reference data, which ideally should be read only once and subsequently cached in memory).
I suggest obtaining his excellent book, but the excerpt describing this pattern is readable on Google Books (look for "fowler identity map").
Basically, an identity map is an object that stores, for example in a hashtable, the entity objects loaded from the database. The map itself is stored in the context of the current session (request), preferably in a Unit of Work (for read/write entities). For read-only entities, the map need not be tied to the session and may be stored in the context of the process (global state).
I consider caching to be something that happens at the Service level, rather than in a repository. Repositories should be "dumb" and just do basic CRUD operations. Services can be smart enough to work with caching as necessary (which is probably more of a business rule than a low-level data access rule).
To put it simply, I don't let any code use Repositories directly - only Services can do that. Then everything else consumes the relevant services as interfaces. That gives you a nice wrapper for putting in biz logic, caching, etc.
I would say, if this "caching" managed anywhere other than the Repository then you are letting concerns leak out.
The repository is your collection of items. No code consuming the repository should have to decide whether to retrieve the object from the repository or from somewhere else.
Singleton sounds like the wrong lifetime; it likely should be per-request. This is easy to manage if you are using an IoC/DI container.
It seems like the fact that you even have consider multiple calls for the same aggregate evidences a architecture/design problem. I would be interested to hear an example of what those first and second calls might be, and why they require the same exact instance of your AR.
How do Services and Repositories relate to each other in DDD? I mean, I've been reading up on DDD for the past 2 days and everywhere I go, there's always a Service layer and there's always a Repository layer. How do these differentiate or compliment each other?
From what I've read, isn't the Repository the layer responsible for delegating interactions between the application and the data?
So, what's the need for the Service layer if it has to implement the Repository to interact with the data anyway even though the Repository probably already implements the methods needed to do so?
I'd appreciate some enlightenment on the subject.
P.S. Don't know if this will help any, but I'm working with an ASP.NET MVC 2 application in which I'm trying to implement the Repository pattern. I just finished implementing the Dependency Injection pattern (for the first time ever)...
UPDATE
Okay, with so many answers, I think I understand what the difference is. So, to review (correct me if I'm wrong):
A Repository layer interacts only with a single object out of the database or the ORM, IEmployeeRepository -> Employee.
A Service layer encapsulates more complex functionality on objects returned from Repositories, either one or multiple.
So, then I have a sub question. Is it considered bad practice to create abstract objects to be sent to my views? For example an AEmployee (A for abstract because to me I means interface) which contains properties from Employee and X or X?
Actually, one more subquestion. If a Service layer can be considered "tuned" for an application does it need to be implemented with an interface?
The Service will use a Repository to retrieve an Entity and then call methods on it (the Entity) to perform the Command/task.
True, a repository works with data (ie. SQL, Webservice etc.) but that's the only job. CRUD operations, nothing more. There is no place for stored procedure based busines logic.
The service (or business logic layer) provides the functionality. How to fullfill a business request (ie. calculate salary), what you have to do.
Oh, and this is a really nice DDD book:
http://www.infoq.com/minibooks/domain-driven-design-quickly
As a concrete example a Shopping Cart application might have the following services:
ShoppingCartService - manages a cart of items with add/remove/update support etc.
OrderService - take a cart, converts it to an order and handles the payment process etc.
each of these services needs to talk a "data source" for CRUD operations. This is where the Repository pattern comes in handy as it abstracts the loading and saving of data to and from the data source be it a database, web service or even in-memory cache.
When you want to create a quick prototype of your application without having to deal with database setup, schema, stored procedures, permissions, etc. you can create a cache or fake repository in a matter of minutes.
For the example above your prototype might start off with the following:
FakeCustomerRepository
FakeAddressRepository
FakeCartRepository
FakeCartLineItemRepository
FakeOrderRepository
FakeOrderLineItemRepository
once your prototype is ready to evolve to the next level you can implement these against a real database:
SQLCustomerRepository
SQLAddressRepository
SQLCartRepository
SQLCartLineItemRepository
SQLOrderRepository
SQLOrderLineItemRepository
From what I can remember, the repository is the final class before the data. The service class can act on data retrieved from the repository. The repository is really just meant to get data to somebody else to do the work. The service layer can provide things such as business logic that all data must pass through. It could also provide for a translation between the application logic and the data layer. But again, this is just what I can remember.
There's no golden standard that defines a service or a repository. In my applications a repository is (as you say) an interface into a database. A service has full access to a repository - but the service exposes a subset of functionality to its consumers.
Think of the repository as more low level. The repository has to expose many ways of accessing the underlying database. A service might combine calls to a repository with other code that only makes sense at a code level (i.e. not in the database), such as access to other state in the application, or validation/business logic that can't easily be applied in a database.