I am working on an MVC-EF application that has a Service layer and a Data-access layer. The controllers interact with the service layer and which in turn interacts with the data-access layer. Data-access layer is based on a simplistic implementation of repository pattern exposing the crud operations most of the times.
I am using transactions in my service layer method by doing:
bool done = false;
using (IDatabaseTransaction transaction = this.businessDataSource.BeginTransaction())
{
try
{
// Execute some statements updating database
// ....
transaction.Commit();
done = true;
}
catch (Exception)
{
transaction.Rollback();
}
}
return done;
where BeginTransaction is a method on the repository base class as:
public IDatabaseTransaction BeginTransaction()
{
return new DatabaseTransaction(this.Context.Database.BeginTransaction());
}
The implemented service methods in for e.g. EmployeeService, would be like, AddEmployee(Employee emp), AssignEmployeeToDepartments(int empId, IList departmentNames) which have their statements enclosed in a using(IDatabaseTransaction transaction) statements as above, assuming the methods are complex with complex business logic (than it seems from the names :) ).
Now I have to implement a third method for e.g. say ImportEmployee(), which is an aggregate method of previously implemented methods AddEmployee(), and AssignEmployeeToDepartments() implemented earlier. Recalling that the earlier implemented methods are complex and hence need to be reused.
So my question is:
How can I refactor (or implement in a particular way) these service methods, so they can be used individually as well as part of another service method (which also executes its statements in a transaction)?
TIA.
EDIT:
I spent some time reading the related posts in stackoverflow and elsewhere and got few ideas how to go about solving the problem. I have described below the one that seems promising.
The key to the solution is "HttpContext.Items" (available through HttpContext.Current.Items), which by definition is a per-request cache.
Since all my transaction handling is within the service methods, whenever I need to start a new transaction within a service method, I would add a flag in HttpContext.Items by doing:
if (!HttpContext.Current.Items.Contains("HasActiveTranaction"))
HttpContext.Current.Items.Add("HasActiveTranaction", true);
checking that no transaction has already been started. If the key HasActiveTranaction exists, then it would mean this is already being executed as part of another transaction, and in that case not create a transaction. Also, doing the same check before trying to commit or rollback the transaction.
This way it makes it possible to use one service method from another, as well as use both on their own (having their own transactions).
It can be further improved upon by creating a static type TransactionHelper, with a property "HasActiveTranaction" to start with, that centralizes access to it.
Does it sound a good solution?
Related
I have two scenarios (examples below), both are perfectly legitimate methods of making a database request, however I'm not really sure which is best.
Example One - This is the method we generally use when building new applications.
private readonly IInterfaceName _repositoryInterface;
public ControllerName()
{
_repositoryInterface = new Repository(Context);
}
public JsonResult MethodName(string someParameter)
{
var data = _repositoryInterface.ReturnData(someParameter);
return data;
}
protected override void Dispose(bool disposing)
{
Context.Dispose();
base.Dispose(disposing);
}
public IEnumerable<ModelName> ReturnData(filter)
{
Expression<Func<ModelName, bool>> query = q => q.ParameterName.ToUpper().Contains(filter)
return Get(filter);
}
Example Two - I've recently started seeing this more frequently
using (SqlConnection connection = new SqlConnection(
ConfigurationManager.ConnectionStrings["ConnectionName"].ToString()))
{
var storedProcedureName = GetStoredProcedureName();
using (SqlCommand command = new SqlCommand(storedProcedureName, connection))
{
command.CommandType = CommandType.StoredProcedure;
command.Parameters.Add("#Start", SqlDbType.Int).Value = start;
using (SqlDataReader reader = command.ExecuteReader())
{
// DATA IS READ AND PARSED
}
}
}
Both examples use Entity Framework in some form (the first more so than the other), there are Model and Mapping files for every table which could be interrogated. The main thing the second example does over the first (regarding EF) is utilising Migrations as part of the Stored Procedure code generation. In addition, both implement the Repository pattern similar to that which is in the second link below.
Code First - MSDN
Contoso University - Tutorial
My understanding of Example One is that the repository and context are instantiated once the Controller is called. When making the call to the repository it returns the data but leaves the context intact until it is disposed of at the end of the method. Example Two on the other hand will call Dispose as soon as the database call is finished with (unless forced into memory, e.g. using .ToList() on an IEnumerable). If my understanding is not correct, please correct me where appropriate.
So my main question is what are the disadvantages and advantages of using one over the other? Example, is there a larger performance overhead of going with Example 2 compared to Example 1.
FYI: I've tried to search for an answer to the below but have been unsuccessful, so if you are of a similar question please feel free to point me in that direction.
You seem to be making a comparison like this:
Is it better to build a house or to install plumbing in the bathroom?
You can have both. You could have a repository (house) that uses data connections (plumbing) so it's not an "OR" situation.
There is no reason why the call to ReturnData doesn't use a SqlCommand under the hood.
Now, the real important difference that is worth considering is whether or not the repository holds a resource (memory, connection, pipe, file, etc) open for its lifetime, or just per data call.
The advantage of using a using is that resources are only opened for the duration of the call. This helps immensely with scaling of the app.
On the other hand there's an overhead to opening connections, so it's better - particularly for single threaded apps - to open a connection, do several tasks, and then close it.
So it really boils down to what type of app you're writing as to which approach you use.
Your second example isn't using entity framework. It seems you may have two different approaches to data access here although it is hard to tell from the repository snippet as it quite rightly hides the data access implementation. The second example is correctly using a "using" statement as you should on any object that implements IDisposable. It means you don't have to worry about calling dispose. This is using pure ADO.net which is what Entity Framework uses under the hood.
If the first example is using Entity framework you most likely have lazy loading in play in which case you need the DbContext to remain until the query has been executed. Entity Framework is an ORM tool. It too uses ADO.net under the hood to connect to the database but it also offers you alot more on top. A good book on both subjects should help you.
I found learning ADO.net first helps alot in understanding how Entity Framework retrieves info from the Database.
the using statement is good practice where ever you find an object that implements IDisposable. You can read more about that here : IDisposable the right way
In response to the change to the question - the answer still on the whole remains the same. In terms of performance - how fast are the queries returned? Does the performance of one work better than the other? Only your current system and set up can tell you that. Both approaches seem to be doing things the correct way.
I haven't worked with Migrations so not sure why you are getting ADO.net type queries integrating with your EF models but wouldn't be surprised by this functionality. Entity Framework as I have experienced it creates the queries for you and then executes them using the ADO.net objects from your second example. The key point is that you want to have the "using" block for SqlConnection and SqlCommand objects (although I don't think you need to nest them. everything inside the outer "using block will be disposed).
There is nothing stopping you putting a "using" block in your repository around the context but when it comes to lazily load the related Entities you will get an error as the context will have been disposed. If you need to make this change you can include the relevant elements in your query and do away with the lazy loading approach. There are performance gains in certain situations for doing this but again you need to balance this in terms to how your system is performing.
I'm reading Vaughn Vernon's book on Implementing Domain Driven design. I have also been going through the book code, C# version, from his github here.
The Java version of the book has decorators #Transactional which I believe are from the spring framework.
public class ProductBacklogItemService
{
#Transactional
public void assignTeamMemberToTask(
string aTenantId,
string aBacklogItemId,
string aTaskId,
string aTeamMemberId)
{
BacklogItem backlogItem =
backlogItemRepository.backlogItemOfId(
new TenantId(aTenantId),
new BacklogItemId(aBacklogItemId));
Team ofTeam =
teamRepository.teamOfId(
backlogItem.tennantId(),
backlogItem.teamId());
backlogItem.assignTeamMemberToTask(
new TeamMemberId(aTeamMemberId),
ofTeam,
new TaskId(aTaskId));
}
}
What would be the equivalent manual implementation in C#? I'm thinking something along the lines of:
public class ProductBacklogItemService
{
private static object lockForAssignTeamMemberToTask = new object();
private static object lockForOtherAppService = new object();
public voice AssignTeamMemberToTask(string aTenantId,
string aBacklogItemId,
string aTaskId,
string aTeamMemberId)
{
lock(lockForAssignTeamMemberToTask)
{
// application code as before
}
}
public voice OtherAppsService(string aTenantId)
{
lock(lockForOtherAppService)
{
// some other code
}
}
}
This leaves me with the following questions:
Do we lock by application service, or by repository? i.e. Should we not be doing backlogItemRepository.lock()?
When we are reading multiple repositories as part of our application service, how do we protect dependencies between repositories during transactions (where aggregate roots reference other aggregate roots by identity) - do we need to have interconnected locks between repositories?
Are there any DDD infrastructure frameworks that handle any of this locking?
Edit
Two useful answers came in to use transactions, as I haven't selected my persistence layer I am using in-memory repositories, these are pretty raw and I wrote them (they don't have transaction support as I don't know how to add!).
I will design the system so I do not need to commit to atomic changes to more than one aggregate root at the same time, I will however need to read consistently across a number of repositories (i.e. if a BacklogItemId is referenced from multiple other aggregates, then we need to protect against race conditions should BacklogItemId be deleted).
So, can I get away with just using locks, or do I need to look at adding TransactionScope support on my in-memory repository?
TL;DR version
You need to wrap your code in a System.Transactions.TransactionScope. Be careful about multi-threading btw.
Full version
So the point of aggregates is that the define a consistency boundary. That means any changes should result in the state of the aggregate still honouring it's invariants. That's not necessarily the same as a transaction. Real transactions are a cross-cutting implementation detail, so should probably be implemented as such.
A warning about locking
Don't do locking. Try and forget any notion you have of implementing pessimistic locking. To build scalable systems you have no real choice. The very fact that data takes time to be requested and get from disk to your screen means you have eventual consistency, so you should build for that. You can't really protect against race conditions as such, you just need to account for the fact they could happen and be able to warn the "losing" user that their command failed. Often you can start finding these issues later on (seconds, minutes, hours, days, whatever your domain experts tell you the SLA is) and tell users so they can do something about it.
For example, imagine if two payroll clerks paid an employee's expenses at the same time with the bank. They would find out later on when the books were being balanced and take some compensating action to rectify the situation. You wouldn't want to scale down your payroll department to a single person working at a time in order to avoid these (rare) issues.
My implementation
Personally I use the Command Processor style, so all my Application Services are implemented as ICommandHandler<TCommand>. The CommandProcessor itself is the thing looking up the correct handler and asking it to handle the command. This means that the CommandProcessor.Process(command) method can have it's entire contents processed in a System.Transactions.TransactionScope.
Example:
public class CommandProcessor : ICommandProcessor
{
public void Process(Command command)
{
using (var transaction = new TransactionScope())
{
var handler = LookupHandler(command);
handler.Handle(command);
transaction.Complete();
}
}
}
You've not gone for this approach so to make your transactions a cross-cutting concern you're going to need to move them a level higher in the stack. This is highly-dependent on the tech you're using (ASP.NET, WCF, etc) so if you add a bit more detail there might be an obvious place to put this stuff.
Locking wouldn't allow any concurrency on those code paths.
I think you're looking for a transaction scope instead.
I don't know what persistency layer you are going to use but the standard ones like ADO.NET, Entity Framework etc. support the TransactionScope semantics:
using(var tr = new TransactionScope())
{
doStuff();
tr.Complete();
}
The transaction is committed if tr.Complete() is called. In any other case it is rolled back.
Typically, the aggregate is a unit of transactional consistency. If you need the transaction to spread across multiple aggregates, then you should probably reconsider your model.
lock(lockForAssignTeamMemberToTask)
{
// application code as before
}
This takes care of synchronization. However, you also need to revert the changes in case of any exception. So, the pattern will be something like:
lock(lockForAssignTeamMemberToTask)
{
try {
// application code as before
} catch (Exception e) {
// rollback/restore previous values
}
}
When creating an interface having methods that are expected to be called in a specific order, is such dependency good practice, or should more patterns and practices be applied to "fix" it or make the situation better?
It's important users of some interfaces call methods in a specific order.
There are likely many various examples. This is the one that came to mind first:
A data source interface of which the author envisions the init method to always be called first by any caller (i.e to connect to the data source or look up preliminary meta info, etc), before any other of the operation methods are called.
interface DataAccess {
// Note to callers: this init must be called first and only once.
void InitSelf();
// operation: get the record having the given id
T Op_GetDataValue<T>(int id);
// operation: get a cound
int Op_GetCountOfData();
// operation: persist something to the data store
void Op_Persist(object o);
//etc.
}
However the caller may choose not to call the initialization method first.
In general I'm wondering if there are better ways for this situation.
You could have the other methods throw an exception if the object is uninitialized, or you could go for a more strict API. It would be more complicated to implement, but for example, InitSelf() could return an interface containing the data operations:
interface DataAccess {
DataOperations InitSelf();
}
interface DataOperations {
T Op_GetDataValue<T>(int id);
...
}
This would sort of require the consumer to initialize before performing operations, though there would be ways to circumvent that.
I'm a little confused. Implementors are not necessarily callers or users of the interface. In your example, implementors can assume that InitSelf is called before anything else, but aren't responsible for making that happen.
I think naming something InitXXX is a good indication that that is the case.
For more odd dependencies (not Init, which is very common), it would probably be better to not have the dependency.
Sometimes, it's not possible, and if you decide that it's not overkill to try to fix it, then a common thing is to separate into multiple interfaces that you get access to as you call early ones.
A common example is a database interface. You call connect, it returns a connection. You call createStatement on the connection, it returns a statement. You call setParam on the statement, you call runStatement on the statement, then you get a result, etc.
Any initialization should be done in the constructor. Clients of the DataAccess interface should not have to worry about the construction details.
I have a method that I want to be "transactional" in the abstract sense. It calls two methods that happen to do stuff with the database, but this method doesn't know that.
public void DoOperation()
{
using (var tx = new TransactionScope())
{
Method1();
Method2();
tc.Complete();
}
}
public void Method1()
{
using (var connection = new DbConnectionScope())
{
// Write some data here
}
}
public void Method2()
{
using (var connection = new DbConnectionScope())
{
// Update some data here
}
}
Because in real terms the TransactionScope means that a database transaction will be used, we have an issue where it could well be promoted to a Distributed Transaction, if we get two different connections from the pool.
I could fix this by wrapping the DoOperation() method in a ConnectionScope:
public void DoOperation()
{
using (var tx = new TransactionScope())
using (var connection = new DbConnectionScope())
{
Method1();
Method2();
tc.Complete();
}
}
I made DbConnectionScope myself for just such a purpose, so that I don't have to pass connection objects to sub-methods (this is more contrived example than my real issue). I got the idea from this article: http://msdn.microsoft.com/en-us/magazine/cc300805.aspx
However I don't like this workaround as it means DoOperation now has knowledge that the methods it's calling may use a connection (and possibly a different connection each). How could I refactor this to resolve the issue?
One idea I'm thinking of is creating a more general OperationScope, so that when teamed up with a custom Castle Windsor lifestyle I'll write, will mean any component requested of the container with OperationScopeLifetyle will always get the same instance of that component. This does solve the problem because OperationScope is more ambiguous than DbConnectionScope.
I'm seeing conflicting requirements here.
On the one hand, you don't want DoOperation to have any awareness of the fact that a database connection is being used for its sub-operations.
On the other hand, it clearly is aware of this fact because it uses a TransactionScope.
I can sort of understand what you're getting at when you say you want it to be transactional in the abstract sense, but my take on this is that it's virtually impossible (no, scratch that - completely impossible) to describe a transaction in such abstract terms. Let's just say you have a class like this:
class ConvolutedBusinessLogic
{
public void Splork(MyWidget widget)
{
if (widget.Validate())
{
widgetRepository.Save(widget);
widget.LastSaved = DateTime.Now;
OnSaved(new WidgetSavedEventArgs(widget));
}
else
{
Log.Error("Could not save MyWidget due to a validation error.");
SendEmailAlert(new WidgetValidationAlert(widget));
}
}
}
This class is doing at least two things that probably can't be rolled back (setting the property of a class and executing an event handler, which might for example cascade-update some controls on a form), and at least two more things that definitely can't be rolled back (appending to a log file somewhere and sending out an e-mail alert).
Perhaps this seems like a contrived example, but that is actually my point; you can't treat a TransactionScope as a "black box". The scope is in fact a dependency like any other; TransactionScope just provides a convenient abstraction for a unit of work that may not always be appropriate because it doesn't actually wrap a database connection and can't predict the future. In particular, it's normally not appropriate when a single logical operation needs to span more than one database connection, whether those connections are to the same database or different ones. It tries to handle this case of course, but as you've already learned, the result is sub-optimal.
The way I see it, you have a few different options:
Make explicit the fact that Method1 and Method2 require a connection by having them take a connection parameter, or by refactoring them into a class that takes a connection dependency (constructor or property). This way, the connection becomes part of the contract, so Method1 no longer knows too much - it knows exactly what it's supposed to know according to the design.
Accept that your DoOperation method does have an awareness of what Method1 and Method2 do. In fact, there is nothing wrong with this! It's true that you don't want to be relying on implementation details of some future call, but forward dependencies in the abstraction are generally considered OK; it's reverse dependencies you need to be concerned about, like when some class deep in the domain model tries to update a UI control that it has no business knowing about in the first place.
Use a more robust Unit of Work pattern (also: here). This is getting to be more popular and it is, by and large, the direction Microsoft has gone in with Linq to SQL and EF (the DataContext/ObjectContext are basically UOW implementations). This sleeves in well with a DI framework and essentially relieves you of the need to worry about when transactions start and end and how the data access has to occur (the term is "persistence ignorance"). This would probably require significant rework of your design, but pound for pound it's going to be the easiest to maintain long-term.
Hope one of those helps you.
How would you go about calling several methods in the data access layer from one method in the business logic layer so that all of the SQL commands lived in one SQL transaction?
Each one of the DAL methods may be called individually from other places in the BLL, so there is no guarantee that the data layer methods are always part of a transaction. We need this functionality so if the database goes offline in the middle of a long running process, there's no commit. The business layer is orchestrating different data layer method calls based on the results of each of the previous calls. We only want to commit (from the business layer) at the very end of the entire process.
well, firstly, you'll have to adhere to an atomic Unit of Work that you specify as a single method in your BLL. This would (for example) create the customer, the order and the order items. you'd then wrap this all neatly up inside a TransactionScope using statement. TransactionScope is the secret weapon here. below is some code that luckily enough I'm working on right now :):
public static int InsertArtist(Artist artist)
{
if (artist == null)
throw new ArgumentNullException("artist");
int artistid = 0;
using (TransactionScope scope = new TransactionScope())
{
// insert the master Artist
/*
we plug the artistid variable into
any child instance where ArtistID is required
*/
artistid = SiteProvider.Artist.InsertArtist(new ArtistDetails(
0,
artist.BandName,
artist.DateAdded));
// insert the child ArtistArtistGenre
artist.ArtistArtistGenres.ForEach(item =>
{
var artistartistgenre = new ArtistArtistGenreDetails(
0,
artistid,
item.ArtistGenreID);
SiteProvider.Artist.InsertArtistArtistGenre(artistartistgenre);
});
// insert the child ArtistLink
artist.ArtistLinks.ForEach(item =>
{
var artistlink = new ArtistLinkDetails(
0,
artistid,
item.LinkURL);
SiteProvider.Artist.InsertArtistLink(artistlink);
});
// insert the child ArtistProfile
artist.ArtistProfiles.ForEach(item =>
{
var artistprofile = new ArtistProfileDetails(
0,
artistid,
item.Profile);
SiteProvider.Artist.InsertArtistProfile(artistprofile);
});
// insert the child FestivalArtist
artist.FestivalArtists.ForEach(item =>
{
var festivalartist = new FestivalArtistDetails(
0,
item.FestivalID,
artistid,
item.AvailableFromDate,
item.AvailableToDate,
item.DateAdded);
SiteProvider.Festival.InsertFestivalArtist(festivalartist);
});
BizObject.PurgeCacheItems(String.Format(ARTISTARTISTGENRE_ALL_KEY, String.Empty, String.Empty));
BizObject.PurgeCacheItems(String.Format(ARTISTLINK_ALL_KEY, String.Empty, String.Empty));
BizObject.PurgeCacheItems(String.Format(ARTISTPROFILE_ALL_KEY, String.Empty, String.Empty));
BizObject.PurgeCacheItems(String.Format(FESTIVALARTIST_ALL_KEY, String.Empty, String.Empty));
BizObject.PurgeCacheItems(String.Format(ARTIST_ALL_KEY, String.Empty, String.Empty));
// commit the entire transaction - all or nothing
scope.Complete();
}
return artistid;
}
hopefully, you'll get the gist. basically, it's an all succeed or fail job, irrespective of any disparate databases (i.e. in the above example, artist and artistartistgenre could be hosted in two separate db stores but TransactionScope would care less about that, it works at COM+ level and manages the atomicity of the scope that it can 'see')
hope this helps
EDIT: you'll possibly find that the initial invocation of TransactionScope (on app start-up) may be slightly noticeable (i.e. in the example above, if called for the first time, can take 2-3 seconds to complete), however, subsequent calls are almost instantaneous (i.e. typically 250-750 ms). the trade off between a simple point of contact transaction vs the (unwieldy) alternatives mitigates (for me and my clients) that initial 'loading' latency.
just wanted to demonstrate that ease doesn't come without compromise (albeit in the initial stages)
What you describe is the very 'definition' of a long transaction.
Each DAL method could simply provide operations (without any specific commits). Your BLL (which is in effect where you are coordinating any calls to the DAL anyway) is where you can choose to either commit, or execute a 'savepoint'. A savepoint is an optional item which you can employ to allow 'rollbacks' within a long running transaction.
So for example, if my DAL has methods DAL1, DAL2, DAL3 are all mutative they would simply 'execute' data change operations (i.e. some type of Create, Update, Delete). From my BLL, lets assume I have BL1, and BL2 methods (BL1 is long running). BL1 invokes all the aforementoned DAL methods (i.e. DAL1...DAL3), while BL2, only invokes DAL3.
Therefore, on execution of each business logic method you might have the following:
BL1 (long-transaction) -> {savepoint} DAL1 -> {savepoint} DAL2 -> DAL3 {commit/end}
BL2 -> DAL3 {commit/end}
The idea behind the 'savepoint' is it can allow BL1 to rollback at any point if there are issues in the data operations. The long transaction is ONLY commited if all three operations successfully complete. BL2 can still call any method in the DAL, and it is responsible for controlling commits. NOTE: you could use 'savepoints' in short/regular transactions as well.
Good question. This gets to the heart of the impedance mismatch.
This is one of the strongest arguments for using stored procedures. Reason: they are designed to encapsulate multiple SQL statements in a transaction.
The same can be done procedurally in the DAL, but it results in code with less clarity, while usually resulting in moving the coupling/cohesion balance in the wrong direction.
For this reason, I implement the DAL at a higher level of abstraction than simply encapsulating tables.
just in case my comment in the original article didn't 'stick', here's what i'd added as additional info:
<-----
coincidently, just noticed another similar reference to this posted a few hours after your request. uses a similar strategy and might be worth you looking at as well:
http://stackoverflow.com/questions/494550/how-does-transactionscope-roll-back-transactions
----->