How would you go about calling several methods in the data access layer from one method in the business logic layer so that all of the SQL commands lived in one SQL transaction?
Each one of the DAL methods may be called individually from other places in the BLL, so there is no guarantee that the data layer methods are always part of a transaction. We need this functionality so if the database goes offline in the middle of a long running process, there's no commit. The business layer is orchestrating different data layer method calls based on the results of each of the previous calls. We only want to commit (from the business layer) at the very end of the entire process.
well, firstly, you'll have to adhere to an atomic Unit of Work that you specify as a single method in your BLL. This would (for example) create the customer, the order and the order items. you'd then wrap this all neatly up inside a TransactionScope using statement. TransactionScope is the secret weapon here. below is some code that luckily enough I'm working on right now :):
public static int InsertArtist(Artist artist)
{
if (artist == null)
throw new ArgumentNullException("artist");
int artistid = 0;
using (TransactionScope scope = new TransactionScope())
{
// insert the master Artist
/*
we plug the artistid variable into
any child instance where ArtistID is required
*/
artistid = SiteProvider.Artist.InsertArtist(new ArtistDetails(
0,
artist.BandName,
artist.DateAdded));
// insert the child ArtistArtistGenre
artist.ArtistArtistGenres.ForEach(item =>
{
var artistartistgenre = new ArtistArtistGenreDetails(
0,
artistid,
item.ArtistGenreID);
SiteProvider.Artist.InsertArtistArtistGenre(artistartistgenre);
});
// insert the child ArtistLink
artist.ArtistLinks.ForEach(item =>
{
var artistlink = new ArtistLinkDetails(
0,
artistid,
item.LinkURL);
SiteProvider.Artist.InsertArtistLink(artistlink);
});
// insert the child ArtistProfile
artist.ArtistProfiles.ForEach(item =>
{
var artistprofile = new ArtistProfileDetails(
0,
artistid,
item.Profile);
SiteProvider.Artist.InsertArtistProfile(artistprofile);
});
// insert the child FestivalArtist
artist.FestivalArtists.ForEach(item =>
{
var festivalartist = new FestivalArtistDetails(
0,
item.FestivalID,
artistid,
item.AvailableFromDate,
item.AvailableToDate,
item.DateAdded);
SiteProvider.Festival.InsertFestivalArtist(festivalartist);
});
BizObject.PurgeCacheItems(String.Format(ARTISTARTISTGENRE_ALL_KEY, String.Empty, String.Empty));
BizObject.PurgeCacheItems(String.Format(ARTISTLINK_ALL_KEY, String.Empty, String.Empty));
BizObject.PurgeCacheItems(String.Format(ARTISTPROFILE_ALL_KEY, String.Empty, String.Empty));
BizObject.PurgeCacheItems(String.Format(FESTIVALARTIST_ALL_KEY, String.Empty, String.Empty));
BizObject.PurgeCacheItems(String.Format(ARTIST_ALL_KEY, String.Empty, String.Empty));
// commit the entire transaction - all or nothing
scope.Complete();
}
return artistid;
}
hopefully, you'll get the gist. basically, it's an all succeed or fail job, irrespective of any disparate databases (i.e. in the above example, artist and artistartistgenre could be hosted in two separate db stores but TransactionScope would care less about that, it works at COM+ level and manages the atomicity of the scope that it can 'see')
hope this helps
EDIT: you'll possibly find that the initial invocation of TransactionScope (on app start-up) may be slightly noticeable (i.e. in the example above, if called for the first time, can take 2-3 seconds to complete), however, subsequent calls are almost instantaneous (i.e. typically 250-750 ms). the trade off between a simple point of contact transaction vs the (unwieldy) alternatives mitigates (for me and my clients) that initial 'loading' latency.
just wanted to demonstrate that ease doesn't come without compromise (albeit in the initial stages)
What you describe is the very 'definition' of a long transaction.
Each DAL method could simply provide operations (without any specific commits). Your BLL (which is in effect where you are coordinating any calls to the DAL anyway) is where you can choose to either commit, or execute a 'savepoint'. A savepoint is an optional item which you can employ to allow 'rollbacks' within a long running transaction.
So for example, if my DAL has methods DAL1, DAL2, DAL3 are all mutative they would simply 'execute' data change operations (i.e. some type of Create, Update, Delete). From my BLL, lets assume I have BL1, and BL2 methods (BL1 is long running). BL1 invokes all the aforementoned DAL methods (i.e. DAL1...DAL3), while BL2, only invokes DAL3.
Therefore, on execution of each business logic method you might have the following:
BL1 (long-transaction) -> {savepoint} DAL1 -> {savepoint} DAL2 -> DAL3 {commit/end}
BL2 -> DAL3 {commit/end}
The idea behind the 'savepoint' is it can allow BL1 to rollback at any point if there are issues in the data operations. The long transaction is ONLY commited if all three operations successfully complete. BL2 can still call any method in the DAL, and it is responsible for controlling commits. NOTE: you could use 'savepoints' in short/regular transactions as well.
Good question. This gets to the heart of the impedance mismatch.
This is one of the strongest arguments for using stored procedures. Reason: they are designed to encapsulate multiple SQL statements in a transaction.
The same can be done procedurally in the DAL, but it results in code with less clarity, while usually resulting in moving the coupling/cohesion balance in the wrong direction.
For this reason, I implement the DAL at a higher level of abstraction than simply encapsulating tables.
just in case my comment in the original article didn't 'stick', here's what i'd added as additional info:
<-----
coincidently, just noticed another similar reference to this posted a few hours after your request. uses a similar strategy and might be worth you looking at as well:
http://stackoverflow.com/questions/494550/how-does-transactionscope-roll-back-transactions
----->
Related
I have an application that I am refactoring and trying to Follow some of the "Clean Code" principles. I have an application that reads data from multiple different data sources and manipulates/formats that data and inserts it into another database. I have a data layer with the associated DTO's, repositories, interfaces , and helpers for each data source as well as a business layer with the matching entities, repositories and interfaces.
My question comes down to the Import Method. I basically have one method that systematically calls each Business logic method to read, process and save the data. There are a lot of calls that need to be made and even though the Import method itself is not manipulating the data at all, the method is still extremely large. Is there a better way to process this data?
ICustomer<Customer> sourceCustomerList = new CustomerRepository();
foreach (Customer customer in sourceCustomerList.GetAllCustomers())
{
// Read Some Data
DataObject object1 = iSourceDataType1.GetDataByCustomerID(customer.ID)
// Format and save the Data
iTargetDataType1.InsertDataType1(object1)
// Read Some Data
// Format the Data
// Save the Data
//...Rinse and repeat
}
You should look into Task Parallel Library (TPL) and Dataflow
ICustomer<Customer> sourceCustomerList = new CustomerRepository();
var customersBuffer = new BufferBlock<Customer>();
var transformBlock = new TransformBlock<Customer, DataObject>(
customer => iSourceDataType1.GetDataByCustomerID(customer.ID)
);
// Build your block with TransformBlock, ActionBlock, many more...
customersBuffer.LinkTo(transformBlock);
// Add all the blocks you need here....
// Then feed the first block or use a custom source
foreach (var c in sourceCustomerList.GetAllCustomers())
customersBuffer.Post(c)
customersBuffer.Complete();
Your performance will be IO-bound, especially with the many accesses to the database(s) in each iteration. Therefore, you need to revise your architecture to minimise IO.
Is it possible to move all the records closer together (maybe in a temporary database) as a first pass, then do the record matching and formatting within the database as a second pass, before reading them out and saving them where they need to be?
(As a side note, sometimes we get carried away with DDD and OO, where everything "needs" to be an object. But that is not always the best approach.)
Hello I found problem when I use ASP.NET MVC with EF and call Web API from other website(that have also use Entity Framework)
the problem is that
I want to make sure that both MVC SaveChanges() and Web API SaveChanges() succeed both together.
Here's my dream pseudo code
public ActionResult Operation()
{
Code Insert Update Delete....
bool testMvcSaveSuccess = db.TempSaveChanges(); //it does not have this command.
if(testMvcSaveSuccess == true)
{
bool isApiSuccess = CallApi(); //insert data to Other Web App
if(isApiSuccess == true)
{
db.SaveChanges(); //Real Save
}
}
}
From above code, if it doesn't have db.TempSaveChanges(), maybe Web API will be successful, but MVC SaveChanges() might fail.
So there is nothing like TempSaveChanges because there is something even better: Transactions.
Transaction is an IDisposable (can be used in a using block) and has methods like Commit and Rollback.
Small example:
private void TestTransaction()
{
var context = new MyContext(connectionString);
using (var transaction = context.Database.BeginTransaction())
{
// do CRUD stuff here
// here is your 'TempSaveChanges' execution
int changesCount = context.SaveChanges();
if (changesCount > 0)
// changes were made
{
// this will do the real db changes
transaction.Commit();
}
else
{
// no changes detected -> so do nothing
// could use 'transaction.Rollback();' since there are no changes, this should not be necessary
// using block will dispose transaction and with it all changes as well
}
}
}
I have extracted this example from my GitHub Exercise.EntityFramework repository. Feel free to Star/Clone/Fork...
Yes you can.
you need to overload the .Savechanges in the context class where it will be called first checked and then call the regular after.
Or create you own TempSaveChanges() in the context class call it then if successful call SaveChanges from it.
What you are referring to is known as atomicity: you want several operations to either all succeed, or none of them. In the context of a database you obtain this via transactions (if the database supports it). In your case however, you need a transaction which spans across two disjoint systems. A general-purpose (some special cases have simpler solutions) robust implementation of such a transaction would have certain requirements on the two systems, and also require additional persistence.
Basically, you need to be able to gracefully recover from a sudden stop at any point during the sequence. Each of the databases you are using are most likely ACID compliant, so you can count on each DB transaction to fulfill the atomicity requirement (they either succeed or fail). Therefore, all you need to worry about is the sequence of the two DB transactions. Your requirement on the two systems is a way to determine a posteriori whether or not some operation was performed.
Example process flow:
Operation begins
Generate unique transaction ID and persist (with request data)
Make changes to local DB and commit
Call external Web API
Flag transaction as completed (or delete it)
Operation ends
Recovery:
Get all pending (not completed) transactions from store
Check if expected change to local DB was made
Ask Web API if expected change was made
If none of the changes were made or both of the changes were made then the transaction is done: delete/flag it.
If one of the changes was made but not the other, then either revert the change that was made (revert transaction), or perform the change that was not (resume transaction) => then delete/flag it.
Now, as you can see it quickly gets complicated, specially if "determining if changes were made" is a non-trivial operation. What is a common solution to this is to use that unique transaction ID as a means of determining which data needs attention. But at this point it gets very application-specific and depends entirely on what the specific operations are. For certain applications, you can just re-run the entire operation (since you have the entire request data stored in the transaction) in the recovery step. Some special cases do not need to persist the transaction since there are other ways of achieving the same things etc.
ok so let's clarify things a bit.
you have an MVC app A1, with its own database D1
you then have an API, let's call it A2 with its own database D2.
you want some code in A1 which does a temp save in D1, then fires a call to A2 and if the response is successful then it saves the temp data from D1 in the right place this time.
based on your pseudo code, I would suggest you create a second table where you save your "temporary" data in D1. So your database has an extra table and the flow is like this:
first you save your A1 data in that table, you then call A2, data gets saved in D2, A1 receives the confirmation and calls a method which moves the data from the second table to where it should be.
Scenarios to consider:
Saving the temp data in D1 works, but the call to A2 fails. you now clear the orphan data with a batch job or simply call something that deletes it when the call to A2 fails.
The call to A2 succeeds and the call to D1 fails, so now you have temp data in D1 which has failed to move to the right table. You could add a flag to the second table against each row, which indicates that the second call to A2 succeeded so this data needs to move in the right place, when possible. You can have a service here which runs periodically and if it finds any data with the flag set to true then it moves the data to the right place.
There are other ways to deal with scenarios like this. You could use a queue system to manage this. Each row of data becomes a message, you assign it a unique id, a GUID, that is basically a CorrelationID and it's the same in both systems. Even if one system goes down, when it comes back up the data will be saved and all is good in the world and because of the common id you can always link it up properly.
C# 6.0 in a Nutshell by Joseph Albahari and Ben Albahari (O’Reilly).
Copyright 2016 Joseph Albahari and Ben Albahari, 978-1-491-92706-9.
brings, at page 376, a discussion on disposing DataContext/ObjectContext instances.
Disposing DataContext/ObjectContext
Although DataContext/ObjectContext implement IDisposable, you can (in
general) get away without disposing instances. Disposing forces the
context’s connection to dispose—but this is usually unnecessary
because L2S and EF close connections automatically whenever you finish
retrieving results from a query. Disposing a context can actually be
problematic because of lazy evaluation. Consider the following:
IQueryable<Customer> GetCustomers (string prefix)
{
using (var dc = new NutshellContext ("connection string"))
return dc.GetTable<Customer>()
.Where (c => c.Name.StartsWith (prefix));
}
...
foreach (Customer c in GetCustomers ("a"))
Console.WriteLine (c.Name);
This will fail because the query is evaluated when we enumerate
it—which is after disposing its DataContext.
There are some caveats, though, on not disposing contexts.
(and it goes on to list them...)
At the end, to avoid the exception just described, it states:
If you want to explicitly dispose contexts, you must pass a
DataContext/ObjectContext instance into methods such as
GetCustomers to avoid the problem described.
The question:
I do not get what the author meant. (no example followed).
I mean, does the author's says you can have the method still return an IQueryable<Customer>, dispose of the DataContext parameter and keep deferred execution altogether ?
How is this achieved ? I can see it happening only if giving up lazy loading.
There is a conflict between the concept of Lazy Loading and the Repository pattern. The repository pattern, for which DataContext/ObjectContext are designed for, separate the code that accesses a database from the code that consumes your business objects.
The fundamental problem with lazy loading properties is that the business objects being returned by the data layer depend on and utilize technology specific data retrieval when it may not be expected.
Some examples:
The underlying data retrieval mechanism has been disposed of when trying to access lazy loading properties later. This is what the author is trying to explain.
Customer myCustomer;
using (var dataSource = GetRepository()) {
myCustomer = dataSource.Retrieve("John");
}
// throws exception since the connection to
// the database has been closed already
var orders = myCustomer.Orders;
You may have code somewhere in your UI which attempts to read from a certain property, which triggers a database call and slows down your UI. An SqlException may occur retrieving properties in unexpected places, leading to either unreliability or tight coupling between your data store and your consumer code.
// some business layer
Customer myCustomer = myRepository.GetCustomer("John");
...
// some UI component trying to show the customer's orders
var orders = myCustomer.Orders;
// could throw any kind of data access exception, such as SqlException
// e.g. Wifi is not working anymore, now I have to build error
// handling for that here, even though it's not very obvious to someone
// who is just accessing the Orders property
Note that in my humble opinion, this is worse than having explicit coupling between data and logic layers, since the coupling is there, but hidden from view.
It's saying that you should create a data context object once and pass it it to the query methods for use.
Something like:
IQueryable<Customer> GetCustomers (NutshellContext dc, string prefix)
{
return dc.GetTable<Customer>()
.Where (c => c.Name.StartsWith (prefix));
}
Then when you call that method, pass in the data context you had created. You should only dispose that context when you're shutting down.
I am working on an MVC-EF application that has a Service layer and a Data-access layer. The controllers interact with the service layer and which in turn interacts with the data-access layer. Data-access layer is based on a simplistic implementation of repository pattern exposing the crud operations most of the times.
I am using transactions in my service layer method by doing:
bool done = false;
using (IDatabaseTransaction transaction = this.businessDataSource.BeginTransaction())
{
try
{
// Execute some statements updating database
// ....
transaction.Commit();
done = true;
}
catch (Exception)
{
transaction.Rollback();
}
}
return done;
where BeginTransaction is a method on the repository base class as:
public IDatabaseTransaction BeginTransaction()
{
return new DatabaseTransaction(this.Context.Database.BeginTransaction());
}
The implemented service methods in for e.g. EmployeeService, would be like, AddEmployee(Employee emp), AssignEmployeeToDepartments(int empId, IList departmentNames) which have their statements enclosed in a using(IDatabaseTransaction transaction) statements as above, assuming the methods are complex with complex business logic (than it seems from the names :) ).
Now I have to implement a third method for e.g. say ImportEmployee(), which is an aggregate method of previously implemented methods AddEmployee(), and AssignEmployeeToDepartments() implemented earlier. Recalling that the earlier implemented methods are complex and hence need to be reused.
So my question is:
How can I refactor (or implement in a particular way) these service methods, so they can be used individually as well as part of another service method (which also executes its statements in a transaction)?
TIA.
EDIT:
I spent some time reading the related posts in stackoverflow and elsewhere and got few ideas how to go about solving the problem. I have described below the one that seems promising.
The key to the solution is "HttpContext.Items" (available through HttpContext.Current.Items), which by definition is a per-request cache.
Since all my transaction handling is within the service methods, whenever I need to start a new transaction within a service method, I would add a flag in HttpContext.Items by doing:
if (!HttpContext.Current.Items.Contains("HasActiveTranaction"))
HttpContext.Current.Items.Add("HasActiveTranaction", true);
checking that no transaction has already been started. If the key HasActiveTranaction exists, then it would mean this is already being executed as part of another transaction, and in that case not create a transaction. Also, doing the same check before trying to commit or rollback the transaction.
This way it makes it possible to use one service method from another, as well as use both on their own (having their own transactions).
It can be further improved upon by creating a static type TransactionHelper, with a property "HasActiveTranaction" to start with, that centralizes access to it.
Does it sound a good solution?
I'm reading Vaughn Vernon's book on Implementing Domain Driven design. I have also been going through the book code, C# version, from his github here.
The Java version of the book has decorators #Transactional which I believe are from the spring framework.
public class ProductBacklogItemService
{
#Transactional
public void assignTeamMemberToTask(
string aTenantId,
string aBacklogItemId,
string aTaskId,
string aTeamMemberId)
{
BacklogItem backlogItem =
backlogItemRepository.backlogItemOfId(
new TenantId(aTenantId),
new BacklogItemId(aBacklogItemId));
Team ofTeam =
teamRepository.teamOfId(
backlogItem.tennantId(),
backlogItem.teamId());
backlogItem.assignTeamMemberToTask(
new TeamMemberId(aTeamMemberId),
ofTeam,
new TaskId(aTaskId));
}
}
What would be the equivalent manual implementation in C#? I'm thinking something along the lines of:
public class ProductBacklogItemService
{
private static object lockForAssignTeamMemberToTask = new object();
private static object lockForOtherAppService = new object();
public voice AssignTeamMemberToTask(string aTenantId,
string aBacklogItemId,
string aTaskId,
string aTeamMemberId)
{
lock(lockForAssignTeamMemberToTask)
{
// application code as before
}
}
public voice OtherAppsService(string aTenantId)
{
lock(lockForOtherAppService)
{
// some other code
}
}
}
This leaves me with the following questions:
Do we lock by application service, or by repository? i.e. Should we not be doing backlogItemRepository.lock()?
When we are reading multiple repositories as part of our application service, how do we protect dependencies between repositories during transactions (where aggregate roots reference other aggregate roots by identity) - do we need to have interconnected locks between repositories?
Are there any DDD infrastructure frameworks that handle any of this locking?
Edit
Two useful answers came in to use transactions, as I haven't selected my persistence layer I am using in-memory repositories, these are pretty raw and I wrote them (they don't have transaction support as I don't know how to add!).
I will design the system so I do not need to commit to atomic changes to more than one aggregate root at the same time, I will however need to read consistently across a number of repositories (i.e. if a BacklogItemId is referenced from multiple other aggregates, then we need to protect against race conditions should BacklogItemId be deleted).
So, can I get away with just using locks, or do I need to look at adding TransactionScope support on my in-memory repository?
TL;DR version
You need to wrap your code in a System.Transactions.TransactionScope. Be careful about multi-threading btw.
Full version
So the point of aggregates is that the define a consistency boundary. That means any changes should result in the state of the aggregate still honouring it's invariants. That's not necessarily the same as a transaction. Real transactions are a cross-cutting implementation detail, so should probably be implemented as such.
A warning about locking
Don't do locking. Try and forget any notion you have of implementing pessimistic locking. To build scalable systems you have no real choice. The very fact that data takes time to be requested and get from disk to your screen means you have eventual consistency, so you should build for that. You can't really protect against race conditions as such, you just need to account for the fact they could happen and be able to warn the "losing" user that their command failed. Often you can start finding these issues later on (seconds, minutes, hours, days, whatever your domain experts tell you the SLA is) and tell users so they can do something about it.
For example, imagine if two payroll clerks paid an employee's expenses at the same time with the bank. They would find out later on when the books were being balanced and take some compensating action to rectify the situation. You wouldn't want to scale down your payroll department to a single person working at a time in order to avoid these (rare) issues.
My implementation
Personally I use the Command Processor style, so all my Application Services are implemented as ICommandHandler<TCommand>. The CommandProcessor itself is the thing looking up the correct handler and asking it to handle the command. This means that the CommandProcessor.Process(command) method can have it's entire contents processed in a System.Transactions.TransactionScope.
Example:
public class CommandProcessor : ICommandProcessor
{
public void Process(Command command)
{
using (var transaction = new TransactionScope())
{
var handler = LookupHandler(command);
handler.Handle(command);
transaction.Complete();
}
}
}
You've not gone for this approach so to make your transactions a cross-cutting concern you're going to need to move them a level higher in the stack. This is highly-dependent on the tech you're using (ASP.NET, WCF, etc) so if you add a bit more detail there might be an obvious place to put this stuff.
Locking wouldn't allow any concurrency on those code paths.
I think you're looking for a transaction scope instead.
I don't know what persistency layer you are going to use but the standard ones like ADO.NET, Entity Framework etc. support the TransactionScope semantics:
using(var tr = new TransactionScope())
{
doStuff();
tr.Complete();
}
The transaction is committed if tr.Complete() is called. In any other case it is rolled back.
Typically, the aggregate is a unit of transactional consistency. If you need the transaction to spread across multiple aggregates, then you should probably reconsider your model.
lock(lockForAssignTeamMemberToTask)
{
// application code as before
}
This takes care of synchronization. However, you also need to revert the changes in case of any exception. So, the pattern will be something like:
lock(lockForAssignTeamMemberToTask)
{
try {
// application code as before
} catch (Exception e) {
// rollback/restore previous values
}
}