I'm reading Vaughn Vernon's book on Implementing Domain Driven design. I have also been going through the book code, C# version, from his github here.
The Java version of the book has decorators #Transactional which I believe are from the spring framework.
public class ProductBacklogItemService
{
#Transactional
public void assignTeamMemberToTask(
string aTenantId,
string aBacklogItemId,
string aTaskId,
string aTeamMemberId)
{
BacklogItem backlogItem =
backlogItemRepository.backlogItemOfId(
new TenantId(aTenantId),
new BacklogItemId(aBacklogItemId));
Team ofTeam =
teamRepository.teamOfId(
backlogItem.tennantId(),
backlogItem.teamId());
backlogItem.assignTeamMemberToTask(
new TeamMemberId(aTeamMemberId),
ofTeam,
new TaskId(aTaskId));
}
}
What would be the equivalent manual implementation in C#? I'm thinking something along the lines of:
public class ProductBacklogItemService
{
private static object lockForAssignTeamMemberToTask = new object();
private static object lockForOtherAppService = new object();
public voice AssignTeamMemberToTask(string aTenantId,
string aBacklogItemId,
string aTaskId,
string aTeamMemberId)
{
lock(lockForAssignTeamMemberToTask)
{
// application code as before
}
}
public voice OtherAppsService(string aTenantId)
{
lock(lockForOtherAppService)
{
// some other code
}
}
}
This leaves me with the following questions:
Do we lock by application service, or by repository? i.e. Should we not be doing backlogItemRepository.lock()?
When we are reading multiple repositories as part of our application service, how do we protect dependencies between repositories during transactions (where aggregate roots reference other aggregate roots by identity) - do we need to have interconnected locks between repositories?
Are there any DDD infrastructure frameworks that handle any of this locking?
Edit
Two useful answers came in to use transactions, as I haven't selected my persistence layer I am using in-memory repositories, these are pretty raw and I wrote them (they don't have transaction support as I don't know how to add!).
I will design the system so I do not need to commit to atomic changes to more than one aggregate root at the same time, I will however need to read consistently across a number of repositories (i.e. if a BacklogItemId is referenced from multiple other aggregates, then we need to protect against race conditions should BacklogItemId be deleted).
So, can I get away with just using locks, or do I need to look at adding TransactionScope support on my in-memory repository?
TL;DR version
You need to wrap your code in a System.Transactions.TransactionScope. Be careful about multi-threading btw.
Full version
So the point of aggregates is that the define a consistency boundary. That means any changes should result in the state of the aggregate still honouring it's invariants. That's not necessarily the same as a transaction. Real transactions are a cross-cutting implementation detail, so should probably be implemented as such.
A warning about locking
Don't do locking. Try and forget any notion you have of implementing pessimistic locking. To build scalable systems you have no real choice. The very fact that data takes time to be requested and get from disk to your screen means you have eventual consistency, so you should build for that. You can't really protect against race conditions as such, you just need to account for the fact they could happen and be able to warn the "losing" user that their command failed. Often you can start finding these issues later on (seconds, minutes, hours, days, whatever your domain experts tell you the SLA is) and tell users so they can do something about it.
For example, imagine if two payroll clerks paid an employee's expenses at the same time with the bank. They would find out later on when the books were being balanced and take some compensating action to rectify the situation. You wouldn't want to scale down your payroll department to a single person working at a time in order to avoid these (rare) issues.
My implementation
Personally I use the Command Processor style, so all my Application Services are implemented as ICommandHandler<TCommand>. The CommandProcessor itself is the thing looking up the correct handler and asking it to handle the command. This means that the CommandProcessor.Process(command) method can have it's entire contents processed in a System.Transactions.TransactionScope.
Example:
public class CommandProcessor : ICommandProcessor
{
public void Process(Command command)
{
using (var transaction = new TransactionScope())
{
var handler = LookupHandler(command);
handler.Handle(command);
transaction.Complete();
}
}
}
You've not gone for this approach so to make your transactions a cross-cutting concern you're going to need to move them a level higher in the stack. This is highly-dependent on the tech you're using (ASP.NET, WCF, etc) so if you add a bit more detail there might be an obvious place to put this stuff.
Locking wouldn't allow any concurrency on those code paths.
I think you're looking for a transaction scope instead.
I don't know what persistency layer you are going to use but the standard ones like ADO.NET, Entity Framework etc. support the TransactionScope semantics:
using(var tr = new TransactionScope())
{
doStuff();
tr.Complete();
}
The transaction is committed if tr.Complete() is called. In any other case it is rolled back.
Typically, the aggregate is a unit of transactional consistency. If you need the transaction to spread across multiple aggregates, then you should probably reconsider your model.
lock(lockForAssignTeamMemberToTask)
{
// application code as before
}
This takes care of synchronization. However, you also need to revert the changes in case of any exception. So, the pattern will be something like:
lock(lockForAssignTeamMemberToTask)
{
try {
// application code as before
} catch (Exception e) {
// rollback/restore previous values
}
}
Related
In a aspnetcore mvc executing context .
I have this simple entity.
public class Foo
{
public int Id { get; private set; }
public string Name{ get; private set; }
public string Code { get; private set; }
private Foo() { }
public Foo(string Name, string Code)
{
GuardClauses.IsNullOrWhiteSpace(Name,nameof(Name), "cannot be null or empty");
GuardClauses.IsNullOrWhiteSpace(Code, nameof(Code), "cannot be null or empty");
this.Nom = Nom;
this.Code = Code;
}
}
In my DbContext I have this code field/constraint that ensures the Code is unique from a persistence point of view.
protected override void OnModelCreating(ModelBuilder builder)
{
builder.Entity<Foo>()
.HasIndex(u => u.Code)
.IsUnique();
}
I want the addNewFoo method in my service class to ensure that for all Foos in my Application the property code is unique, before adding it.
I try as much as I can to respect persistence ignorance principle, but I'm not as skilled as I wish to do that.
For starters, is it the role of a Builder to determine if the Code field is Unique?
Secondly I know that in my validation layer I can determine if there is an existing foo already with the same Code that the actual foo I'm currently trying to add. But this approchah isn't thread safe or transactional.
The fact is I don't want to wait the moment I add my foo too have a SqlException, just to know it cannot be done.
What is the best approach to ensure unicity in my application with the
Fail Fast principle in mind.
Because there is't a concrete example or description of a system I will generalize a bit. If you provide a concrete example I can add additional info. Every solution has a Context to which it applies to best and of course there is always a trade-off
Let's ask couple of querstions regarding the nature of this Code and what it represents
Who is responcible for the Code generation: the User of the system or the System itself?
Can the Code be completely random (UUID for example)?
Is the code generated by some special algorithm (SSN or maybe a CarPartNumber that is composed of different parts with a special meaning)
And one more very important question:
How frequently do we expect these unique violations to occur?
If the answer to question 2 is Yes, then you don't have a problem. You can have a duplicate UUID's, but the chances are very low. You can add a Unique Constrant to you DB just in case and treat this violation as a normal error that you don't care about since it's gonna happen every once in a million years.
If the answer to question 3 is Yes, then we have a different sittuation. In a multi-user system you cannot avoid concurrency. There are couple of way to deal with the sittuation:
Option 1: Optimistic Offline Lock
Option 2: Pessimistic Offline Lock
Option 3: If System is generating codes, have a special service and queue code generation requests.
If you choose to use a lock, you can either lock the whole resource Foo or only lock the Code generation.
Option 1:
You will have to handle the SQLException. This one is used in most applications today because it ensures smooth User Experience by not causing the application to stall for large amounts of time because someone has locked a resource.
You can use an abstraction for example a Repository. Define your own application level exception UniqueCodeViolationException that will be thrown by the Repository. The repository will try{}catch{} the SQLException, process it and wrap it in UniqueCodeViolationException when is compares the error codes. This won't save you the check but at least if will hide the concrete errors and you will have the processing in only one place.
Option2:
Sometimes you realy need to ensure that there is no concurrency, so you use this option. In this case you will have to lock the process of creation of Foo for only one user and don't allow other to be able to even be able to open the dialog/form/page for creating Foo if there is a lock.
This ensures consistency and avoids the problem by creating a system that is basically unusable for multiple users that target the same Foo. It's quite possible that the application you are building will have only one person responsible for Foo creation or it may that concurrency is very low, so this may be a good solution.
I have friends who use this lock in an application for Insurances. Usually in their application one person is going to one office to make an Insurance. so the possibility of concurrency in the creation of an insurance for the same person is very low, but the cost of having multiple Insurances to the same person is very hight.
Option 3:
On the other hand If your Code is generated by the System, you can have a CodeGenerationService that deals with code generations and ensures that unique codes are generated. You can have a queue with these requests. In each generation operation in the Service you can check if this code exists and return an Error (or throw exception).
Now to question 4. If you don't expect to have collisions often, just add a Unique Constraint in your DB and treat it as a general unexpected error. Add the check if the Code already exists and show an error if it does.
You still can have a concurrency here so there will be a slim change that one user will add a Foo and another will get an error "Oooops... something whent wrong.. please try again". Since this will happen once in a 100 years it's ok.
The last solution will make your system a lot simpler by ignoring special situations that can occur in rare situations.
I'm currently debugging a very large public method that does these things.
Fetch record(s) from DB
Do some logical checks
based on logical checks, determine if additional DB operations are done like save/ update/ delete
Call a service and start a scheduled service
etc, etc
Now, within these logical groups, it fetches data, etc, etc. Our architect has advised to write unit tests against it. If i break them down into smaller public methods that are being called from the big parent method, i know that it wont be good practice. Ive done some research and people said about saying something about using internal then exposing the assembly using InternalsVisibleTo. The method has also been existing for quite some time now and i want to avoid totally refactoring it and do really mind blowing regression testing.
Can anyone give me advice?
update
public JobModel SaveAndUpdateJob(JobModel jobModel)
{
using (var dbSession = OpenSession())
{
var jobEntity = isNew ? new JobEntity() : dbSession.Get<JobEntity>(jobModel.Id);
jobEntity.JobUniqueCode = jobEntity.GenerateUniqueCode(jobEntity, IsPublished);
RateJobAgainstSurvey(jobModel, dbSession, jobEntity);
_emailService.SendNotification(dbSession);
}
return jobModel;
}
One of the questions I was asked was that I have a database table with following columns
pid - unique identifier
orderid - varchar(20)
documentid - int
documentpath - varchar(250)
currentLocation - varchar(250)
newlocation - varchar(250)
status - varchar(15)
I have to write a c# app to move the files from currentlocation to newlocation and update status column as either 'SUCCESS' or 'FAILURE'.
This was my answer
Create a List of all the records using linq
Create a command object which would be perform moving files
using foreach, invoke a delegate to move the files -
use endinvoke to capture any exception and update the db accordingly
I was told that command pattern and delegate did not fit the bill here - i was aksed to think and implement a more favorable GoF pattern.
Not sure what they were looking for - In this day and age, do candidates keep a lot of info on head as one always has google to find any answer and come up with solution.
I sort of agree with Aaronaught's comment above. For a problem like this, sometimes you can overthink it and try to do something more than you actually need to do.
That said, the one GoF pattern that came to mind was "Iterator." In your first statement, you said you would read all the records into a List. The one thing that could be problematic with that is if you had millions of these records. You'd probably want to process them in a more successive fashion, rather than reading the entire list into memory. The Iterator pattern would give you the ability to iterate over the list without having to know the underlying (database) storage/retrieval mechanism. The underlying implementation of the iterator could retrieve one, ten, or a hundred records at a time, and dole them out to the business logic upon request. This would provide some testing benefit as well, because you could test your other "business" logic using a different type of underlying storage (e.g. in-memory list), so that your unit tests would be independent from the database.
A deep understanding of patterns is something you should definitely have as a developer - you shouldn't need to go to Google to determine which pattern to "use" because you won't have enough time to really understand that pattern between when you start reading about it and when you apply it.
Patterns are mostly about understanding forces and encapsulating variation. That is, forces create certain kinds of variation and we have well understood ways of encapsulating those kinds of variation. A "pattern" is a body of understanding about which forces lead to which kinds of variation and which methods of encapsulation best address those.
I have a friend who was teaching a course on patterns and it suddenly struck him that he could solve a given problem "using" (meaning "implementing the encapsulating technique of") every pattern in his course book. It really did a great job of helping drive home the fact that finding the right technique is more important that knowing how to apply a technique.
The Command pattern, for instance, starts with an understanding that sometimes we want to vary when something happens. In these cases, we want to decouple the decision of what to do from the decision of when to do it. In this example, I don't see any indication that when your command should be executed varies at all.
In fact, I don't really see anything that varies so there might not have been any patterns in the problem at all. If your interviewers were saying there were, then they may have some learning to do as well.
Anywho... I'd recommend Design Patterns Explained by Shalloway and Trott. You'll get a deeper understanding of what patterns are really for and how they help you do your job and, the next time they tell you that you are "using" the wrong pattern, you might just be in a position to educate them. That seems to go over pretty well for me... about 20% of the time. :)
I would rather say that the interviewer wanted you to use (or mention) the SOLID object oriented design principles here, and in that process you might use some design pattern.
For instance, we could a make a design like below which adheres to SRP, OCP, and DIP.
internal interface IStatusRecordsToMove
{
List<IRecord> Records { get; }
}
internal interface IRecord
{
string Status { get; set; }
}
internal interface IRecordsMover
{
ITargetDb TargetDb { get; }
void Move(IStatusRecordsToMove record);
}
internal interface ITargetDb
{
void SaveAndUpdateStatus(IRecord record);
}
class ProcessTableRecordsToMove : IStatusRecordsToMove
{
public List<IRecord> Records
{
get { throw new NotImplementedException(); }
}
}
internal class ProcessRecordsMoverImpl : IRecordsMover
{
#region IRecordsMover Members
public ITargetDb TargetDb
{
get { throw new NotImplementedException(); }
}
public void Move(IStatusRecordsToMove recordsToMove)
{
foreach (IRecord item in recordsToMove.Records)
{
TargetDb.SaveAndUpdateStatus(item);
}
}
#endregion
}
internal class TargetTableBDb : ITargetDb
{
public void SaveAndUpdateStatus(IRecord record)
{
try
{
//some db object, save new record
record.Status = "Success";
}
catch(ApplicationException)
{
record.Status = "Failed";
}
finally
{
//Update IRecord Status in Db
}
}
}
I have a method that I want to be "transactional" in the abstract sense. It calls two methods that happen to do stuff with the database, but this method doesn't know that.
public void DoOperation()
{
using (var tx = new TransactionScope())
{
Method1();
Method2();
tc.Complete();
}
}
public void Method1()
{
using (var connection = new DbConnectionScope())
{
// Write some data here
}
}
public void Method2()
{
using (var connection = new DbConnectionScope())
{
// Update some data here
}
}
Because in real terms the TransactionScope means that a database transaction will be used, we have an issue where it could well be promoted to a Distributed Transaction, if we get two different connections from the pool.
I could fix this by wrapping the DoOperation() method in a ConnectionScope:
public void DoOperation()
{
using (var tx = new TransactionScope())
using (var connection = new DbConnectionScope())
{
Method1();
Method2();
tc.Complete();
}
}
I made DbConnectionScope myself for just such a purpose, so that I don't have to pass connection objects to sub-methods (this is more contrived example than my real issue). I got the idea from this article: http://msdn.microsoft.com/en-us/magazine/cc300805.aspx
However I don't like this workaround as it means DoOperation now has knowledge that the methods it's calling may use a connection (and possibly a different connection each). How could I refactor this to resolve the issue?
One idea I'm thinking of is creating a more general OperationScope, so that when teamed up with a custom Castle Windsor lifestyle I'll write, will mean any component requested of the container with OperationScopeLifetyle will always get the same instance of that component. This does solve the problem because OperationScope is more ambiguous than DbConnectionScope.
I'm seeing conflicting requirements here.
On the one hand, you don't want DoOperation to have any awareness of the fact that a database connection is being used for its sub-operations.
On the other hand, it clearly is aware of this fact because it uses a TransactionScope.
I can sort of understand what you're getting at when you say you want it to be transactional in the abstract sense, but my take on this is that it's virtually impossible (no, scratch that - completely impossible) to describe a transaction in such abstract terms. Let's just say you have a class like this:
class ConvolutedBusinessLogic
{
public void Splork(MyWidget widget)
{
if (widget.Validate())
{
widgetRepository.Save(widget);
widget.LastSaved = DateTime.Now;
OnSaved(new WidgetSavedEventArgs(widget));
}
else
{
Log.Error("Could not save MyWidget due to a validation error.");
SendEmailAlert(new WidgetValidationAlert(widget));
}
}
}
This class is doing at least two things that probably can't be rolled back (setting the property of a class and executing an event handler, which might for example cascade-update some controls on a form), and at least two more things that definitely can't be rolled back (appending to a log file somewhere and sending out an e-mail alert).
Perhaps this seems like a contrived example, but that is actually my point; you can't treat a TransactionScope as a "black box". The scope is in fact a dependency like any other; TransactionScope just provides a convenient abstraction for a unit of work that may not always be appropriate because it doesn't actually wrap a database connection and can't predict the future. In particular, it's normally not appropriate when a single logical operation needs to span more than one database connection, whether those connections are to the same database or different ones. It tries to handle this case of course, but as you've already learned, the result is sub-optimal.
The way I see it, you have a few different options:
Make explicit the fact that Method1 and Method2 require a connection by having them take a connection parameter, or by refactoring them into a class that takes a connection dependency (constructor or property). This way, the connection becomes part of the contract, so Method1 no longer knows too much - it knows exactly what it's supposed to know according to the design.
Accept that your DoOperation method does have an awareness of what Method1 and Method2 do. In fact, there is nothing wrong with this! It's true that you don't want to be relying on implementation details of some future call, but forward dependencies in the abstraction are generally considered OK; it's reverse dependencies you need to be concerned about, like when some class deep in the domain model tries to update a UI control that it has no business knowing about in the first place.
Use a more robust Unit of Work pattern (also: here). This is getting to be more popular and it is, by and large, the direction Microsoft has gone in with Linq to SQL and EF (the DataContext/ObjectContext are basically UOW implementations). This sleeves in well with a DI framework and essentially relieves you of the need to worry about when transactions start and end and how the data access has to occur (the term is "persistence ignorance"). This would probably require significant rework of your design, but pound for pound it's going to be the easiest to maintain long-term.
Hope one of those helps you.
How would you go about calling several methods in the data access layer from one method in the business logic layer so that all of the SQL commands lived in one SQL transaction?
Each one of the DAL methods may be called individually from other places in the BLL, so there is no guarantee that the data layer methods are always part of a transaction. We need this functionality so if the database goes offline in the middle of a long running process, there's no commit. The business layer is orchestrating different data layer method calls based on the results of each of the previous calls. We only want to commit (from the business layer) at the very end of the entire process.
well, firstly, you'll have to adhere to an atomic Unit of Work that you specify as a single method in your BLL. This would (for example) create the customer, the order and the order items. you'd then wrap this all neatly up inside a TransactionScope using statement. TransactionScope is the secret weapon here. below is some code that luckily enough I'm working on right now :):
public static int InsertArtist(Artist artist)
{
if (artist == null)
throw new ArgumentNullException("artist");
int artistid = 0;
using (TransactionScope scope = new TransactionScope())
{
// insert the master Artist
/*
we plug the artistid variable into
any child instance where ArtistID is required
*/
artistid = SiteProvider.Artist.InsertArtist(new ArtistDetails(
0,
artist.BandName,
artist.DateAdded));
// insert the child ArtistArtistGenre
artist.ArtistArtistGenres.ForEach(item =>
{
var artistartistgenre = new ArtistArtistGenreDetails(
0,
artistid,
item.ArtistGenreID);
SiteProvider.Artist.InsertArtistArtistGenre(artistartistgenre);
});
// insert the child ArtistLink
artist.ArtistLinks.ForEach(item =>
{
var artistlink = new ArtistLinkDetails(
0,
artistid,
item.LinkURL);
SiteProvider.Artist.InsertArtistLink(artistlink);
});
// insert the child ArtistProfile
artist.ArtistProfiles.ForEach(item =>
{
var artistprofile = new ArtistProfileDetails(
0,
artistid,
item.Profile);
SiteProvider.Artist.InsertArtistProfile(artistprofile);
});
// insert the child FestivalArtist
artist.FestivalArtists.ForEach(item =>
{
var festivalartist = new FestivalArtistDetails(
0,
item.FestivalID,
artistid,
item.AvailableFromDate,
item.AvailableToDate,
item.DateAdded);
SiteProvider.Festival.InsertFestivalArtist(festivalartist);
});
BizObject.PurgeCacheItems(String.Format(ARTISTARTISTGENRE_ALL_KEY, String.Empty, String.Empty));
BizObject.PurgeCacheItems(String.Format(ARTISTLINK_ALL_KEY, String.Empty, String.Empty));
BizObject.PurgeCacheItems(String.Format(ARTISTPROFILE_ALL_KEY, String.Empty, String.Empty));
BizObject.PurgeCacheItems(String.Format(FESTIVALARTIST_ALL_KEY, String.Empty, String.Empty));
BizObject.PurgeCacheItems(String.Format(ARTIST_ALL_KEY, String.Empty, String.Empty));
// commit the entire transaction - all or nothing
scope.Complete();
}
return artistid;
}
hopefully, you'll get the gist. basically, it's an all succeed or fail job, irrespective of any disparate databases (i.e. in the above example, artist and artistartistgenre could be hosted in two separate db stores but TransactionScope would care less about that, it works at COM+ level and manages the atomicity of the scope that it can 'see')
hope this helps
EDIT: you'll possibly find that the initial invocation of TransactionScope (on app start-up) may be slightly noticeable (i.e. in the example above, if called for the first time, can take 2-3 seconds to complete), however, subsequent calls are almost instantaneous (i.e. typically 250-750 ms). the trade off between a simple point of contact transaction vs the (unwieldy) alternatives mitigates (for me and my clients) that initial 'loading' latency.
just wanted to demonstrate that ease doesn't come without compromise (albeit in the initial stages)
What you describe is the very 'definition' of a long transaction.
Each DAL method could simply provide operations (without any specific commits). Your BLL (which is in effect where you are coordinating any calls to the DAL anyway) is where you can choose to either commit, or execute a 'savepoint'. A savepoint is an optional item which you can employ to allow 'rollbacks' within a long running transaction.
So for example, if my DAL has methods DAL1, DAL2, DAL3 are all mutative they would simply 'execute' data change operations (i.e. some type of Create, Update, Delete). From my BLL, lets assume I have BL1, and BL2 methods (BL1 is long running). BL1 invokes all the aforementoned DAL methods (i.e. DAL1...DAL3), while BL2, only invokes DAL3.
Therefore, on execution of each business logic method you might have the following:
BL1 (long-transaction) -> {savepoint} DAL1 -> {savepoint} DAL2 -> DAL3 {commit/end}
BL2 -> DAL3 {commit/end}
The idea behind the 'savepoint' is it can allow BL1 to rollback at any point if there are issues in the data operations. The long transaction is ONLY commited if all three operations successfully complete. BL2 can still call any method in the DAL, and it is responsible for controlling commits. NOTE: you could use 'savepoints' in short/regular transactions as well.
Good question. This gets to the heart of the impedance mismatch.
This is one of the strongest arguments for using stored procedures. Reason: they are designed to encapsulate multiple SQL statements in a transaction.
The same can be done procedurally in the DAL, but it results in code with less clarity, while usually resulting in moving the coupling/cohesion balance in the wrong direction.
For this reason, I implement the DAL at a higher level of abstraction than simply encapsulating tables.
just in case my comment in the original article didn't 'stick', here's what i'd added as additional info:
<-----
coincidently, just noticed another similar reference to this posted a few hours after your request. uses a similar strategy and might be worth you looking at as well:
http://stackoverflow.com/questions/494550/how-does-transactionscope-roll-back-transactions
----->