I have a method that I want to be "transactional" in the abstract sense. It calls two methods that happen to do stuff with the database, but this method doesn't know that.
public void DoOperation()
{
using (var tx = new TransactionScope())
{
Method1();
Method2();
tc.Complete();
}
}
public void Method1()
{
using (var connection = new DbConnectionScope())
{
// Write some data here
}
}
public void Method2()
{
using (var connection = new DbConnectionScope())
{
// Update some data here
}
}
Because in real terms the TransactionScope means that a database transaction will be used, we have an issue where it could well be promoted to a Distributed Transaction, if we get two different connections from the pool.
I could fix this by wrapping the DoOperation() method in a ConnectionScope:
public void DoOperation()
{
using (var tx = new TransactionScope())
using (var connection = new DbConnectionScope())
{
Method1();
Method2();
tc.Complete();
}
}
I made DbConnectionScope myself for just such a purpose, so that I don't have to pass connection objects to sub-methods (this is more contrived example than my real issue). I got the idea from this article: http://msdn.microsoft.com/en-us/magazine/cc300805.aspx
However I don't like this workaround as it means DoOperation now has knowledge that the methods it's calling may use a connection (and possibly a different connection each). How could I refactor this to resolve the issue?
One idea I'm thinking of is creating a more general OperationScope, so that when teamed up with a custom Castle Windsor lifestyle I'll write, will mean any component requested of the container with OperationScopeLifetyle will always get the same instance of that component. This does solve the problem because OperationScope is more ambiguous than DbConnectionScope.
I'm seeing conflicting requirements here.
On the one hand, you don't want DoOperation to have any awareness of the fact that a database connection is being used for its sub-operations.
On the other hand, it clearly is aware of this fact because it uses a TransactionScope.
I can sort of understand what you're getting at when you say you want it to be transactional in the abstract sense, but my take on this is that it's virtually impossible (no, scratch that - completely impossible) to describe a transaction in such abstract terms. Let's just say you have a class like this:
class ConvolutedBusinessLogic
{
public void Splork(MyWidget widget)
{
if (widget.Validate())
{
widgetRepository.Save(widget);
widget.LastSaved = DateTime.Now;
OnSaved(new WidgetSavedEventArgs(widget));
}
else
{
Log.Error("Could not save MyWidget due to a validation error.");
SendEmailAlert(new WidgetValidationAlert(widget));
}
}
}
This class is doing at least two things that probably can't be rolled back (setting the property of a class and executing an event handler, which might for example cascade-update some controls on a form), and at least two more things that definitely can't be rolled back (appending to a log file somewhere and sending out an e-mail alert).
Perhaps this seems like a contrived example, but that is actually my point; you can't treat a TransactionScope as a "black box". The scope is in fact a dependency like any other; TransactionScope just provides a convenient abstraction for a unit of work that may not always be appropriate because it doesn't actually wrap a database connection and can't predict the future. In particular, it's normally not appropriate when a single logical operation needs to span more than one database connection, whether those connections are to the same database or different ones. It tries to handle this case of course, but as you've already learned, the result is sub-optimal.
The way I see it, you have a few different options:
Make explicit the fact that Method1 and Method2 require a connection by having them take a connection parameter, or by refactoring them into a class that takes a connection dependency (constructor or property). This way, the connection becomes part of the contract, so Method1 no longer knows too much - it knows exactly what it's supposed to know according to the design.
Accept that your DoOperation method does have an awareness of what Method1 and Method2 do. In fact, there is nothing wrong with this! It's true that you don't want to be relying on implementation details of some future call, but forward dependencies in the abstraction are generally considered OK; it's reverse dependencies you need to be concerned about, like when some class deep in the domain model tries to update a UI control that it has no business knowing about in the first place.
Use a more robust Unit of Work pattern (also: here). This is getting to be more popular and it is, by and large, the direction Microsoft has gone in with Linq to SQL and EF (the DataContext/ObjectContext are basically UOW implementations). This sleeves in well with a DI framework and essentially relieves you of the need to worry about when transactions start and end and how the data access has to occur (the term is "persistence ignorance"). This would probably require significant rework of your design, but pound for pound it's going to be the easiest to maintain long-term.
Hope one of those helps you.
Related
If I have a method outside it's callers scope that does a few things, and I have to call this method multiple times in multiple places. Is there any way to make the entire scope of the caller available to the method without passing parameters and also without using global variables? Like if I need it to have an access to a List, an entity framework context
Instead of
myMethod(string _string, List<string> _stringList, EntityContext _db)
{
//log _string to a database table
//add _string to _stringList
//etc.
}
Is there a way I can just pass the _string and make the method inherit the scope as if I'm just writing the same three lines of code everywhere I call this method? It seems a lot cleaner to call myMethod("foo") than myMethod("foo", stringList, MyEntities).
I could create a class, instantiate it, and call the class, but I'm just plain curious if scope inheritance or scope passing is a thing.
Absolutely don't do that. If you have a context you need to pass, use a class to represent the context needed, but don't try to handwave it away and hide it. It makes for unmaintainable code full of interdependencies.
In fact, the "bother" or "overhead" of passing the context object around is a good thing: it points out that having dependencies between the elements of your software project is not free. If you think that writing out the extra parameter is "too much work", then you're missing the forest for the trees: the dependency thus introduced has a much higher mental overhead than the mere mechanics of typing an extra parameter. After you pass that context a few times, typing it will be second nature and have 0 real overhead. The typing is cheap and doesn't require thinking, but keeping in mind the dependency and how it figures in the design of the overall system is anything but.
So: if you are trying to argue that introducing the dependency is worth it, then you have to back it up with actions and actually pass the context object around. The real cost is in the dependency, not the typing. Otherwise, it's a case of "talk is cheap" :)
One way of decreasing the apparent "cost" of passing such context objects is to upset the balance and make the context object actually do something, besides just carrying data. You would then use the context object to manipulate the objects for you, instead of calling the methods on the objects. This sort of "inversion" is quite handy, and often results in better design. After all, the presence of the context indicates that there's an overarching common state, and that perhaps too much functionality is delegated to the "end object", making it intertwined with the common state, whereas it may make more sense in the context object, making the end object less dependent on the presence of any particular external state.
You'd want the context to have methods that require "seeing the big picture", i.e. being aware of the presence of multiple objects, whereas the "leaf objects" (the ones with myMethod) should have methods that don't require the context, or that are general enough not to force any particular context class.
In your case, myMethod perhaps instead of working directly on an EntityContext could generate a functor or a similar action-wrapping object that performs the action, and this could then be applied by the caller (e.g. the context) to execute the database action. This way later it'll be easier to centrally manage the queue of database operations, etc.
When I refactor large projects, this sort of a "context inversion" comes in handy often, and the need for such patterns is very common. Usually, as large projects grow, the "leaf classes" start up lean, and end up acquiring functionality that belongs at a higher level. This is why using good tooling to explore the history of the repository is an imperative, and it's equally important that the entire repository history is available, i.e. that it was properly imported to git. I personally use DeepGit to trace the history of the code I work on, and find such tool indispensable. DeepGit is free as in beer for any use, and if you're not using a tool with similar functionality, you're seriously missing out, I think.
The need to pass contexts around is usually the indicator that a higher level has to be designed and introduced, and the "leafs" then need to be slimmed down, their context-using functionality moved out into the higher level. A few years down the road yet another higher level ends up being needed, although there are projects so far gone that when you just refactor them to make sense of the code base, two or three additional layers make themselves apparent!
I know of 2 ways that can be done. Consider you have the following method:
static void myMethod(string _stringA, string _stringB, string _stringC)
{
Console.WriteLine($"{_stringA},{_stringB},{_stringC}");
}
Create an overload method in the class. For example you could create an overloaded method like:
static void myMethod(string _stringA)
{
myMethod(_stringA, "stringB", "stringC");
}
The second way (which I would not advice it) is doing it the functional way like Javascript does (by using delegates):
public delegate void MethodDelegate(string _string);
static MethodDelegate mMethod1;
static MethodDelegate mMethod2;
static void Main(string[] args)
{
mMethod1 = delegate (string s) { myMethod(s, "method1-str-a", "method1-str-b"); };
mMethod1("str1");
mMethod2 = delegate (string s) { myMethod(s, "method2-str-a", "method2-str-b"); };
mMethod2("str2");
}
I'm reading Vaughn Vernon's book on Implementing Domain Driven design. I have also been going through the book code, C# version, from his github here.
The Java version of the book has decorators #Transactional which I believe are from the spring framework.
public class ProductBacklogItemService
{
#Transactional
public void assignTeamMemberToTask(
string aTenantId,
string aBacklogItemId,
string aTaskId,
string aTeamMemberId)
{
BacklogItem backlogItem =
backlogItemRepository.backlogItemOfId(
new TenantId(aTenantId),
new BacklogItemId(aBacklogItemId));
Team ofTeam =
teamRepository.teamOfId(
backlogItem.tennantId(),
backlogItem.teamId());
backlogItem.assignTeamMemberToTask(
new TeamMemberId(aTeamMemberId),
ofTeam,
new TaskId(aTaskId));
}
}
What would be the equivalent manual implementation in C#? I'm thinking something along the lines of:
public class ProductBacklogItemService
{
private static object lockForAssignTeamMemberToTask = new object();
private static object lockForOtherAppService = new object();
public voice AssignTeamMemberToTask(string aTenantId,
string aBacklogItemId,
string aTaskId,
string aTeamMemberId)
{
lock(lockForAssignTeamMemberToTask)
{
// application code as before
}
}
public voice OtherAppsService(string aTenantId)
{
lock(lockForOtherAppService)
{
// some other code
}
}
}
This leaves me with the following questions:
Do we lock by application service, or by repository? i.e. Should we not be doing backlogItemRepository.lock()?
When we are reading multiple repositories as part of our application service, how do we protect dependencies between repositories during transactions (where aggregate roots reference other aggregate roots by identity) - do we need to have interconnected locks between repositories?
Are there any DDD infrastructure frameworks that handle any of this locking?
Edit
Two useful answers came in to use transactions, as I haven't selected my persistence layer I am using in-memory repositories, these are pretty raw and I wrote them (they don't have transaction support as I don't know how to add!).
I will design the system so I do not need to commit to atomic changes to more than one aggregate root at the same time, I will however need to read consistently across a number of repositories (i.e. if a BacklogItemId is referenced from multiple other aggregates, then we need to protect against race conditions should BacklogItemId be deleted).
So, can I get away with just using locks, or do I need to look at adding TransactionScope support on my in-memory repository?
TL;DR version
You need to wrap your code in a System.Transactions.TransactionScope. Be careful about multi-threading btw.
Full version
So the point of aggregates is that the define a consistency boundary. That means any changes should result in the state of the aggregate still honouring it's invariants. That's not necessarily the same as a transaction. Real transactions are a cross-cutting implementation detail, so should probably be implemented as such.
A warning about locking
Don't do locking. Try and forget any notion you have of implementing pessimistic locking. To build scalable systems you have no real choice. The very fact that data takes time to be requested and get from disk to your screen means you have eventual consistency, so you should build for that. You can't really protect against race conditions as such, you just need to account for the fact they could happen and be able to warn the "losing" user that their command failed. Often you can start finding these issues later on (seconds, minutes, hours, days, whatever your domain experts tell you the SLA is) and tell users so they can do something about it.
For example, imagine if two payroll clerks paid an employee's expenses at the same time with the bank. They would find out later on when the books were being balanced and take some compensating action to rectify the situation. You wouldn't want to scale down your payroll department to a single person working at a time in order to avoid these (rare) issues.
My implementation
Personally I use the Command Processor style, so all my Application Services are implemented as ICommandHandler<TCommand>. The CommandProcessor itself is the thing looking up the correct handler and asking it to handle the command. This means that the CommandProcessor.Process(command) method can have it's entire contents processed in a System.Transactions.TransactionScope.
Example:
public class CommandProcessor : ICommandProcessor
{
public void Process(Command command)
{
using (var transaction = new TransactionScope())
{
var handler = LookupHandler(command);
handler.Handle(command);
transaction.Complete();
}
}
}
You've not gone for this approach so to make your transactions a cross-cutting concern you're going to need to move them a level higher in the stack. This is highly-dependent on the tech you're using (ASP.NET, WCF, etc) so if you add a bit more detail there might be an obvious place to put this stuff.
Locking wouldn't allow any concurrency on those code paths.
I think you're looking for a transaction scope instead.
I don't know what persistency layer you are going to use but the standard ones like ADO.NET, Entity Framework etc. support the TransactionScope semantics:
using(var tr = new TransactionScope())
{
doStuff();
tr.Complete();
}
The transaction is committed if tr.Complete() is called. In any other case it is rolled back.
Typically, the aggregate is a unit of transactional consistency. If you need the transaction to spread across multiple aggregates, then you should probably reconsider your model.
lock(lockForAssignTeamMemberToTask)
{
// application code as before
}
This takes care of synchronization. However, you also need to revert the changes in case of any exception. So, the pattern will be something like:
lock(lockForAssignTeamMemberToTask)
{
try {
// application code as before
} catch (Exception e) {
// rollback/restore previous values
}
}
When creating an interface having methods that are expected to be called in a specific order, is such dependency good practice, or should more patterns and practices be applied to "fix" it or make the situation better?
It's important users of some interfaces call methods in a specific order.
There are likely many various examples. This is the one that came to mind first:
A data source interface of which the author envisions the init method to always be called first by any caller (i.e to connect to the data source or look up preliminary meta info, etc), before any other of the operation methods are called.
interface DataAccess {
// Note to callers: this init must be called first and only once.
void InitSelf();
// operation: get the record having the given id
T Op_GetDataValue<T>(int id);
// operation: get a cound
int Op_GetCountOfData();
// operation: persist something to the data store
void Op_Persist(object o);
//etc.
}
However the caller may choose not to call the initialization method first.
In general I'm wondering if there are better ways for this situation.
You could have the other methods throw an exception if the object is uninitialized, or you could go for a more strict API. It would be more complicated to implement, but for example, InitSelf() could return an interface containing the data operations:
interface DataAccess {
DataOperations InitSelf();
}
interface DataOperations {
T Op_GetDataValue<T>(int id);
...
}
This would sort of require the consumer to initialize before performing operations, though there would be ways to circumvent that.
I'm a little confused. Implementors are not necessarily callers or users of the interface. In your example, implementors can assume that InitSelf is called before anything else, but aren't responsible for making that happen.
I think naming something InitXXX is a good indication that that is the case.
For more odd dependencies (not Init, which is very common), it would probably be better to not have the dependency.
Sometimes, it's not possible, and if you decide that it's not overkill to try to fix it, then a common thing is to separate into multiple interfaces that you get access to as you call early ones.
A common example is a database interface. You call connect, it returns a connection. You call createStatement on the connection, it returns a statement. You call setParam on the statement, you call runStatement on the statement, then you get a result, etc.
Any initialization should be done in the constructor. Clients of the DataAccess interface should not have to worry about the construction details.
i'm working on a fork of the Divan CouchDB library, and ran into a need to set some configuration parameters on the httpwebrequest that's used behind the scenes. At first i started threading the parameters through all the layers of constructors and method calls involved, but then decided - why not pass in a configuration delegate?
so in a more generic scenario,
given :
class Foo {
private parm1, parm2, ... , parmN
public Foo(parm1, parm2, ... , parmN) {
this.parm1 = parm1;
this.parm2 = parm2;
...
this.parmN = parmN;
}
public Bar DoWork() {
var r = new externallyKnownResource();
r.parm1 = parm1;
r.parm2 = parm2;
...
r.parmN = parmN;
r.doStuff();
}
}
do:
class Foo {
private Action<externallyKnownResource> configurator;
public Foo(Action<externallyKnownResource> configurator) {
this.configurator = configurator;
}
public Bar DoWork() {
var r = new externallyKnownResource();
configurator(r);
r.doStuff();
}
}
the latter seems a lot cleaner to me, but it does expose to the outside world that class Foo uses externallyKnownResource
thoughts?
This can lead to cleaner looking code, but has a huge disadvantage.
If you use a delegate for your configuration, you lose a lot of control over how the objects get configured. The problem is that the delegate can do anything - you can't control what happens here. You're letting a third party run arbitrary code inside of your constructors, and trusting them to do the "right thing." This usually means you end up having to write a lot of code to make sure that everything was setup properly by the delegate, or you can wind up with very brittle, easy to break classes.
It becomes much more difficult to verify that the delegate properly sets up each requirement, especially as you go deeper into the tree. Usually, the verification code ends up much messier than the original code would have been, passing parameters through the hierarchy.
I may be missing something here, but it seems like a big disadvantage to create the externallyKnownResource object down in DoWork(). This precludes easy substitution of an alternate implementation.
Why not:
public Bar DoWork( IExternallyKnownResource r ) { ... }
IMO, you're best off accepting a configuration object as a single parameter to your Foo constructor, rather than a dozen (or so) separate parameters.
Edit:
there's no one-size-fits-all solution, no. but the question is fairly simple. i'm writing something that consumes an externally known entity (httpwebrequest) that's already self-validating and has a ton of potentially necessary parameters. my options, really, are to re-create almost all of the configuration parameters this has, and shuttle them in every time, or put the onus on the consumer to configure it as they see fit. – kolosy
The problem with your request is that in general it is poor class design to make the user of the class configure an external resource, even if it's a well-known or commonly used resource. It is better class design to have your class hide all of that from the user of your class. That means more work in your class, yes, passing configuration information to your external resource, but that's the point of having a separate class. Otherwise why not just have the caller of your class do all the work on your external resource? Why bother with a separate class in the first place?
Now, if this is an internal class doing some simple utility work for another class that you will always control, then you're fine. But don't expose this type of paradigm publicly.
I am refactoring a class so that the code is testable (using NUnit and RhinoMocks as testing and isolations frameworks) and have found that I have found myself with a method is dependent on another (i.e. it depends on something which is created by that other method). Something like the following:
public class Impersonator
{
private ImpersonationContext _context;
public void Impersonate()
{
...
_context = GetContext();
...
}
public void UndoImpersonation()
{
if (_context != null)
_someDepend.Undo();
}
}
Which means that to test UndoImpersonation, I need to set it up by calling Impersonate (Impersonate already has several unit tests to verify its behaviour). This smells bad to me but in some sense it makes sense from the point of view of the code that calls into this class:
public void ExerciseClassToTest(Impersonator c)
{
try
{
if (NeedImpersonation())
{
c.Impersonate();
}
...
}
finally
{
c.UndoImpersonation();
}
}
I wouldn't have worked this out if I didn't try to write a unit test for UndoImpersonation and found myself having to set up the test by calling the other public method. So, is this a bad smell and if so how can I work around it?
Code smell has got to be one of the most vague terms I have ever encountered in the programming world. For a group of people that pride themselves on engineering principles, it ranks right up there in terms of unmeasurable rubbish, and about as useless a measure, as LOCs per day for programmer efficiency.
Anyway, that's my rant, thanks for listening :-)
To answer your specific question, I don't believe this is a problem. If you test something that has pre-conditions, you need to ensure the pre-conditions have been set up first for the given test case.
One of the tests should be what happens when you call it without first setting up the pre-conditions - it should either fail gracefully or set up it's own pre-condition if the caller hasn't bothered to do so.
Well, there is a bit too little context to tell, it looks like _someDepend should be initalized in the constructor.
Initializing fields in an instance method is a big NO for me. A class should be fully usable (i.e. all methods work) as soon as it is constructed; so the constructor(s) should initialize all instance variables. See e.g. the page on single step construction in Ward Cunningham's wiki.
The reason initializing fields in an instance method is bad is mainly that it imposes an implicit ordering on how you can call methods. In your case, TheMethodIWantToTest will do different things depending on whether DoStuff was called first. This is generally not something a user of your class would expect, so it's bad :-(.
That said, sometimes this kind of coupling may be unavoidable (e.g. if one method acquires a resource such as a file handle, and another method is needed to release it). But even that should be handled within one method if possible.
What applies to your case is hard to tell without more context.
Provided you don't consider mutable objects a code smell by themselves, having to put an object into the state needed for a test is simply part of the set-up for that test.
This is often unavoidable, for instance when working with remote connections - you have to call Open() before you can call Close(), and you don't want Open() to automatically happen in the constructor.
However you want to be very careful when doing this that the pattern is something readily understood - for instance I think most users accept this kind of behaviour for anything transactional, but might be surprised when they encounter DoStuff() and TheMethodIWantToTest() (whatever they're really called).
It's normally best practice to have a property that represents the current state - again look at remote or DB connections for an example of a consistently understood design.
The big no-no is for this to ever happen for properties. Properties should never care what order they are called in. If you have a simple value that does depend on the order of methods then it should be a parameterless method instead of a property-get.
Yes, I think there is a code smell in this case. Not because of dependencies between methods, but because of the vague identity of the object. Rather than having an Impersonator which can be in different persona states, why not have an immutable Persona?
If you need a different Persona, just create a new one rather than changing the state of an existing object. If you need to do some cleanup afterwards, make Persona disposable. You can keep the Impersonator class as a factory:
using (var persona = impersonator.createPersona(...))
{
// do something with the persona
}
To answer the title: having methods call each other (chaining) is unavoidable in object oriented programming, so in my view there is nothing wrong with testing a method that calls another. A unit test can be a class after all, it's a "unit" you're testing.
The level of chaining depends on the design of your object - you can either fork or cascade.
Forking:
classToTest1.SomeDependency.DoSomething()
Cascading:
classToTest1.DoSomething() (which internally would call SomeDependency.DoSomething)
But as others have mentioned, definitely keep your state initialisation in the constructor which from what I can tell, will probably solve your issue.