I've a need to add method that will calculate a weighted sum of worker salary and his superior salary. I would like something like this:
class CompanyFinanse
{
public decimal WeightedSumOfWorkerSalaryAndSuperior(Worker WorkerA, Worker Superior)
{
return WorkerA.Salary + Superior.Salary * 2;
}
}
Is this a good design or should I put this method somewhere else? I'm just staring designing project and think about a good, Object Oriented way of organize methods in classes. So I would like start from beginning with OOP on my mind. Best practice needed!
I would either put it in the worker class, or have a static function in a finance library. I don't think a Finance object really makes sense, I think it would be more of a set of business rules than anything, so it would be static.
public class Worker {
public Worker Superior {get;set;}
public readonly decimal WeightedSalary {
get {
return (Superior.Salary * 2) + (this.Salary)
}
}
public decimal Salary {get;set;}
}
or
public static class Finance {
public static decimal WeightedSumOfWorkerSalaryAndSuperior(Worker WorkerA, Worker Superior) {
return WorkerA.Salary + Superior.Salary * 2; }
}
For your design to be Object Oriented, you should start by thinking of the purpose of the entire application. If there is only one method in your application (weighted sum), then there isn't too much design to go on.
If this is a finance application, maybe you could have a Salary class which contains a worker's salary and some utility functions.
For the method you pointed out, if the Worker class has a reference to his Superior, you could make this method part of the Worker class.
Without more information on the purpose of the application, it's difficult to give good guidance.
So it may be impossible to give you a complete answer about "best practices" without knowing more about your domain, but I can tell you that you may be setting yourself up for disaster by thinking about the implementation details this early.
If you're like me then you were taught that good OOD/OOP is meticulously detailed and involves BDUF. It wasn't until later in my career that I found out this is the reason so many projects become egregiously unmaintainable later on down the road. Assumptions are made about how the project might work, instead of allowing the design to emerge naturally from how the code is actually going to be used.
Simply stated: You need to being doing BDD / TDD (Behavior/Test Driven Development).
Start with a rough domain model sketched out, but avoid too much detail.
Pick a functional area that you want to get started with. Preferably at the top of the model, or one the user will be interacting with.
Brainstorm on expected functionality that the unit should have and make a list.
Start the TDD cycle on that unit and then refactor aggressively as you go.
What you will end up with is exactly what you do need, and nothing you don't (most of the time). You gain the added benefit of having full test coverage so you can refactor later on down the road without worrying about breaking stuff :)
I know I haven't given you any code here, but that is because anything I give you will probably be wrong, and then you will be stuck with it. Only you know how the code is actually going to be used, and you should start by writing the code in that way. TDD focuses on how the code should look, and then you can fill in the implementation details as you go.
A full explanation of this is beyond the scope of this post, but there are a myriad of resources available online as well as a number of books that are excellent resources for beginning the practice of TDD. These two guys should get you off to a good start.
Martin Fowler
Kent Beck
Following up on the answer by brien, I suggest looking at the practice of CRC cards (Class-Responsibility-Collaboration). There are many sources of information, including:
this tutorial from Cal Poly,
this orientation on the Agile Modeling web site, and
The CRC Card Book, which discusses the practice and its use with multiple languages.
Understanding which class should "own" a particular behavior (and/or which classes should collaborate in implementing a given use case), is in general a top-down kind of discussion driven by the overall design of what your system is doing for its users.
It is easy to find out whether your code needs improvement. There is a code smell in your snippet. You should address that.
It is good that you have very declarative name for the method. But it is too long. It sounds like if you keep that method in this Finanse class it is inevitable that you have to use all those words in the method name to get the sense of what that method is intended to do.
It basically means that this method may not belong to this class.
One way to address this code smell is to see if you could get a shorter method name if we have the method on other class. I see you have Worker and Salary classes.
Assuming those are the only classes left and you don't want to add up more classes, I would put this on Salary. Salary knows how to calculate weighted salary given another salary (Superior salary in this case) as input. You don't need more than two words for the method name now.
#Shawn's answer is one variation of addressing this code smell. (I think you can call it as 'long method name' code smell)
Related
Recently we had a discussion regarding Data and Behavior separation in classes. The concept of separation of Data and Behaviour is implemented by placing the Domain Model and its behavior into seperate classes.
However I am not convinced of the supposed benefits of this approach. Even though it might have been coined by a "great" (I think it is Martin Fowler, though I am not sure). I present a simple example here. Suppose I have a Person class containing data for a Person and its methods (behavior).
class Person
{
string Name;
DateTime BirthDate;
//constructor
Person(string Name, DateTime BirthDate)
{
this.Name = Name;
this.BirthDate = BirthDate;
}
int GetAge()
{
return Today - BirthDate; //for illustration only
}
}
Now, separate out the behavior and data into separate classes.
class Person
{
string Name;
DateTime BirthDate;
//constructor
Person(string Name, DateTime BirthDate)
{
this.Name = Name;
this.BirthDate = BirthDate;
}
}
class PersonService
{
Person personObject;
//constructor
PersonService(string Name, DateTime BirthDate)
{
this.personObject = new Person(Name, BirthDate);
}
//overloaded constructor
PersonService(Person personObject)
{
this.personObject = personObject;
}
int GetAge()
{
return personObject.Today - personObject.BirthDate; //for illustration only
}
}
This is supposed to be beneficial and improve flexibility and provide loose coupling. I do not see how. According to me this introduces extra coding and performance penalty, that each time we have to initialize two class objects. And I see more problems in extending this code. Consider what happens when we introduce inheritance in above case. We have to inherit both the classes
class Employee: Person
{
Double Salary;
Employee(string Name, DateTime BirthDate, Double Salary): base(Name, BirthDate)
{
this.Salary = Salary;
}
}
class EmployeeService: PersonService
{
Employee employeeObject;
//constructor
EmployeeService(string Name, DateTime BirthDate, Double Salary)
{
this.employeeObject = new Employee(Name, BirthDate, Salary);
}
//overloaded constructor
EmployeeService(Employee employeeObject)
{
this.employeeObject = employeeObject;
}
}
Note that even if we segregate out the behavior in a seperate class, we still need object of the Data class for the Behaviour class methods to work on. So in the end our Behavior class contains both the data and the behavior albeit we have the data in form of a model object.
You might say that you can add some Interfaces to the mix , so we could have IPersonService and an IEmployeeService. But I think introducing interfaces for each and every class and inherting from interfaces does not seem OK.
So then can you tell me what have I achieved by seperating out the data and behavior in above case that I could not have achieved by having them in the same class ?
I agree, the separation as you implemented is cumbersome. But there are other options. What about an ageCalculator object that has method getAge(person p)? Or person.getAge(IAgeCalculator calc). Or better yet calc.getAge(IAgeble a)
There are several benefits that accrue from separating these concerns. Assuming that you intended for your implementation to return years, what if a person / baby is only 3 months old? Do you return 0? .25? Throw an exception? What if I want the age of a dog? Age in decades or hours? What if I want the age as of a certain date? What if the person is dead? What if I want to use Martian orbit for year? Or Hebrew calander?
None of that should affect classes that consume the person interface but make no use of birthdate or age. By decoupling the age calculation from the data it consumes, you get increased flexibility and increased chance of reuse. (Maybe even calculate age of cheese and person with same code!)
As usually, optimal design will vary greatly with context. It would be a rare situation, however, that performance would influence my decision in this type of problem. Other parts of the system are likely several orders of magnitude greater factors, like the speed of light between browser and server or database retrieval or serialization. time / dollars are better spent refactoring toward simplicity and maintainability than theoretical performance concerns. To that end, I find separating data and behavior of domain models to be helpful. They are, after all, separate concerns, no?
Even with such priorities, thing are muddled. Now the class that wants the persons age has another dependency, the calc class. Ideally, fewer class dependencies are desirable. Also, who is responsible instantiating calc? Do we inject it? Create a calcFactory? Or should it be a static method? How does the decision affect testability? Has the drive toward simplicity actually increased complexity?
There seems to be a disconnect between OO's instance on combining behavior with data and the single responsibility principle. When all else fails, write it both ways and then ask a coworker, "which one is simpler?"
Actually, Martin Fowler says that in the domain model, data and behavior should be combined. Take a look at AnemicDomainModel.
I realize I am about a year late on replying to this but anyway... lol
I have separated the Behaviors out before but not in the way you have shown.
It is when you have Behaviors that should have a common interface yet allow for different (unique) implementation for different objects that separating out the behaviors makes sense.
If I was making a game, for example, some behaviors available for objects might be the ability to walk, fly, jump and so forth.
By defining Interfaces such as IWalkable, IFlyable and IJumpable and then making concrete classes based on these Interfaces it gives you great flexibility and code reuse.
For IWalkable you might have...
CannotWalk : IWalkableBehavior
LimitedWalking : IWalkableBehavior
UnlimitedWalking : IWalkableBehavior
Similar pattern for IFlyableBehavior and IJumpableBehavior.
These concrete classes would implement the behavior for CannotWalk, LimitedWalking and UnlimitedWalking.
In your concrete classes for the objects (such as an enemy) you would have a local instance of these Behaviors. For example:
IWalkableBehavior _walking = new CannotWalk();
Others might use new LimitedWalking() or new UnlimitedWalking();
When the time comes to handle the behavior of an enemy, say the AI finds the player is within a certain range of the enemy (and this could be a behavior as well say IReactsToPlayerProximity) it may then naturally attempt to move the enemy closer to "engage" the enemy.
All that is needed is for the _walking.Walk(int xdist) method to be called and it will automagically be sorted out. If the object is using CannotWalk then nothing will happen because the Walk() method would be defined as simply returning and doing nothing. If using LimitedWalking the enemy may move a very short distance toward the player and if UnlimitedWalking the enemy may move right up to the player.
I might not be explaining this very clearly but basically what I mean is to look at it the opposite way. Instead of encapsulating your object (what you are calling Data here) into the Behavior class encapsulate the Behavior into the object using Interfaces and this gives you the "loose coupling" allowing you to refine the behaviors as well as easily extend each "behavioral base" (Walking, Flying, Jumping, etc) with new implementations yet your objects themselves know no difference. They just have a Walking behavior even if that behavior is defined as CannotWalk.
Funnily enough, OOP is often described as combining data and behavior.
What you're showing here is something I consider an anti-pattern: the "anemic domain model." It does suffer from all the problems you've mentioned, and should be avoided.
Different levels of an application might have a more procedural bent, which lends themselves to a service model like you've shown, but that would usually only be at the very edge of a system. And even so, that would internally be implemented by traditional object design (data + behavior). Usually, this is just a headache.
Age in intrisic to a person (any person). Therefore it should be a part of the Person object.
hasExperienceWithThe40mmRocketLauncher() is not intrinsic to a person, but perhaps to the interface MilitaryService that can either extend or aggregate the Person object. Therefore it should not be a part of the Person object.
In general, the goal is to avoid adding methods to the base object ("Person") just because it's the easiest way out, as you introduce exceptions to normal Person behavior.
Basically, if you see yourself adding stuff like "hasServedInMilitary" to your base object, you are in trouble. Next you will be doing loads of statements such as if (p.hasServedInMilitary()) blablabla. This is really logically the same as doing instanceOf() checks all the time, and indicates that Person and "Person who has seen military service" are really two different things, and should be disconnected somehow.
Taking a step back, OOP is about reducing the number of if and switch statements, and instead letting the various objects handle things as per their specific implementations of abstract methods/interfaces. Separating the Data and Behavior promotes this, but there's no reason to take it to extremes and seperate all data from all behavior.
The approach you have described is consistent with the strategy pattern. It facilitates the following design principles:
The open/closed principle
Classes should be open for extension but closed for modification
Composition over Inheritance
Behaviours are defined as separate interfaces and specific classes that implement these interfaces. This allows better decoupling between the behaviour and the class that uses the behaviour. The behaviour can be changed without breaking the classes that use it, and the classes can switch between behaviours by changing the specific implementation used without requiring any significant code changes.
The answer is really that it's good in the right situation. As a developer part of your job is to determine the best solution for the problems presented and try to position the solution to be able to accommodate future needs.
I don't do this often follow this pattern but if the compiler or environment are designed specifically to support the separation of data and behavior there are many optimizations that can be achieved in how the Platform handles and organizes your scripts.
It’s in your best interest to be familiarize yourself with as many Design Patterns as possible rather than custom building your entire solution every time and don’t be too judgmental because the pattern doesn’t immediately make sense. You can often use existing design patterns to achieve flexible and robust solutions throughout your code. Just remember they are all meant as a starting point so you should always be prepared to customize to accommodate the individual scenarios you encounter.
The more I dive into functional programming I read the recommendation to favor static methods in favor of non-static ones. You can read about that recommendation in this book for example:
http://www.amazon.de/Functional-Programming-Techniques-Projects-Programmer/dp/0470744588
Of course that makes sense if you think about functional purity. A static function stands there and says: "I do not need any state!"
However, how does that influence testability? I mean, isn't it that a system with a lot of static methods becomes a pain to test (since static methods are hard to mock)? Or does mocks play a minor role in functional programming and if so: why?
EDIT
Since there are doubts if the book really makes that recommendation. I will quote a little more. I hope thats ok for Oliver Sturm.
Use Static Methods
Static methods is one of the basic ideas worth considering as a general guideline. It is supported by many object oriented programmers, and from a functional point of view, functions can be made static most of the time. Any pure function can be made static.
(...)
Some may argue that the idea of always passing around all parameters means you're not exploiting the ideas of object orientation as much as you could. That may in fact be true, but then perhaps it is because object orientation concepts don't give as much consideration to issues of parallel execution as they should.
(...)
Finally, a guideline to recommend: when you have written a method that does not require acces to any field in the class it lives in, make it static!
Btw, there have been good answers so far. Thanks for that!
One way of looking at this is that for functional programming you only need to mock state (by providing suitable inputs) that is required by the specific function. For OO programming you need to mock all of the state required for the inner working of the class.
Functional programs also have the side benefit that you can guarantee that repeating the same test with the same input will give the same result. In classic OO you have to guarantee not just the same input, but the same overall state.
In well architectured OO code, the difference will be minimal (as classes will have well defined responsibility) but the requirements for a functional test are still a strict subset of the equivilent OO test.
(I realise that functional programming styles can make use of OO via immutable objects - please read mentions of OO above as 'object oriented programming with mutible state')
Edit:
As pointed out by Fredrik, the important part about functional methods is not that they are static, but that they do not mutate the state of the program. A 'pure' function is a mapping from a set of inputs to a set of outputs (same input always gives same result), and has no other effect.
I think that static methods per se is not the problem, the problem comes when they start to operate on static data. As long as the static method takes input as argument, operates on it and returns a result, I see no problems testing them.
Even when I am not pursuing a functional approach in my code, I tend to make methods static whenever I can. But I think very carefully before introducing static state, or a static type.
All "state" in pure functional programming comes from the inputs. To unit test functional programs you create test inputs and observe the outputs. If your methods can not be tested by giving them test inputs and observing the output they are not functional enough.
In functional programming you would want to mock functions instead of objects. So if you want to test function f without depending on some ComplicatedAndLongFunction in
f(x)
{
myx = g(x);
y = ComplicatedAndLongFunction(myx);
myy = h(y)
return myy;
}
you may want to decouple f from the ComplicatedAndLongFunction by injecting the latter into f:
f(x, calc)
{
myx = g(x);
y = calc(myx);
myy = h(y)
return myy;
}
so you can specify the behavior of calc in you test.
This raises the question (in my head at least) if there are mocking frameworks that make it easy to specify expectations on functions without having to revert to objects.
I am entry level .Net developer and using it to develop web sites. I started with classic asp and last year jumped on the ship with a short C# book.
As I developed I learned more and started to see that coming from classic asp I always used C# like scripting language.
For example in my last project I needed to encode video on the webserver and wrote a code like
public class Encoder
{
Public static bool Encode(string videopath) {
...snip...
return true;
}
}
While searching samples related to my project I’ve seen people doing this
public class Encoder
{
Public static Encode(string videopath) {
EncodedVideo encoded = new EncodedVideo();
...snip...
encoded.EncodedVideoPath = outputFile;
encoded.Success = true;
...snip...
}
}
public class EncodedVideo
{
public string EncodedVideoPath { get; set; }
public bool Success { get; set; }
}
As I understand second example is more object oriented but I don’t see the point of using EncodedVideo object.
Am I doing something wrong? Does it really necessary to use this sort of code in a web app?
someone once explained OO to me as a a soda can.
A Soda can is an object, an object has many properties. And many methods. For example..
SodaCan.Drink();
SodaCan.Crush();
SocaCan.PourSomeForMyHomies();
etc...
The purpose of OO Design is theoretically to write a line of code once, and have abstraction between objects.
This means that Coder.Consume(SodaCan.contents); is relative to your question.
An encoded video is not the same thing as an encoder. An encoder returns an encoded video. and encoded video may use an encoder but they are two seperate objects. because they are two different entities serving different functions, they simply work together.
Much like me consuming a soda can does not mean that I am a soda can.
Neither example is really complete enough to evaluate. The second example seems to be more complex than the first, but without knowing how it will be used it's difficult to tell.
Object Oriented design is at it's best when it allows you to either:
1) Keep related information and/or functions together (instead of using parallel arrays or the like).
Or
2) Take advantage of inheritance and interface implementation.
Your second example MIGHT be keeping the data together better, if it returns the EncodedVideo object AND the success or failure of the method needs to be kept track of after the fact. In this case you would be replacing a combination of a boolean "success" variable and a path with a single object, clearly documenting the relation of the two pieces of data.
Another possibility not touched on by either example is using inheritance to better organize the encoding process. You could have a single base class that handles the "grunt work" of opening the file, copying the data, etc. and then inherit from that class for each different type of encoding you need to perform. In this case much of your code can be written directly against the base class, without needing to worry about what kind of encoding is actually being performed.
Actually the first looks better to me, but shouldn't return anything (or return an encoded video object).
Usually we assume methods complete successfully without exceptional errors - if exceptional errors are encountered, we throw an exception.
Object oriented programming is fundamentally about organization. You can program in an OO way even without an OO language like C#. By grouping related functions and data together, it is easier to deal with increasingly complex projects.
You aren't necessarily doing something wrong. The question of what paradigm works best is highly debatable and isn't likely to have a clear winner as there are so many different ways to measure "good" code,e.g. maintainable, scalable, performance, re-usable, modular, etc.
It isn't necessary, but it can be useful in some cases. Take a look at various MVC examples to see OO code. Generally, OO code has the advantage of being re-usable so that what was written for one application can be used for others over and over again. For example, look at log4net for example of a logging framework that many people use.
The way your structure an OO program--which objects you use and how you arrange them--really depends on many factors: the age of the project, the overall size of the project, complexity of the problem, and a bit for just personal taste.
The best advice I can think of that will wrap all the reasons for OO into one quick lesson is something I picked up learning design patterns: "Encapsulate the parts that change." The value of OO is to reuse elements that will be repeated without writing additional code. But obviously you only care to "wrap up" code into objects if it will actually be reused or modified in the future, thus you should figure out what is likely to change and make objects out of it.
In your example, the reason to use the second set up may be that you can reuse the EncodedVideo object else where in the program. Anytime you need to deal with EncodedVideo, you don't concern yourself with the "how do I encode and use video", you just use the object you have and trust it to handle the logic. It may also be valuable to encapsulate the encoding logic if it's complex, and likely to change. Then you isolate changes to just one place in the code, rather than many potential places where you might have used the object.
(Brief aside: The particular example you posted isn't valid C# code. In the second example, the static method has no return type, though I assume you meant to have it return the EncodedVideo object.)
This is a design question, so answer depends on what you need, meaning there's no right or wrong answer. First method is more simple, but in second case you incapsulate encoding logic in EncodedVideo class and you can easily change the logic (based on incoming video type, for instance) in your Encoder class.
I think the first example seems more simple, except I would avoid using statics whenever possible to increase testability.
public class Encoder
{
private string videoPath;
public Encoder(string videoPath) {
this.videoPath = videoPath;
}
public bool Encode() {
...snip...
return true;
}
}
Is OOP necessary? No.
Is OOP a good idea? Yes.
You're not necessarily doing something wrong. Maybe there's a better way, maybe not.
OOP, in general, promotes modularity, extensibility, and ease of maintenance. This goes for web applications, too.
In your specific Encoder/EncodedVideo example, I don't know if it makes sense to use two discrete objects to accomplish this task, because it depends on a lot of things.
For example, is the data stored in EncodedVideo only ever used within the Encode() method? Then it might not make sense to use a separate object.
However, if other parts of the application need to know some of the information that's in EncodedVideo, such as the path or whether the status is successful, then it's good to have an EncodedVideo object that can be passed around in the rest of the application. In this case, Encode() could return an object of type EncodedVideo rather than a bool, making that data available to the rest of your app.
Unless you want to reuse the EncodedVideo class for something else, then (from what code you've given) I think your method is perfectly acceptable for this task. Unless there's unrelated functionality in EncodedVideo and the Encoder classes or it forms a massive lump of code that should be split down, then you're not really lowering the cohesion of your classes, which is fine. Assuming you don't need to reuse EncodedVideo and the classes are cohesive, by splitting them you're probably creating unnecessary classes and increasing coupling.
Remember: 1. the OO philosophy can be quite subjective and there's no single right answer, 2. you can always refactor later :p
I have for some time tried to anthropomorphise (meaning human readable) the names I give to interfaces, to me this is the same as give an interface a role based name – trying to capture the purpose of the interface in the name.
I was having a discussion with other developers who think this is a little strange and childish.
What do the folks of SO think?
Examples (C# syntax):
public interface IShowMessages
{
void Show(string message);
void Show(string title, string message);
}
public class TraceMessenger : IShowMessages
{
}
public interface IHaveMessageParameters
{
IList<string> Parameters { get; }
}
public class SomeClass : IHaveMessageParameters
{
}
IThinkItsATerribleIdea
Of course you should always choose identifiers which are human readable. As in: transport the meaning which they convey even to somebody who is not as familiar with the problem to be solved by the code as you are.
However, using long identifiers does not make your identifiers more 'readable'. To any reasonably experienced programmer, 'tmp' conveys as much information as 'temporaryVariable' does. Same goes for 'i' vs. 'dummyCounter' etc..
In your particular example, the interface names are actually quite annoying since somebody who's used to developing object oriented systems will read the inheritance as 'is a'. And 'SomeClass is a IHaveMessageParameters' sounds silly.
Try using IMessagePrinter and IMessageParameterProvider instead.
Yes, that sounds like a good idea.
What's the alternative?
Code should be human-readable. Any fool can write code a computer can understand. The difficult part is writing code a human can understand.
Humans have to maintain the code, so it's pretty darn important that it is as easy to maintain as possible - that includes that the code should be as readable as possible.
Interfaces describe behavior, and so I name them so as to to communicate the behavior they are mandating. This 'generally' means that the name is a verb, (or adverb) or some form of action-describing phrase. Combined with the "I" for interface, this looks like what you are doing...
ICanMove, IControllable, ICanPrint, ISendMesssages, etc...
using adverbs as in IControllable, IDisposable, IEnumerable, etc. communicates the same thought as a verb form and is terser, so I use this form as well...
Finally, more important (or at least equally important) than what you name the interface, is to keep the interfaces you design as small and logically contained as possible. You should strive to have each interface represent as small and logically connected a set of methods/properties as possible. When an interface has so much in it that there is no obvious name that would describe all the behavior it mandates, it's a sign that there is too much in it, and that it needs to be refactored into two or more smaller interfaces. So, maming interfaces in the way you are proposing helps to enforce this type of organizational design, which is a good thing.
There's nothing strange about using simple human-readable names. But using the I for interface to also stand for the first-person I as though it's talking about itself... is a little unusual, yes.
But the bottom line is, whatever works for you and is understood by you and your team is fine. You gotta go with what works.
In my opinion this approach just adds a greater burden on the developers to come up with such names since it intergrates the I as part of a sentence. I don't find IDisposable for example to be more difficult to read than ICanBeDisposed.
In the OP's examples, the anthropomorphic way compares well against alternatives - eg: IShowMessages vs. something like IMessageShower. But - this is not always the case. Interfaces I have used when programming game objects include: IOpenClosable and ILockable. Alternatives like ICanBeOpenedAndClosed and ICanBeLocked would be more verbose. Or you could simply do IAmOpenClosable and IAmLockable - but then you'd be adding the "Am" just for the anthropomorphic effect with no real information benefit. I am all for minimizing verbosity if the same amount of information is conveyed.
So long as the semantics of what is trying to be achieved aren't lost and terseness isn't irreparably compromised (IDoLotsOfThingsWhichIncludesTheFollowingColonSpace...). I wouldn't generally mind somebody other than myself doing it. Still, there are plenty of contexts in which terseness is paramount, in which this would be unacceptable.
Intentionally using the 'I for Interface' convention in the first person seems a bit silly to be honest. What starts out as a cute pun becomes impossible to follow consistently, and ends up clouding meaning later on. That said, your standalone example reads clearly enough and I wouldn't have a problem with it.
My company is on a Unit Testing kick, and I'm having a little trouble with refactoring Service Layer code. Here is an example of some code I wrote:
public class InvoiceCalculator:IInvoiceCalculator
{
public CalculateInvoice(Invoice invoice)
{
foreach (InvoiceLine il in invoice.Lines)
{
UpdateLine(il);
}
//do a ton of other stuff here
}
private UpdateLine(InvoiceLine line)
{
line.Amount = line.Qty * line.Rate;
//do a bunch of other stuff, including calls to other private methods
}
}
In this simplified case (it is reduced from a 1,000 line class that has 1 public method and ~30 private ones), my boss says I should be able to test my CalculateInvoice and UpdateLine separately (UpdateLine actually calls 3 other private methods, and performs database calls as well). But how would I do this? His suggested refactoring seemed a little convoluted to me:
//Tiny part of original code
public class InvoiceCalculator:IInvoiceCalculator
{
public ILineUpdater _lineUpdater;
public InvoiceCalculator (ILineUpdater lineUpdater)
{
_lineUpdater = lineUpdater;
}
public CalculateInvoice(Invoice invoice)
{
foreach (InvoiceLine il in invoice.Lines)
{
_lineUpdater.UpdateLine(il);
}
//do a ton of other stuff here
}
}
public class LineUpdater:ILineUpdater
{
public UpdateLine(InvoiceLine line)
{
line.Amount = line.Qty * line.Rate;
//do a bunch of other stuff
}
}
I can see how the dependency is now broken, and I can test both pieces, but this would also create 20-30 extra classes from my original class. We only calculate invoices in one place, so these pieces wouldn't really be reusable. Is this the right way to go about making this change, or would you suggest I do something different?
Thank you!
Jess
This is an example of Feature Envy:
line.Amount = line.Qty * line.Rate;
It should probably look more like:
var amount = line.CalculateAmount();
There isn't anything wrong with lots of little classes, it's not about re-usability as much as it's about adaptability. When you have many single responsibility classes, it's easier to see the behavior of your system and change it when your requirements change. Big classes have intertwinded responsibilities which make it very difficult to change.
IMO this all depends on how 'significant' that UpdateLine() method really is. If it's just an implementation detail (e.g. it could easily be inlined inside CalculateInvoice() method and they only thing that would hurt is readability), then you probably don't need to unit test it separately from the master class.
On the other hand, if UpdateLine() method has some value to the business logic, if you can imagine situation when you would need to change this method independently from the rest of the class (and therefore test it separately), then you should go on with refactoring it to a separate LineUpdater class.
You probably won't end up with 20-30 classes this way, because most of those private methods are really just implementation details and do not deserve to be tested separately.
Well, your boss goes more correct way in terms of unit-testing:
He is now able to test CalculateInvoice() without testing UpdateLine() function. He can pass mock object instead of real LineUpdater object and test only CalculateInvoice(), not a whole bunch of code.
Is it right? It depends. Your boss wants to make real unit-tests. And testing in your first example would not be unit-testing, it would be integration testing.
What are advantages of unit-tests before integration tests?
1) Unit-tests allow you to test only one method or property, without it being affected by other methods/database and so on.
2) Second advantage - unit tests execute faster (for example, you said UpdateLine uses database) because they don't test all the nested methods. Nested methods can be database calls so if you have thousand of tests your tests can run slow (several minutes).
3) Third advantage: if your methods make database calls then sometimes you need to setup database (fill it with data which is necessary for test) and it can be not easy - maybe you will have to write a couple of pages of code just to prepare database for a test. With unit tests, you separate database calls from the methods being tested (using mock objects).
But! I am not saying that unit tests a better. They are just different. As I said, unit tests allow you to test a unit in isolation and quickly. Integration tests are easier and allow you to test results of a joint work of different methods and layers. Honestly, I prefer integration tests more :)
Also, I have a couple of suggestions for you:
1) I don't think having Amount field is a good idea. It seems that Amount field is extra because it's value can be calculated based on 2 other public fields. If you want to do it anyway, I would do it as a read only property which returns Qty * Rate.
2) Usually, having a class which consists of 1000 rows may mean that it's badly designed and should be refactored.
Now, I hope you better understand the situation and can decide. Also, if you understand the situation you can talk to your boss and you can decide together.
yeah, nice one. I'm not sure wether the InvoiceLine object also has some logic included, otherwise then you would probably need a IInvoiceLine also.
I sometimes have the same questions. On one hand you want to do things right and unit test your code, but when database calls and maybe filewriting is involved it causes a lot of extra work to setup the first test with all the testobjects which step in when filewriting and database io is about to happen, interfaces, asserts and you also want to test that the datalayer doesn't contain any errors. So a test which is more 'process' then 'unit' is often easier to build.
If you have a project that will be changed a lot (in the future) and lots of dependencies of this code (maybe other programs read the file or database data) it can be nice to have a solid unit test for all parts of your code, and the investment time is worthwhile.
But if the project is, like my latest client says 'let's get it live and maybe we'll tweak a bit next year and next year there will be something new', than i wouldn't be so hard on myself to get all unit tests up and running.
Michel
Your boss' example looks reasonable to me.
Some of the key considerations I try to keep in mind when designing for any scenario are:
Single Responsibility Principle
A class should only change for one reason.
Does each new class justify its existence
Have classes been created just for the sake of it, or do they encapsulate a meaningful portion of logic?
Are you able to test each piece of code in isolation?
In your scenario, just looking at the names, it would appear that you were wandering away from Single Responsibility - You have an IInvoiceCalculator, yet this class is then also responsible for updating InvoiceLines. Not only have you made it very difficult to test the update behaviour, but you now need to change your InvoiceCalculator class both when calculation business rules change and when the rules around updating change.
Then there is the question about the updating logic - does the logic justify a seperate class? That really depends and it is hard to say without seeing the code, but certainly the fact that your boss wants that logic under test would suggest that it is more than a simple on liner call off to a datalayer.
You say that this refactoring creates a great number of extra classes, (I'm taking that you mean across all your business entities, since I only see a couple of new classes and their interfaces in your example) but you have to consider what you get from this. It looks like you gain full testability of your code, the ability to introduce new calculation and new update logic in isolation, and a more clear encapsulation of what are seperate pieces of business logic.
The gains above are of course subject to a cost benefit analysis, but since your boos is asking for them, it sounds like he is happy that they will pay off, against the extra work to implement the code this way.
The final, third point about testing in isolation is also a key benefit of the way your boss has designed this - the closer your public methods are to the code that does that actualy work, the easier it is to inject stubs or mocks for parts of your system that are not under test. For example, if you are testing an update method that calls off the a datalayer, you do not want to test the datalayer, so you would usually inject a mock. If you need to pass that mocked datalayer through all your calculator logic first, your test setup is going to be far more complicated since the mock now needs to meet many other potential requirements, not related to the actual test in question.
While this approach is extra work initially, I'd say that the majority of the work is the time to think through the design, after that, and after you get up to speed on the more injection based style of code, the raw implementation time of software that is structured in that way is actually comparible.
Your hoss' approach is a great example of dependency injection and how doing so allows you to use a mock ILineUpdater to conduct your tests efficiently.