Unit Testing: Self-contained tests vs code duplication (DRY)

Unit Testing: Self-contained tests vs code duplication (DRY) - c#

I'm making my first steps with unit testing and am unsure about two paradigms which seem to contradict themselves on unit tests, which is:
Every single unit test should be self-contained and not depend on others.
Don't repeat yourself.
To be more concrete, I've got an importer which I want to test. The Importer has a "Import" function, taking raw data (e.g. out of a CSV) and returning an object of a certain kind which also will be stored into a database through ORM (LinqToSQL in this case).
Now I want to test several things, e.g. that the returned object returned is not null, that it's mandatory fields are not null or empty and that it's attributes got the correct values. I wrote 3 unit tests for this. Should each test import and get the job or does this belong into a general setup-logic? On the other hand, believing this blog post, the latter would be a bad idea as far as my understanding goes. Also, wouldn't this violate the self-containment?
My class looks like this:
[TestFixture]
public class ImportJob
{
private TransactionScope scope;
private CsvImporter csvImporter;
private readonly string[] row = { "" };
public ImportJob()
{
CsvReader reader = new CsvReader(new StreamReader(
#"C:\SomePath\unit_test.csv", Encoding.Default),
false, ';');
reader.MissingFieldAction = MissingFieldAction.ReplaceByEmpty;
int fieldCount = reader.FieldCount;
row = new string[fieldCount];
reader.ReadNextRecord();
reader.CopyCurrentRecordTo(row);
}
[SetUp]
public void SetUp()
{
scope = new TransactionScope();
csvImporter = new CsvImporter();
}
[TearDown]
public void TearDown()
{
scope.Dispose();
}
[Test]
public void ImportJob_IsNotNull()
{
Job j = csvImporter.ImportJob(row);
Assert.IsNotNull(j);
}
[Test]
public void ImportJob_MandatoryFields_AreNotNull()
{
Job j = csvImporter.ImportJob(row);
Assert.IsNotNull(j.Customer);
Assert.IsNotNull(j.DateCreated);
Assert.IsNotNull(j.OrderNo);
}
[Test]
public void ImportJob_MandatoryFields_AreValid()
{
Job j = csvImporter.ImportJob(row);
Customer c = csvImporter.GetCustomer("01-01234567");
Assert.AreEqual(j.Customer, c);
Assert.That(j.DateCreated.Date == DateTime.Now.Date);
Assert.That(j.OrderNo == row[(int)Csv.RechNmrPruef]);
}
// etc. ...
}
As can be seen, I'm doing the line Job j = csvImporter.ImportJob(row);
in every unit test, as they should be self-contained. But this does violate the DRY principle and may possibly cause performance issues some day.
What's the best practice in this case?

Your test classes are no different from usual classes, and should be treated as such: all good practices (DRY, code reuse, etc.) should apply there as well.

That depends on how much of your scenario that's common to your test. In the blog post you refered to the main complaint was that the SetUp method did different setup for the three tests and that can't be considered best practise. In your case you've got the same setup for each test/scenario and then you should use a shared SetUp instead of duplicating the code in each test. If you later on find that there are more tests that does not share this setup or requires a different setup shared between a set of tests then refactor those test to a new test case class. You could also have shared setup methods that's not marked with [SetUp] but gets called in the beginning of each test that needs them:
[Test]
public void SomeTest()
{
setupSomeSharedState();
...
}
A way of finding the right mix could be to start off without a SetUp method and when you find that you're duplicating code for test setup then refactor to a shared method.

You could put the
Job j = csvImporter.ImportJob(row);
in your setup. That way you're not repeating code.
you actually should run that line of code for each and every test. Otherwise tests will start failing because of things that happened in other tests. This will become hard to maintain.
The performance problem isn't caused by DRY violations. You actually should setup everything for each and every test. These aren't unit tests, they're integration tests, you rely on external files to run the test. You could make ImportJob read from a stream instead of it directly opening a file. Then you could test with a memorystream.

Whether you move
Job j = csvImporter.ImportJob(row);
into the SetUp function or not, it will still be executed before every test is executed. If you have the exact same line at the top of each test, well then it is just logical that you move that line into the SetUp portion.
The blog entry that you posted complained about the setup of the test values being done in a function disconnected (possibly not on the same screen as) from the test itself -- but your case is different, in that the test data is being driven by an external text file, so that complaint doesn't match up with your specific use case either.

In one of my projects we agreed with team that we will not implement any initialization logic in unit tests constructors. We have Setup, TestFixtureSetup, SetupFixture (since version 2.4 of NUnit) attributes. They are enough for almost all cases when we need initialization. We force developers to use one of these attributes and to explicitly define whether we will run this initialization code before each test, before all tests in a fixture or before all tests in a namespace.
However I will disagree that unit tests should always confirm to all good practices supposed for a usual development. It is desirable, but it is not a rule. My point is that in real life customer doesn't pay for unit tests. Customer pays for the overall quality and functionality of the product. He is not interested to know whether you provide him a bug-free product by covering 100% of code by unit test/automated GUI tests or by employing 3 manual testers per one developer that will click on every piece of the screen after each build.
Unit tests don't add business value to the product, they allow you to save on development and testing efforts and force developers to write better code. So it is always up to you - will you spend additional time on UT refactoring to make unit tests perfect? Or will you spend the same amount of time to add new features for the customers of your product? Do not also forget that unit-tests should be as simple as possible. How to find a golden section?
I suppose this depends on the project, and PM or team lead need to plan and estimate quality of unit tests, their completeness and code coverage as if they estimate all other business features of your product. My opinion, that it is better to have copy-paste unit tests that cover 80% of production code then to have a very well designed and separated unit tests that cover only 20%.

Related

How to use OneTimeSetup?

NUnit has a OneTimeSetup attribute. I am trying to think of some scenarios when I can use it. However, I cannot find any GitHub examples online (even my link does not have an example).
Say I had some code like this:
[TestFixture]
public class MyFixture
{
IProduct Product;
[OneTimeSetUp]
public void OneTimeSetUp()
{
IFixture fixture = new Fixture().Customize(new AutoMoqCustomization());
Product = fixture.Create<Product>();
}
[Test]
public void Test1()
{
//Use Product here
}
[Test]
public void Test2()
{
//Use Product here
}
[Test]
public void Test3()
{
//Use Product here
}
}
Is this a valid scenario for initialising a variable inside OneTimeSetup i.e. because in this case, Product is used by all of the Test methods?

The fixture setup for Moq seems reasonable to me.
However, reusing the same Product is suspect as it can be modified and accidentally reused which could lead to unstable tests. I would get a new Product each test run, that is if it's the system under test.

Yes. No. :-)
Yes... [OneTimeSetUp] works as you expect. It executes once, before all your tests and all your tests will use the same product.
No... You should probably not use it this way, assuming that the initialization of Product is not extremely costly - it likely is not since it's based on a mock.
OneTimeSetUp is best used only for extremely costly steps like creating a database. In fact, it tends to have very little use in purely unit testing.
You might think, "Well if it's even a little more efficient, why not use it all the time?" The answer is that you are writing tests, which implies you don't actually know what the code under test will do. One of your tests may change the common product, possibly even put it in an invalid state. Your other tests will start to fail, making it tricky to figure out the source of the problem. Almost invariably, you want to start out using SetUp, which will create a separate product for each one of your tests.

Yes, you use OneTimeSetUp (and OneTimeTearDown) to do all the setup that will be shared among tests in that fixture. It is run once per test run, so regardless of however many tests in the fixture the code will execute once. Typically you will initialize instances held in private fields etc for the other tests to use, so that you don't end up with lots of duplicate setup code.

Is NUnit Setup attribute a code smell?

I have written a set of NUnit classes, which have Setup and TearDown attributes. Then I read this: http://jamesnewkirk.typepad.com/posts/2007/09/why-you-should-.html. I can understand what the author is saying where you have to scroll up and scroll down when reading the Unit Tests. However, I also see the benefit of Setup and TearDown. For example, in a recent test class I did this:
private Product1 _product1;
private Product2 _product2;
private IList<Product> _products;
[SetUp]
public void Setup()
{
_product1 = new Product();
_product2 = new Product();
_product = new List<Product>();
_products.Add(_product1);
_products.Add(_product2);
}
Here _product1, _product2 and _products is used by every test. Therefore it seems to violate DRY to put them in every method. Should they be put in every test method?

This question is very subjective, but I don't believe it is a code smell. The example in the linked blog post was very arbitrary and did not use the variables setup in every test. If you look at your code and you are not using the variables from the SetUp, then yes, it is probably a code smell.
If your tests are grouped well however, then each test fixture will be testing a group of functionality that often needs the same setup. In this case, the DRY principle wins out in my books. James argued in his post that he needed to look in three methods to see the state of the data, but I would counter that too much setup code in a test method obscures the purpose of the test.
Copying setup code like you have in your example also makes your tests harder to maintain. If your Product class changes in the future and requires additional construction, then you will need to change it in every test, not in one place. Adding that much setup code to each test method would also make your test class very long and hard to scan.

How do I ensure complete unit test coverage?

I have 2 projects, one is the "Main" project and the other is the "Test" project.
The policy is that all methods in the Main project must have at least one accompanying test in the test project.
What I want is a new unit test in the Test project that verifies this remains the case. If there are any methods that do not have corresponding tests (and this includes method overloads) then I want this test to fail.
I can sort out appropriate messaging when the test fails.
My guess is that I can get every method (using reflection??) but I'm not sure how to then verify that there is a reference to each method in this Test project (and ignore references in projects)

You can use any existing software to measure code coverage but...
Don't do it!!! Seriously. The aim should be not to have 100% coverage but to have software that can easily evolve. From your test project you can invoke by reflection every single existing method and swallow all the exceptions. That will make your coverage around 100% but what good would it be?
Read about TDD. Start creating testable software that has meaningful tests that will save you when something goes wrong. It's not about coverage, it's about being safe.

This is an example of a meta-test that sounds good in principle, but once you get to the detail you should quickly realise is a bad idea. As has already been suggested, the correct approach is to encourage whoever owns the policy to amend it. As you’ve quoted the policy, it is sufficiently specific that people can satisfy the requirement without really achieving anything of value.
Consider:
public void TestMethod1Exists()
{
try
{
var classUnderTest = new ClassToTest();
classUnderTest.Method1();
}
catch (Exception)
{
}
}
The test contains a call to Method1 on the ClassToTest so the requirement of having a test for that method is satisfied, but nothing useful is being tested. As long as the method exists (which is must if the code compiled) the test will pass.
The intent of the policy is presumably to try to ensure that written code is being tested. Looking at some very basic code:
public string IsSet(bool flag)
{
if (flag)
{
return "YES";
}
return "NO";
}
As methods go, this is pretty simple (it could easily be changed to one line), but even so it contains two routes through the method. Having a test to ensure that this method is being called gives you a false sense of security. You would know it is being called but you would not know if all of the code paths are being tested.
An alternative that has been suggested is that you could just use code coverage tools. These can be useful and give a much better idea as to how well exercised your code is, but again they only give an indication of the coverage, not the quality of that coverage. So, let’s say I had some tests for the IsSet method above:
public void TestWhenTrue()
{
var classUnderTest = new ClassToTest();
Assert.IsString(classUnderTest.IsSet(true));
}
public void TestWhenFalse()
{
var classUnderTest = new ClassToTest();
Assert.IsString(classUnderTest.IsSet(false));
}
I’m passing sufficient parameters to exercise both code paths, so the coverage for the IsSet method should be 100%. But all I am testing is that the method returns a string. I’m not testing the value of the string, so the tests themselves don’t really add much (if any) value.
Code coverage is a useful metric, but only as part of a larger picture of code quality. Having peer reviews and sharing best practice around how to effectively test the code you are writing within your team, whilst it is less concretely measurable will have a more significant impact on the quality of your test code.

Do I have to fake a value object in my unit test

I have written a class using TDD containing a method (method under test) which takes a simple value object as a parameter (range).
Code:
The method under test looks like this:
public List<string> In(IRange range)
{
var result = new List<string>();
for (int i = range.From; i <= range.To; i++)
{
//...
}
return result;
}
Furthermore I have a unit test to verify my method under test:
[TestMethod]
public void In_SimpleNumbers_ReturnsNumbersAsList()
{
var range = CreateRange(1, 2);
var expected = new List<string>() { "1", "2" };
var result = fizzbuzz.In(range);
CollectionAssert.AreEqual(expected, result);
}
private IRange CreateRange(int from, int to)
{
return new Fakes.StubIRange()
{
FromGet = () => { return from; },
ToGet = () => { return to; }
};
}
Question:
I have read Roy Osherove's book on unit testing ("The Art of Unit Testing"). In there he says
"external dependencies (filesystem, time, memory etc.) should be
replaced by stubs"
What does he mean by external dependency? Is my value object (range) also an external dependency which should be faked? Should I fake all dependencies a class have?
Can someone give me an advice

TL;DR
Do the simplest thing possible that solves your problem.
The longer I have been using TDD, the more I appreciate the value of being pragmatic. Writing super-duper isolated unit tests is not a value unto itself. The tests are there to help you write high quality code that is easy to understand and solves the right problem.
Adding an interface for the range class is a good idea if you have the need to be able to switch to another concrete range implementation without having to modify the code which depends on it.
However, if you do not have that need adding an interface serves no real purpose, but it does add some complexity, which actually takes you further from the goal of writing easy to understand code which solves the problem.
Be careful not to think to much about what might change in the future. YAGNI is a good principle to follow. If you have been doing TDD you won't have any problems refactoring the code if an actual need occurs in the future, since you have solid tests to rely on.
In general terms I would not consider a proper Value Object to be a dependency. If it is complex enough that you feel uncomfortable letting other code use it when under test it sounds like it is actually more like a service.

A unit test should run in isolation (completely in memory) without having to touch any external systems, such as file system, database, web service, mail service, system clock, or anything that is either slow, hard to setup, or undeterministic (such as the ever changing system time).
To be able to do this, you should abstract away those external dependencies which allows you to mock them in your tests.
A unit test however, goes one step further. In a unit test you often only want to test the logic of a single method or a single class. You're not interested to verify how multiple components integrate, but you just want to verify that the logic of that single class is correct, and that it communicates correctly with other components.
To be able to do this, you need to fake those other components (the class's dependencies). So in general you should indeed fake all dependencies (that contain behavior) a class has.

What's the best practice in organizing test methods that cover steps of a behavior?

I have a FileExtractor class which a Start method which does some steps.
I've created a test class called "WhenExtractingInvalidFile.cs" within my folder called "FileExtractorTests" and added some Test Methods inside it as below which should be verified as steps of the Start() method:
[TestMethod]
public void Should_remove_original_file()
{
}
[TestMethod]
public void Should_add_original_file_to_errorStorage()
{
}
[TestMethod]
public void Should_log_error_locally()
{
}
This way, it'd nicely organize the behaviors and the expectations that should be met.
The problem is that most of the logic of these test methods are the same so should I be creating one test method that verifies all the steps or separately like above?
[TestMethod]
public void Should_remove_original_file_then_add_original_file_to_errorStorage_then_log_error_locally()
{
}
What's the best practice?

While it's commonly accepted that the Act section of tests should only contain one call, there's still a lot of debate over the "One Assert per Test" practice.
I tend to adhere to it because :
when a test fails, I immediately know (from the test name) which of the multiple things we want to verify on the method under test went wrong.
tests that imply mocking are already harder to read than regular tests, they can easily get arcane when you're asserting against multiple mocks in the same test.
If you don't follow it, I'd at least recommend that you include meaningful Assert messages in order to minimize head scratching when a test fails.

I would do the same thing you did, but with a _on_start postfix like this: Should_remove_original_file_on_start. The latter method will only give you a maximum of one assert fail, even though all aspects of Start could be broken.

DRY - don't repeat yourself.
Think about the scenario where your test fails. What would be most useful from a test organization point?
The second dimension to look at is maintenance. Do you want to run and maintain 1 test or n tests? Don't overburden development with writing many tests that have little value (I think TDD is a bit overrated for this reason). A test is more valuable if it exercises a larger path of the code rather than a short one.
In this particular scenario I would create a single test. If this test fails often and you are not reaching the root cause of the problem fast enough, re-factor into multiple tests.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.