How do you add sample (dummy) data to your unit tests?

How do you add sample (dummy) data to your unit tests? - c#

In bigger projects my unit tests usually require some "dummy" (sample) data to run with. Some default customers, users, etc. I was wondering how your setup looks like.
How do you organize/maintain this data?
How do you apply it to your unit tests (any automation tool)?
Do you actually require test data or do you think it's useless?
My current solution:
I differentiate between Master data and Sample data where the former will be available when the system goes into production (installed for the first time) and the latter are typical use cases I require for my tests to run (and to play during development).
I store all this in an Excel file (because it's so damn easy to maintain) where each worksheet contains a specific entity (e.g. users, customers, etc.) and is flagged either master or sample.
I have 2 test cases which I (miss)use to import the necessary data:
InitForDevelopment (Create Schema, Import Master data, Import Sample data)
InitForProduction (Create Schema, Import Master data)

I use the repository pattern and have a dummy repository that's instantiated by the unit tests in question, it provides a known set of data that encompasses a examples that are both within and out of range for various fields.
This means that I can test my code unchanged by supplying the instantiated repository from the test unit for testing or the production repository at runtime (via a dependency injection (Castle)).
I don't know of a good web reference for this but I learnt much from Steven Sanderson's Professional ASP.NET MVC 1.0 book published by Apress. The MVC approach naturally provides the separation of concern that's necessary to allow your testing to operate with fewer dependencies.
The basic elements are that you repository implements an interface for data access, that same interface is then implemented by a fake repository that you construct in your test project.
In my current project I have an interface thus:
namespace myProject.Abstract
{
public interface ISeriesRepository
{
IQueryable<Series> Series { get; }
}
}
This is implemented as both my live data repository (using Linq to SQL) and also a fake repository thus:
namespace myProject.Tests.Respository
{
class FakeRepository : ISeriesRepository
{
private static IQueryable<Series> fakeSeries = new List<Series> {
new Series { id = 1, name = "Series1", openingDate = new DateTime(2001,1,1) },
new Series { id = 2, name = "Series2", openingDate = new DateTime(2002,1,30),
...
new Series { id = 10, name = "Series10", openingDate = new DateTime(2001,5,5)
}.AsQueryable();
public IQueryable<Series> Series
{
get { return fakeSeries; }
}
}
}
Then the class that's consuming the data is instantiated passing the repository reference to the constructor:
namespace myProject
{
public class SeriesProcessor
{
private ISeriesRepository seriesRepository;
public void SeriesProcessor(ISeriesRepository seriesRepository)
{
this.seriesRepository = seriesRepository;
}
public IQueryable<Series> GetCurrentSeries()
{
return from s in seriesRepository.Series
where s.openingDate.Date <= DateTime.Now.Date
select s;
}
}
}
Then in my tests I can approach it thus:
namespace myProject.Tests
{
[TestClass]
public class SeriesTests
{
[TestMethod]
public void Meaningful_Test_Name()
{
// Arrange
SeriesProcessor processor = new SeriesProcessor(new FakeRepository());
// Act
IQueryable<Series> currentSeries = processor.GetCurrentSeries();
// Assert
Assert.AreEqual(currentSeries.Count(), 10);
}
}
}
Then look at CastleWindsor for the inversion of control approach for your live project to allow your production code to automatically instantiate your live repository through dependency injection. That should get you closer to where you need to be.

In our company we discuss exact these problem a bunch of time since weeks and month.
To follow the guideline of unit testing:
Each test must be atomar and don't allow relate to each other (No data sharing), that means, each tust must be have there own data at the beginning and clear the data at end.
Out product is so complex (5 years development, over 100 tables in a database), that is nearly impossible to maintain this in a acceptable way.
We tried out database scripts, which creates and deletes the data before / after the test (there are automatic methods which call it).
I would say you are on a good way with excel files.
Ideas from me to make it a little well:
If you have a database behind your software google for "NDBUnit". It's a framework to insert and delete data in databases for unit tests.
If you have no database maybe XML is a little more flexible on systems like excel.

Not directly answering the question but one way to limit the amount of tests that need to use dummy data is to use a mocking framework to create mocked objects that you can use to fake the behavior of any dependencies you have in a class.
I find that using mocked objects rather then a specific concrete implementation you can drastically reduce the amount of real data you need to use as mocks don't process the data you pass into them. They just perform exactly as you want them to.
I'm still sure you probably need dummy data in a lot of instances so apologies if you're already using or are aware of mocking frameworks.

Just to be clear, you need to differenciate between UNIT testing (test a module with no implied dependencies on other modules) and app testing (test parts of application).
For the former, you need a mocking framework (I'm only familiar with Perl ones, but i'm sure they exist in Java/C#). A sign of a good framework would be ability to take a running app, RECORD all the method calls/returns, and then mock the selected methods (e.g. the ones you are not testing in this specific unit test) using recorded data.
For good unit tests you MUST mock every external dependency - e.g., no calls to filesystem, no calls to DB or other data access layers unless that is what you are testing, etc...
For the latter, the same mocking framework is useful, plus ability to create test data sets (that can be reset for each test). The data to be loaded for the tests can reside in any offline storage that you can load from - BCP files for Sybase DB data, XML, whatever tickles your fancy. We use both BCP and XML.
Please note that this sort of "load test data into DB" testing is SIGNIFICANTLY easier if your overall company framework allows - or rather enforces - a "What is the real DB table name for this table alias" API. That way, you can cause your application to look at cloned "test" DB tables instead of real ones during testing - on top of such table aliasing API's main purpose of enabling one to move DB tables from one database to another.

Related

How to write an integration test in NUnit?

We are two students writing our bachelor thesis and we have developed a Windows Application, which should be able to aid a restaurant in various communication processes. Fundamentally, it should be able to present information about the order from the moment a guest send it to it is served.
We have omitted to test during the development but have decided to write unit tests now. Nevertheless, we have found out that the most suitable test we can write to our system now are integration tests because all the methods in our classes are bound to SQL stored procedures via LINQ to SQL. We are aware of the usage of stubs to fake out a dependency to a database, but when our database already is implemented together with all the functions, we figured it would give us more value to test several methods together as an integration test.
As seen in the code below we have tried to follow the guide lines for a unit test, but is this the right way to write an integration test?
[Test]
public void SendTotalOrder_SendAllItemsToProducer_OneSentOrder()
{
//Arrange
Order order = new Order();
Guest guest = new Guest(1, order);
Producer producer = new Producer("Thomas", "Guldborg", "Beverage producer");
DataGridView dataGridView = new DataGridView { BindingContext = new BindingContext() };
order.MenuItemId = 1;
order.Quantity = 1;
//Act
guest.AddItem();
dataGridView.DataSource = guest.SendOrderOverview();
guest.SendOrder(dataGridView);
dataGridView.DataSource = producer.OrderOverview();
var guestTableOrder = producer.OrderOverview()
.Where(orders => orders.gtid == guest.GuestTableId)
.Select(producerOrder => producerOrder.gtid)
.Single();
//Assert
Assert.That(guestTableOrder, Is.EqualTo(guest.GuestTableId));
}

Yes, generally speaking, this is how to write a unit test/integration test. You observe some important guidelines:
Distinct Act-Arrange-Assert steps
The test name describes these steps (maybe it should have something like "ShouldSendOneOrder" at the end, "Should" is commonly used to describe the Assert).
One Assert per test.
I assume you also obey other guidelines:
Tests are independent: they don't change persistent state, so they don't influence other tests.
Test realistic use cases: don't arrange constellations that violate business logic, don't do impossible acts. Or: mimic the real application.
However, I also see things that raise eyebrows.
It's not clear which act you test. I think some "acts" belong to the arrange step.
A method like producer.OrderOverview() makes me suspect that domain objects execute database interaction. If so, this would violate persistence ignorance. I think there should be a service that presents this method (but see below).
It's not clear why dataGridView.DataSource = producer.OrderOverview(); is necessary for the test. If it is, this only aggravates the most serious point:
Business logic and UI are entangled!!
Method like guest.SendOrderOverview() and producer.OrderOverview() are smelly: why should a domain object know how to present its content? That's something a presenter (MVP) or a controller (MVC) or a view model (MVVM) should be responsible for.
A method like guest.SendOrder(dataGridView) is evil. It ties the domain layer to the UI framework. This fixed dependency is evil enough, but of course you also need values from the grid view inside this method. So the business logic needs intimate knowledge of some UI component. This violates the tell - don't ask principle. guest.SendOrder should have simple parameters that tell it how to do its task and the domain shouldn't have any reference to any UI framework.
You really should address the latter point. Make it your goal to run this test without any interaction with DGV.

If you continue to bound sql in class,your test is not a big problem.
You can use this method when the program logic is very simple,But I suggest you study The Repository Pattern,as the logic becomes more complex.

How to write integration test against database

Lets say I have a class like below.
I'm not sure how I would write a unit/integration test against it. Does it need refactoring?
Would it simply be to add an Add/Find method (which it would have in reality), call the Add in the test then call the Delete and then the Find?
public class Repository
{
public void DeleteProduct(int id)
{
var connstring = ""; //Get from web.config
using(SqlConnection conn = new SqlConnection(connstring))
{
conn.Open();
SqlCommand command = new SqlCommand("DELETE FROM PRODUCTS WHERE ID = #ID")
command.Paramaters.Add("#ID", id)
command.ExecuteNonQuery();
}
}
}

The golden rule is not to test framewkrk's code. Unless this method would have no custom logic there is nothing to test.
I think what you trying to achieve is to separate Repository to make unit-testing easy. The best way of doing this would be to create interface for your repository and mock it.
If you really want to create some integration tests then you have to create some test database where you could make your nuclear bombs experiments.

My suggestion - write an Integration test for repositories (since you are using a framework for data access), unless there is more than CRUD that you are doing in the repository.
Add/Find are all individual Repository methods, and they need to be tested themselves.
I would recommend, use Setup to setup seed data, that you know you can act on. In this case, insert records into Products table.
Then Act: Call Repository.Deleteproduct(<product id created in setup>)
Assert that: Product created in setup is deleted (Query the database again to check).
If you are using an ORM, this test would also test your mappings for Product.

I have never added unit test for database calls. This is definitely more of an integration test. There is nothing observable for you to check.
I know that Java had some tools for this that fitted into JUnit. IT requires that you write XML files that mimic before and after and then it compares the contents of the table to the XML file. I am sure that .Net will have something similar. However I am not sure that it is worth it. I found those tests to be incredibly brittle and provide very little value.
I would suggest take the pragmatic approach and don't write test for database objects. Rather test those object that interact with your database objects.

Single class with two databases

I have a two part application. One part is a web application (C# 4.0) which runs on a hosted machine with a hosted MSSQL database. That's nice and standard. The other part is a Windows Application that runs locally on our network and accesses both our main database (Advantage) and the web database. The website has no way to access the Advantage database.
Currently this setup works just fine (provided the network is working), but we're now in the process of rebuilding the website and upgrading it from a Web Forms /.NET 2.0 / VB site to a MVC3 / .NET 4.0 / C# site. As part of the rebuild, we're adding a number of new tables where the internal database has all the data, and the web database has a subset thereof.
In the internal application, tables in the database are represented by classes which use reflection and attribute flags to populate themselves. For example:
[AdvantageTable("warranty")]
public class Warranty : AdvantageTable
{
[Advantage("id", IsKey = true)]
public int programID;
[Advantage("w_cost")]
public decimal cost;
[Advantage("w_price")]
public decimal price;
public Warranty(int id)
{
this.programID = id;
Initialize();
}
}
The AdvantageTable class's Initialize() method uses reflection to build a query based on all the keys and their values, and then populates each field based on the database column specified. Updates work similarly - We call AdvantageTable.Update() on whichever object, and it handles all the database writes. It works quite well, hides all the standard CRUD, and lets us rapidly create new classes when we add a new table. We'd rather not change it, but I'm not going to entirely rule it out if there's a solution that would require it.
The web database needs to have this table, but doesn't have a need for the cost data. I could create a separate class that's backed by the web database (via stored procedures, reflection, LINQ-TO-SQL, ADO data objects, etc), but there may be other functionality in the Warranty object which I want to behave the same way regardless of whether it's called from the website or the internal app, without the need to maintain two sets of code. For example, we might change the logic of how we decide which warranty applies to a product - I want to need to create and test that in only one place, not two.
So my question is: Can anyone think of a good way to allow this class to sometimes be populated from the Advantage database and sometimes the web database? It's not just a matter of connection strings, because they have two very different methods of access (even aside from the reflection). I considered adding [Web("id")] type tags to the Advantage tags, and only putting them on the fields which exist in the web database to designate its columns, then having a switch of some kind to control which set of logic is used for reading/writing, but I have the feeling that that would get painful (Is this method web-safe? How do I set the flag before instantiating it?). So I have no ideas I like and suspect there's a solution I'm not even aware exists. Any input?

I think the fundamental issue is that you want to put business logic in the Warranty object, which is a data layer object. What you really want to do is have a common data contract (could be an interface in this case) that both data sources support, with logic encapsulated in a separate class/layer that can operate with either data source. This side-steps the issue of having a single data class attempt to operate with two different data sources by establishing a common data contract that your business layer can use, regardless of how the data is pulled.
So, with your example, you might have an AdvantageWarranty and WebWarranty, both of which implement IWarranty. You have a separate WarrantyValidator class that can operate on any IWarranty to tell you whether the warranty is still valid for given conditions. Incidentally, this gives you a nice way to stub out your data if you want to unit test your business logic in the WarrantyValidator class.

The solution I eventually came up with was two-fold. First, I used Linq-to-sql to generate objects for each web table. Then, I derived a new class from AdvantageTable called AdvantageWebTable<TABLEOBJECT>, which contains the web specific code, and added web specific attributes. So now the class looks like this:
[AdvantageTable("warranty")]
public class Warranty : AdvantageWebTable<WebObjs.Warranty>
{
[Advantage("id", IsKey = true)][Web("ID", IsKey = true)]
public int programID;
[Advantage("w_cost")][Web("Cost")]
public decimal cost;
[Advantage("w_price")][Web("Price")]
public decimal price;
public Warranty(int id)
{
this.programID = id;
Initialize();
}
}
There's also hooks for populating web-only fields right before saving to the web database, and there will be (but isn't yet since I haven't needed it) a LoadFromWeb() function which uses reflection to populate the fields.

How to Test Functions w/ Complex Data Interactions

Currently, I am working on system that does quite a bit of reporting-style functions that consumes many different data points and transforms them into larger, sometimes flattened outputs. Most of my app is built upon a variation of the repository pattern. Due to this, I have a suite of mock-repositories that I use for testing scenarios. The problem that I am running into is that the interaction between these data points is so complex that it is quickly become a maintenance nightmare to maintain the "mock data". Here is a mock example:
public class SomeReportingEntity
{
private IProductRepo ProductRepo;
private IManagerRepo ManagerRepo;
private ILocationRepo LocationRepo;
private IOrdersService OrdersService;
private IEmployeeRepo EmployeeRepo;
public ReportingEntity(IProductRepo ipr, IManagerRepo imr, ILocationRepo ilr, IOrdersService ios,
IEmployeeRepo ier){
//Load these to private vars...
}
//This is the function that I want to test...
public SomeReportingEntity GetManagerSalesByRegionReport()
{
//Make a complex join on all sub collections. These
//sub collections are all under test individually.
var MangerSalesByRegionItems = From x in ProductRepo.CurrentProducts()
Join y in OrdersService.FutureOrders() On ...
Join z in EmployeeRepo.ActiveEmployees() On ...
Join a in LocationRepo.GetAllRegions() On ...
Join b In ManagerRepo.GetActiveManagers On ...
Select new SomeReportingEntity() With { ... }
return MangerSalesByRegionItems.ToList();
}
}
Admittedly, this is a very contrived example but the basic idea that I want to emphasize is that I have several repositories that I am joining and I need to create many tests to ensure that this complex query does as expected. Due to the fact that the joining operations are so complex, it makes the mock data VERY difficult to keep in line - especially as I have to add more associations and test additional points. In addition, I need to be able to enter specific record states into the mocks (such as an employee lacking an assigned manager) to verify that query handles those situations appropriately.
So here are my questions:
What is the best way to "mock" this data so that it is not such a matinenance nightmare? I have had many people suggest building an in-memory database to support this.
Am I really suffering from an architecture issue here? In reporting scenarios, I find myself in this pattern quite a bit where I take many disassociated data points and merge them into a new, hybrid entity. With the onset of Linq, it is very easy to do and has high clarity of intent, but sometimes it feels like I am cheating a little.

The first thing you want to do is make centralized object that knows how to retrieve the data for different repositories. Since this is reporting only, it's easier because you don't have to worry about change tracking.
From a logistical standpoint, one thing I would consider is making a local database to hold the remote data (update periodically using agents). This would remove some of the issues of calling remote services and aggregating their data on the fly. You would also be able to pre-process some of the data at the start.
When I use the repository pattern, I couple it with the Unit Of Work pattern. The Unit of Work is the guy that does all the legwork for you. Theoretically, your UoW could bring in the data from the multiple services and present it to the repositories based on configuration.
For testing, you can use the InMemoryUnitOfWork to provide all the data in one single place.

I've been working on data-heavy project myself. What has worked for us is to use the Repository itself to hydrate objects and then serialize them to XML. We pull the XML file into our test project and use that as the starting point for our automated tests. It's nice because it ensures that your mock data looks like real data.
Our tests tend to look like this...
var object1 = XmlUtil.LoadObject1("filename1");
var object2 = XmlUtil.LoadObject2("filename2");
var result = SomeConverter.Convert(object1, object2);
Assert("somevalue", result.Property1);
If you need to do inline lookups, you can add a mock repository that would provide the same level of dependency injection.
The downside of this approach is if the data schema changes. Sometimes, a test can become obsolete if the data schema has changed. If your schema is still under a lot of flux, I would keep your automated test small until the schema settles down. Focus on unit tests until you know that the schema is relaitively stable.

You have to decide exactly what you want to test.
One way to do this might be to pretend you're using TDD. Pretend that your GetManagerSalesByRegionReport method does not exist (or actually delete it). You'll have to:
Write a failing unit test. What's the simplest thing for it to test: that you can call the method and that it doesn't throw an exception when there's nothing wrong with the data.
You'll need to create the method, empty. It should return void since your test doesn't need it to return anything.
Your test should now pass.
Add a test to ensure that a List of the appropriate type is returned, even if none of the sub-repositories have data.
You'll have to change the method to return your list type, and you'll have to change it to return null. Your test will still fail, so change it to return an empty List and it will pass.
What's left? Those are INNER joins, so you won't get any data back unless all the repositories contain at least one row. So, test for that: create a test where each repo contains one row and ensure the returned list contains the appropriate number of rows. Then, test for the appropriate properties per returned row. Then test that no data is returned if any of the repos contain no rows.
Then, maybe test what happens if some of the repos contain more than one row.
Then, I don't know what would be left to test.

How to handle setting up complex unit test and have them only test the unit

I have a method that takes 5 parameters. This method is used to take a bunch of gathered information and send it to my server.
I am writing a unit test for this method, but I am hitting a bit of a snag. Several of the parameters are Lists<> of classes that take some doing to setup correctly. I have methods that set them up correctly in other units (production code units). But if I call those then I am kind of breaking the whole idea of a unit test (to only hit one "unit").
So.... what do I do? Do I duplicate the code that sets up these objects in my Test Project (in a helper method) or do I start calling production code to setup these objects?
Here is hypothetical example to try and make this clearer:
File: UserDemographics.cs
class UserDemographics
{
// A bunch of user demographic here
// and values that get set as a user gets added to a group.
}
File: UserGroups.cs
class UserGroups
{
// A bunch of variables that change based on
// the demographics of the users put into them.
public AddUserDemographicsToGroup(UserDemographcis userDemographics)
{}
}
File: UserSetupEvent.cs
class UserSetupEvent
{
// An event to record the registering of a user
// Is highly dependant on UserDemographics and semi dependant on UserGroups
public SetupUserEvent(List<UserDemographics> userDemographics,
List<UserGroup> userGroups)
{}
}
file: Communications.cs
class Communications
{
public SendUserInfoToServer(SendingEvent sendingEvent,
List<UserDemographics> userDemographics,
List<UserGroup> userGroups,
List<UserSetupEvent> userSetupEvents)
{}
}
So the question is: To unit test SendUserInfoToServer should I duplicate SetupUserEvent and AddUserDemographicsToGroup in my test project, or should I just call them to help me setup some "real" parameters?

You need test duplicates.
You're correct that unit tests should not call out to other methods, so you need to "fake" the dependencies. This can be done one of two ways:
Manually written test duplicates
Mocking
Test duplicates allow you to isolate your method under test from its dependencies.
I use Moq for mocking. Your unit test should send in "dummy" parameter values, or statically defined values you can use to test control flow:
public class MyTestObject
{
public List<Thingie> GetTestThingies()
{
yield return new Thingie() {id = 1};
yield return new Thingie() {id = 2};
yield return new Thingie() {id = 3};
}
}
If the method calls out to any other classes/methods, use mocks (aka "fakes"). Mocks are dynamically-generated objects based on virtual methods or interfaces:
Mock<IRepository> repMock = new Mock<IRepository>();
MyPage obj = new MyPage() //let's pretend this is ASP.NET
obj.IRepository = repMock.Object;
repMock.Setup(r => r.FindById(1)).Returns(MyTestObject.GetThingies().First());
var thingie = MyPage.GetThingie(1);
The Mock object above uses the Setup method to return the same result for the call defined in the r => r.FindById(1) lambda. This is called an expecation. This allows you to test only the code in your method, without actually calling out to any dependent classes.
Once you've set up your test this way, you can use Moq's features to confirm that everything happened the way it was supposed to:
//did we get the instance we expected?
Assert.AreEqual(thingie.Id, MyTestObject.GetThingies().First().Id);
//was a method called?
repMock.Verify(r => r.FindById(1));
The Verify method allows you to test whether a method was called. Together, these facilities allow you focus your unit tests on a single method at a time.

Sounds like your units are too tightly coupled (at least from a quick view at your problem). What makes me curious is for instance the fact that your UserGroups takes a UserDemographics and your UserSetupEvent takes a list of UserGroup including a list of UserDemographics (again). Shouldn't the List<UserGroup> already include the ÙserDemographics passed in it's constructor or am I misunderstanding it?
Somehow it seems like a design problem of your class model which in turn makes it difficult to unit test. Difficult setup procedures are a code smell indicating high coupling :)

Bringing in interfaces is what I would prefer. Then you can mock the used classes and you don't have to duplicate code (which violates the Don't Repeat Yourself principle) and you don't have to use the original implementations in the unit tests for the Communications class.

You should use mock objects, basically your unit test should probably just generate some fake data that looks like real data instead of calling into the real code, this way you can isolate the test and have predictable test results.

You can make use of a tool called NBuilder to generate test data. It has a very good fluent interface and is very easy to use. If your tests need to build lists this works even better. You can read more about it here.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.