Unit testing, project with many database calls

Unit testing, project with many database calls - c#

I have a question on unit testing. Currently I have a large project that calls many SP and does not get a return for most methods. Really it is a large wrapper for many SQL calls. There is not a lot of logic as it is all held in the SP it also has sections of in line sql.
I need to unit test this c# project but it is becoming clear that the unit test would be pointless as it would call many SP which all would be mocked. Am I worried I am thinking about this incorrectly.
My question is that has anyone had this problem and what did they do? Should I be doing database unit tests instead, any insight would be a great help.
Thanks.

A unit test should not touch a data access layer as that would be an integration test/system test. What you can test is that your project in fact calls your data access layer. Doing this will give you peace of mind that during refactors that clicking a button does always call the data access layer.
//Arrange
var dataAccessMock = new Mock<IDataAccessMock>();
dataAccessMock(da => da.ExecuteSomething());
IYourApplication app = new YourApplication(dataAccessMock);
//Act
app.SomeProcessThatCallsExecuteSomething("1234567890");
/Assert
dataAccessMock.Verify(dp=>da.ExecuteSomething(), Times.Once());
note, in this example I am using Moq
After this is is tested to your liking you can focus on your integration test to verify your stored procedures are working as intended. For this you will potentially need to do quite a bit of work to attach a database in a known state, run your stored procedures, and then revert or trash your database so the tests are repeatable.

You should split your testing strategy into integration testing and unit testing.
For integration testing you can rely on your existing database. You will typically write more high-level tests here and verify that your application interacts with your database correctly.
For unit testing you should only pick selected scenarios that actually make sense for mocking out. These are typically scenarios where a lot of business logic "sits on top" of your database logic and you want to verify that business logic.
Over time you can mock out more and more database calls, but for the beginning identify the critical spots.

You have discovered one reason that business logic should generally go in the business, rather than data access, layer. Certainly there are exceptions dictated by performance and sometimes security concerns, but they should remain exceptions.
Having said that, you can still develop a strategy to test your sprocs (though depending on how extensive they are, it may or may not be correct to call those tests "unit tests").
You can use a unit testing framework either way.
In the initialization section, restore a testing copy of the database to a known state, e.g. by loading it from a previously saved copy.
Then, execute unit tests that exercise the stored procedures. Since the stored procedures generally do not return anything, your unit test code will have to select values from the database to check whether the expected changes were made or not.
It may be necessary, depending on possible interactions between stored procedures, to restore the database between each test, or between groups of related tests.

Data / Persistence Layer could is the often most neglected code from a unit testing perspective (true unit testing using test doubles: mocks, stubs, fakes, etc.). If you are connecting to a database then you are integration testing. I find value in a) well architected data/persistence layers that as a side effect are easy to test (uses interfaces, good data access framework abstractedion, etc. and b) are actually unit and integration tested property.

Related

Testing Database Exists TDD

I am in need of some testing advice.
I know that it's generally bad practice to hit a database in Unit Tests except in exceptional circumstances.
I'm taking a TDD approach to an MVC project using EF. My first test is:
void DatabaseShouldExist() { ... }
I would like to know... Is this an exceptional circumstance?
I want to check that EF generated the DB and my next test will be to check if it contains the the correct seed data.
How would you go about testing this?
Should it be tested?

You want to test behaviour, so not if a DB exists or not on it own.
As suggested in comments, start with business logic.
TDD starts small and is iterative, don't dive into DB logic test 1.
Simplistic example (For a App to store movies)
Test 1 - shouldAddAMoveToList()
Test 2 - shouldBeAbleToRetrieveAMovieFromList()
Test 3 - shouldPersistAMovieBeweenSessions() // Could Be DB here
When using TDD, pick something simple first. The DB part should come into play a bit later on.
Personally I would avoid testing against a DB with a Unit Test, and save this for Integration tests. DAO pattern is good for this as you could persist in memory, or simply mock the DB side in Unit Tests.
Unit Tests should try to adhere to the FIRST principle, introducing Databases can slow down tests, and prevent them being independent (unless clearing DB each time) - At very least try to use in-memory database for Unit Tests

Writing unit tests when the repository holds your most important code

I have an EAV system that stores entities in a SQL database, fetches them out and stores them in the cache. The application is written using the repository pattern because at some point in the future we will probably switch to using a NOSQL database for serving some or all of the data. I use Ninject to fetch the correct repository at runtime.
A large part of the system's functionality is around storing, retrieving and querying data in an efficient and timely manner. There is not a huge amount of functionality that doesn't fall into the realm of data access or user interface.
I've read up on unit testing - I understand the theory but haven't put it into practice yet for a few reasons:
An entity consists of fieldsets, fields, values, each of which has many properties. Creating any large number of these in code in order to test would require a lot of effort.
Some of the most crucial parts of my code are in the repositories. For instance all of the data access goes through a single highly optimised method that fetches entities from the database or cache.
Using a test database feels like I'm breaking one of the key tenets of unit testing - no external dependencies.
In addition to this the way the repositories are built feels like it's tied into how the data is stored in SQL. Entities go in one table, fields in another, values in another etc. So I have a repository for each. It is my understanding though that in a document store database that the Entity, its field and values would all exist as a single object, removing the need for multiple repositories. I've considered making my data access more granular in order to move sections of code outside of the repository, but this would compound the problem by forcing me to write the repository interfaces in a way that is designed for retrieving data from SQL.
Question: Based on the above, should I accept that I cannot write unit tests for large parts of my code and just test the things I can?

should I accept that I cannot write unit tests for large parts of my code?
No, you shouldn't accept that. In fact, this is never the case - with enough effort, you can unit test pretty much anything.
Your problem boils down to this: your code relies upon a database, but you cannot use it, because it is an external dependency. You can address this problem by using mock objects - special objects constructed inside your unit test code that present themselves as implementations of database interfaces, and feed your program the data that is required to complete a particular unit test. When your program sends requests to these objects, your unit test code can verify that the requests are correct. When your program expects a particular response, your unit tests give it the response as required by your unit test scenario.
Mocking may be non-trivial, especially in situations when requests and responses are complex. Several libraries exist to help you out with this in .NET, making the task of coding your mock objects almost independent of the structure of the real object. However, the real complexity is often in the behavior of the system that you are mocking - in your case, that's the database. The effort of coding up this complexity is entirely on you, and it does consume a very considerable portion of your coding time.

It appears, when you say a "unit test", you really mean an "integration test". Because in a unit-test-world there is no database. If you expect to get or insert some data into the external resource, you just fake it (using mocks, stubs, fakes, spies etc)
should I accept that I cannot write unit tests for large parts of my
code and just test the things I can?
Hard to tell without seeing your code, but it sounds like you can easily unit test it. This is based on your use of the interfaces and the repository pattern. As long as a unit test is independent from other tests, tests only a single piece of functionality, small, simple, does not depend on any external resources - you are good to go.
Do not confuse this with integration and other types of testing. Those may involve real data and may be a bit trickier to write.

If you're using the proper Repository pattern testing is easy, because
Business layer knows ONLY about the repository interface which deals ONLY with objects known by the layer and doesn't expose ANYTHING related to the actual Db (like EF). Here's where you're using fakes implementing the repository interface.
Testing the db access means you're testing the Repo implementation, you get test objects in and out. It's natural for the Repository to be coupled with the db.
Test for repo should be something like this
public void add_get_object()
{
var entity=CreateSomeTestEntity();
_repo.Save(entity);
var loaded=_repo.Get(entity.Id);
//assert those 2 entities are identical, but NOT using reference
// either make it implement IEquatable<Entity> or test each property manually
}
These repo tests can be reused with ANY repository implementation: in memory, EF, raven db etc, because it shouldn't matter the implementation, it matters that the repo does what it's required to do (saving/loading business objects).

Is there any reason to use Mock objects when unit testing an Entity Framework DAL?

From what I understand, Mocking is used to remove dependency of another service so that you can properly test the execution of business logic without having to worry about those other services working or not.
However, if what you are testing IS that particular service (i.e. Entity Framework), Implementation-style unit tests against a preset test database are really the only tests that will tell you anything useful.
Am I missing anything? Does mocking have any place in my testing of an Entity Framework DAL?

You are correct in your assertion about mocking:
Mocking is used to remove dependency of another service so that you
can properly test the execution of business logic without having to
worry about those other services working or not.
In my words: the idea behind unit testing is to test a single code path through a single method. When that method hands execution over to another object there is a dependency. When control passes to an unmocked dependency you are no longer unit testing, but are instead integration testing.
Data access layer testing is typically an integration test. As you speculate you can utilize a predictable data set to ensure your DAL is returning results as expected. I would not expect a DAL to have any dependencies which would require mocking. Your testing that the values returned by your DAL match what you would expect given your dataset.
All of the above said it is not your responsibility to test the Entity Framework. If you find yourself testing the way EF works and are not creating tests about your specific DAL implementation then you are writing the wrong tests. Put another way: you test your code, let someone else test theirs.
Finally, three years ago I asked a similar question which elicited answers which greatly improved my understanding of testing. While not identical to your question I'd recommend reading through the responses.

In my opinion, mocking objects has nothing to do with the layer you are about to test. You have a component that you want to test and it has dependencies (that you can mock). So go ahead and mock.

One can assume EF works. You would test your code that interacts with EF. In this case, you fake EF in order to test your interacting code.
You basically shouldn't be testing EF. Leave that to Microsoft.

One thing you might consider is that EF can work with a local file and not your actual repository and you can switch between the two with connection strings. This example would create the .sdf file in an AppData folder or in the bin folder if it's a console application.
<connectionStrings>
<add name="SecurityContext"
connectionString="Data Source=|DataDirectory|YourDBContext.sdf"
providerName="System.Data.SqlServerCe.4.0" />
I like this when I'm starting a project or testing. You can load the DB with data and such and presto: EF has a mocked DB for you to run tests against without touching production data.

Differences between database/entity framework and in memory lists when mocking in unit tests

I have been doing a lot of unit testing lately with mocking. The one thing that strikes me as a bit of a problem are the differences between querying against an in memory list (via a mock of my repository) and querying directly against the database via entity framework.
Some of these situations might be:
Testing a filter parameter which would be case insensitive against a database but case sensitive
against an in memory collection leading to a false fail.
Linq statements that might pass against an in memory collection but would fail against entity framework because they arent supported leading to a false pass.
What is the correct way to handle or account for these differences so that there are not false passes or fails in tests? I really like mocking as it makes things so much quicker and easier to test. But it seems to me that the only way to get a really accurate test would be to just test against a the entity framework/database environment.

Besides the unit tests you do you should also create integration tests which run against a real database setup as encountered in production.
I'm not an expert for EF but with NHibernate for example you can create a configuration which points to an in-memory instance of SQLite where you then run your quick tests against (i.e. during a development cycle where you want to get through the test suite as fast as possible). When you want to run your integration tests against a real database you simply change the NHibernate config to point to a real database setup and run the same tests again.
Would be surprising if you could not achieve something similar with EF.

You can use DevMagicFake, this framework will fake the DB for you and can also generate data so you can test your application without testing the DB

First and most important is you can define any behavior data within your mock. Second is speed. From unit testing perspective testing speed counts. Database connections are bottleneck most of time so that's why you mock it with tests.
To implement testing properly you need to work on your overall arch first.
For instance to access data layer I use repository pattern sometimes. It's described really good in Eric Evans DDD book.
So let's say if your repository is defined as below
interface IRepository: IQueryable, ICollection
you can handle linq queries pretty straightforward.
Further reading Repository

I would make my mocks more granular, so that you don't actually query against a larger set in a mock repository.
I typically have setters on my mock repository that I set in each test to control the output of the mocked repository.
This way you don't have to rely on writing queries against a generic mock, and your focus can be on testing the logic in the method under test

Unit-Testing: Database set-up for tests

I'm writing unit-tests for an app that uses a database, and I'd like to be able to run the app against some sample/test data - but I'm not sure of the best way to setup the initial test data for the tests.
What I'm looking for is a means to run the code-under-test against the same database (or schematically identical) that I currently use while debugging - and before each test, I'd like to ensure that the database is reset to a clean slate prior to inserting the test data.
I realize that using an IRepository pattern would allow me to remove the complexity of testing against an actual database, but I'm not sure that will be possible in my case.
Any suggestions or articles that could point me in the right direction?
Thanks!
--EDIT--
Thanks everyone, those are some great suggestions! I'll probably go the route of mocking my data access layer, combined with some simple set-up classes to generate exactly the data I need per test.

Here's the general approach I try to use. I conceive of tests at about three or four levels:: unit-tests, interaction tests, integration tests, acceptance tests.
At the unit test level, it's just code. Any database interaction is mocked out, either manually or using one of the popular frameworks, so loading data is not an issue. They run quick, and make sure the objects work as expected. This allows for very quick write-test/write code/run test cycles. The mock objects serve up the data that is needed by each test.
Interaction tests test the interactions of non-trivial class interactions. Again, no database required, it's mocked out.
Now at the integration level, I'm testing integration of components, and that's where real databases, queues, services, yada yada, get thrown in. If I can, I'll use one of the popular in-memory databases, so initialization is not an issue. It always starts off empty, and I use utility classes to scrub the database and load exactly the data I want before each test, so that there's no coupling between the tests.
The problem I've hit using in-memory databases is that they often don't support all the features I need. For example, perhaps I require an outer join, and the in-memory DB doesn't support that. In that case, I'll typically test against a local conventional database such as MySQL, again, scrubbing it before each test. Since the app is deployed to production in a separate environment, that data is untouched by the testing cycle.

The best way I've found to handle this is to use a static test database with known data, and use transactions to ensure that your tests don't change anything.
In your test setup you would start a transaction, and in your test cleanup, you would roll the transaction back. This lets you modify data in your tests but also makes sure everything gets restored to its original state when the test completes.

I know you're using C# but in the Java World there's the Spring framework. It allows you to run database minipulations in a transaction and after this transaction, you roll this one back. This means that you operate against a real database without touching the state after the test finishes. Perhaps this could be a hint to further investigation in C#.

Mocking is of cause the best way to unit test your code.
As far as integration tests go, I have had some issues using in-memory databases like SQLite, mainly because of small differences in behaviour and/or syntax.
I have been using a local instance of MySql for integration tests in several projects. A returning problem is the server setup and creation of test data.
I have created a small Nuget package called Mysql.Server (see more at https://github.com/stumpdk/MySql.Server), that simply sets up a local instance of MySql every time you run your tests.
With this instance running you can easily set up table structures and sample data for your tests without being concerned of either your production environment or local server setup.

I don't think there is an easy way to finish this. You just have to create those Pre-Test sql setup scripts and post-test Tear-down scripts. Then you need trigger those scripts for each run. A lot of people suggest SQLLite for unit test setup.

I found it best to have my tests go to a different db so I could wipe it clean and put in the data I wanted for the test.
You may want to have the database be something that can be set within the program, then your test can tell the classes to change the database.

This code clears all data from all user's tables in MS SQL Server:
private DateTime _timeout;
public void ClearDatabase(SqlConnection connection)
{
_timeout = DateTime.Now + TimeSpan.FromSeconds(30);
do
{
SqlCommand command = connection.CreateCommand();
command.CommandText = "exec sp_MSforeachtable 'DELETE FROM ?'";
try
{
command.ExecuteNonQuery();
return;
}
catch (SqlException)
{
}
} while (!TimeOut());
if (TimeOut())
Assert.Fail("Fail to clear DB");
}
private bool TimeOut()
{
return DateTime.Now > _timeout;
}

If you are thinking about a real database usage, then mostlikely we're talking integration tests here. I.e tests, which check app behavior as a composition of different components contrary to unit tests, where components are supposed to be tested in isolation.
Having the testing scope defined, I wouldn't recommend using things like in-memory databases or mocking libraries as the other authors suggested. The problem is that usually there is a slightly different behavior or reduced set of features for in-memory databases and there is no database at all with mocking, therefore you'll be testing some other application in general sense and not the one you'll be delivering to your customers.
I'd rather suggest to minimize the amount of integration tests by covering just a crucial parts of your logic leaving the rest for unit testing, while using a real database with the setup as close to the production one as possible. Test runs could be too slow and a real pain if there are a lot of integration ones.
Also you might use some tricks to optimize the speed of your tests execution:
Split tests to Read and Write in regard to the data mutations they introduce and run the former ones in parallel and without any cleanup. (E.g HTTP GET requests are safe to be run in parallel if the system under test is a webapp and tests are more like end-to-end);
Use the only insert/delete script for all the data and optimize as much as possible. You might find Reseed library I'm developing currently helpful. It's able to generate both insert and delete scripts for you. So basically what you asked for. Or check out Respawn which could be used for database cleanup;
Use database snapshots for the restore, which might be faster than full insert/delete cycle;
Wrap each test in transaction and revert it afterwards (this one is also not 100% honest and somewhat fragile);
Parallelize your tests by using a pool of databases instead of the only. Docker and TestContainers could be suitable here;

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.