How to use unit tests in projects with many levels of indirection

How to use unit tests in projects with many levels of indirection - c#

I was looking over a fairly modern project created with a big emphasis on unit testing. In accordance with old adage "every problem in object oriented programming can be solved by introducing new layer of indirection" this project was sporting multiple layers of indirection. The side-effect was that fair amount of code looked like following:
public bool IsOverdraft)
{
balanceProvider.IsOverdraft();
}
Now, because of the empahsis on unit testing and maintaining high code coverage, every piece of code had unit tests written against it.Therefore this little method would have three unit tests present. Those would check:
If balanceProvider.IsOverdraft() returns true then IsOverdraft should return true
If balanceProvider.IsOverdraft() returns false then IsOverdraft should return false
If balanceProvider throws an exception then IsOverdraft should rethrow the same exception
To make things worse, the mocking framework used (NMock2) accepted method names as string literals, as follows:
NMock2.Expect.Once.On(mockBalanceProvider)
.Method("IsOverdraft")
.Will(NMock2.Return.Value(false));
That obviously made "red, green, refactor" rule into "red, green, refactor, rename in test, rename in test, rename in test". Using differnt mocking framework like Moq, would help with refactoring, but it would require a sweep trough all existing unit tests.
What is the ideal way to handle this situation?
A) Keep smaller levels of layers, so that those forwarding calls do not happen anymore.
B) Do not test those forwarding methods, as they do not contain business logic. For purposes of coverage marked them all with ExcludeFromCodeCoverage attribute.
C) Test only if proper method is invoked, without checking return values, exceptions, etc.
D) Suck it up, and keep writing those tests ;)

Either B or C. That's the problem with such general requirements ("every method must have unit test, every line of code needs to be covered") - sometimes, benefit they provide is not worth the cost. If it's something you came up with, I suggest rethinking this approach. The "we must have 95% code coverage" might be appealing on paper but in practice it quickly spawns problems like the one you have.
Also, the code you're testing is something I'd call trivial code. Having 3 tests for it is most likely overkill. For that single line of code, you'll have to maintain like 40 more. Unless your software is mission critical (which might explain high-coverage requirement), I'd skip those tests.
One of the (IMHO) most pragmatic advices on this topic was provided by Kent Beck some time ago on this very site and I expanded a bit on those thoughts with in my blog posts - What should you test?

Honestly, I think we should write tests only to document our code in an helpful manner. We should not write tests just for the sake of code coverage. (Code coverage is just a great tool to figure out what it is NOT covered so that we can figure out if we did forget important unit tests cases or if we actually have some dead code somewhere).
If I write a test, but the test ends up just being a "duplication" of the implementation or worse...if it's harder to understand the test than the actual implementation....then really such a test should not exists. Nobody is interested in reading such tests. Tests should not contain implementation details. Test are about "what" should happen not "how" it will be done. Since you've tagged your question with "TDD", I would add that TDD is a design practice. So if I already know 100% sure in advance what will be the design of what i'm going to implement, then there is no point for me to use TDD and write unit tests (But I will always have in all cases a high level acceptance test that will cover that code). That will happen often when the thing to design is really simple, like in your example. TDD is not about testing and code coverage, but really about helping us to design our code and document our code. There is no point to use a design tool or a documentation tool for designing/documenting simple/obvious things.
In your example, it's far easier to understand what's going on by reading directly the implementation than the test. The test doesn't add any value in term of documentation. So I'd happily erase it.
On top of that such tests are horridly brittle, because they are tightly coupled to the implementation. That's a nightmare on the long term when you need to refactor stuff since any time you will want to change the implementation they will break.
What I'd suggest to do, is to not write such tests but instead have higher level component tests or fast integration tests/acceptance tests that would exercise these layers without knowing anything at all about the inner working.

I think one of the most important things to keep in mind with unit tests is that it doesn't necessarily matter how the code is implemented today, but rather what happens when the tested code, direct or indirect, is modified in the future.
If you ignore those methods today and they are critical to your application's operation, then someone decides to implement a new balanceProvider at some point down the road or decides that the redirection no longer makes sense, you will most likely have a failure point.
So, if this were my application, I would first look to reduce the forward-only calls to a bare minimum (reducing the code complexity), then introduce a mocking framework that does not rely on string values for method names.

A couple of things to add to the discussion here.
Switch to a better mocking framework immediately and incrementally. We switched from RhinoMock to Moq about 3 years ago. All new tests used Moq, and often when we change a test class we switch it over. But areas of the code that haven't changed much or have huge test casses are still using RhinoMock and that is OK. The code we work with from day to day is much better as a result of making the switch. All test changes can happen in this incremental way.
You are writing too many tests. An important thing to keep in mind in TDD is that you should only write code to satisfy a red test, and you should only write a test to specify some unwritten code. So in your example, three tests is overkill, because at most two are needed to force you to write all of that production code. The exception test does not make you write any new code, so there is no need to write it. I would probably only write this test:
[Test]
public void IsOverdraftDelegatesToBalanceProvider()
{
var result = RandomBool();
providerMock.Setup(p=>p.IsOverdraft()).Returns(result);
Assert.That(myObject.IsOverDraft(), Is.EqualTo(result);
}
Don't create useless layers of indirection. Mostly, unit tests will tell you if you need indirection. Most indirection needs can be solved by the dependency inversion principle, or "couple to abstractions, not concretions". Some layers are needed for other reasons (I make WCF ServiceContract implementations a thin pass through layer. I also don't test that pass through). If you see a useless layer of indirection, 1) make sure it really is useless, then 2) delete it. Code clutter has a huge cost over time. Resharper makes this ridiculously easy and safe.
Also, for meaningful delegation or delegation scenarios you can't get rid of but need to test, something like this makes it a lot easier.

I'd say D) Suck it up, and keep writing those tests ;) and try to see if you can replace NMock with MOQ.
It might not seem necessary and even though it's just delegation now, but the tests are testing that it's calling the right method with right parameters, and the method itself is not doing anything funky before returning values. So it's a good idea to cover them in tests. But to make it easier use MOQ or similiar framework that'll make it so much easier to refactor.

Related

Should invalid cases be in one test? [duplicate]

What Makes a Good Unit Test? says that a test should test only one thing. What is the benefit from that?
Wouldn't it be better to write a bit bigger tests that test bigger block of code? Investigating a test failure is anyway hard and I don't see help to it from smaller tests.
Edit: The word unit is not that important. Let's say I consider the unit a bit bigger. That is not the issue here. The real question is why make a test or more for all methods as few tests that cover many methods is simpler.
An example: A list class. Why should I make separate tests for addition and removal? A one test that first adds then removes sounds simpler.

Testing only one thing will isolate that one thing and prove whether or not it works. That is the idea with unit testing. Nothing wrong with tests that test more than one thing, but that is generally referred to as integration testing. They both have merits, based on context.
To use an example, if your bedside lamp doesn't turn on, and you replace the bulb and switch the extension cord, you don't know which change fixed the issue. Should have done unit testing, and separated your concerns to isolate the problem.
Update: I read this article and linked articles and I gotta say, I'm shook: https://techbeacon.com/app-dev-testing/no-1-unit-testing-best-practice-stop-doing-it
There is substance here and it gets the mental juices flowing. But I reckon that it jibes with the original sentiment that we should be doing the test that context demands. I suppose I'd just append that to say that we need to get closer to knowing for sure the benefits of different testing on a system and less of a cross-your-fingers approach. Measurements/quantifications and all that good stuff.

I'm going to go out on a limb here, and say that the "only test one thing" advice isn't as actually helpful as it's sometimes made out to be.
Sometimes tests take a certain amount of setting up. Sometimes they may even take a certain amount of time to set up (in the real world). Often you can test two actions in one go.
Pro: only have all that setup occur once. Your tests after the first action will prove that the world is how you expect it to be before the second action. Less code, faster test run.
Con: if either action fails, you'll get the same result: the same test will fail. You'll have less information about where the problem is than if you only had a single action in each of two tests.
In reality, I find that the "con" here isn't much of a problem. The stack trace often narrows things down very quickly, and I'm going to make sure I fix the code anyway.
A slightly different "con" here is that it breaks the "write a new test, make it pass, refactor" cycle. I view that as an ideal cycle, but one which doesn't always mirror reality. Sometimes it's simply more pragmatic to add an extra action and check (or possibly just another check to an existing action) in a current test than to create a new one.

Tests that check for more than one thing aren't usually recommended because they are more tightly coupled and brittle. If you change something in the code, it'll take longer to change the test, since there are more things to account for.
[Edit:]
Ok, say this is a sample test method:
[TestMethod]
public void TestSomething() {
// Test condition A
// Test condition B
// Test condition C
// Test condition D
}
If your test for condition A fails, then B, C, and D will appear to fail as well, and won't provide you with any usefulness. What if your code change would have caused C to fail as well? If you had split them out into 4 separate tests, you would know this.

Haaa... unit tests.
Push any "directives" too far and it rapidly becomes unusable.
Single unit test test a single thing is just as good practice as single method does a single task. But IMHO that does not mean a single test can only contain a single assert statement.
Is
#Test
public void checkNullInputFirstArgument(){...}
#Test
public void checkNullInputSecondArgument(){...}
#Test
public void checkOverInputFirstArgument(){...}
...
better than
#Test
public void testLimitConditions(){...}
is question of taste in my opinion rather than good practice. I personally much prefer the latter.
But
#Test
public void doesWork(){...}
is actually what the "directive" wants you to avoid at all cost and what drains my sanity the fastest.
As a final conclusion, group together things that are semantically related and easilly testable together so that a failed test message, by itself, is actually meaningful enough for you to go directly to the code.
Rule of thumb here on a failed test report: if you have to read the test's code first then your test are not structured well enough and need more splitting into smaller tests.
My 2 cents.

Think of building a car. If you were to apply your theory, of just testing big things, then why not make a test to drive the car through a desert. It breaks down. Ok, so tell me what caused the problem. You can't. That's a scenario test.
A functional test may be to turn on the engine. It fails. But that could be because of a number of reasons. You still couldn't tell me exactly what caused the problem. We're getting closer though.
A unit test is more specific, and will firstly identify where the code is broken, but it will also (if doing proper TDD) help architect your code into clear, modular chunks.
Someone mentioned about using the stack trace. Forget it. That's a second resort. Going through the stack trace, or using debug is a pain and can be time consuming. Especially on larger systems, and complex bugs.
Good characteristics of a unit test:
Fast (milliseconds)
Independent. It's not affected by or dependent on other tests
Clear. It shouldn't be bloated, or contain a huge amount of setup.

Using test-driven development, you would write your tests first, then write the code to pass the test. If your tests are focused, this makes writing the code to pass the test easier.
For example, I might have a method that takes a parameter. One of the things I might think of first is, what should happen if the parameter is null? It should throw a ArgumentNull exception (I think). So I write a test that checks to see if that exception is thrown when I pass a null argument. Run the test. Okay, it throws NotImplementedException. I go and fix that by changing the code to throw an ArgumentNull exception. Run my test it passes. Then I think, what happens if it's too small or too big? Ah, that's two tests. I write the too small case first.
The point is I don't think of the behavior of the method all at once. I build it incrementally (and logically) by thinking about what it should do, then implement code and refactoring as I go to make it look pretty (elegant). This is why tests should be small and focused because when you are thinking about the behavior you should develop in small, understandable increments.

Having tests that verify only one thing makes troubleshooting easier. It's not to say you shouldn't also have tests that do test multiple things, or multiple tests that share the same setup/teardown.
Here should be an illustrative example. Let's say that you have a stack class with queries:
getSize
isEmpty
getTop
and methods to mutate the stack
push(anObject)
pop()
Now, consider the following test case for it (I'm using Python like pseudo-code for this example.)
class TestCase():
def setup():
self.stack = new Stack()
def test():
stack.push(1)
stack.push(2)
stack.pop()
assert stack.top() == 1, "top() isn't showing correct object"
assert stack.getSize() == 1, "getSize() call failed"
From this test case, you can determine if something is wrong, but not whether it is isolated to the push() or pop() implementations, or the queries that return values: top() and getSize().
If we add individual test cases for each method and its behavior, things become much easier to diagnose. Also, by doing fresh setup for each test case, we can guarantee that the problem is completely within the methods that the failing test method called.
def test_size():
assert stack.getSize() == 0
assert stack.isEmpty()
def test_push():
self.stack.push(1)
assert stack.top() == 1, "top returns wrong object after push"
assert stack.getSize() == 1, "getSize wrong after push"
def test_pop():
stack.push(1)
stack.pop()
assert stack.getSize() == 0, "getSize wrong after push"
As far as test-driven development is concerned. I personally write larger "functional tests" that end up testing multiple methods at first, and then create unit tests as I start to implement individual pieces.
Another way to look at it is unit tests verify the contract of each individual method, while larger tests verify the contract that the objects and the system as a whole must follow.
I'm still using three method calls in test_push, however both top() and getSize() are queries that are tested by separate test methods.
You could get similar functionality by adding more asserts to the single test, but then later assertion failures would be hidden.

If you are testing more than one thing then it is called an Integration test...not a unit test. You would still run these integration tests in the same testing framework as your unit tests.
Integration tests are generally slower, unit tests are fast because all dependencies are mocked/faked, so no database/web service/slow service calls.
We run our unit tests on commit to source control, and our integration tests only get run in the nightly build.

If you test more than one thing and the first thing you test fails, you will not know if the subsequent things you are testing pass or fail. It is easier to fix when you know everything that will fail.

Smaller unit test make it more clear where the issue is when they fail.

The GLib, but hopefully still useful, answer is that unit = one. If you test more than one thing, then you aren't unit testing.

Regarding your example: If you are testing add and remove in the same unit test, how do you verify that the item was ever added to your list? That is why you need to add and verify that it was added in one test.
Or to use the lamp example: If you want to test your lamp and all you do is turn the switch on and then off, how do you know the lamp ever turned on? You must take the step in between to look at the lamp and verify that it is on. Then you can turn it off and verify that it turned off.

I support the idea that unit tests should only test one thing. I also stray from it quite a bit. Today I had a test where expensive setup seemed to be forcing me to make more than one assertion per test.
namespace Tests.Integration
{
[TestFixture]
public class FeeMessageTest
{
[Test]
public void ShouldHaveCorrectValues
{
var fees = CallSlowRunningFeeService();
Assert.AreEqual(6.50m, fees.ConvenienceFee);
Assert.AreEqual(2.95m, fees.CreditCardFee);
Assert.AreEqual(59.95m, fees.ChangeFee);
}
}
}
At the same time, I really wanted to see all my assertions that failed, not just the first one. I was expecting them all to fail, and I needed to know what amounts I was really getting back. But, a standard [SetUp] with each test divided would cause 3 calls to the slow service. Suddenly I remembered an article suggesting that using "unconventional" test constructs is where half the benefit of unit testing is hidden. (I think it was a Jeremy Miller post, but can't find it now.) Suddenly [TestFixtureSetUp] popped to mind, and I realized I could make a single service call but still have separate, expressive test methods.
namespace Tests.Integration
{
[TestFixture]
public class FeeMessageTest
{
Fees fees;
[TestFixtureSetUp]
public void FetchFeesMessageFromService()
{
fees = CallSlowRunningFeeService();
}
[Test]
public void ShouldHaveCorrectConvenienceFee()
{
Assert.AreEqual(6.50m, fees.ConvenienceFee);
}
[Test]
public void ShouldHaveCorrectCreditCardFee()
{
Assert.AreEqual(2.95m, fees.CreditCardFee);
}
[Test]
public void ShouldHaveCorrectChangeFee()
{
Assert.AreEqual(59.95m, fees.ChangeFee);
}
}
}
There is more code in this test, but it provides much more value by showing me all the values that don't match expectations at once.
A colleague also pointed out that this is a bit like Scott Bellware's specunit.net: http://code.google.com/p/specunit-net/

Another practical disadvantage of very granular unit testing is that it breaks the DRY principle. I have worked on projects where the rule was that each public method of a class had to have a unit test (a [TestMethod]). Obviously this added some overhead every time you created a public method but the real problem was that it added some "friction" to refactoring.
It's similar to method level documentation, it's nice to have but it's another thing that has to be maintained and it makes changing a method signature or name a little more cumbersome and slows down "floss refactoring" (as described in "Refactoring Tools: Fitness for Purpose" by Emerson Murphy-Hill and Andrew P. Black. PDF, 1.3 MB).
Like most things in design, there is a trade-off that the phrase "a test should test only one thing" doesn't capture.

When a test fails, there are three options:
The implementation is broken and should be fixed.
The test is broken and should be fixed.
The test is not anymore needed and should be removed.
Fine-grained tests with descriptive names help the reader to know why the test was written, which in turn makes it easier to know which of the above options to choose. The name of the test should describe the behaviour which is being specified by the test - and only one behaviour per test - so that just by reading the names of the tests the reader will know what the system does. See this article for more information.
On the other hand, if one test is doing lots of different things and it has a non-descriptive name (such as tests named after methods in the implementation), then it will be very hard to find out the motivation behind the test, and it will be hard to know when and how to change the test.
Here is what a it can look like (with GoSpec), when each test tests only one thing:
func StackSpec(c gospec.Context) {
stack := NewStack()
c.Specify("An empty stack", func() {
c.Specify("is empty", func() {
c.Then(stack).Should.Be(stack.Empty())
})
c.Specify("After a push, the stack is no longer empty", func() {
stack.Push("foo")
c.Then(stack).ShouldNot.Be(stack.Empty())
})
})
c.Specify("When objects have been pushed onto a stack", func() {
stack.Push("one")
stack.Push("two")
c.Specify("the object pushed last is popped first", func() {
x := stack.Pop()
c.Then(x).Should.Equal("two")
})
c.Specify("the object pushed first is popped last", func() {
stack.Pop()
x := stack.Pop()
c.Then(x).Should.Equal("one")
})
c.Specify("After popping all objects, the stack is empty", func() {
stack.Pop()
stack.Pop()
c.Then(stack).Should.Be(stack.Empty())
})
})
}

The real question is why make a test or more for all methods as few tests that cover many methods is simpler.
Well, so that when some test fails you know which method fails.
When you have to repair a non-functioning car, it is easier when you know which part of the engine is failing.
An example: A list class. Why should I make separate tests for addition and removal? A one test that first adds then removes sounds simpler.
Let's suppose that the addition method is broken and does not add, and that the removal method is broken and does not remove. Your test would check that the list, after addition and removal, has the same size as initially. Your test would be in success. Although both of your methods would be broken.

Disclaimer: This is an answer highly influenced by the book "xUnit Test Patterns".
Testing only one thing at each test is one of the most basic principles that provides the following benefits:
Defect Localization: If a test fails, you immediately know why it failed (ideally without further troubleshooting, if you've done a good job with the assertions used).
Test as a specification: the tests are not only there as a safety net, but can easily be used as specification/documentation. For instance, a developer should be able to read the unit tests of a single component and understand the API/contract of it, without needing to read the implementation (leveraging the benefit of encapsulation).
Infeasibility of TDD: TDD is based on having small-sized chunks of functionality and completing progressive iterations of (write failing test, write code, verify test succeeds). This process get highly disrupted if a test has to verify multiple things.
Lack of side-effects: Somewhat related to the first one, but when a test verifies multiple things, it's more possible that it will be tied to other tests as well. So, these tests might need to have a shared test fixture, which means that one will be affected by the other one. So, eventually you might have a test failing, but in reality another test is the one that caused the failure, e.g. by changing the fixture data.
I can only see a single reason why you might benefit from having a test that verifies multiple things, but this should be seen as a code smell actually:
Performance optimisation: There are some cases, where your tests are not running only in memory, but are also dependent in persistent storage (e.g. databases). In some of these cases, having a test verify multiple things might help in decreasing the number of disk accesses, thus decreasing the execution time. However, unit tests should ideally be executable only in memory, so if you stumble upon such a case, you should re-consider whether you are going in the wrong path. All persistent dependencies should be replaced with mock objects in unit tests. End-to-end functionality should be covered by a different suite of integration tests. In this way, you do not need to care about execution time anymore, since integration tests are usually executed by build pipelines and not by developers, so a slightly higher execution time has almost no impact to the efficiency of the software development lifecycle.

Use virtual protected factory methods to mock objects created within a class?

Rhino Mocks is tightly coupled with the design pattern of using dependency injection and constructor injection, but I typically don't follow the dependency-injection paradigm and don't like to re-architect my solution just for my test tool.
Take this scenario:
class MyClass{
public void MyMethod(...){
var x = new Something(...);
x.A();
x.B();
x.C();
}
}
Would it be quite typical and acceptable to instead do the following, since this is not a case where I would generally wish to inject the dependency - it can be considered part of MyClass' behaviour/logic.
class MyClass{
public void MyMethod(...){
var x = NewSomething(...);
x.A();
x.B();
x.C();
}
virtual protected Something NewSomething(...){
return new Something(...);
}
}
Now I can (I think) extend MyClass either as a concrete class in my test project, or using Rhino... right? Is this a)correct b)a reasonably sensible, commonplace way of doing things?
Another approach I can see other than DI could be that I actually have a ClassFactory class in my project which creates all instances as needed; then I find a way to mock/stub that in my tests. But this seems 'smelly' to me, though I'm aware it is a pattern some people use.

I have used this quite a few times when trying to make legacy code more testable, although it really likes to come back and bite you later.
Basically, mocks/fakes/testdoubles are your enemy. You should hate them and avoid them (Edit note: I'm not saying don't use them, I'm saying use them only when you HAVE to). It all follows from the paradigm that all code is bad code, and we should write as little code as necessary to complete the task. Having a bunch of test doubles overriding virtual methods makes your code very rigid. It makes it really painful to change a method signature, even if your production code only invokes the method in a single place, because your test doubles will also break. It also makes it painful to later on clean up the mess and actually inject the dependency (and yes, I would argue injecting stuff is Objectively Better(tm)).
What it comes down to is basically: Yes, doing this will make your code more testable, but without any of the benefits you usually get by testing. You wn't get better design, you'll have rigid code etc.
I won't really point any fingers though, since I like I said have used this on occasion just to get a test up to see if something works. It can be a temporary solution that is "good enough", but my final answer is "probably don't if you can at all avoid it (and still have tested code)".

You would be sacrificing the Single Responsability Pattern (SRP) with the method that you're suggesting, and arguably making your code harder to read, understand and maintain, there's a smell right there, before we even reach the talking about tests.
If you plan to run tests, and at the same time not follow SOLID principles , then where are you saving on time, readability, or agility? (I honestly would love to know).
What the DI principle allows you to do is to run your tests completely in isolation from your dependencies, which is exactly what you want to be doing.
The SOLID principles of OOP have many good arguments going for them, but I'm always open to learn and be wrong, but I must say, I may have been blinded by several years of SOLID code, and tens of thousands of green unit tests securing my projects.

Recommendation for testing a long, complex method

I've got a fairly long and intricate C# method - just shy of 200 lines - that I'm trying to figure out how to test effectively. I've already got about 50 unit tests for this particular method, but I'm not satisfied with them, for two reasons: (1) Experience has shown that they've missed some problematic scenarios, and (2) the tests are complicated enough that I'm having trouble confirming that they're actually testing what I want them to test.
The strategy that I'm adopting to ameliorate this problem is to refactor the method into half a dozen smaller methods, which should individually be easier to test. So far, so good - nothing unusual about this.
But I'm worried about the fact that these new methods - which I should normally make private, as I can't foresee them being used by any other production classes - either (a) need to be public, so that they can be tested, or (b) if I leave them private, I need to jump through weird reflection-style hoops to test them. Since the class in question isn't intended for external consumption, I'm not horribly worried about exposing these ostensibly private methods as public, but it still strikes me as having a weird code smell that I'd prefer to avoid.
What have other folks done in similar scenarios? What sort of strategies should I be adopting to help with this?

Spliiting the method up is a good start.
You don't need to make them public. Make the methods internal and use the InternalsVisibleTo-Attribute to grant your unit test assembly access to them.
If you have a Visual Studio version that supports it, use the "Analyze code coverage" feature to check if you have tested every line.

Why do we need mocking frameworks?

I have worked with code which had NUnit test written. But, I have never worked with mocking frameworks. What are they? I understand dependency injection and how it helps to improve the testability. I mean all dependencies can be mocked while unit testing. But, then why do we need mocking frameworks? Can't we simply create mock objects and provide dependencies. Am I missing something here?
Thanks.

It makes mocking easier
They usually
allow you to express testable
assertions that refer to the
interaction between objects.
Here you have an example:
var extension = MockRepository
.GenerateMock<IContextExtension<StandardContext>>();
var ctx = new StandardContext();
ctx.AddExtension(extension);
extension.AssertWasCalled(
e=>e.Attach(null),
o=>o.Constraints(Is.Equal(ctx)));
You can see that I explicitly test that the Attach method of the IContextExtension was called and that the input parameter was said context object. It would make my test fail if that did not happen.

You can create mock objects by hand and use them during testing using Dependency Injection frameworks...but letting a mocking framework generate your mock objects for you saves time.
As always, if using the framework adds too much complexity to be useful then don't use it.

Sometimes when working with third-party libraries, or even working with some aspects of the .NET framework, it is extremely difficult to write tests for some situations - for example, an HttpContext, or a Sharepoint object. Creating mock objects for those can become very cumbersome, so mocking frameworks take care of the basics so we can spend our time focusing on what makes our applications unique.

Using a mocking framework can be a much more lightweight and simple solution to provide mocks than actually creating a mock object for every object you want to mock.
For example, mocking frameworks are especially useful to do things like verify that a call was made (or even how many times that call was made). Making your own mock objects to check behaviors like this (while mocking behavior is a topic in itself) is tedious, and yet another place for you to introduce a bug.
Check out Rhino Mocks for an example of how powerful a mocking framework can be.

Mock objects take the place of any large/complex/external objects your code needs access to in order to run.
They are beneficial for a few reasons:
Your tests are meant to run fast and easily. If your code depends on, say, a database connection then you would need to have a fully configured and populated database running in order to run your tests. This can get annoying, so you create a replace - a "mock" - of the database connection object that just simulates the database.
You can control exactly what output comes out of the Mock objects and can therefore use them as controllable data sources to your tests.
You can create the mock before you create the real object in order to refine its interface. This is useful in Test-driven Development.

The only reason to use a mocking library is that it makes mocking easier.
Sure, you can do it all without the library, and that is fine if it's simple, but as soon as they start getting complicated, libraries are much easier.
Think of this in terms of sorting algorithms, sure anyone can write one, but why? If the code already exists and is simple to call... why not use it?

You certainly can mock your dependencies manually, but with a framework it takes a lot of the tedious work away. Also the assertions usually available make it worth it to learn.

Mocking frameworks allow you to isolate units of code that you wish to test from that code's dependencies. They also allow you to simulate various behaviors of your code's dependencies in a test environment that might be difficult to setup or reproduce otherwise.
For example if I have a class A containing business rules and logic that I wish to test, but this class A depends on a data-access classes, other business classes, even u/i classes, etc., these other classes can be mocked to perform in a certain manner (or in no manner at all in the case of loose mock behavior) to test the logic within your class A based on every imaginable way that these other classes could conceivably behave in a production environment.
To give a deeper example, suppose that your class A invokes a method on a data access class such as
public bool IsOrderOnHold(int orderNumber) {}
then a mock of that data access class could be setup to return true every time or to return false every time, to test how your class A responds to such circumstances.

I'd claim you don't. Writing test doubles isn't a large chore in 9 times out of 10. Most of the time it's done almost entirely automatically by just asking resharper to implement an interface for you and then you just add the minor detail needed for this double (because you aren't doing a bunch of logic and creating these intricate super test doubles, right? Right?)
"But why would I want my test project bloated with a bunch of test doubles" you may ask. Well you shouldn't. the DRY principle holds for tests as well. Create GOOD test doubles that are reusable and have descriptive names. This makes your tests more readable too.
One thing it DOES make harder is to over-use test doubles. I tend to agree with Roy Osherove and Uncle Bob, you really don't want to create a mock object with some special configuration all that often. This is in itself a design smell. Using a framework it's soooo easy to just use test doubles with intricate logic in just about every test and in the end you find that you haven't really tested your production code, you have merely tested the god-awful frankenstein's monsteresque mess of mocks containing mocks containing more mocks. You'll never "accidentally" do this if you write your own doubles.
Of course, someone will point out that there are times when you "have" to use a framework, not doing so would be plain stupid. Sure, there are cases like that. But you probably don't have that case. Most people don't, and only for a small part of the code, or the code itself is really bad.
I'd recommend anyone (ESPECIALLY a beginner) to stay away from frameworks and learn how to get by without them, and then later when they feel that they really have to they can use whatever framework they think is the most suitable, but by then it'll be an informed desicion and they'll be far less likely to abuse the framework to create bad code.

Well mocking frameworks make my life much easier and less tedious so I can spend time on actually writing code. Take for instance Mockito (in the Java world)
//mock creation
List mockedList = mock(List.class);
//using mock object
mockedList.add("one");
mockedList.clear();
//verification
verify(mockedList).add("one");
verify(mockedList).clear();
//stubbing using built-in anyInt() argument matcher
when(mockedList.get(anyInt())).thenReturn("element");
//stubbing using hamcrest (let's say isValid() returns your own hamcrest matcher):
when(mockedList.contains(argThat(isValid()))).thenReturn("element");
//following prints "element"
System.out.println(mockedList.get(999));
Though this is a contrived example if you replace List.class with MyComplex.class then the value of having a mocking framework becomes evident. You could write your own or do without but why would you want to go that route.

I first grok'd why I needed a mocking framework when I compared writing test doubles by hand for a set of unit tests (each test needed slightly different behaviour so I was creating subclasses of a base fake type for each test) with using something like RhinoMocks or Moq to do the same work.
Simply put it was much faster to use a framework to generate all of the fake objects I needed rather than writing (and debugging) my own fakes by hand.

How to unit test code that is highly complex behind the public interface

I'm wondering how I should be testing this sort of functionality via NUnit.
Public void HighlyComplexCalculationOnAListOfHairyObjects()
{
// calls 19 private methods totalling ~1000 lines code + comments + whitespace
}
From reading I see that NUnit isn't designed to test private methods for philosophical reasons about what unit testing should be; but trying to create a set of test data that fully executed all the functionality involved in the computation would be nearly impossible. Meanwhile the calculation is broken down into a number of smaller methods that are reasonably discrete. They are not however things that make logical sense to be done independently of each other so they're all set as private.

You've conflated two things. The Interface (which might expose very little) and this particular Implementation class, which might expose a lot more.
Define the narrowest possible Interface.
Define the Implementation class with testable (non-private) methods and attributes. It's okay if the class has "extra" stuff.
All applications should use the Interface, and -- consequently -- don't have type-safe access to the exposed features of the class.
What if "someone" bypasses the Interface and uses the Class directly? They are sociopaths -- you can safely ignore them. Don't provide them phone support because they violated the fundamental rule of using the Interface not the Implementation.

To solve your immediate problem, you may want to take a look at Pex, which is a tool from Microsoft Research that addresses this type of problem by finding all relevant boundary values so that all code paths can be executed.
That said, had you used Test-Driven Development (TDD), you would never had found yourself in that situation, since it would have been near-impossible to write unit tests that drives this kind of API.
A method like the one you describe sounds like it tries to do too many things at once. One of the key benefits of TDD is that it drives you to implement your code from small, composable objects instead of big classes with inflexible interfaces.

As mentioned, InternalsVisibleTo("AssemblyName") is a good place to start when testing legacy code.
Internal methods are still private in the sense that assemblys outside of the current assembly cannot see the methods. Check MSDN for more infomation.
Another thing would be to refactor the large method into smaller, more defined classes. Check this question I asked about a similiar problem, testing large methods.

Personally I'd make the constituent methods internal, apply InternalsVisibleTo and test the different bits.
White-box unit testing can certainly still be effective - although it's generally more brittle than black-box testing (i.e. you're more likely to have to change the tests if you change the implementation).

HighlyComplexCalculationOnAListOfHairyObjects() is a code smell, an indication that the class that contains it is potentially doing too much and should be refactored via Extract Class. The methods of this new class would be public, and therefore testable as units.
One issue to such a refactoring is that the original class held a lot of state that the new class would need. Which is another code smell, one that indicates that state should be moved into a value object.

I've seen (and probably written) many such hair objects. If it's hard to test, it's usually a good candidate for refactoring. Of course, one problem with that is that the first step to refactoring is making sure it passes all tests first.
Honestly, though, I'd look to see if there isn't some way you can break that code down into a more manageable section.

Get the book Working Effectively with Legacy Code by Michael Feathers. I'm about a third of the way through it, and it has multiple techniques for dealing with these types of problems.

Your question implies that there are many paths of execution throughout the subsystem. The first idea that pops into mind is "refactor." Even if your API remains a one-method interface, testing shouldn't be "impossible".

trying to create a set of test data
that fully executed all the
functionality involved in the
computation would be nearly impossible
If that's true, try a less ambitious goal. Start by testing specific, high-usage paths through the code, paths that you suspect may be fragile, and paths for which you've had reported bugs.
Refactoring the method into separate sub-algorithms will make your code more testable (and might be beneficial in other ways), but if your problem is a ridiculous number of interactions between those sub-algorithms, extract method (or extract to strategy class) won't really solve it: you'll have to build up a solid suite of tests one at a time.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.