Say, I have the following test:
[Test]
public void MyTest( [RandomNumbers( Count=100, Minimum=0, Maximum=1000 )] int number )
{
...
}
And at some point, during my regular build process, it has failed. I got an e-mail notification and set off to investigate.
Now, when I open the test in Visual Studio and click "Run Tests", it passes. I do it again, and it passes again. And again. And again. Obviously, the failure was related to that particular sequence of random numbers.
So the question is: How do I re-run this test with that exact sequence?
(provided I have full Gallio report)
UPDATE:
(following a comment about it being a bad idea)
First, I'm not actually asking whether it's a good idea. The question is different.
Second, when the system being tested is complex enough, and the input data space is of multiple independent dimensions, properly breaking that space into equivalency regions presents a significant challenge in both mental effort and time, which is just not worth it, provided smaller components of the system have already been tested on their own. At the same time, if I can just poke the system here and there, why not do so?
Third, I am actually not a newbie in this area. I always used this technique with other test frameworks (such as csUnit and NUnit), and it proved very successful in catching subtle bugs. At the time, there was no such concepts as generated data, so we used our own custom crutches, in the form of System.Random with a predetermined seed. That seed was being generated as part of fixture initialization (usually based on current time) and carefully written to log. This way, when the test failed, I could take the seed from log, plug it into the test fixture, and get exactly same set of test data, and thus, exactly same failure to debug.
And fourth, if it is such a bad idea, why does the RandomNumbers factory exist in the first place?
There is currently no built-in way in Gallio/MbUnit to generate again the same sequence of random numbers. But I think this could be a useful feature and I did open an issue for that request. I'm going to update the subject answer when it's ready.
What I propose is the following:
Display the actual seed of the inner random generator as an annotation into the test report.
Expose a Seed property to the [RandomNumbers] and [RandomStrings] attributes, and to the fluent data generators as well.
Thus you could easily re-generate the exact same sequence of values by feeding the generator with the same seed number.
UPDATE: This feature is now available in Gallio v3.3.8 and later.
Now we all agree with what Péter said. Using random numbers as input for unit tests is rarely a good idea. The corollary is that it is sometimes convenient and mostly appropriate. And that is exactly the reason why we decided to implement that feature in MbUnit. IMHO, a common scenario that could fit well with random test input is stochastic analysis on hash code computations.
Related
we are starting with UnitTests now to raise our code quality a lot and (of course) for different other reasons.
Sadly we are very late to the party, so we have classes with ~50 methods which are completly untested until now. And we have a lot of classes!
Is it possible / how is it possible to:
a) see which methods are untested?
b) force a developer who adds a method to a class to write a unit test for it (e.g. I achieve to test all 50 methods and tomorrow someone adds a 51. method)?
c) raise a warning or a error when a method is untested?
a) see which methods are untested?
This is fairly easy with a Code Coverage tool. There's one built into Visual Studio, but there are other options as well.
b) force a developer who adds a method to a class to write a unit test for it (e.g. I achieve to test all 50 methods and tomorrow someone adds a 51. method)?
Don't make hard rules based on coverage!
All experience (including mine) shows that if you set hard code coverage targets, the only result you'll get is that developers will begin to game the system. The result will be worse code.
Here's an example; consider this Reverse method:
public static string Reverse(string s)
{
char[] charArray = s.ToCharArray();
Array.Reverse(charArray);
return new string(charArray);
}
You can write a single test to get 100 % coverage of this method. However, as it's given here, the implementation is brittle: what happens if s is null? The method is going to throw a NullReferenceException.
A better implementation would be to check s for null, and add a branch that handles that case, but unless you write a test for it, code coverage will drop.
If you have developers who don't want to write tests, but you demand a hard code coverage target, such developers will leave the Reverse function as is, instead of improving it.
I've seen such gaming of the system in the wild. It will happen if you institute a rule without buy-in from developers.
You'll need to teach developers how their jobs could become easier if they add good unit tests.
Once developers understand this, they'll add tests themselves, and you don't need any rules.
As long as developers don't understand this, the rule will only hurt you even more, because the 'tests' you'll get out of it will be written without understanding.
c) raise a warning or a error when a method is untested?
As explained above, code coverage is a poor tool if used for an absolute metric. On the other hand, it can be useful if you start monitoring trends.
In general, code coverage should increase until it reaches some plateau where it's impractical to increase it more.
Sometimes, it may even drop, which can be okay, as long as there's a good reason for it. Monitoring coverage trends will tell you when the coverage drops, and then you can investigate. Sometimes, there's a good reason (such as when someone adds a Humble Object), but sometimes, there isn't, and you'll need to talk to the developer(s) responsible for the drop.
To see which methods are untested there are tools for code coverage(either built-in or external like NCrunch, dotCover). It depends on your needs according results format, reports etc.
Those tools can be run either with VS or e.g. while building with build server(like TeamCity, Bamboo). Build can be set to fail when code coverage lowers(it is somehow answer to your c) question).
In my opinion you should force dev to write unit test by some plugins. It should be the part of your process, it can be pointed during code review. What is more - how you would like to know when to force to write a test? I mean - you don't know when the process of the development finishes so it is hard to tell someone "write your test right know".
EDIT:
If you want some free code coverage tool for TC the one I know is OpenCover - you can easily find some information about plugin installation and usage.
That's funny, your section B "force a developer who adds.." was a question I posted just a few minutes ago, and I got so many down votes I removed the question, because apparently, it's not the right approach to force someone to write tests, it should be a "culture" you set in your organization.
So to your question (which is a bit wide, but still can get some direction):
a. You can use existing code coverage tools / libraries or implement your own to achieve this functionality, this is a good post which mentions a few.
What can I use for good quality Code Coverage for C#/.NET?
You can build customization based on such libaries to assist with achieving answer to section c (raise a warning or a error when a method is untested) as well.
b.You should try implementing TDD if you can, and write the tests before you write the method, this way you won't have to force anything.
If you can't do it since the method cannot be changed due to time and resource considerations, do TED (Test eventually developed).
True it won't be a unit test but still you'll have a test which has value.
I am looking to automatically generate unit tests in MonoDevelop/.Net.
I've tried NUnit, but it doesn't generate the tests. In eclipse, the plug-in randoop does this, however it targets Java and JUnit.
How can I automatically generate unit tests in MonoDevelop and/or for .Net? Or perhaps there is an existing tool out there I am unaware of...
Calling methods with different (random) input is just one part of the process. You also need to define the correct result for each input, and I don't think a tool can do that for you.
randoop only seems to check very few very basic properties of equal, which is not of great use imo and also might lead to a false impression of correctness ("Hey look, all tests pass, my software is ok" ...)
Also just randomly generating code (and input) has the risk of undetermined test results. You might or might not get tests that really find flaws in your code.
That said, a quick googling gave the following starting points for approaches you might want to take:
You might be interested in using test case generators (this CodeProject article describes a very simple one). They support you in generating the "boilerplate" code and can make sure, you dont miss any classes/methods you want to test. Of course, the generated tests need to be adapted by defining proper (i.e. meaningful) input and (correct) output values. Googling for "NUnit Test generators" will give you other links, also for commercial software, which i don't want to repeat here ...
NUnit (and other testing frameworks) support parameterized tests: These can be used to test a whole class of input scenarios. For NUnit, i found the Random attribute which lets you generate random input (in a certain range) for your methods. Remember what I wrote above about random test inputs: the results of these tests will not be reproducable which renders them useless for automatic or regression testing.
That said, also look at this question (and certainly others on SO), which may support my argument against automatic unit test generation.
I have a bit of a paradox.
I'm trying to use TDD to build tests for my password hashing methods before I build the implementation. But I don't how to come up with the expected values beforehand, without first building the implementation.
Of course, with simple hashing implementation, I can probably find a site to create the expected values based on the known password/salt.
I'm betting the solution is to make an exception for TDD and forgo building my tests first. Rather, build my implementation to come up with the proper salt/hash values, then build my tests against those values to prevent regression.
But I thought I would post this to see if there's a solution I'm not thinking of.
Or, maybe there's someone out there that can generate hashes in their head in order to build the tests first.
i don't know if you are writing your own hashing function, like sha-1 (in this case just don't do it) or you are using external hash and random functions to generate salt etc. in second case you don't have to know your output. you just can mock your hash and random providers and check if they are called on your input and or partial results
So, basically you should know "expected" output when you are doing TDD. In your case expected output is the results of hashing function.
If you are implementing known hashing algorigm, it's not a problem to take an tests results from either their sites or just produce them manually.
In case you are developing own algorithm. You should also probably know the expected output by implementing prototype of algorithm implementation. Even if you don't know are they right or not, you just making an assumption they right and use values in tests. If the implementation of hash function is changed, those tests become red.
Well to start ill tell u what i have to do.
I have to make a program so students can upload some C++ code from an exercise. And this uploaded code needs to get compared with the "best code" from that exercise. And from those comparisation the server gives back some feedback if the student uploaded good or bad code.
EG: the Exercise is to make an arraylist from 1 to 10 and so the student can upload his code. The server then compares it with some other code and gives feedback.
This is easyer said then done, because it can't be just a file comparer because of the different variables a user can code. Thats why i was tinking of using external compilers to get some output and compare this output with the output of the "best code". Or more detailed to get a hook within the compiler so i can check every method and every variable.
Or any other idea how i can check this or compare?
Or is there already a program that exist?
Very thanks,
Michael
There are systems that assess the output of the whole program (like ejudge, it is used as a contest system), and, in my opinion, it's easier to use them, because specifying conditions on the program code is itself not a trivial task (if your students are writing non-trivial programs).
You can use formal specification languages like ACSL to set input and output conditions and prove that the program works correctly.
It might be more practical to evaluate the code by testing the behaviour against some automated tests, like a boundary test suite (or even a mutant test if you want things to get exciting), and then to look for odd implementations or interesting architectures look at code metrics like number of functions, lines of code, compile size, etc. .
This approach would be a lot more scalable, and there is a lot of free tools, especially for c++ and java that would be easy to set up an automated test system.
Comparing code to determine correctness is not necessarily a correct approach, depending on how it is done, and would be very difficult to scale up.
When using Assert(...) if a logical test fails the unit test is aborted and the rest of the unit test isn't run. Is there a way to get the logical test to fail but just provide a warning or something and still run the rest of the unit test?
An example of the context is I have a test that creates some students, teachers and classes, creates relationships, then places them into a database. Then some SSIS packages are run on this database that takes the existing data and converts it into another database schema in another database. The test then needs to check the new database for certain things like the correct number of rows, actions, etc.
Obviously other tests are deletes and mods, but all of them follow the same structure - create data in source db, run SSIS packages, verify data in target db.
It sounds like you are attempting to test too many things in a single test.
If a precondition isn't met, then presumably the rest of the test will not pass either. I'd prefer to end the test as soon as I know things aren't what I expect.
The concepts of unit testing are Red fail, Green pass. I know MSTest also allows for a yellow, but it isn't going to do what you want it to. You can do an Assert.Inconclusive to get a yellow light. I have used this when I worked on a code base that had a lot of integration tests that relied on specific database data. Rather than have the test fail, I started having the results be inconclusive. The code might have worked just fine, but the data was missing. And there was no reason to believe the data would always be there (they were not good tests IMO).
If you are using Gallio/MbUnit, you can use Assert.Multiple to achieve what you want. It captures the failing assertions but does not stop the execution of the test immediately. All the failing assertions are collected and reported later at the end of the test.
[Test]
public void MultipleAssertSample()
{
Assert.Multiple(() =>
{
Assert.Fail("Boum!");
Assert.Fail("Paf!");
Assert.Fail("Crash!");
});
}
The test in the example above is obviously failing but what's insteresting is that the 3 failures are shown in the test report. The execution does not stop at the first failure.
I know your question was asked several years ago. But recently (around 2017 or 2018) NUNIT 3 supports Warnings. You can embed a [boolean] test within an Assert.Warning as you would Assert.Fail. But instead of the single Assert line failing the whole test, the test running will log the Warning and continue on the the test.
Read about it here: https://docs.nunit.org/articles/nunit/writing-tests/Warnings.html
It behaves similarly to Multiple (listed by #Yann Trevin above, and Multiple is also available in NUnit 3.0). The cool difference though is on integration tests where having the flexibility of using stand-alone Assert.Warning commands shines. Contrast to a group Asserts within a Multiple instance. Once the Multiple assert has completed, the test may not continue.
Integration tests, especially those which can run for hours to, perhaps test how well a bunch of micros services all play together, are expensive to re-run. Also, if you happen to have multiple teams (external, internal, outurnal, infernal, and eternal) and timezones committing code virtually all the time, it may be challenging to get a new product to run a start-to-end integration test all the way to the end of its workflow once all the pieces are hosted together. (note - It's important to assemble teams with ample domain knowledge and at least enough software engineering knowledge to assemble solid "contracts" for how each API will be, and manage it well. Doing so should help alleviate the mismatches implied above.)
The simple black/white, pass/fail testing is absolutely correct for Unit testing.
But as systems become more abstract, layered upon service after service, agent after agent, the ability to know a systems robustness and reliability become more important. We already know the small blocks of code will work as intended; the Unit tests and code coverage tells us so. But when they must all run atop of someone else's infrastructure (AWS, Azure, Google Cloud), Unit testing isn't good enough.
Knowing how many times a service had to retry to come through, how much did a service cost, will the system meet SLA given certain loads? These are things Integration tests can help find out using the type of Assert you were asking about, #dnatoli.
Given the number of years since your question, you're almost certainly an expert by now.
From what you've explained in the question, it is more of an acceptance test (than a unit test). Unit testing frameworks are designed to fail fast. That is why Assert behaves the way it does (and its a good thing.)
Coming back to your problem: You should take a look at using an acceptance testing framework like Fitnesse, which would support what you want i.e. show me the steps that failed but continue execution till the end.
However if you MUST use a unit-testing framework, use a collecting variable/parameter to simulate this behavior. e.g.
Maintain a List<string> within the test
append a descriptive error message for every failed step
At the end of the test, assert that the collecting variable is empty
I had a similar problem when I wanted to get a bit more meaningful failure report. I was comparing collections and used to get wrong number of elements - no idea what was the real reason for the failure. Unfortunately I ended up writing a comparison code manually - so checking all the conditions and then doing a single assert at the end with a good error message.
Unit testing is black and white - either you pass tests or not, either you are breaking logic or not, either your data in DB is correct or not (although unit testing with DB is no longer unit testing per se).
What are you going to do with the warning? Is that pass or fail? If it's pass then what's the point of the unit testing in this case? If it's fail.. well.. just fail then.
I suggest spending a little of time on figuring out what should be unit tested and how it should be unit tested. "Unit testing" is a cliche term used by many for very very different things.
unit-testing might be more black and white than something like integration testing but if you were using a tool like speck flow then you might need the testing framework to give you a warning or assert inconclusive...
Why do some unit testing frameworks allow you to assert inconclusive if it's black and white?
Imagine that the test date of that you're passing in your unit test is created from a random data generator... Maybe you have several of these... for some data condition you are sure that it is a failure for another day to condition you might be unsure...
The point of a warning or a certain inconclusive is to tell the engineer to take a look at this corner case and add more code to try to catch it next time...
The assumption that your test will always be perfect or black and white I don't think it's correct I've run into too many cases and 15 years of testing where it's not passed or failed... it passes...fails... and I don't know yet... You got to think about the fact that when you fail a test it means that you know it failed...
False failures are really bad in automated tests... It creates a lot of noise... you're better off saying I don't know if you don't know that you are failing...