I am looking for a way to programmatically create unit tests using MSTest. I would like to loop through a series of configuration data and create tests dynamically based on the information. The configuration data will not be available at compile time and may come from an external data source such as a database or an XML file. Scenario: Load configuration data into a test harness and loop through the data while creating a new test for each element. Would like each dynamically created test to be reported (success/fail) on separately.
You can use Data Driven Testing depending on how complex your data is. If you are just substituting values and testing to make sure that your code can handle the same inputs that might be the way to go, but this doesn't really sound like what you are after. (You could make this more complex, after all all you are doing is pulling in values from a data source and then making a programmatic decision based on it)
All MS Test really does is run a series of tests and then produce the results (in an xml file) which is then interpreted by the calling application. It's just a wrapper for executing methods that you designate through attributes.
What it sounds like you're asking is to write C# code dynamically, and have it execute in the harness.
If you really want to run this through MS test you could:
Build a method (or series of methods) which looks at the XML file
Write out the C# code (I would maybe look at T4 Templates for this) (Personally, I would use F# to do this, but I'm more partial to functional languages, and this would be easier for me).
Calls the csc.exe (C# compiler)
Invokes MS Test
You could also write MSIL code into the running application directly, and try to get MS Test to execute it, which for some might be fun, but that could be time consuming and not necessarily guaranteed to work (I haven't tried it, so I don't know what the pit falls would be).
Based on this, it might be easier to quickly build your own harness which will interpret your XML file and dynamically build out your test scenarios and produce the same results file. (After all the results are what's important, not how you got there.) Since you said it won't be available during compile time, I would guess that you aren't interested in viewing the results in the VS studio window.
Actually, personally, I wouldn't use XML as your Domain Specific Language (DSL). The parsing of it is easy, because .NET already does that for you, but it's limiting in how it would define how your method can function. It's meant for conveying data, and although technically code is a form of data, it doesn't have the sufficient expressive strength to convey many abilities in more formal language. This is just my personal opinion though, and there are many ways to skin a cat.
Related
I was just wondering, given an input file(excel,xml etc), can we generate a unit test code in c#? Consider for example, I need to validate a database. In the input excel file, i will mention which all attributes to be set, which all to retrieve, expected value etc. Also for these , i can provide the queries to run. So given these many inputs from my side, can I create a unit test case method in c# through some tool or script or another program? Sorry if this sounds dumb. Thank you for the help.
A unit-test should test if your software works correct/as expected not that your data is correct. To be concise you should test the software that imported the data to your database. When the data is already in the database you can however write a validation-script or something similar which has nothing to do with a Unit-Test (however the script may be tested of course for working correctly).
You should however test if the queries provided by your software to run against the database are correct and wheather they work as expected, with both arbitrary and real-world-data.
Even when code-generation is involved you do not want to check if the process of generating the source-code works correctly (at least until you did not write your own code-generator). Simply assume the generator works as expected and continue with the stuff you can handle yourself.
I had a similar question some time back, though not in the context of unit tests. This code that can be generated from another file/database table is called Boilerplate Code.
So if you ask whether this can be done, the answer is yes. But if you wonder whether this should be done, the answer is no. Unit tests are not ideal boilerplate code. They are mutable... On catching an edge case that you did not consider earlier you may have to add a few more tests.
Also, unit tests are often used to not just test the code but to drive code development. This method is known as Test Driven Development (abbr. TDD). It'd be a mess to "drive" your development from boilerplate tests.
The current system we are adopting at work is to write some extremely complex queries which perform multiple calculations and have multiple joins / sub-queries. I don't think I am experienced enough to say if this is correct or not so I am agreeing and attempting to function with this system as it has clear benefits.
The problem we are having at the moment is that the person writing the queries makes a lot of mistakes and assumes everything is correct. We have now assigned a tester to analyse all of the queries but this still proves extremely time consuming and stressful.
I would like to know how we could create an automated procedure (without specifically writing it with code if possible as I can work out how to do that the long way) to verify a set of 10+ different inputs, verify the output data and say if the calculations are correct.
I know I could write a script using specific data in the database and create a script using c# (the db is SQL Server) and verify all the values coming out but I would like to know what the official "standard" is as my experience is lacking in this area and I would like to improve.
I am happy to add more information if required, add a comment if necessary. Thank you.
Edit: I am using c#
The standard approach to testing code that runs SQL queries is to unit-test it. (There are higher-level kinds of testing than unit testing, but it sounds like your problem is with a small, specific part of your application so don't worry about higher-level testing yet.) Don't try to test the queries directly, but test the result of the queries. That is, write unit tests for each of the C# methods that runs a query. Each unit test should insert known data into the database, call the method, and assert that it returns the expected result.
The two most common approaches to unit testing in C# are to use the Visual Studio unit test tools or NUnit. How to write unit tests is a big topic. Roy Osherove's "Art of Unit Testing" should be a good place to get started.
The other answer to this question, while generally correct for testing code, does not address the issue of testing your database at all.
It sounds like you're after database unit tests. The idea is that you create a temporary, isolated database environment with your desired schema and test data, then you validate that your queries are returning appropriate data.
I'm trying to write a visualiser for some code which generates graphics for barcodes and labels. The way I want to do this is by recording the methods+parameters being run to a file, so I can play them back and see the visual output generated at each stage (so a kind of visual debugger to help me fix issues with measurements in the drawing)
I have access to the methods, and I can put anything I like in them - but I'm stuck on the best way to record the method signature being called and the parameters, especially since a lot of them are overloads etc.
Is there anything simple that will help me serialize/record actual method call information? (with a view to replay it back, so I need to programmatically load the information and call it) Perhaps something reflection-related?
Note: I'm an intern on the project I'm working on, and I'm probably not allow to introduce new assemblies etc. into the build, so I think aspect-based things requiring libraries are out. (At the same time, I'm not just asking a Q. I should be figuring out myself - this is more an additional thing I'm doing during my lunch break to help my main task)
It might be a good idea to start from an existing profiler as a base - e.g. from http://code.google.com/p/slimtune/
Note that profilers themselves are quite complicated - for .Net they require some C++/COM knowledge - but if you start from a base like slimtune, then hopefully you'll be able to avoid this core code and will instead be able to focus on your own visualisation requirements.
Recording the method name itself is easy, parameters will be more difficult. I think the only way to generically retrieve the parameters is to use reflection--the alternative is to have an ungodly amount of logging code where you explicitly log every parameter.
Also consider that you'll need all parameters to be serializable, and depending on how you want the file to be used (by a program vs. human readable) you might have to implement quite a bit of boilerplate serialization code.
You should really consider existing profiling tools and testing tools rather than thinking of inventing something new. It sounds like performance tests or integration tests may be more valuable than a "playback" utility.
I am developing a ETL process that extract business data from one database to a data warehouse. The application is NOT using NHibinate, Linq to Sql or Entity Framework. The application has its own generated data access classes that generate the necessary SQL statements to perform CUID.
As one can image, developers who write code that generate custom SQL can easily make mistakes.
I would like to write a program that generate testing data (Arrange), than perform the ETL process (Act) and validate the data warehouse (Assert).
I don't think it is hard to write such program. However, what I worry is that in the past my company had attempt to do something similar, and ending up with a brunch of un-maintainable unit tests that constantly fail because of many new changes to the database schema as new features are added.
My plan is to write an integration test that runs on the build machine, and not any unit tests to ensures the ETL process works. The testing data cannot be totally random generate because of business logic on determine how data are loaded to the data warehouse. We have custom development tool that generates new data access classes when there is a change in the database definition.
I would love any feedback from the community on giving me advice on write such integration test that is easy to easy to maintain. Some ideas I have:
Save a backup testing database in the version control (TFS), developers will need to modify the backup database when there are data changes to the source or data warehouse.
Developers needs to maintain testing data though the testing program (C# in this case) manually. This program would have a basic framework for developer to generate their testing data.
When the test database is initialize, it generate random data. Developers will need to write code to override certain randomly generated data to ensure the test passes.
I welcome any suggestions
Thanks
Hey dsum,
allthough I don't really know your whole architecture of the ETL, I would say, that integration-testing should only be another step in your testing process.
Even if the unit-testing in the first encounter ended up in a mess, you should keep in mind, that for many cases a single unit-test is the best place to check. Or do you want to split the whole integration test for triple-way case or sth. other further deep down, in order to guarantee the right flow in every of the three conditions?
Messy unit-test are only the result of messy production code. Don't feel offended. That's just my opinion. Unit-tests force coders to keep a clean coding style and keep the whole thing much more maintainable.
So... my goal is, that you just think about not only to perform integration testing on the whole thing, because unit-tests (if they are used in the right way) can focus on problems in more detail.
Regards,
MacX
First, let's say I think that's a good plan, and I have done something similar using Oracle & PL/SQL some years ago. IMHO your problem is mainly an organizational one, not a technical:
You must have someone who is responsible to extend and maintain the test code.
Responsibility for maintaining the test data must be clear (and provide mechanisms for easy test data maintenance; same applies to any verification data you might need)
The whole team should know that no code will go into the production environment as long as the test fails. If the test fails, first priority of the team should be to fix it (the code or the test, whatever is right). Train them not to work on any new feature as long as the test breaks!
After a bug fix, it should be easy for the one who fixed it to verify that the part of the integration which failed before does not fail afterwards. That means, it should be possible to run the whole test quick and easily from any developer machine (or at least, parts of it). Quick can get a problem for an ETL process if your test is too big, so focus on testing a lot of things with as few data as possible. And perhaps you can break the whole test into smaller pieces which can be executed step-by-step.
If one wants to maintain data while performing Data integration testing in ETL, We could also go with these steps because Integration testing of the ETL process and the related applications involves in them. for eg:
1.Setup test data in the source system.
2.Execute ETL process to load the test data into the target.
3.View or process the data in the target system.
4.Validate the data and application functionality that uses the data
I'm in the process of writing an HTML screen scraper. What would be the best way to create unit tests for this?
Is it "ok" to have a static html file and read it from disk on every test?
Do you have any suggestions?
To guarantee that the test can be run over and over again, you should have a static page to test against. (Ie. from disk is OK)
If you write a test that touches the live page on the web, thats probably not a unit test, but an integration test. You could have those too.
For my ruby+mechanize scrapers I've been experimenting with integration tests that transparently test against as many possible versions of the target page as possible.
Inside the tests I'm overloading the scraper HTTP fetch method to automatically re-cache a newer version of the page, in addition to an "original" copy saved manually. Then each integration test runs against:
the original manually-saved page (somewhat like a unit test)
the freshest version of the page we have
a live copy from the site right now (which is skipped if offline)
... and raises an exception if the number of fields returned by them is different, e.g. they've changed the name of a thumbnail class, but still provides some resilience against tests breaking because the target site is down.
Files are ok but: your screen scraper processes text. You should have various unit tests that "scrapes" different pieces of text hard coded within each unit test. Each piece should "provoke" the various parts of your scraper method.
This way you completely remove dependencies to anything external, both files and web pages. And your tests will be easier to maintain individually since they no longer depends on external files. Your unit tests will also execute (slightly) faster ;)
To create your unit tests, you need to know how your scraper works and what sorts of information you think it should be extracting. Using simple web pages as unit tests could be OK depending on the complexity of your scraper.
For regression testing, you should absolutely keep files on disk.
But if your ultimate goal is to scrape the web, you should also keep a record of common queries and the HTML that comes back. This way, when your application fails, you can quickly capture all past queries of interest (using say wget or curl) and find out if and how the HTML has changed.
In other words, regression test both against known HTML and against unknown HTML from known queries. If you issue a known query and the HTML that comes back is identical to what's in your database, you don't need to test it twice.
Incidentally, I've had much better luck screen scraping ever since I stopped trying to scrape raw HTML and started instead to scrape the output of w3m -dump, which is ASCII and is so much easier to deal with!
You need to think about what it is you are scraping.
Static Html (html that is not bound to change drastically and break your scraper)
Dynamic Html (Loose term, html that may drastically change)
Unknown (html that you pull specific data from, regardless of format)
If the html is static, then I would just use a couple different local copies on disk. Since you know the html is not bound to change drastically and break your scraper, you can confidently write your test using a local file.
If the html is dynamic (again, loose term), then you may want to go ahead and use live requests in the test. If you use a local copy in this scenario and the test passes you may expect the live html to do the same, whereas it may fail. In this case, by testing against the live html every time, you immediately know if your screen scraper is up to par or not, before deployment.
Now if you simply don't care what format the html is, the order of the elements, or the structure because you are simply pulling out individual elements based on some matching mechanism (Regex/Other), then a local copy may be fine, but you may still want to lean towards testing against live html. If the live html changes, specifically parts of what you are looking for, then your test may pass if you're using a local copy, but come deployment may fail.
My opinion would be to test against live html if you can. This will prevent your local tests from passing when the live html may fail, and visa-versa. I don't think there is a best practice with screenscrapers, because screenscrapers in themselves are unusual little buggers. If a website or web service does not expose a API, a screenscraper is sort of a cheesy workaround to getting the data you want.
What you're suggesting sounds sensible. I'd perhaps have a directory of suitable test HTML files, plus data on what to expect for each one. You can further populate that with known problematic pages as/when you come across them, to form a complete regression test suite.
You should also perform integration tests for actually talking HTTP (including not just successful page fetches, but also 404 errors, unresponsive servers etc.)
I would say that depends on how many different tests you need to run.
If you need to check for a large number of different things in your unit test, you might be better off generating HTML output as part of your test initialization. It would still be file-based, but you would have an extensible pattern:
Initialize HTML file with fragments for Test A
Execute Test A
Delete HTML file
That way when you add test ZZZZZ down the road, you would have a consistent way of providing test data.
If you are just running a limited number of tests, and it will stay that way, a few pre-written static HTML files should be fine.
Certainly do some integration tests as Rich suggests.
You're creating an external dependency, which is going to be fragile.
Why not create a TestContent project, populated with a bunch of resources files? Copy 'n paste your source HTML into the resource file(s) and then you can reference them in your unit tests.
Sounds like you have several components here:
Something that fetches your HTML content
Something that strips away the chaff and produces just the text that must be scraped
Something that actually looks at the content and transforms it into your database/whatever
You should test (and probably) implement these parts of scraper independently.
There's no reason you shouldn't be able to get content from any where (i.e. no HTTP).
There's no reason you wouldn't want to strip away the chaff for purposes other than scraping.
There's no reason to only store data into your database via scraping.
So.. there's no reason to build and test all these pieces of your code as a single large program.
Then again... maybe we're over complicating things?
You should probably query a static page on disk for all but one or two tests. But don't forget those tests that touch the web!
I don't see why it matters where the html originates from as far as your unit tests are concerned.
To clarify: Your unit test is processing the html content, where that content comes from is immaterial, so reading it from a file is fine for your unit tests. as you say in your comment you certainly don't want to hit the network for every test as that is just overhead.
You also might want to add an integration test or two to check you're processing urls correctly though (i.e. you are able to connect and process external urls).