Generating classes automatically from unit tests?

Generating classes automatically from unit tests? - c#

I am looking for a tool that can take a unit test, like
IPerson p = new Person();
p.Name = "Sklivvz";
Assert.AreEqual("Sklivvz", p.Name);
and generate, automatically, the corresponding stub class and interface
interface IPerson // inferred from IPerson p = new Person();
{
string Name
{
get; // inferred from Assert.AreEqual("Sklivvz", p.Name);
set; // inferred from p.Name = "Sklivvz";
}
}
class Person: IPerson // inferred from IPerson p = new Person();
{
private string name; // inferred from p.Name = "Sklivvz";
public string Name // inferred from p.Name = "Sklivvz";
{
get
{
return name; // inferred from Assert.AreEqual("Sklivvz", p.Name);
}
set
{
name = value; // inferred from p.Name = "Sklivvz";
}
}
public Person() // inferred from IPerson p = new Person();
{
}
}
I know ReSharper and Visual Studio do some of these, but I need a complete tool -- command line or whatnot -- that automatically infers what needs to be done.
If there is no such tool, how would you write it (e.g. extending ReSharper, from scratch, using which libraries)?

What you appear to need is a parser for your language (Java), and a name and type resolver. ("Symbol table builder").
After parsing the source text, a compiler usually has a name resolver, that tries to record the definition of names and their corresponding types, and a type checker, that verifies that each expression has a valid type.
Normally the name/type resolver complains when it can't find a definition. What you want it to do is to find the "undefined" thing that is causing the problem, and infer a type for it.
For
IPerson p = new Person();
the name resolver knows that "Person" and "IPerson" aren't defined. If it were
Foo p = new Bar();
there would be no clue that you wanted an interface, just that Foo is some kind of abstract parent of Bar (e.g., a class or an interface). So the decision as which is it must be known to the tool ("whenever you find such a construct, assume Foo is an interface ..."). You could use a heuristic: IFoo and Foo means IFoo should be an interface, and somewhere somebody has to define Foo as a class realizing that interface. Once the
tool has made this decision, it would need to update its symbol tables so that it can
move on to other statements:
For
p.Name = "Sklivvz";
given that p must be an Interface (by the previous inference), then Name must be a field member, and it appears its type is String from the assignment.
With that, the statement:
Assert.AreEqual("Sklivvz", p.Name);
names and types resolve without further issue.
The content of the IFoo and Foo entities is sort of up to you; you didn't have to use get and set but that's personal taste.
This won't work so well when you have multiple entities in the same statement:
x = p.a + p.b ;
We know a and b are likely fields, but you can't guess what numeric type if indeed they are numeric, or if they are strings (this is legal for strings in Java, dunno about C#).
For C++ you don't even know what "+" means; it might be an operator on the Bar class.
So what you have to do is collect constraints, e.g., "a is some indefinite number or string", etc. and as the tool collects evidence, it narrows the set of possible constraints. (This works like those word problems: "Joe has seven sons. Jeff is taller than Sam. Harry can't hide behind Sam. ... who is Jeff's twin?" where you have to collect the evidence and remove the impossibilities). You also have to worry about the case where you end up with a contradiction.
You could rule out p.a+p.b case, but then you can't write your unit tests with impunity. There are standard constraint solvers out there if you want impunity. (What a concept).
OK, we have the ideas, now, can this be done in a practical way?
The first part of this requires a parser and a bendable name and type resolver. You need a constraint solver or at least a "defined value flows to undefined value" operation (trivial constraint solver).
Our DMS Software Reengineering Toolkit with its Java Front End could probably do this. DMS is a tool builder's tool, for people that want to build tools that process computer langauges in arbitrary ways. (Think of "computing with program fragments rather than numbers").
DMS provides general purpose parsing machinery, and can build an tree for whatever front end it is given (e.g., Java, and there's a C# front end).
The reason I chose Java is that our Java front end has all that name and type resolution machinery, and it is provided in source form so it can be bent. If you stuck to the trivial constraint solver, you could probably bend the Java name resolver to figure out the types. DMS will let you assemble trees that correspond to code fragments, and coalesce them into larger ones; as your tool collected facts for the symbol table, it could build the primitive trees.
Somewhere, you have to decide you are done. How many unit tests the tool have to see
before it knows the entire interface? (I guess it eats all the ones you provide?).
Once complete, it assembles the fragments for the various members and build an AST for an interface; DMS can use its prettyprinter to convert that AST back into source code like you've shown.
I suggest Java here because our Java front end has name and type resolution. Our C# front end does not. This is a "mere" matter of ambition; somebody has to write one, but that's quite a lot of work (at least it was for Java and I can't imagine C# is really different).
But the idea works fine in principle using DMS.
You could do this with some other infrastructure that gave you access to a parser and an a bendable name and type resolver. That might not be so easy to get for C#; I suspect MS may give you a parser, and access to name and type resolution, but not any way to change that. Maybe Mono is the answer?
You still need a was to generate code fragments and assemble them. You might try to do this by string hacking; my (long) experience with gluing program bits together is that if you do it with strings you eventually make a mess of it. You really want pieces that represent code fragments of known type, that can only be combined in ways the grammar allows; DMS does that thus no mess.

Its amazing how no one really gave anything towards what you were asking.
I dont know the answer, but I will give my thoughts on it.
If I were to attempt to write something like this myself I would probably see about a resharper plugin. The reason I say that is because as you stated, resharper can do it, but in individual steps. So I would write something that went line by line and applied the appropriate resharper creation methods chained together.
Now by no means do I even know how to do this, as I have never built anything for resharper, but that is what I would try to do. It makes logical sense that it could be done.
And if you do write up some code, PLEASE post it, as I could find that usefull as well, being able to generate the entire skeleton in one step. Very useful.

If you plan to write your own implementation I would definately suggest that you take a look at the NVelocity (C#) or Velocity (Java) template engines.
I have used these in a code generator before and have found that they make the job a whole lot easier.

It's doable - at least in theory. What I would do is use something like csparser to parse the unit test (you cannot compile it, unfortunately) and then take it from there. The only problem I can see is that what you are doing is wrong in terms of methodology - it makes more sense to generate unit tests from entity classes (indeed, Visual Studio does precisely this) than doing it the other way around.

I think a real solution to this problem would be a very specialized parser. Since that's not so easy to do, I have a cheaper idea. Unfortunately, you'd have to change the way you write your tests (namely, just the creation of the object):
dynamic p = someFactory.Create("MyNamespace.Person");
p.Name = "Sklivvz";
Assert.AreEqual("Sklivvz", p.Name);
A factory object would be used. If it can find the named object, it will create it and return it (this is the normal test execution). If it doesn't find it, it will create a recording proxy (a DynamicObject) that will record all calls and at the end (maybe on tear down) could emit class files (maybe based on some templates) that reflect what it "saw" being called.
Some disadvantages that I see:
Need to run the code in "two" modes, which is annoying.
In order for the proxy to "see" and record calls, they must be executed; so code in a catch block, for example, has to run.
You have to change the way you create your object under test.
You have to use dynamic; you'll lose compile-time safety in subsequent runs and it has a performance hit.
The only advantage that I see is that it's a lot cheaper to create than a specialized parser.

I like CodeRush from DevExpress. They have a huge customizable templating engine. And the best for me their is no Dialog boxes. They also have functionality to create methods and interfaces and classes from interface that does not exist.

Try looking at the Pex , A microsoft project on unit testing , which is still under research
research.microsoft.com/en-us/projects/Pex/

I think what you are looking for is a fuzzing tool kit (https://en.wikipedia.org/wiki/Fuzz_testing).
Al tough I never used, you might give Randoop.NET a chance to generate 'unit tests' http://randoop.codeplex.com/

Visual Studio ships with some features that can be helpful for you here:
Generate Method Stub. When you write a call to a method that doesn't exist, you'll get a little smart tag on the method name, which you can use to generate a method stub based on the parameters you're passing.
If you're a keyboard person (I am), then right after typing the close parenthesis, you can do:
Ctrl-. (to open the smart tag)
ENTER (to generate the stub)
F12 (go to definition, to take you to the new method)
The smart tag only appears if the IDE thinks there isn't a method that matches. If you want to generate when the smart tag isn't up, you can go to Edit->Intellisense->Generate Method Stub.
Snippets. Small code templates that makes it easy to generate bits of common code. Some are simple (try "if[TAB][TAB]"). Some are complex ('switch' will generate cases for an enum). You can also write your own. For your case, try "class" and "prop".
See also "How to change “Generate Method Stub” to throw NotImplementedException in VS?" for information snippets in the context of GMS.
autoprops. Remember that properties can be much simpler:
public string Name { get; set; }
create class. In Solution Explorer, RClick on the project name or a subfolder, select Add->Class. Type the name of your new class. Hit ENTER. You'll get a class declaration in the right namespace, etc.
Implement interface. When you want a class to implement an interface, write the interface name part, activate the smart tag, and select either option to generate stubs for the interface members.
These aren't quite the 100% automated solution you're looking for, but I think it's a good mitigation.

I find that whenever I need a code generation tool like this, I am probably writing code that could be made a little bit more generic so I only need to write it once. In your example, those getters and setters don't seem to be adding any value to the code - in fact, it is really just asserting that the getter/setter mechanism in C# works.
I would refrain from writing (or even using) such a tool before understanding what the motivations for writing these kinds of tests are.
BTW, you might want to have a look at NBehave?

I use Rhino Mocks for this, when I just need a simple stub.
http://www.ayende.com/wiki/Rhino+Mocks+-+Stubs.ashx

Related

How to weave C# code to intercept call to constructors ? Maybe a custom preprocessor or Roslyn

Is there any solution similar to [PostSharp] - [Infuse - A Precompiler for C#] that let me modify code at compile time ?
The below is a pseudo code.
[InterceptCallToConstructors]
void Method1(){
Person Eric = new Person("Eric Bush");
}
InterceptCallToConstructors(ConstructorMethodArgs args){
if(args.Type == typeof(Person))
if(PersonInstances++ > 10 ) args.ReturnValue = null;
}
In this example we see the Eric should not contain a new Person class if more than 10 Person are created.
After some research I found two solution PostSharp and Infuse.
With Infuse it's very complicated and hard to detect how many instance of Person are made how ever with PostSharp it's one line code to detect.
I have tried to go AOP with PostSharp but PostSharp currently doesn't support to intercept Call To Constructor Aspect.
As far as I read Roslyn doesn't support to modify code at compile time.

This would be a "custom preprocessor" answer, that modifies the source code to achieve OP's effect.
Our DMS Software Reengineering Toolkit with its C# Front End can do this.
DMS provides for source to source transformations, with the transformations coded as
if you see *this*, replace it by *that*
This is written in the form:
rule xxx pattern_parameters
this_pattern
-> that_pattern ;
The "->" is pronounced "replace by: :-}
DMS operates on ASTs, so includes a parsing step (text to ASTs), a tree transformation step, and a prettyprinting step that produces the final answer (ASTs to text).
OP's seems to want to modify the constructor call site (he can't modify the constructor; there's no way to get it to return "null"). To accomplish OP's, task, he would provide DMS the following source-to-source transformation specification:
default domain CSharp~v5; -- says we are working with C# syntax (and need the C# front end)
rule intercept_constructor(c: IDENTIFIER, a:arguments): expression
" new \c (\a) "
-> " \c.PersonInstances==10?null:(PersonInstances++,new \c (\a)) "
if c == "Person"; -- one might want to force c to be on some qualified path
What the rule does is find matching constructor call syntax of arbitrary form, and replace it by a conditional expression that check's OP's precondition, returning null if there are too many Person instances (we fix a bug in OP's spec here; he appears to increment the count whether new Person instance is created or not, surely not his intention). We have to qualify the PersonInstance's location; it can't just be floating around in the ether. In this example I'm proposing it is a static member of the class.
The details: each rule has a name ("intercept_constructor", stolen from OP). It refers to a syntactic category ("expression") with syntactic shape "new \c (\a)", forcing it to match only constructor calls that are expressions. The quotes in the rule are meta-quotes; they distinguish the syntax of the rule language from the syntax of the targeted language (C# in this case). The backslashes are meta-escapes; \c in meta-quotes is the same think in the rule as c outside the meta-quotes, similarly for \a.
In a really big system there may be several Person classes. We want to make sure we get right one; one might need to qualify the referenced class as being a specific by by providing a path. OP hints at this with the annotation. If one wanted to check that an annotation existed on the containing method, one would need custom special predicate to ask for that. DMS provides complete facilities for coding such a predicate, including complete access the the AST, so the predicate can climb up or down in its search for a matching annotation.

If you're running on top of the KRuntime (-> ASP.NET 5) you can hook into the compilation by implementing the ICompileModule assembly neutral interface.
I'd recommend loooking at:
the aop example in the repo
this nice writeup

Implicit typing and TDD

I just read this post and it makes the case against implicit typing using when starting out with Test driven development/design.
His post says that TDD can be "slowed down" when using implicit typing for the return type when unit testing a method. Also, he seems to want the return type specified by the test in order to drive development (which makes sense to me).
A given unit test with implicit typing might look like this:
public void Test_SomeMethod()
{
MyClass myClass = new MyClass();
var result = myClass.MethodUnderTest();
Assert.AreEqual(someCondition, result);
}
So my questions are:
Does using implicit typing help or hinder writing unit tests for TDD? Is there anyone out there that can share their experience using this technique when writing unit tests?
I ask this because soon I have not done TDD and want to know if there is a way to write generic or semi-generic unit tests that would work a return type might change.

I see his point but I don't really think it's the right reason to not use var here. Remember, TDD works roughly according to the following:
Write a new test.
If test fails to compile (and it should fail!), write enough code until the test compiles.
Run all tests.
If a test fails, write enough code until all tests pass.
Refactor.
Whether or not we use var the test will fail to compile either way because the method under test won't exist yet!. Once we start coding up NewMethod his points are rather moot.
Rather, the right reason to not use var here is because the code gives no indication what the type of result is. This is a matter of opinion but var is okay here
var dict = new Dictionary<Foo, List<Bar>>();
and for anonymous types but not here
var m = M();
because it's completely unclear without going to the declaration of M (or using IntelliSense) what the return type of M is.

Yes and No
In Visual Studio presently, TDD is a bit of a pain, especially when using implicity typing. var means no intellisense, then when you enter the name of a type that may not exist yet it has the tendency to auto-complete with something that is similiar to what you are typing, often the name of the test fixture.
Visual Studio 2010 has a consume first mode, which makes it ideal and better for Test Driven Development. Currently you'll find (in 2008 and earlier) you have to hit escape to hide intellisense.
As for use of var it's purely synatic sugar. It makes the following much nicer in my opinion:
var type = new MyType();
Its clear that the variable type, is of type MyType. var is great for generics and follows the prinicple of DRY - Don't Repeat Yourself.
var type = MethodCall();
var result = ReturnResult();
On the other hand, this makes for hard to read code, whether you follow TDD or not. Good unit tests should flow and be easy to read. If you have to think, or hover the mouse over a method to see the return type, that is the sign of a bad, hard to read test.

From a tooling perspective, I'd say it's nicer to avoid the var. I use Eclipse and Java, but I know that extensions like CodeRush and Resharper offer many of the features that I'm discussing here. When in my test I call a method that doesn't exist yet, I can "quick fix" it to create the method in the desired class. The return type of the automatically created method depends on its context; if I am expecting back a String, the return type of the method will be String. But if the assignment is to a var (which Java doesn't have - but if it did), the IDE wouldn't know enough to make the return type anything other than var (or maybe Object).
Not everyone uses the IDE in this way in TDD, but I find it very helpful. The more information I can give the IDE in my test, the less typing I have to do to make the test pass.

Is it good form to expose derived values as properties?

I need to derive an important value given 7 potential inputs. Uncle Bob urges me to avoid functions with that many parameters, so I've extracted the class. All parameters now being properties, I'm left with a calculation method with no arguments.
“That”, I think, “could be a property, but I'm not sure if that's idiomatic C#.”
Should I expose the final result as a property, or as a method with no arguments? Would the average C# programmer find properties confusing or offensive? What about the Alt.Net crowd?
decimal consumption = calculator.GetConsumption(); // obviously derived
decimal consumption = calculator.Consumption; // not so obvious
If the latter: should I declare interim results as [private] properties, also? Thanks to heavy method extraction, I have several interim results. Many of these shouldn't be part of the public API. Some of them could be interesting, though, and my expressions would look cleaner if I could access them as properties:
decimal interim2 = this.ImportantInterimValue * otherval;
Happy Experiment Dept.:
While debugging my code in VS2008, I noticed that I kept hovering my mouse over the method calls that compute interim results, expecting a hover-over with their return value. After turning all methods into properties, I found that exposing interim results as properties greatly assisted debugging. I'm well pleased with that, but have lingering concerns about readability.
The interim value declarations look messier. The expressions, however, are easier to read without the brackets. I no longer feel compelled to start the method name with a verb. To contrast:
// Clean method declaration; compulsive verby name; callers need
// parenthesis despite lack of any arguments.
decimal DetermineImportantInterimValue() {
return this.DetermineOtherInterimValue() * this.SomeProperty;
}
// Messier property declaration; clean name; clean access syntax
decimal ImportantInterimValue {
get {
return this.OtherInterimValue * this.SomeProperty;
}
}
I should perhaps explain that I've been coding in Python for a decade. I've been left with a tendency to spend extra time making my code easier to call than to write. I'm not sure the Python community would regard this property-oriented style as acceptably “Pythonic”, however:
def determineImportantInterimValue(self):
"The usual way of doing it."
return self.determineOtherInterimValue() * self.someAttribute
importantInterimValue = property(
lambda self => self.otherInterimValue * self.someAttribute,
doc = "I'm not sure if this is Pythonic...")

The important question here seems to be this:
Which one produces more legible, maintainable code for you in the long run?
In my personal opinion, isolating the individual calculations as properties has a couple of distinct advantages over a single monolothic method call:
You can see the calculations as they're performed in the debugger, regardless of the class method you're in. This is a boon to productivity while you're debugging the class.
If the calculations are discrete, the properties will execute very quickly, which means (in my opinion), they observe the rules for property design. It's absurd to think that a guideline for design should be treated as a straightjacket. Remember: There is no silver bullet.
If the calculations are marked private or internal, they do not add unnecessary complexity to consumers of the class.
If all of the properties are discrete enough, compiler inlining may resolve the performance issues for you.
Finally, if the final method that returns your final calculation is far and away easier to maintain and understand because you can read it, that is an utterly compelling argument in and of itself.
One of the best things you can do is think for yourself and dare to challenge the preconceived One Size Fits All notions of our peers and predecessors. There are exceptions to every rule. This case may very well be one of them.
Postscript:
I do not believe that we should abandon standard property design in the vast majority of cases. But there are cases where deviating from The Standard(TM) is called for, because it makes sense to do so.

Personally, I would prefer if you make your public API as a method instead of property. Properties are supposed to be as 'fast' as possible in C#. More details on this discussion: Properties vs Methods
Internally, GetConsumption can use any number of private properties to arrive at the result, choice is yours.

I usually go by what the method or property will do. If it is something that is going to take a little time, I'll use a method. If it's very quick or has a very small number of operations going on behind the scenes, I'll make it a property.

I use to use methods to denote any action on the object or which changes the state of an object. so, in this case I would name the function as CalculateConsumption() which computes the values from other properties.

You say you are deriving a value from seven inputs, you have implemented seven properties, one for each input, and you have a property getter for the result. Some things you might want to consider are:
What happens if the caller fails to set one or more of the seven "input" properties? Does the result still make sense? Will an exception be thrown (e.g. divide by zero)?
In some cases the API may be less discoverable. If I must call a method that takes seven parameters, I know that I must supply all seven parameters to get the result. And if some of the parameters are optional, different overloads of the method make it clear which ones.
In contrast, it may not be so clear that I have to set seven properties before accessing the "result" property, and could be easy to forget one.
When you have a method with several parameters, you can more easily have richer validation. For example, you could throw an ArgumentException if "parameter A and parameter B are both null".
If you use properties for your inputs, each property will be set independently, so you can't perform the validation when the inputs are being set - only when the result property is being dereferenced, which may be less intuitive.

What's the point of DSLs / fluent interfaces

I was recently watching a webcast about how to create a fluent DSL and I have to admit, I don't understand the reasons why one would use such an approach (at least for the given example).
The webcast presented an image resizing class, that allows you to specify an input-image, resize it and save it to an output-file using the following syntax (using C#):
Sizer sizer = new Sizer();
sizer.FromImage(inputImage)
.ToLocation(outputImage)
.ReduceByPercent(50)
.OutputImageFormat(ImageFormat.Jpeg)
.Save();
I don't understand how this is better than a "conventional" method that takes some parameters:
sizer.ResizeImage(inputImage, outputImage, 0.5, ImageFormat.Jpeg);
From a usability point of view, this seems a lot easier to use, since it clearly tells you what the method expects as input. In contrast, with the fluent interface, nothing stops you from omitting/forgetting a parameter/method-call, for example:
sizer.ToLocation(outputImage).Save();
So on to my questions:
1 - Is there some way to improve the usability of a fluent interface (i.e. tell the user what he is expected to do)?
2 - Is this fluent interface approach just a replacement for the non existing named method parameters in C#? Would named parameters make fluent interfaces obsolete, e.g. something similar objective-C offers:
sizer.Resize(from:input, to:output, resizeBy:0.5, ..)
3 - Are fluent interfaces over-used simply because they are currently popular?
4 - Or was it just a bad example that was chosen for the webcast? In that case, tell me what the advantages of such an approach are, where does it make sense to use it.
BTW: I know about jquery, and see how easy it makes things, so I'm not looking for comments about that or other existing examples.
I'm more looking for some (general) comments to help me understand (for example) when to implement a fluent interface (instead of a classical class-library), and what to watch out for when implementing one.

2 - Is this fluent interface approach
just a replacement for the non
existing named method parameters in
C#? Would named parameters make fluent
interfaces obsolete, e.g. something
similar objective-C offers:
Well yes and no. The fluent interface gives you a larger amount of flexibility. Something that could not be achieved with named params is:
sizer.FromImage(i)
.ReduceByPercent(x)
.Pixalize()
.ReduceByPercent(x)
.OutputImageFormat(ImageFormat.Jpeg)
.ToLocation(o)
.Save();
The FromImage, ToLocation and OutputImageFormat in the fluid interface, smell a bit to me. Instead I would have done something along these lines, which I think is much clearer.
new Sizer("bob.jpeg")
.ReduceByPercent(x)
.Pixalize()
.ReduceByPercent(x)
.Save("file.jpeg",ImageFormat.Jpeg);
Fluent interfaces have the same problems many programming techniques have, they can be misused, overused or underused. I think that when this technique is used effectively it can create a richer and more concise programming model. Even StringBuilder supports it.
var sb = new StringBuilder();
sb.AppendLine("Hello")
.AppendLine("World");

I would say that fluent interfaces are slightly overdone and I would think that you have picked just one such example.
I find fluent interfaces particularly strong when you are constructing a complex model with it. With model I mean e.g. a complex relationship of instantiated objects. The fluent interface is then a way to guide the developer to correctly construct instances of the semantic model. Such a fluent interface is then an excellent way to separate the mechanics and relationships of a model from the "grammar" that you use to construct the model, essentially shielding details from the end user and reducing the available verbs to maybe just those relevant in a particular scenario.
Your example seems a bit like overkill.
I have lately done some fluent interface on top of the SplitterContainer from Windows Forms. Arguably, the semantic model of a hierarchy of controls is somewhat complex to correctly construct. By providing a small fluent API a developer can now declaratively express how his SplitterContainer should work. Usage goes like
var s = new SplitBoxSetup();
s.AddVerticalSplit()
.PanelOne().PlaceControl(()=> new Label())
.PanelTwo()
.AddHorizontalSplit()
.PanelOne().PlaceControl(()=> new Label())
.PanelTwo().PlaceControl(()=> new Panel());
form.Controls.Add(s.TopControl);
I have now reduced the complex mechanics of the control hierarchy to a couple of verbs that are relevant for the issue at hand.
Hope this helps

Consider:
sizer.ResizeImage(inputImage, outputImage, 0.5, ImageFormat.Jpeg);
What if you used less clear variable names:
sizer.ResizeImage(i, o, x, ImageFormat.Jpeg);
Imagine you've printed this code out. It's harder to infer what these arguments are, as you don't have access to the method signature.
With the fluent interface, this is clearer:
sizer.FromImage(i)
.ToLocation(o)
.ReduceByPercent(x)
.OutputImageFormat(ImageFormat.Jpeg)
.Save();
Also, the order of methods is not important. This is equivalent:
sizer.FromImage(i)
.ReduceByPercent(x)
.OutputImageFormat(ImageFormat.Jpeg)
.ToLocation(o)
.Save();
In addition, perhaps you might have defaults for the output image format, and the reduction, so this could become:
sizer.FromImage(i)
.ToLocation(o)
.Save();
This would require overloaded constructors to achieve the same effect.

It's one way to implement things.
For objects that do nothing but manipulate the same item over and over again, there's nothing really wrong with it. Consider C++ Streams: they're the ultimate in this interface. Every operation returns the stream again, so you can chain together another stream operation.
If you're doing LINQ, and doing manipulation of an object over and over, this makes some sense.
However, in your design, you have to be careful. What should the behavior be if you want to deviate halfway through? (IE,
var obj1 = object.Shrink(0.50); // obj1 is now 50% of obj2
var obj2 = object.Shrink(0.75); // is ojb2 now 75% of ojb1 or is it 75% of the original?
If obj2 was 75% of the original object, then that means you're making a full copy of the object every time (and has its advantages in many cases, like if you're trying to make two instances of the same thing, but slightly differently).
If the methods simply manipulate the original object, then this kind of syntax is somewhat disingenuous. Those are manipulations on the object instead of manipulations to create a changed object.
Not all classes work like this, nor does it make sense to do this kind of design. For example, this style of design would have little to no usefulness in the design of a hardware driver or the core of a GUI application. As long as the design involves nothing but manipulating some data, this pattern isn't a bad one.

You should read Domain Driven Design by Eric Evans to get some idea why is DSL considered good design choice.
Book is full of good examples, best practice advices and design patterns. Highly recommended.

It's possible to use a variation on a Fluent interface to enforce certain combinations of optional parameters (e.g. require that at least one parameter from a group is present, and require that if a certain parameter is specified, some other parameter must be omitted). For example, one could provide a functionality similar to Enumerable.Range, but with a syntax like IntRange.From(5).Upto(19) or IntRange.From(5).LessThan(10).Stepby(2) or IntRange(3).Count(19).StepBy(17). Compile-time enforcement of overly-complex parameter requirements may require the definition of an annoying number of intermediate-value structures or classes, but the approach can in some cases prove useful in simpler cases.

Further to #sam-saffron's suggestion regarding the flexibility of a Fluent Interface when adding a new operation:
If we needed to add a new operation, such as Pixalize(), then, in the 'method with multiple parameters' scenario, this would require a new parameter to be added to the method signature. This may then require a modification to every invocation of this method throughout the codebase in order to add a value for this new parameter (unless the language in use would allow an optional parameter).
Hence, one possible benefit of a Fluent Interface is limiting the impact of future change.

Logic is now Polymorphism instead of Switch, but what about constructing?

This question is specifically regarding C#, but I am also interested in answers for C++ and Java (or even other languages if they've got something cool).
I am replacing switch statements with polymorphism in a "C using C# syntax" code I've inherited. I've been puzzling over the best way to create these objects. I have two fall-back methods I tend to use. I would like to know if there are other, viable alternatives that I should be considering or just a sanity check that I'm actually going about this in a reasonable way.
The techniques I normally use:
Use an all-knowing method/class. This class will either populate a data structure (most likely a Map) or construct on-the-fly using a switch statement.
Use a blind-and-dumb class that uses a config file and reflection to create a map of instances/delegates/factories/etc. Then use map in a manner similar to above.
???
Is there a #3, #4... etc that I should strongly consider?
Some details... please note, the original design is not mine and my time is limited as far as rewriting/refactoring the entire thing.
Previous pseudo-code:
public string[] HandleMessage(object input) {
object parser = null;
string command = null;
if(input is XmlMessage) {
parser = new XmlMessageParser();
((XmlMessageParser)parser).setInput(input);
command = ((XmlMessageParser)parser).getCommand();
} else if(input is NameValuePairMessage) {
parser = new NameValuePairMessageParser();
((NameValuePairMessageParser)parser).setInput(input);
command = ((XmlMessageParser)parser).getCommand();
} else if(...) {
//blah blah blah
}
string[] result = new string[3];
switch(command) {
case "Add":
result = Utility.AddData(parser);
break;
case "Modify":
result = Utility.ModifyData(parser);
break;
case ... //blah blah
break;
}
return result;
}
What I plan to replace that with (after much refactoring of the other objects) is something like:
public ResultStruct HandleMessage(IParserInput input) {
IParser parser = this.GetParser(input.Type); //either Type or a property
Map<string,string> parameters = parser.Parse(input);
ICommand command = this.GetCommand(parameters); //in future, may need multiple params
return command.Execute(parameters); //to figure out which object to return.
}
The question is what should the implementation of GetParser and GetCommand be?
Putting a switch statement there (or an invokation of a factory that consists of switch statements) doesn't seem like it really fixes the problem. I'm just moving the switch somewhere else... which maybe is fine as its no longer in the middle of my primary logic.

You may want to put your parser instantiators on the objects themselves, e.g.,
public interface IParserInput
{
...
IParser GetParser()
ICommand GetCommand()
}
Any parameters that GetParser needs should, theoretically, be supplied by your object.
What will happen is that the object itself will return those, and what happens with your code is:
public ResultStruct HandleMessage(IParserInput input)
{
IParser parser = input.GetParser();
Map<string,string> parameters = parser.Parse(input);
ICommand command = input.GetCommand();
return command.Execute(parameters);
}
Now this solution is not perfect. If you do not have access to the IParserInput objects, it might not work. But at least the responsibility of providing information on the proper handler now falls with the parsee, not the handler, which seems to be more correct at this point.

You can have an
public interface IParser<SomeType> : IParser{}
And set up structuremap to look up for a Parser for "SomeType"
It seems that Commands are related to the parser in the existing code, if you find it clean for your scenario, you might want to leave that as is, and just ask the Parser for the Command.
Update 1: I re-read the original code. I think for your scenario it will probably be the least change to define an IParser as above, which has the appropiate GetCommand and SetInput.
The command/input piece, would look something along the lines:
public string[] HandleMessage<MessageType>(MessageType input) {
var parser = StructureMap.GetInstance<IParser<MessageType>>();
parser.SetInput(input);
var command = parser.GetCommand();
//do something about the rest
}
Ps. actually, your implementation makes me feel that the old code, even without the if and switch had issues. Can you provide more info on what is supposed to happen in the GetCommand in your implementation, does the command actually varies with the parameters, as I am unsure what to suggest for that because of it.

I don't see any problems with a message handler like you have it. I certainly wouldn't go with the config file approach - why create a config file outside the debugger when you can have everything available at compile time?

The third alternative would be to discover the possible commands at runtime in a decentralized way.
For example, Spring can do this in Java using so-called “classpath scanning”, reflection and annotations — Spring parses all classes in the package(s) you specify, picks ones annotated with #Controller, #Resource etc and registers them as beans.
Classpath scanning in Java relies on directory entries being added to JAR archives (so that Spring can enumerate the contents of various classpath directories).
I don't know about C#, but there should be a similar technique there: probably you can enumerate a list of classes in your assembly, and pick some of them based on some criteria (naming convention, annotation, whatever).
Now, this is just a third option to have in mind for the sake of having a third option in mind. I doubt it should actually be used in practise. You first alternative (just write a piece of code that knows about all the classes) should be the default choice unless you have a compelling reason to do otherwise.
In the decentralised world of OOP, where each class is a little piece of the puzzle, there has to be some “integration code” that knows how to put these pieces together. There's nothing wrong about having such “all-knowing” classes (as long as you limit them to application-level and subsystem-level integration code only).
Whichever way you choose (hard-code the possible choices in a class, read a config file or use reflection to discover the choices), it's all the same story, does not really matter, and can easily be changed at any time.
Have fun!

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.