DebuggerDisplay resolve to string at runtime - c#

Is there a way to access the string shown by DebuggerDisplayAttribute at runtime?
For our business objects i try to get automatic debugger information on exception handling. the actual object that was used while the exception was caught should be serialized to text to enhance the exception message. Since some attributes have other business objects as type, this could get really long if used recursively. Therefore i'd like to serialize to just the information that is already defined in DebuggerDisplay attributes of the class. The ToString() implementation of the classes can differ and are not usable for this task.
So is it possible to get the string that is shown in the debugger at runtime?

I don't think so (at least not without some effort on your part) - I've just done a bit of digging around and found this an article about Debugger Display Best Practices. It's not directly related, but it does highlight one thing:
Each property {expression hole} must
be evaluated individually and done so
once for every instance of this type
in every debugger display window.
I expect that it's using the debugger to do the evaluation once the code has been broken into (kind of similar to how you would use the immediate window to evaluate a statement when you're at a breakpoint).
The long and short of it is that the resulting debugger display value for an object is not available to you at runtime, unless you're willing to parse each of the expression holes and use reflection to evaluate them yourself.
The article suggests that the most efficient way to provide debugger output is to have a private method do a String.Format over all the properties you want to display. You might want to consider making this a public method (maybe on an interface) and use this to retrieve your exception information from.

Probably there is some way to extract that information, but wouldn't it be easier to redefine those classes with a property like this:
[DebuggerDisplay("{InfoProperty}")]
class X {
public string InfoProperty {
get { return "Debug and display info here"; }
}
}
Then you include that InfoProperty in your error messages / logs instead of digging the way the data for display is reconstructed by Visual Studio.
Of course I am assuming that you can modify the business object classes, which might not be the case...

Technically, sure, it's possible - you could access the DebuggerDisplayAttribute at runtime with Reflection and write some code that parses the string and again uses Reflection to get the values.
This won't work if you've got anything but properties and fields inside those curly braces, though.
In any case, I strongly suggest you heed Mike or Paolo's advice -if there are hundred of classes you need to change - then find a way to change them automatically - either with something like Resharper's Structural
Search and Replace, or a Regular Expression - it shouldn't take too long.

Related

Should I throw on null parameters in private/internal methods?

I'm writing a library that has several public classes and methods, as well as several private or internal classes and methods that the library itself uses.
In the public methods I have a null check and a throw like this:
public int DoSomething(int number)
{
if (number == null)
{
throw new ArgumentNullException(nameof(number));
}
}
But then this got me thinking, to what level should I be adding parameter null checks to methods? Do I also start adding them to private methods? Should I only do it for public methods?
Ultimately, there isn't a uniform consensus on this. So instead of giving a yes or no answer, I'll try to list the considerations for making this decision:
Null checks bloat your code. If your procedures are concise, the null guards at the beginning of them may form a significant part of the overall size of the procedure, without expressing the purpose or behaviour of that procedure.
Null checks expressively state a precondition. If a method is going to fail when one of the values is null, having a null check at the top is a good way to demonstrate this to a casual reader without them having to hunt for where it's dereferenced. To improve this, people often use helper methods with names like Guard.AgainstNull, instead of having to write the check each time.
Checks in private methods are untestable. By introducing a branch in your code which you have no way of fully traversing, you make it impossible to fully test that method. This conflicts with the point of view that tests document the behaviour of a class, and that that class's code exists to provide that behaviour.
The severity of letting a null through depends on the situation. Often, if a null does get into the method, it'll be dereferenced a few lines later and you'll get a NullReferenceException. This really isn't much less clear than throwing an ArgumentNullException. On the other hand, if that reference is passed around quite a bit before being dereferenced, or if throwing an NRE will leave things in a messy state, then throwing early is much more important.
Some libraries, like .NET's Code Contracts, allow a degree of static analysis, which can add an extra benefit to your checks.
If you're working on a project with others, there may be existing team or project standards covering this.
If you're not a library developer, don't be defensive in your code
Write unit tests instead
In fact, even if you're developing a library, throwing is most of the time: BAD
1. Testing null on int must never be done in c# :
It raises a warning CS4072, because it's always false.
2. Throwing an Exception means it's exceptional: abnormal and rare.
It should never raise in production code. Especially because exception stack trace traversal can be a cpu intensive task. And you'll never be sure where the exception will be caught, if it's caught and logged or just simply silently ignored (after killing one of your background thread) because you don't control the user code. There is no "checked exception" in c# (like in java) which means you never know - if it's not well documented - what exceptions a given method could raise. By the way, that kind of documentation must be kept in sync with the code which is not always easy to do (increase maintenance costs).
3. Exceptions increases maintenance costs.
As exceptions are thrown at runtime and under certain conditions, they could be detected really late in the development process. As you may already know, the later an error is detected in the development process, the more expensive the fix will be. I've even seen exception raising code made its way to production code and not raise for a week, only for raising every day hereafter (killing the production. oops!).
4. Throwing on invalid input means you don't control input.
It's the case for public methods of libraries. However if you can check it at compile time with another type (for example a non nullable type like int) then it's the way to go. And of course, as they are public, it's their responsibility to check for input.
Imagine the user who uses what he thinks as valid data and then by a side effect, a method deep in the stack trace trows a ArgumentNullException.
What will be his reaction?
How can he cope with that?
Will it be easy for you to provide an explanation message ?
5. Private and internal methods should never ever throw exceptions related to their input.
You may throw exceptions in your code because an external component (maybe Database, a file or else) is misbehaving and you can't guarantee that your library will continue to run correctly in its current state.
Making a method public doesn't mean that it should (only that it can) be called from outside of your library (Look at Public versus Published from Martin Fowler). Use IOC, interfaces, factories and publish only what's needed by the user, while making the whole library classes available for unit testing. (Or you can use the InternalsVisibleTo mechanism).
6. Throwing exceptions without any explanation message is making fun of the user
No need to remind what feelings one can have when a tool is broken, without having any clue on how to fix it. Yes, I know. You comes to SO and ask a question...
7. Invalid input means it breaks your code
If your code can produce a valid output with the value then it's not invalid and your code should manage it. Add a unit test to test this value.
8. Think in user terms:
Do you like when a library you use throws exceptions for smashing your face ? Like: "Hey, it's invalid, you should have known that!"
Even if from your point of view - with your knowledge of the library internals, the input is invalid, how you can explain it to the user (be kind and polite):
Clear documentation (in Xml doc and an architecture summary may help).
Publish the xml doc with the library.
Clear error explanation in the exception if any.
Give the choice :
Look at Dictionary class, what do you prefer? what call do you think is the fastest ? What call can raises exception ?
Dictionary<string, string> dictionary = new Dictionary<string, string>();
string res;
dictionary.TryGetValue("key", out res);
or
var other = dictionary["key"];
9. Why not using Code Contracts ?
It's an elegant way to avoid the ugly if then throw and isolate the contract from the implementation, permitting to reuse the contract for different implementations at the same time. You can even publish the contract to your library user to further explain him how to use the library.
As a conclusion, even if you can easily use throw, even if you can experience exceptions raising when you use .Net Framework, that doesn't mean it could be used without caution.
Here are my opinions:
General Cases
Generally speaking, it is better to check for any invalid inputs before you process them in a method for robustness reason - be it private, protected, internal, protected internal, or public methods. Although there are some performance costs paid for this approach, in most cases, this is worth doing rather than paying more time to debug and to patch the codes later.
Strictly Speaking, however...
Strictly speaking, however, it is not always needed to do so. Some methods, usually private ones, can be left without any input checking provided that you have full guarantee that there isn't single call for the method with invalid inputs. This may give you some performance benefit, especially if the method is called frequently to do some basic computation/action. For such cases, doing checking for input validity may impair the performance significantly.
Public Methods
Now the public method is trickier. This is because, more strictly speaking, although the access modifier alone can tell who can use the methods, it cannot tell who will use the methods. More over, it also cannot tell how the methods are going to be used (that is, whether the methods are going to be called with invalid inputs in the given scopes or not).
The Ultimate Determining Factor
Although access modifiers for methods in the code can hint on how to use the methods, ultimately, it is humans who will use the methods, and it is up to the humans how they are going to use them and with what inputs. Thus, in some rare cases, it is possible to have a public method which is only called in some private scope and in that private scope, the inputs for the public methods are guaranteed to be valid before the public method is called.
In such cases then, even the access modifier is public, there isn't any real need to check for invalid inputs, except for robust design reason. And why is this so? Because there are humans who know completely when and how the methods shall be called!
Here we can see, there is no guarantee either that public method always require checking for invalid inputs. And if this is true for public methods, it must also be true for protected, internal, protected internal, and private methods as well.
Conclusions
So, in conclusion, we can say a couple of things to help us making decisions:
Generally, it is better to have checks for any invalid inputs for robust design reason, provided that performance is not at stake. This is true for any type of access modifiers.
The invalid inputs check could be skipped if performance gain could be significantly improved by doing so, provided that it can also be guaranteed that the scope where the methods are called are always giving the methods valid inputs.
private method is usually where we skip such checking, but there is no guarantee that we cannot do that for public method as well
Humans are the ones who ultimately use the methods. Regardless of how the access modifiers can hint the use of the methods, how the methods are actually used and called depend on the coders. Thus, we can only say about general/good practice, without restricting it to be the only way of doing it.
The public interface of your library deserves tight checking of preconditions, because you should expect the users of your library to make mistakes and violate the preconditions by accident. Help them understand what is going on in your library.
The private methods in your library do not require such runtime checking because you call them yourself. You are in full control of what you are passing. If you want to add checks because you are afraid to mess up, then use asserts. They will catch your own mistakes, but do not impede performance during runtime.
Though you tagged language-agnostic, it seems to me that it probably doesn't exist a general response.
Notably, in your example you hinted the argument: so with a language accepting hinting it'll fire an error as soon as entering the function, before you can take any action.
In such a case, the only solution is to have checked the argument before calling your function... but since you're writing a library, that cannot have sense!
In the other hand, with no hinting, it remains realistic to check inside the function.
So at this step of the reflexion, I'd already suggest to give up hinting.
Now let's go back to your precise question: to what level should it be checked?
For a given data piece it'd happen only at the highest level where it can "enter" (may be several occurrences for the same data), so logically it'd concern only public methods.
That's for the theory. But maybe you plan a huge, complex, library so it might be not easy to ensure having certainty about registering all "entry points".
In this case, I'd suggest the opposite: consider to merely apply your controls everywhere, then only omit it where you clearly see it's duplicate.
Hope this helps.
In my opinion you should ALWAYS check for "invalid" data - independent whether it is a private or public method.
Looked from the other way... why should you be able to work with something invalid just because the method is private? Doesn't make sense, right? Always try to use defensive programming and you will be happier in life ;-)
This is a question of preference. But consider instead why are you checking for null or rather checking for valid input. It's probably because you want to let the consumer of your library to know when he/she is using it incorrectly.
Let's imagine that we have implemented a class PersonList in a library. This list can only contain objects of the type Person. We have also on our PersonList implemented some operations and therefore we do not want it to contain any null values.
Consider the two following implementations of the Add method for this list:
Implementation 1
public void Add(Person item)
{
if(_size == _items.Length)
{
EnsureCapacity(_size + 1);
}
_items[_size++] = item;
}
Implementation 2
public void Add(Person item)
{
if(item == null)
{
throw new ArgumentNullException("Cannot add null to PersonList");
}
if(_size == _items.Length)
{
EnsureCapacity(_size + 1);
}
_items[_size++] = item;
}
Let's say we go with implementation 1
Null values can now be added in the list
All opoerations implemented on the list will have to handle theese null values
If we should check for and throw a exception in our operation, consumer will be notified about the exception when he/she is calling one of the operations and it will at this state be very unclear what he/she has done wrong (it just wouldn't make any sense to go for this approach).
If we instead choose to go with implementation 2, we make sure input to our library has the quality that we require for our class to operate on it. This means we only need to handle this here and then we can forget about it while we are implementing our other operations.
It will also become more clear for the consumer that he/she is using the library in the wrong way when he/she gets a ArgumentNullException on .Add instead of in .Sort or similair.
To sum it up my preference is to check for valid argument when it is being supplied by the consumer and it's not being handled by the private/internal methods of the library. This basically means we have to check arguments in constructors/methods that are public and takes parameters. Our private/internal methods can only be called from our public ones and they have allready checked the input which means we are good to go!
Using Code Contracts should also be considered when verifying input.

When to use DebuggerDisplayAttribute

What are some best practices around the DebuggerDisplayAttribute? What guides your decisions on when and how to apply the attribute to your code? For example..
Do you find DebuggerDisplayAttribute more useful on some types of objects (i.e. custom data structures) rather than others?
Do you define it on public types, internal types, or both?
Will you generally add it to the initial implementation, or wait for a tester/user to request it?
When is it better to define DebuggerDisplayAttribute and when does it make more sense to override .ToString()?
Do you have guidelines on how much data you expose in the attribute, or limits on the amount of computation to include?
Do any inheritance rules apply that would make it more beneficial to apply on base classes?
Is there anything else to consider when deciding when or how to use it?
It's subjective and I'd hesitate to say there are any best practices, but:
Do you find DebuggerDisplayAttribute more useful on some types of objects (i.e. custom data structures) rather than others?
By far the most common use is types that represent business entities - and I'll commonly display ID + name. Also any types that will be stored in collections in the application.
Other than that I add it whenever I find myself frequently searching for properties in the debugger.
2.Do you define it on public types, internal types, or both?
Both.
3.Will you generally add it to the initial implementation, or wait for a tester/user to request it?
Testers/users won't ever see it - it's only used while debugging.
4.When is it better to define DebuggerDisplayAttribute and when does it make more sense to override .ToString()?
Override ToString() when you want the representation at runtime, either for logging or application-specific purposes. Use DebuggerDisplayAttribute if you only need it for debugging.
5.Do you have guidelines on how much data you expose in the attribute, or limits on the amount of computation to include?
As it's not used at runtime, the only constraint is that it should be fast enough not to impede the debugging experience (especially when called multiple times for elements of a collection).
You don't need to be concerned about exposing sensitive data as you would with runtime logging (e.g. by overriding .ToString), because such data will be visible anyway in the debugger.
6.Do any inheritance rules apply that would make it more beneficial to apply on base classes?
No, apply it on the classes you need it.
7.Is there anything else to consider when deciding when or how to use it?
Nothing else I can think of.
Debugging mode without DebuggerDisplay Attribute
Debugging mode with DebuggerDisplay Attribute
[DebuggerDisplay("{Name,nq}")]//nq suffix means no quotes
public class Product {
public int Id { get; set; }
public string Name { get; set; }
//Other members of Northwind.Product
}
DebuggerDisplay attribute best practices
Tell the debugger what to show using the DebuggerDisplay Attribute (C#, Visual Basic, F#, C++/CLI)
Debugger/Diagnostics Tips & Tricks in Visual Studio 2019
Although the attribute is quite old, you should watch the applause and the narrator's reaction :) By the way, if you want to see more debugger tricks, you might want to see this demo at your free time.
I use it a lot when I know that a code section will require a lot of debugging. It saves some time when browsing the objects in the debugger, especially if you are using expressions like "{ChildCollection.Count}". It gives you a quick idea of the data you are looking at.
I almost always put it on class that will end up in collections so that it is really quick to see each item and not just a bunch of MyNamespace.MyClass elements that you have to expand.
My opinion is that ToString() is used to provide an end-user representation of the data. DebuggerDisplay is for developers, you can decide to show the element ID, some additional internal/private properties.
DebuggerDisplay has value for any class which doesn't have a meaningful .ToString() implementation, but I personally haven't seen anyone proactively write the attributes until they are needed.
In general, Omer's link to best practices looks like sound advice; however I personally would lean away from the suggestion to have a dedicated DebuggerDisplay() method--even though it's private it seems to offer little benefit over the attribute aside from removing magic strings.

Is it good form to expose derived values as properties?

I need to derive an important value given 7 potential inputs. Uncle Bob urges me to avoid functions with that many parameters, so I've extracted the class. All parameters now being properties, I'm left with a calculation method with no arguments.
“That”, I think, “could be a property, but I'm not sure if that's idiomatic C#.”
Should I expose the final result as a property, or as a method with no arguments? Would the average C# programmer find properties confusing or offensive? What about the Alt.Net crowd?
decimal consumption = calculator.GetConsumption(); // obviously derived
decimal consumption = calculator.Consumption; // not so obvious
If the latter: should I declare interim results as [private] properties, also? Thanks to heavy method extraction, I have several interim results. Many of these shouldn't be part of the public API. Some of them could be interesting, though, and my expressions would look cleaner if I could access them as properties:
decimal interim2 = this.ImportantInterimValue * otherval;
Happy Experiment Dept.:
While debugging my code in VS2008, I noticed that I kept hovering my mouse over the method calls that compute interim results, expecting a hover-over with their return value. After turning all methods into properties, I found that exposing interim results as properties greatly assisted debugging. I'm well pleased with that, but have lingering concerns about readability.
The interim value declarations look messier. The expressions, however, are easier to read without the brackets. I no longer feel compelled to start the method name with a verb. To contrast:
// Clean method declaration; compulsive verby name; callers need
// parenthesis despite lack of any arguments.
decimal DetermineImportantInterimValue() {
return this.DetermineOtherInterimValue() * this.SomeProperty;
}
// Messier property declaration; clean name; clean access syntax
decimal ImportantInterimValue {
get {
return this.OtherInterimValue * this.SomeProperty;
}
}
I should perhaps explain that I've been coding in Python for a decade. I've been left with a tendency to spend extra time making my code easier to call than to write. I'm not sure the Python community would regard this property-oriented style as acceptably “Pythonic”, however:
def determineImportantInterimValue(self):
"The usual way of doing it."
return self.determineOtherInterimValue() * self.someAttribute
importantInterimValue = property(
lambda self => self.otherInterimValue * self.someAttribute,
doc = "I'm not sure if this is Pythonic...")
The important question here seems to be this:
Which one produces more legible, maintainable code for you in the long run?
In my personal opinion, isolating the individual calculations as properties has a couple of distinct advantages over a single monolothic method call:
You can see the calculations as they're performed in the debugger, regardless of the class method you're in. This is a boon to productivity while you're debugging the class.
If the calculations are discrete, the properties will execute very quickly, which means (in my opinion), they observe the rules for property design. It's absurd to think that a guideline for design should be treated as a straightjacket. Remember: There is no silver bullet.
If the calculations are marked private or internal, they do not add unnecessary complexity to consumers of the class.
If all of the properties are discrete enough, compiler inlining may resolve the performance issues for you.
Finally, if the final method that returns your final calculation is far and away easier to maintain and understand because you can read it, that is an utterly compelling argument in and of itself.
One of the best things you can do is think for yourself and dare to challenge the preconceived One Size Fits All notions of our peers and predecessors. There are exceptions to every rule. This case may very well be one of them.
Postscript:
I do not believe that we should abandon standard property design in the vast majority of cases. But there are cases where deviating from The Standard(TM) is called for, because it makes sense to do so.
Personally, I would prefer if you make your public API as a method instead of property. Properties are supposed to be as 'fast' as possible in C#. More details on this discussion: Properties vs Methods
Internally, GetConsumption can use any number of private properties to arrive at the result, choice is yours.
I usually go by what the method or property will do. If it is something that is going to take a little time, I'll use a method. If it's very quick or has a very small number of operations going on behind the scenes, I'll make it a property.
I use to use methods to denote any action on the object or which changes the state of an object. so, in this case I would name the function as CalculateConsumption() which computes the values from other properties.
You say you are deriving a value from seven inputs, you have implemented seven properties, one for each input, and you have a property getter for the result. Some things you might want to consider are:
What happens if the caller fails to set one or more of the seven "input" properties? Does the result still make sense? Will an exception be thrown (e.g. divide by zero)?
In some cases the API may be less discoverable. If I must call a method that takes seven parameters, I know that I must supply all seven parameters to get the result. And if some of the parameters are optional, different overloads of the method make it clear which ones.
In contrast, it may not be so clear that I have to set seven properties before accessing the "result" property, and could be easy to forget one.
When you have a method with several parameters, you can more easily have richer validation. For example, you could throw an ArgumentException if "parameter A and parameter B are both null".
If you use properties for your inputs, each property will be set independently, so you can't perform the validation when the inputs are being set - only when the result property is being dereferenced, which may be less intuitive.

Should you use accessor properties from within the class, or just from outside of the class? [duplicate]

This question already has answers here:
What is the best way to access properties from the same class, via accessors or directly? [closed]
(5 answers)
Closed 9 years ago.
I have a class 'Data' that uses a getter to access some array. If the array is null, then I want Data to access the file, fill up the array, and then return the specific value.
Now here's my question:
When creating getters and setters should you also use those same accessor properties as your way of accessing that array (in this case)? Or should you just access the array directly?
The problem I am having using the accessors from within the class is that I get infinite loops as the calling class looks for some info in Data.array, the getter finds the array null so goes to get it from the file, and that function ends up calling the getter again from within Data, array is once again null, and we're stuck in an infinite loop.
EDIT:
So is there no official stance on this? I see the wisdom in not using Accessors with file access in them, but some of you are saying to always use accessors from within a class, and others are saying to never use accessors from with the class............................................
I agree with krosenvold, and want to generalize his advice a bit:
Do not use Property getters and setters for expensive operations, like reading a file or accessing the network. Use explicit function calls for the expensive operations.
Generally, users of the class will not expect that a simple property retrieval or assignment may take a lot of time.
This is also recommended in Microsoft's Framework Design Guidelines.;
Do use a method, rather than a
property, in the following situations.
The operation is orders of magnitude
slower than a field set would be. If
you are even considering providing an
asynchronous version of an operation
to avoid blocking the thread, it is
very likely that the operation is too
expensive to be a property. In
particular, operations that access the
network or the file system (other than
once for initialization) should most
likely be methods, not properties.
I think its a good idea to always use the accessors. Then if you need any special logic when getting or setting the property, you know that everything is performing that logic.
Can you post the getter and setter for one of these properties? Maybe we can help debug it.
I have written a getter that opens a file and always regretted it later. Nowdays I would never solve that problem by lazy-constructing through the getter - period. There's the issue of getters with side-effects where people don't expect all kinds of crazy activity to be going on behind the getter. Furthermore you probably have to ensure thread safety, which can further pollute this code. Unit-Testing can also become slightly harder each time you do this.
Explicit construction is a much better solution than all sorts of lazy-init getters. It may be because I'm using DI frameworks that give me all of this as part of the standard usage patterns. I really try to treat construction logic as distinctly as possible and not hide too much, it makes code easier to understand.
No. I don't believe you should, the reason: maintainable code.
I've seen people use properties within the defining class and at first all looks well. Then someone else comes along and adds features to the properties, then someone else comes along and tries to change the class, they don't fully understand the class and all hell breaks loose.
It shouldn't because maintenance teams should fully understand what they are trying to change but they are often looking at a different problem or error and the encapsulated property often escapes them. I've see this a lot and so never use properties internally.
They can also be a performance hog, what should be a simple lookup can turn nasty if someone puts database code in the properties - and I have seen people do that too!
The KISS principle is still valid after all these years...!
Aside from the point made by others, whether to use an accessor or a field directly may need to be informed by semantics. Some times the semantics of an external consumer accessing a property is different from the mechanical necessity of accessing its value by internal code.
Eric Lippert recently blogged on this subject in a couple of posts:-
automatic-vs-explicit-properties
future-proofing-a-design
If using an Get method leads to this kind of error, you should access the value directly. Otherwise, it is good practice to use your accessors. If you should modify either the getter or setter to take specific actions in the future, you'll break your object if you fail to use that path.
I guess what you are trying to implement is some sort of a lazy-loading property, where you load the data only when it is accessed for the first time.
In such a case I would use the following approach to prevent the infinite loop:
private MyData _data = null;
public MyData Data
{
get
{
if (_data == null)
_data = LoadDataFromFile();
return _data;
}
}
private MyData LoadDataFromFile()
{
// ...
}
In other words:
don't implement a setter
always use the property to access the data (never use the field directly)
You should always use the accessors, but the function that reads the value from the file (which should be private, and called something like getValueFromFile) should only be called when the value has to be read from the file, and should just read the file and return the value(s). That function might even be better off in another class, dedicated to reading values from your data file.
If I am understanding it right, you are trying to access a property from within it's implementation (by using a method that calls the same property in the property's implementation code). I am not sure if there any official standards regarding this, but I would consider it a bad practice, unless there would be a specific need to do it.
I always prefer using private members within a class instead of properties, unless I need the functionality property implementation provides.

Generating classes automatically from unit tests?

I am looking for a tool that can take a unit test, like
IPerson p = new Person();
p.Name = "Sklivvz";
Assert.AreEqual("Sklivvz", p.Name);
and generate, automatically, the corresponding stub class and interface
interface IPerson // inferred from IPerson p = new Person();
{
string Name
{
get; // inferred from Assert.AreEqual("Sklivvz", p.Name);
set; // inferred from p.Name = "Sklivvz";
}
}
class Person: IPerson // inferred from IPerson p = new Person();
{
private string name; // inferred from p.Name = "Sklivvz";
public string Name // inferred from p.Name = "Sklivvz";
{
get
{
return name; // inferred from Assert.AreEqual("Sklivvz", p.Name);
}
set
{
name = value; // inferred from p.Name = "Sklivvz";
}
}
public Person() // inferred from IPerson p = new Person();
{
}
}
I know ReSharper and Visual Studio do some of these, but I need a complete tool -- command line or whatnot -- that automatically infers what needs to be done.
If there is no such tool, how would you write it (e.g. extending ReSharper, from scratch, using which libraries)?
What you appear to need is a parser for your language (Java), and a name and type resolver. ("Symbol table builder").
After parsing the source text, a compiler usually has a name resolver, that tries to record the definition of names and their corresponding types, and a type checker, that verifies that each expression has a valid type.
Normally the name/type resolver complains when it can't find a definition. What you want it to do is to find the "undefined" thing that is causing the problem, and infer a type for it.
For
IPerson p = new Person();
the name resolver knows that "Person" and "IPerson" aren't defined. If it were
Foo p = new Bar();
there would be no clue that you wanted an interface, just that Foo is some kind of abstract parent of Bar (e.g., a class or an interface). So the decision as which is it must be known to the tool ("whenever you find such a construct, assume Foo is an interface ..."). You could use a heuristic: IFoo and Foo means IFoo should be an interface, and somewhere somebody has to define Foo as a class realizing that interface. Once the
tool has made this decision, it would need to update its symbol tables so that it can
move on to other statements:
For
p.Name = "Sklivvz";
given that p must be an Interface (by the previous inference), then Name must be a field member, and it appears its type is String from the assignment.
With that, the statement:
Assert.AreEqual("Sklivvz", p.Name);
names and types resolve without further issue.
The content of the IFoo and Foo entities is sort of up to you; you didn't have to use get and set but that's personal taste.
This won't work so well when you have multiple entities in the same statement:
x = p.a + p.b ;
We know a and b are likely fields, but you can't guess what numeric type if indeed they are numeric, or if they are strings (this is legal for strings in Java, dunno about C#).
For C++ you don't even know what "+" means; it might be an operator on the Bar class.
So what you have to do is collect constraints, e.g., "a is some indefinite number or string", etc. and as the tool collects evidence, it narrows the set of possible constraints. (This works like those word problems: "Joe has seven sons. Jeff is taller than Sam. Harry can't hide behind Sam. ... who is Jeff's twin?" where you have to collect the evidence and remove the impossibilities). You also have to worry about the case where you end up with a contradiction.
You could rule out p.a+p.b case, but then you can't write your unit tests with impunity. There are standard constraint solvers out there if you want impunity. (What a concept).
OK, we have the ideas, now, can this be done in a practical way?
The first part of this requires a parser and a bendable name and type resolver. You need a constraint solver or at least a "defined value flows to undefined value" operation (trivial constraint solver).
Our DMS Software Reengineering Toolkit with its Java Front End could probably do this. DMS is a tool builder's tool, for people that want to build tools that process computer langauges in arbitrary ways. (Think of "computing with program fragments rather than numbers").
DMS provides general purpose parsing machinery, and can build an tree for whatever front end it is given (e.g., Java, and there's a C# front end).
The reason I chose Java is that our Java front end has all that name and type resolution machinery, and it is provided in source form so it can be bent. If you stuck to the trivial constraint solver, you could probably bend the Java name resolver to figure out the types. DMS will let you assemble trees that correspond to code fragments, and coalesce them into larger ones; as your tool collected facts for the symbol table, it could build the primitive trees.
Somewhere, you have to decide you are done. How many unit tests the tool have to see
before it knows the entire interface? (I guess it eats all the ones you provide?).
Once complete, it assembles the fragments for the various members and build an AST for an interface; DMS can use its prettyprinter to convert that AST back into source code like you've shown.
I suggest Java here because our Java front end has name and type resolution. Our C# front end does not. This is a "mere" matter of ambition; somebody has to write one, but that's quite a lot of work (at least it was for Java and I can't imagine C# is really different).
But the idea works fine in principle using DMS.
You could do this with some other infrastructure that gave you access to a parser and an a bendable name and type resolver. That might not be so easy to get for C#; I suspect MS may give you a parser, and access to name and type resolution, but not any way to change that. Maybe Mono is the answer?
You still need a was to generate code fragments and assemble them. You might try to do this by string hacking; my (long) experience with gluing program bits together is that if you do it with strings you eventually make a mess of it. You really want pieces that represent code fragments of known type, that can only be combined in ways the grammar allows; DMS does that thus no mess.
Its amazing how no one really gave anything towards what you were asking.
I dont know the answer, but I will give my thoughts on it.
If I were to attempt to write something like this myself I would probably see about a resharper plugin. The reason I say that is because as you stated, resharper can do it, but in individual steps. So I would write something that went line by line and applied the appropriate resharper creation methods chained together.
Now by no means do I even know how to do this, as I have never built anything for resharper, but that is what I would try to do. It makes logical sense that it could be done.
And if you do write up some code, PLEASE post it, as I could find that usefull as well, being able to generate the entire skeleton in one step. Very useful.
If you plan to write your own implementation I would definately suggest that you take a look at the NVelocity (C#) or Velocity (Java) template engines.
I have used these in a code generator before and have found that they make the job a whole lot easier.
It's doable - at least in theory. What I would do is use something like csparser to parse the unit test (you cannot compile it, unfortunately) and then take it from there. The only problem I can see is that what you are doing is wrong in terms of methodology - it makes more sense to generate unit tests from entity classes (indeed, Visual Studio does precisely this) than doing it the other way around.
I think a real solution to this problem would be a very specialized parser. Since that's not so easy to do, I have a cheaper idea. Unfortunately, you'd have to change the way you write your tests (namely, just the creation of the object):
dynamic p = someFactory.Create("MyNamespace.Person");
p.Name = "Sklivvz";
Assert.AreEqual("Sklivvz", p.Name);
A factory object would be used. If it can find the named object, it will create it and return it (this is the normal test execution). If it doesn't find it, it will create a recording proxy (a DynamicObject) that will record all calls and at the end (maybe on tear down) could emit class files (maybe based on some templates) that reflect what it "saw" being called.
Some disadvantages that I see:
Need to run the code in "two" modes, which is annoying.
In order for the proxy to "see" and record calls, they must be executed; so code in a catch block, for example, has to run.
You have to change the way you create your object under test.
You have to use dynamic; you'll lose compile-time safety in subsequent runs and it has a performance hit.
The only advantage that I see is that it's a lot cheaper to create than a specialized parser.
I like CodeRush from DevExpress. They have a huge customizable templating engine. And the best for me their is no Dialog boxes. They also have functionality to create methods and interfaces and classes from interface that does not exist.
Try looking at the Pex , A microsoft project on unit testing , which is still under research
research.microsoft.com/en-us/projects/Pex/
I think what you are looking for is a fuzzing tool kit (https://en.wikipedia.org/wiki/Fuzz_testing).
Al tough I never used, you might give Randoop.NET a chance to generate 'unit tests' http://randoop.codeplex.com/
Visual Studio ships with some features that can be helpful for you here:
Generate Method Stub. When you write a call to a method that doesn't exist, you'll get a little smart tag on the method name, which you can use to generate a method stub based on the parameters you're passing.
If you're a keyboard person (I am), then right after typing the close parenthesis, you can do:
Ctrl-. (to open the smart tag)
ENTER (to generate the stub)
F12 (go to definition, to take you to the new method)
The smart tag only appears if the IDE thinks there isn't a method that matches. If you want to generate when the smart tag isn't up, you can go to Edit->Intellisense->Generate Method Stub.
Snippets. Small code templates that makes it easy to generate bits of common code. Some are simple (try "if[TAB][TAB]"). Some are complex ('switch' will generate cases for an enum). You can also write your own. For your case, try "class" and "prop".
See also "How to change “Generate Method Stub” to throw NotImplementedException in VS?" for information snippets in the context of GMS.
autoprops. Remember that properties can be much simpler:
public string Name { get; set; }
create class. In Solution Explorer, RClick on the project name or a subfolder, select Add->Class. Type the name of your new class. Hit ENTER. You'll get a class declaration in the right namespace, etc.
Implement interface. When you want a class to implement an interface, write the interface name part, activate the smart tag, and select either option to generate stubs for the interface members.
These aren't quite the 100% automated solution you're looking for, but I think it's a good mitigation.
I find that whenever I need a code generation tool like this, I am probably writing code that could be made a little bit more generic so I only need to write it once. In your example, those getters and setters don't seem to be adding any value to the code - in fact, it is really just asserting that the getter/setter mechanism in C# works.
I would refrain from writing (or even using) such a tool before understanding what the motivations for writing these kinds of tests are.
BTW, you might want to have a look at NBehave?
I use Rhino Mocks for this, when I just need a simple stub.
http://www.ayende.com/wiki/Rhino+Mocks+-+Stubs.ashx

Categories

Resources