Nested Validation - c#

I'm wondering what the best approach is to take with validation.
If I have a complex object object composed of primitives, associations, and collections of other custom objects, should an IsValid() method validate the child objects as well as the required fields/keys of the root object?
If yes, should this be in some sort of abstract class, or is it best to use interfaces? With abstract, I would need to cast my child object interfaces to their concrete class definition in order to use an abstract method, whereas with interface validation I believe I can keep my children as interfaces as I call their validation methods.
Also, I'm not using MVC, but MVP with web forms (and trying to employ DDD principles).
Thanks.
UPDATE
I have a Aggregate root of ScheduledMeeting:
class ScheduledMeeting : BaseValidation
{
ScheduledMeetingID {get;set;}
ITimeSlot TimeSlot {get;set;}
IList<IMeetingAssignee> Assignees{get;set;}
DateTime meetingDate {get;set;}
AssignEmployees(IList<IEmployees> employees){}
}
Currently, there exists an abstract class of BaseValidation, which looks similar to the following:
public bool isValid(bool validateKeys)
{
if (validateKeys)
{
ValidateRequiredFields();
ValidateKeys();
}
else
{
ValidateRequiredFields();
}
return true;
}
where ValidateRequiredFields() and ValidateKeys() are overridden in implementing objects.
If I'm to use the above and cascade to IMeetingAssigned, I would need to loop in both ValidateKeys() and ValidateRequiredKeys() in ScheduledMeeting, casting IMeetingAssigned to concrete MeetingAssigned before then calling either ValidateKeys() or ValidateRequiredKeys() in this object (as it would also implement BaseValidation), and so on, all the way down.
UPDATE 2
I am stuck with .NET 3.5, and so cannot implement Code Contracts, etc. (as far as I'm aware).

Don't let your objects get into an invalid state in the first place, saves you a lot of 'IsValid' trouble.

Since you're employing Domain Driven Design I assume you have identified and modelled your classes as Aggregates.
To answer your question: Yes, the Aggregate Root is responsible of making sure that itself and everything that's contained is valid at any given point in time. The Aggregate Root should never be in an invalid state.
Update from the comments: Get rid of your validation interface. An Aggregate should never be in an invalid state in the first place. Whenever the state of your AR is about to change, make sure the resulting state wouldn't brake any invariants. If it would, the AR should reject the modification.
Enforce the invariants, don't validate after the fact.

Related

Can you have an interface be dependent on a class?

I'm studying SOLID principles and have a question about dependency management in relation to interfaces.
An example from the book I'm reading (Adaptive Code via C# by Gary McLean Hall) shows a TradeProcessor class that will get the trade data, process it, and store it in the database. The trade data is modeled by a class called TradeRecord. A TradeParser class will handle converting the trade data that is received into a TradeRecord instance(s). The TradeProcessor class only references an ITradeParser interface so that it is not dependent on the TradeParser implementation.
The author has the Parse method (in the ITradeParser interface) return an IEnumerable<TradeRecord> collection that holds the processed trade data. Doesn't that mean that ITradeParser is now dependent on the TradeRecord class?
Shouldn't the author have done something like make an ITradeRecord interface and have Parse return a collection of ITradeRecord instances? Or am I missing something important?
Here's the code (the implementation of TradeRecord is irrelevant so it is omitted):
TradeProcessor.cs
public class TradeProcessor
{
private readonly ITradeParser tradeParser;
public TradeProcessor(ITradeParser tradeParser)
{
this.tradeParser = tradeParser;
}
public void ProcessTrades()
{
IEnumerable<string> tradeData = "Simulated trade data..."
var trades = tradeParser.Parse(tradeData);
// Do something with the parsed data...
}
}
ITradeParser.cs
public interface ITradeParser
{
IEnumerable<TradeRecord> Parse(IEnumerable<string> tradeData);
}
This is a good question that goes into the tradeoff between purity and practicality.
Yes, by pure principal, you can say that ITradeParser.Parse should return a collection of ITraceRecord interfaces. After all, why tie yourself to a specific implementation?
However, you can take this further. Should you accept an IEnumerable<string>? Or should you have some sort of ITextContainer? I32bitNumeric instead of int? This is reductio ad absurdum, of course, but it shows that we always, at some point, reach a point where we're working on something, a concrete object (number, string, TraceRecord, whatever), not an abstraction.
This also brings up the point of why we use interfaces in the first place, which is to define contracts for logic and functionality. An ITradeProcessor is a contract for an unknown implementation that can be replaced or updated. A TradeRecord isn't a contract for implementation, it is the implementation. If it's a DTO object, which it seems to be, there would be no difference between the interface and the implementation, which means there's no real purpose in defining this contract - it's implied in the concrete class.
The author has the Parse method (in the ITradeParser interface) return an IEnumerable collection that holds the processed trade data.
Doesn't that mean that ITradeParser is now dependent on the TradeRecord class?
Yes, ITradeParser is now tightly coupled with TradeRecord. Given the more academic approach of this question, I can see where you are coming from. But what is TradeRecord? A record, by definition, is generally a simple, non-intelligent piece of data (sometimes called POCO, DTO, or Model).
At some point, the potential gain of abstraction is less valuable than the complexities it causes. This approach is pretty common in practice - Models (as I refer to them) are sealed types that flow through the layers of an application. Layers that act upon the models are abstracted to interfaces, so that each layer may be mocked and tested separately.
For example, a client application may have a View, ViewModel, and Repository layer. Each layer knows how to work with the concrete record type. But the ViewModel could be wired up to work with a mocked IRepository, which builds up the concrete types with hardcoded, mocked data. There's no benefit to an abstracted IModel at this point - it just has straight data.

Best way to implement an "Inheritance Square"

Here is my class diagram.
The problem is in AgendaInstance (see red dot). I'm trying to inherit (reuse) Agenda.Tasks to contain its own tasks, which are of type TaskInstance, a subtype of Task.
I can put this.Tasks.Add(new TaskInstance()); inside AgendaInstance. That code works, but the problem comes in when I try to serialize or bind. Since Tasks is statically bound to Task all that gets serialized (e.g., to xml) or bound (e.g., to a grid row) are the properties of Task, not TaskInstance.
Is there a design pattern I can use here to overcome this issue? I don't want to shadow (new) Tasks in AgendaInstance. That would defeat the purpose of having an inheritance hierarchy. My midi-chlorians tell me there's a solution that is higher than directly dealing with serialization or binding specifics; it's a "deeper" issue that lends itself to a more fundamental solution. I'm going to fiddle around with generics but perhaps you know of an even better way or a better pattern.
90% of my experience with xml serialization is bad. They tend to break inheritance model and does not support interfaces. Therefore, it resulting you to hack and tinker the existing class to suit the serialization. XmlIgnore and duplicated properties usually come to hand when dealing with it.
Therefore usually I create another class for the serialization purpose only. Ex: AgendaSerializable, with TaskSerializeable as Tasks. The benefit is: you keep your inheritance and data model clean, while you need to handle with data conversion as the cons.
may the force be with you.
You could make Agenda generic on the Task, like this:
class Agenda<T> where T : Task {
public IList<T> Tasks {get; private set;}
...
}
class AgendaInstance<TaskInstance> {
...
}
Now there is only one Task property in the hierarchy. However, Agenda requires a type parameter to be instantiated, so what used to be a "plain" Agenda becomes Agenda<Task>.

Proxy objects to simulate a soon-to-be-created database

I have a database that contains "widgets", let's say. Widgets have properties like Length and Width, for example. The original lower-level API for creating wdigets is a mess, so I'm writing a higher-level set of functions to make things easier for callers. The database is strange, and I don't have good control over the timing of the creation of a widget object. Specifically, it can't be created until the later stages of processing, after certain other things have happened first. But I'd like my callers to think that a widget object has been created at an earlier stage, so that they can get/set its properties from the outset.
So, I implemented a "ProxyWidget" object that my callers can play with. It has private fields like private_Length and private_Width that can store the desired values. Then, it also has public properties Length and Width, that my callers can access. If the caller tells me to set the value of the Width property, the logic is:
If the corresponding widget object already exists in the database, then set
its Width property
If not, store the given width value in the private_Width field for later use.
At some later stage, when I'm sure that the widget object has been created in the database, I copy all the values: copy from private_Width to the database Width field, and so on (one field/property at a time, unfortunately).
This works OK for one type of widget. But I have about 50 types, each with about 20 different fields/properties, and this leads to an unmaintainable mess. I'm wondering if there is a smarter approach. Perhaps I could use reflection to create the "proxy" objects and copy field/property data in a generic way, rather than writing reams of repetitive code? Factor out common code somehow? Can I learn anything from "data binding" patterns? I'm a mathematician, not a programmer, and I have an uneasy feeling that my current approach is just plain dumb. My code is in C#.
First, in my experience, manually coding a data access layer can feel like a lot of repetitive work (putting an ORM in place, such as NHibernate or Entity Framework, might somewhat alleviate this issue), and updating a legacy data access layer is awful work, especially when it consists of many parts.
Some things are unclear in your question, but I suppose it is still possible to give a high-level answer. These are meant to give you some ideas:
You can build ProxyWidget either as an alternative implementation for Widget (or whatever the widget class from the existing low-level API is called), or you can implement it "on top of", or as a "wrapper around", Widget. This is the Adapter design pattern.
public sealed class ExistingTerribleWidget { … }
public sealed class ShinyWidget // this is the wrapper that sits on top of the above
{
public ShinyWidget(ExistingTerribleWidget underlying) { … }
private ExistingTerribleWidget underlying;
… // perform all real work by delegating to `underlying` as appropriate
}
I would recommend that (at least while there is still code using the existing low-level API) you use this pattern instead of creating a completely separate Widget implementation, because if ever there is a database schema change, you will have to update two different APIs. If you build your new EasyWidget class as a wrapper on top of the existing API, it could remain unchanged and only the underlying implementation would have to be updated.
You describe ProxyWidget having two functions (1) Allow modifications to an already persisted widget; and (2) Buffer for a new widget, which will be added to the database later.
You could perhaps simplify your design if you have one common base type and two sub-classes: One for new widgets that haven't been persisted yet, and one for already persisted widgets. The latter subtype possibly has an additional database ID property so that the existing widget can be identified, loaded, modified, and updated in the database:
interface IWidget { /* define all the properties required for a widget */ }
interface IWidgetTemplate : IWidget
{
IPersistedWidget Create();
bool TryLoadFrom(IWidgetRepository repository, out IPersistedWidget matching);
}
interface IPersistedWidget : IWidget
{
Guid Id { get; }
void SaveChanges();
}
This is one example for the Builder design pattern.
If you need to write similar code for many classes (for example, your 50+ database object types) you could consider using T4 text templates. This just makes writing code less repetitive; but you will still have to define your 50+ objects somewhere.

Validation strategies

I have a business model with many classes in, some logical entities within this model consist of many different classes (Parent-child-grandchild.) On these various classes I define constraints which are invariant, for example that the root of the composite should have a value for Code.
I currently have each class implement an interface like so...
public interface IValidatable
{
IEnumerable<ValidationError> GetErrors(string path);
}
The parent would add a validation error if Code is not set, and then execute GetErrors on each child, which in turn would call GetErrors on each grandchild.
Now I need to validate different constraints for different operations, for example
Some constraints should always be checked because they are invariant
Some constraints should be checked when I want to perform operation X on the root.
Some additional constraints might be checked when performing operation Y.
I have considered adding a "Reason" parameter to the GetErrors method but for a reason I can't quite put my finger on this doesn't feel right. I have also considered creating a visitor and having a concrete implementation to validate for OperationX and another for OperationY but dislike this because some of the constraint checks would be required for multiple operations but not all of them (e.g. a Date is required for OperationX+OperationY but not OperationZ) and I wouldn't like to have to duplicate the code which checks.
Any suggestions would be appreciated.
You have an insulation problem here, as your classes are in charge of doing their own validation, yet the nature of that validation depends on the type of operation you're doing. This means the classes need to know about the kinds of operations that can be performed on them, which creates a fairly tight coupling between the classes and the operations that use them.
One possible design is to create a parallel set of classes like this:
public interface IValidate<T>
{
IEnumerable<ValidationError> GetErrors(T instance, string path);
}
public sealed class InvariantEntityValidator : IValidate<Entity>
{
public IEnumerable<ValidationError> GetErrors(Entity entity, string path)
{
//Do simple (invariant) validation...
}
}
public sealed class ComplexEntityValidator : IValidate<Entity>
{
public IEnumerable<ValidationError> GetErrors(Entity entity, string path)
{
var validator = new InvariantEntityValidator();
foreach (var error in validator.GetErrors(entity, path))
yield return error;
//Do additional validation for this complex case
}
}
You'll still need to resolve how you want to associate the validation classes with the various classes being validated. It sounds like this should occur at the level of the operations somehow, as this is where you know what type of validation needs to occur. It's difficult to say without a better understanding of your architecture.
I would do a kind of attribute-based validation:
public class Entity
{
[Required, MaxStringLength(50)]
public string Property1 { get; set; }
[Between(5, 20)]
public int Property2 { get; set; }
[ValidateAlways, Between(0, 5)]
public int SomeOtherProperty { get; set; }
[Requires("Property1, Property2")]
public void OperationX()
{
}
}
Each property which is passed to the Requires-attribute needs to be valid to perform the operation.
The properties which have the ValidateAlways-attribute, must be valid always - no matter what operation.
In my pseudo-code Property1, Property2 and SomeOtherProperty must be valid to execute OperationX.
Of course you have to add an option to the Requires-attribute to check validation attributes on a child, too. But I'm not able to suggest how to do that without seeing some example code.
Maybe something like that:
[Requires("Property1, Property2, Child2: Property3")]
If needed you can also reach strongly typed property pointers with lambda expressions instead of strings (Example).
I would suggest using the Fluent Validation For .Net library. This library allows you to setup validators pretty easily and flexibly, and if you need different validations for different operations you can use the one that applies for that specific operation (and change them out) very easily.
I've used Sptring.NET's validation engine for exactly the same reason - It allows you to use Conditional Validators. You just define rules - what validation to apply and under what conditions and Spring does the rest. The good thing is that your business logic is no longer polluted by interfaces for validation
You can find more information in documentation at springframework.net I will just copy the sample for the their doc to show how it looks like:
<v:condition test="StartingFrom.Date >= DateTime.Today" when="StartingFrom.Date != DateTime.MinValue">
<v:message id="error.departureDate.inThePast" providers="departureDateErrors, validationSummary"/>
</v:condition>
In this example the StartingFrom property of the Trip object is compared to see if it is later than the current date, i.e. DateTime but only when the date has been set (the initial value of StartingFrom.Date was set to DateTime.MinValue).
The condition validator could be considered "the mother of all validators". You can use it to achieve almost anything that can be achieved by using other validator types, but in some cases the test expression might be very complex, which is why you should use more specific validator type if possible. However, condition validator is still your best bet if you need to check whether particular value belongs to a particular range, or perform a similar test, as those conditions are fairly easy to write.
If you're using .Net 4.0, you can use Code Contracts to control some of this.
Try to read this article from top to bottom, I've gained quite a few ideas from this.
http://codebetter.com/jeremymiller/2007/06/13/build-your-own-cab-part-9-domain-centric-validation-with-the-notification-pattern/
It is attribute based domain validation with a Notification that wraps those validations up to higher layers.
I would separate the variant validation logic out, perhaps with a Visitor as you mentioned. By separating the validation from the classes, you will be keeping your operations separate from your data, and that can really help to keep things clean.
Think about it this way too -- if you mix-in your validation and operations with your data classes, think of what are things going to look like a year from now, when you are doing an enhancement and you need to add a new operation. If each operation's validation rules and operation logic are separate, it's largely just an "add" -- you create a new operation, and a new validation visitor to go with it. On the other hand, if you have to go back and touch alot of "if operation == x" logic in each of your data classes, then you have some added risk for regression bugs/etc.
I rely on proven validation strategy which has implemented in .net framework in 'System.ComponentModel.DataAnnotations Namespace' and for example used in ASP.NET MVC.
This one provides unobtrusive way to apply validation rules (using attributes or implementing IValidatableObject) and facilities to validate rules.
Scott Allen described this way in great article 'Manual Validation with Data Annotations'.
if you want to apply validation attributes on interface then look at MetadataTypeAttribute
to get better understanding how it works look at MS source code

When to use Properties and Methods?

I'm new to the .NET world having come from C++ and I'm trying to better understand properties. I noticed in the .NET framework Microsoft uses properties all over the place. Is there an advantage for using properties rather than creating get/set methods? Is there a general guideline (as well as naming convention) for when one should use properties?
It is pure syntactic sugar. On the back end, it is compiled into plain get and set methods.
Use it because of convention, and that it looks nicer.
Some guidelines are that when it has a high risk of throwing Exceptions or going wrong, don't use properties but explicit getters/setters. But generally even then they are used.
Properties are get/set methods; simply, it formalises them into a single concept (for read and write), allowing (for example) metadata against the property, rather than individual members. For example:
[XmlAttribute("foo")]
public string Name {get;set;}
This is a get/set pair of methods, but the additional metadata applies to both. It also, IMO, simply makes it easier to use:
someObj.Name = "Fred"; // clearly a "set"
DateTime dob = someObj.DateOfBirth; // clearly a "get"
We haven't duplicated the fact that we're doing a get/set.
Another nice thing is that it allows simple two-way data-binding against the property ("Name" above), without relying on any magic patterns (except those guaranteed by the compiler).
There is an entire book dedicated to answering these sorts of questions: Framework Design Guidelines from Addison-Wesley. See section 5.1.3 for advice on when to choose a property vs a method.
Much of the content of this book is available on MSDN as well, but I find it handy to have it on my desk.
Consider reading Choosing Between Properties and Methods. It has a lot of information on .NET design guidelines.
properties are get/set methods
Properties are set and get methods as people around here have explained, but the idea of having them is making those methods the only ones playing with the private values (for instance, to handle validations).
The whole other logic should be done against the properties, but it's always easier mentally to work with something you can handle as a value on the left and right side of operations (properties) and not having to even think it is a method.
I personally think that's the main idea behind properties.
I always think that properties are the nouns of a class, where as methods are the verbs...
First of all, the naming convention is: use PascalCase for the property name, just like with methods. Also, properties should not contain very complex operations. These should be done kept in methods.
In OOP, you would describe an object as having attributes and functionality. You do that when designing a class. Consider designing a car. Examples for functionality could be the ability to move somewhere or activate the wipers. Within your class, these would be methods. An attribute would be the number of passengers within the car at a given moment. Without properties, you would have two ways to implement the attribute:
Make a variable public:
// class Car
public int passengerCount = 4;
// calling code
int count = myCar.passengerCount;
This has several problems. First of all, it is not really an attribute of the vehicle. You have to update the value from inside the Car class to have it represent the vehicles true state. Second, the variable is public and could also be written to.
The second variant is one widley used, e. g. in Java, where you do not have properties like in c#:
Use a method to encapsulate the value and maybe perform a few operations first.
// class Car
public int GetPassengerCount()
{
// perform some operation
int result = CountAllPassengers();
// return the result
return result;
}
// calling code
int count = myCar.GetPassengerCount();
This way you manage to get around the problems with a public variable. By asking for the number of passengers, you can be sure to get the most recent result since you recount before answering. Also, you cannot change the value since the method does not allow it. The problem is, though, that you actually wanted the amount of passengers to be an attribute, not a function of your car.
The second approach is not necessarily wrong, it just does not read quite right. That's why some languages include ways of making attributes look like variables, even though they work like methods behind the scenes. Actionscript for example also includes syntax to define methods that will be accessed in a variable-style from within the calling code.
Keep in mind that this also brings responsibility. The calling user will expect it to behave like an attribute, not a function. so if just asking a car how many passengers it has takes 20 seconds to load, then you probably should pack that in a real method, since the caller will expect functions to take longer than accessing an attribute.
EDIT:
I almost forgot to mention this: The ability to actually perform certain checks before letting a variable be set. By just using a public variable, you could basically write anything into it. The setter method or property give you a chance to check it before actually saving it.
Properties simply save you some time from writing the boilerplate that goes along with get/set methods.
That being said, a lot of .NET stuff handles properties differently- for example, a Grid will automatically display properties but won't display a function that does the equivalent.
This is handy, because you can make get/set methods for things that you don't want displayed, and properties for those you do want displayed.
The compiler actually emits get_MyProperty and set_MyProperty methods for each property you define.
Although it is not a hard and fast rule and, as others have pointed out, Properties are implemented as Get/Set pairs 'behind the scenes' - typically Properties surface encapsulated/protected state data whereas Methods (aka Procedures or Functions) do work and yield the result of that work.
As such Methods will take often arguments that they might merely consume but also may return in an altered state or may produce a new object or value as a result of the work done.
Generally speaking - if you need a way of controlling access to data or state then Properties allow the implementation that access in a defined, validatable and optimised way (allowing access restriction, range & error-checking, creation of backing-store on demand and a way of avoiding redundant setting calls).
In contrast, methods transform state and give rise to new values internally and externally without necessarily repeatable results.
Certainly if you find yourself writing procedural or transformative code in a property, you are probably really writing a method.
Also note that properties are available via reflection. While methods are, too, properties represent "something interesting" about the object. If you are trying to display a grid of properties of an object-- say, something like the Visual Studio form designer-- then you can use reflection to query the properties of a class, iterate through each property, and interrogate the object for its value.
Think of it this way, Properties encapsulate your fields (commoningly marked private) while at the same time provides your fellow developers to either set or get the field value. You can even perform routine validation in the property's set method should you desire.
Properties are not just syntactic sugar - they are important if you need to create object-relational mappings (Linq2Sql or Linq2Entities), because they behave just like variables while it is possible to hide the implementation details of the object-relational mapping (persistance). It is also possible to validate a value being assigned to it in the getter of the property and protect it against assigning unwanted values.
You can't do this with the same elegance with methods. I think it is best to demonstrate this with a practical example.
In one of his articles, Scott Gu creates classes which are mapped to the Northwind database using the "code first" approach. One short example taken from Scott's blog (with a little modification, the full article can be read at Scott Gu's blog here):
public class Product
{
[Key]
public int ProductID { get; set; }
public string ProductName { get; set; }
public Decimal? UnitPrice { get; set; }
public bool Discontinued { get; set; }
public virtual Category category { get; set; }
}
// class Category omitted in this example
public class Northwind : DbContext
{
public DbSet<Product> Products { get; set; }
public DbSet<Category> Categories { get; set; }
}
You can use entity sets Products, Categories and the related classes Product and Category just as if they were normal objects containing variables: You can read and write them and they behave just like normal variables. But you can also use them in Linq queries, persist them (store them in the database and retrieve them).
Note also how easy it is to use annotations (C# attributes) to define the primary key (in this example ProductID is the primary key for Product).
While the properties are used to define a representation of the data stored in the database, there are some methods defined in the entity set class which control the persistence: For example, the method Remove() marks a given entity as deleted, while Add() adds a given entity, SaveChanges() makes the changes permanent. You can consider the methods as actions (i.e. you control what you want to do with the data).
Finally I give you an example how naturally you can use those classes:
// instantiate the database as object
var nw = new NorthWind();
// select product
var product = nw.Products.Single(p => p.ProductName == "Chai");
// 1. modify the price
product.UnitPrice = 2.33M;
// 2. store a new category
var c = new Category();
c.Category = "Example category";
c.Description = "Show how to persist data";
nw.Categories.Add(c);
// Save changes (1. and 2.) to the Northwind database
nw.SaveChanges();

Categories

Resources