Validating Primitive Arguments and "Complex Data"
Validating Arguments
When writing a method, arguments should be validated first before any operations are performed. For example, let's say we've got a class representing people:
public class Person
{
public readonly string Name;
public readonly int Age;
public class Person(string name, int age)
{
this.Name = name;
this.Age = age;
}
}
What's wrong with this Person class? name and age aren't validated before their values are set as fields of Person. What do I mean by "validated?" Both argument should be checked that their values are acceptable. For example, what if name's value is an empty string? Or age's value is -10?
Validating the arguments is performed by throwing ArgumentExceptions or derived exceptions when the values are unacceptable. For example:
public class Person(string name, int age)
{
if (String.IsNullOrEmpty(name))
{
throw new ArgumentNullException
("name", "Cannot be null or empty.");
}
if (age <= 0 || age > 120)
{
throw new ArgumentOutOfRangeException
("age", "Must be greater than 0 and less than 120.");
}
this.Name = name;
this.Age = age;
}
This properly validates the arguments Person's constructor receives.
Tedium ad Nauseum
Because you've been validating arguments for a long time (right?), you're probably tired of writing these if (....) throw Argument... statements in all of your methods.
What can we do to avoid writing String.IsNullOrEmpty a bazillion times throughout your code?
You can look into Code Contracts in .NET 4.0.
You may also want to look at the FluentValidation Library on CodePlex if you don't want to wait for code contracts.
Ultimately, you still need to put the rules that govern argument values somewhere - it just a matter of deciding whether you prefer an imperative style (e.g. string.IsNullOrEmpty) or a declarative one.
Validating your inputs as is a key practice for writing solid code - but it certainly can be repetitive and verbose.
You may be helped by using more complex types rather than primitives.
For example, if you take the time to define something like a PersonName class, you can have the validation there, and you don't have to keep validating it on every other object that needs to have a name on it.
Obviously this is only going to help the problem if you have multiple objects that use the same field types.
You can try using the Castle Validation Framework => http://www.castleproject.org/activerecord/documentation/v1rc1/usersguide/validation.html
OR
You can use a simple validation framework that I created. Both the frameworks uses Attribute based validation. Check out the link below:
http://www.highoncoding.com/Articles/424_Creating_a_Domain_Object_Validation_Framework.aspx
There are options based on Postsharp. code-o-matic is one of them. It lets you write code like this:
public class Person(
[NotNull, NotEmpty] string name,
[NotNull, NotEmpty] int age
)
{
this.Name = name;
this.Age = age;
}
I use this every day at work.
I'll give a solution in the D programming language. I don't know how powerful C# generics and variadics are because I don't use C#, but maybe you could adapt this:
void validate(T...)(T args) { // args is variadic.
foreach(arg; args) { // Iterate over variadic argument list.
static if(isSomeString!(typeof(arg))) { // Introspect to see arg's type.
if(arg.isNullOrEmpty) {
throw new ArgException(
"Problem exists between keyboard and chair.");
}
} else static if(isOtherTypeWithBoilerPlateValidation!(typeof(arg))) {
// Do more boilerplate validation.
}
}
}
Usage:
class Foo {
SomeType myMethod(T arg1, U arg2, V arg3) {
validate(arg1, arg2, arg3);
// Do non-boilerplate validation.
// Method body.
}
}
Related
I am currently doing a project for school, and to get full marks you must;
More able candidates should use validation within the field Set method/Property of a class and throw back error messages where relevant. It is expected that a validation process calls methods from a static class.
Could someone please explain what the exam board mean by this?
Also,
More able candidates should be encouraged to make good use of try catch, get/set, and the use of specific or custom exceptions.
I've been doing validation like this:
if (isValidString(txtUsername.Text, "^[a-zA-Z0-9]*$") && (txtPassword.Text.Length > 5))
Does that mean I need to change something?
EDIT:
So, if I put my validation in the set method, will that tick this off?
It is expected that a validation process calls methods from a static class.
or is that something else?
More able candidates should use validation within the field Set method/Property of a class and throw back error messages where relevant. It is expected that a validation process calls methods from a static class.
I won't give you a code sample just so you figure out by yourself, however, a property (like public string Name { get; set: }) can have logic if you use a backing field.
For example:
public class Student
{
private string _name;
public string Name
{
get
{
return _name;
}
set
{
_name = value;
}
}
}
What they are asking you is to validate whatever is assigned to the property set method, instead of validating the input and then assigning it to the property.
More able candidates should be encouraged to make good use of try catch, get/set, and the use of specific or custom exceptions.
This overlaps somewhat with what I showed above. However, they are also asking you to throw specific/custom exceptions rather than general ones.
A basic example:
public void AssignName(string name)
{
if (name == null)
{
//WRONG!!!
// Exception is the base class, this doesn't provide you anything meaningful
throw new Exception("name is null!!!);
//Correct:
// ArgumentNullException tells you that a null value was passed when this isn't valid
throw new ArgumentNullException("name");
}
}
How bad of practice is this? I am currently being asked by my professor to do this which is against everything I have been told. Can anybody give me examples why you should not validate this way? (Using regular expressions in the get / set methods in a asp web page)
More Information:
Here is the code of what he wants us to do:
In the Property:
public String FName
{
get
{
return _fName;
}
set
{
if (validateName(value.ToString()))
_fName = value;
}
}
The method im calling:
public static bool validateName(String name)
{
bool isGood = true;
Regex regex = new Regex("^[A-Z]");
if (!regex.IsMatch(name))
isGood = false;
return isGood;
}
In general it's not good, as validating as is, presumes also a failure.
So the questions are:
How do you intend to handle faults during constructor code execution. ?
What if you get an exception in constructor? What the state of the object remains after that ?
That's why it's a bad practice in general. The good path to follow is to:
Construct object
Run validation
But these are guides, and you're free to brake them based on your convenience. So in deffence of your professor, should say, that he asked this:
Or to bring you to some thoughts
Or to teach you something
So follow his path and try to understand why he asked to write the code in that way.
It depends what you mean by validation, guard clauses are quite common practice in constructors e.g.
if(param1 == null)
throw new ArgumentNullException("param1");
It helps make sure that your object is in a consistent state for use later on (preventing you having to check at the time of use).
You can also use guard clauses on properties (what your case seems to be) and methods too, to ensure your object is always in a consistent state.
In reply to your update, I'd find that really annoying, for example:
var a = new yourObject();
a.FirstName = 123;
What my code doesn't know is that I've failed validation so I haven't changed the first name property at all!
Edit:
Your can also simplify your validation method:
public static bool validateName(String name)
{
Regex regex = new Regex("^[A-Z]");
return regex.IsMatch(name)
}
I agree with your instructor.
In general, you should validate a value in any place it is possible to set it prior to "accepting" it. The general rule is that whatever method that attempts to set the value should receive immediate feedback when it attempts to set it.
For your example, I would place the validator inside of the setter of your FName public property, and if your constructor also accepts a FName value, then simply call the FName setter within your constructor to fully encapsulate the behavior of the property, be it validation behavior or any other business rules that the property implements:
public class User
{
public User(string firstName, string lastName)
{
FirstName = firstName;
LastName = lastName;
}
private string _firstName;
public string FirstName
{
get { return _firstName; }
set
{
if (!IsValid(value))
// throw / handle appropriately
else
_firstName = value;
}
}
}
Also: stay away from abbreviations! Do not use FName; use FirstName.
Purpose of a constructor is to assign values to the members of a type. By convention, validation is not responsibility of constructor.
Validation of any information is dependent on business of the application that you are building. If you are creating a modular application where every component is meant for a specific purpose, it is better to create a separate class or a set of classes (depending on size of the application) to perform all business validations. Such validations have to be invoked depending upon the validation rules imposed on a piece of data.
I have a method that has 2 ref parameters:
public void ReplaceSomething(ref int code, ref string name)
{
...
}
I want to avoid this, as it is not a good design (and scales poorly). What are my options?
I've though about using an anonymous object, but that doesn't seem like a good idea, either.
Object something = new { code = 1, name = "test" };
ReplaceSomething(something);
Are the code and the name closely linked together? If so, consider creating a type to put the two of them together. Then you can return a value of that type.
Alternatively, you might consider returning a Tuple<int, string>.
(In both cases you can accept an input parameter of the same type, of course. As you haven't shown any of your code, it's not really clear whether you use the existing values of the parameters, or whether they could basically be out parameters.)
Why don't you want to use ref arguments? That seems like a perfectly good way to change some caller values.
The other approach would be to implement a return value. Maybe you need to better explain what the problem is?
If these values are tightly coupled and "belong together" you could define a custom class that holds your properties and either return a new instance (assuming its immutable) of that or update its properties:
class Code
{
public int Value {get;set;}
public string Name {get;set;}
}
public Code UpdateCode(Code code)
{
...
}
If you need to return these values, you can either use a tuple
public Tuple<int, string> ReplaceSomething(int code, string name)
{
...
}
Or create your own class-wrapper that holds the values as properties
public Foo ReplaceSomething(int code, string name)
{
var foo = new Foo(){...};
return foo;
}
class Foo
{
public int IntValue{get;set;}
public string StringValue{get;set;}
}
Why would you change it? ref parameters make sense at times, and if this is one of those times - use them. You could introduce a new class that contains that pair of values, which only makes sense if those values come together often.
I say, keep it.
Based on your question, I could be way off. What do you mean by replacing ref? Are you looking to overload?
public void ReplaceSomething(int code, string name)
{
// ...
}
public void ReplaceSomething()
{
return ReplaceSomething(1, "test");
}
Edit:
ok, so you need to return the code and the name what are the calculations that need to be made? Jon Skeet's answer about a tuple could be right, or you might need a POCO that contains the code the name and the replaced
public void ReplaceSomething(int code, string name)
{
var replaced = new Replaced();
replaced.code = code;
replaced.name = name;
var r;
// do some replacement calculations
replaced.replaced = r;
return replaced;
}
public class Replaced {
public string name {get; set;}
public int code {get; set;}
public string replaced {get; set;}
}
I have several applications within my domain that accept similar inputs in text fields. Each application implements its own validation. I want to bring that functionality into a class library so that rather than re-inventing the wheel on each project, our developers can quickly implement the validation library, and move on.
I'm not the best when it comes to OO design. What I need is the ability for a user to enter an arbitrary string, and then for the validation library to check it against the known types to make sure that it matches one of them. Should I build an interface and make each type of string a class that implements that interface? (seems wrong since I won't know the type when I read in the string). I could use some help identifying a pattern for this.
Thanks.
I've always been a fan of Fluent Validation for .Net. If it's more robust then you need, it's functionality is easy enough to mimic on your own.
If you're interested, here's a link to my very simple validation class. It's similar in usage to Fluent Validation, but uses lambdas to create the validation assertions. Here's a quick example of how to use it:
public class Person
{
public Person(int age){ Age = age; }
public int Age{ get; set;}
}
public class PersonValidator : AbstractValidator
{
public PersonValidator()
{
RuleFor(p => p.Age >= 0,
() => new ArgumentOutOfRangeException(
"Age must be greater than or equal to zero."
));
}
}
public class Example
{
void exampleUsage()
{
var john = new Person(28);
var jane = new Person(-29);
var personValidator = new PersonValidator();
var johnsResult = personValidator.Validate(john);
var janesResult = personValidator.Validate(jane);
displayResult(johnsResult);
displayResult(janesResult);
}
void displayResult(ValidationResult result)
{
if(!result.IsValid)
Console.WriteLine("Is valid");
else
Console.WriteLine(result.Exception.GetType());
}
}
(see source code for a more thorough example).
Output:
Is valid
System.ArgumentOutOfRangeException
Each application implements its own validation. I want to bring that functionality into a class library so that rather than re-inventing the wheel on each project, our developers can quickly implement the validation library, and move on.
Your problem seems similar to custom NUnit constraints.
NUnit allows something they call a constraint-based assertion model, and allow the user to create custom constraints, saying whether or not a given object satisfies the criteria of that constraint.
Using an object-based constraint model is superior to a purely function-based constraint model:
It lets you aggregate sub-constraints to evaluate a higher level constraint.
It lets you provide diagnostic information as to why a specific constraint doesn't match your input data.
This sounds fancy, but constraints are just functions that take a parameter of your desired type, returns true if it matches, and false if it doesn't.
Adapting it to your problem
What I need is the ability for a user to enter an arbitrary string, and then for the validation library to check it against the known types to make sure that it matches one of them.
You don't actually have to build assertions out of your constraints. You could evaluate constraints without throwing exceptions, and do your classifications first.
But I don't recommend you do any automatic classification. I recommend you attach a specific constraint to a specific input, rather than trying to match all available constraints. Pass in the string to that constraint, and call it done.
If you need to do this for higher level objects, build a constraint for the higher level object that uses specific (existing) constraints for each of its sub fields, as well as doing cross-field constraint validation.
When you're done, you can aggregate all constraint violations to the top level, and have your validation logic throw an exception containing all the violations.
BTW, I wouldn't use the exact same interface NUnit does:
It is a confusing design
I'd prefer an approach that used generics all the way through
I'd prefer an approach that allowed you to return an IEnumerable<ConstraintViolation> or IEnumerable<string>, rather than taking some sort of output writer class as a dependency
But I'd definitely steal the base concept :)
Implementation
Here's an example implementation of what I'm talking about:
public class ConstraintViolation
{
public ConstraintViolation(IConstraintBase source, string description)
{
Source = source;
Description = description;
}
public IConstraintBase Source { get; }
public string Description { get; set; }
}
public interface IConstraintBase
{
public string Name { get; }
public string Description { get; }
}
public interface IConstraint<T> : IConstraintBase
{
public IEnumerable<ConstraintViolation> GetViolations(T value);
}
And here's an example constraint to validate the length of a string (a weak example, but see my comments about this below):
public class StringLengthConstraint : IConstraint<string>
{
public StringLengthConstraint(int maximumLength)
: this(minimumLength: 0, maximumLength: maximumLength)
{
}
public StringLengthConstraint(int minimumLength, int maximumLength,
bool isNullAllowed = false)
{
MinimumLength = minimumLength;
MaximumLength = maximumLength;
IsNullAllowed = isNullAllowed;
}
public int MinimumLength { get; private set; }
public int MaximumLength { get; private set; }
public bool IsNullAllowed { get; private set; }
public IEnumerable<ConstraintViolation> GetViolations(string value)
{
if (value == null)
{
if (!IsNullAllowed)
{
yield return CreateViolation("Value cannot be null");
}
}
else
{
int length = value.Length;
if (length < MinimumLength)
{
yield return CreateViolation(
"Value is shorter than minimum length {0}",
MinimumLength);
}
if (length > MaximumLength)
{
yield return CreateViolation("Value is longer than maximum length {0}",
MaximumLength);
}
}
}
public string Name
{
get { return "String Length"; }
}
public string Description
{
get
{
return string.Format("Ensure a string is an acceptable length"
+ " - Minimum: {0}"
+ ", Maximum: {1}"
+ "{2}"
, MinimumLength
, MaximumLength
, IsNullAllowed ? "" : ", and is not null"
);
}
}
private ConstraintViolation CreateViolation(string description,
params object[] args)
{
return new ConstraintViolation(this, string.Format(description, args));
}
}
Here's how to use it when doing validation of a single field:
var violations = new StringLengthConstraint(10).GetViolations(value);
if(violations.Any())
{
throw new InvalidArgumentException("value", string.Join(", ", violations));
}
Justification
The string length constraint is a lot of code to do something stupidly simple, especially if you're doing this just once. But there are advantages to this approach:
It is reusable
Write this or use it once, and I'd agree this is a pain.
But most of the code here is to allow this to be reusable. For example you can select this out of a list of constraints for a string type. Or you can display a list of constraints or constraint violations on a UI, with tooltips, etc. Or you can use it in a unit testing framework; With an adapter class it could plug directly into NUnit.
This model supports aggregating constraints and violations
Through Linq
Through object composition
Linq:
var violations = new SomeConstraint(someData).GetViolations(value)
.Concat(new SomeOtherConstraint(someData).GetViolations(value))
;
Object composition:
// ...
public IEnumerable<ConstraintViolation> GetViolations(SomeType value)
{
if(value == 42)
{
yield return new ConstraintViolation(this, "Value cannot be 42");
}
foreach(var subViolation in subConstraint.GetViolations(value))
{
yield return subViolation;
}
}
private SomeSubConstraint subConstraint;
You need to do the following:
Parse your string and figure out (somehow) what type of string it is. I'd prefer to know it before the validation (by assigning types to fields), because if some string is incorrect, you can assign incorrect type for it.
Validate your string based on the validation rules applicable to the given field type. These validators should implement some interface, so you can validate any type of string. Usually you have not only field-type-specific validation, but field-specific-validation, so this kind of validators should also implement the same interface.
Everything else is depending on your app-specific logic.
I have to structs both having e.g. "Id":
public struct User
{
public int Id;
public string Email;
}
public struct Computer
{
public int Id;
public string Name;
}
I'd like to make a template method to rewrite Id from one IList of Computers, Users and such to another.
I've tried below, but VS complains T does not contain a definition for Id:
private static void RewriteIListIds<T>(ref IList<T> pre, IList<T> post)
{
if (post != null && post.Count > 0)
{
Assert.IsTrue(pre != null && pre.Count > 0);
for (int i = 0; i < post.Count; i++)
{
T preElement = pre[i];
T postElement = post[i];
preElement.Id = postElement.Id;
pre[i] = preElement;
}
}
}
EDIT:
Interesting ideas but I probably should have mention I'm testing a service which I really don't want and most probably can't really change.
EDIT2:
Just for future references and to be more clear - I've probably made this problem more generic than it should be - User and Computer structs are what a Web Service (currently configured as SOAP) returns in an IList. [DataContract] and [Data Member] was removed from above example to make this problem a bit more generic.
No. C# generics aren't C++ templates, basically. I would suggest that:
You stop exposing fields publicly
You stop using mutable structs
You stop using ref when you don't need to (see my article on parameter passing for more details)
You extract an interface with a read/write Id property
You implement that interface on two classes for User and Computer
You add a constraint of where T : IFoo to your generic method where IFoo is your new interface (with a better name, of course)
You can then remove the pre[i] = preElement; line of your method too...
Meta: Don't refer to a field as an attribute; the word attribute has a very specific meaning in .NET which is not the same as a field or property.
(Apologies for the slightly curt response - I don't have time to explain each point in detail right now.)
I agree with Jon: you probably don't have to do this, and it can be done in some other way.
But if you really have to you can tell the method how the type gets or sets its id.
public delegate int IdGetter<in T>(T holder);
public delegate T IdSetter<T>(T holder, int newId);
private static void RewriteIListIds<T>(IList<T> pre, IList<T> post,
IdGetter<T> getId, IdSetter<T> setId)
{
if (post != null && post.Count > 0)
{
for (int i = 0; i < post.Count; i++)
{
T preElement = pre[i];
T postElement = post[i];
int id = getId(preElement);
postElement = setId(postElement, id);
post[i] = postElement;
}
}
}
To use it
RewriteIListIds<User>(aList, bList, u => u.Id, (u,id) => {u.Id = id; return u;});
I assume that you have a C++ background? The C# feature you are using is called "generics".
Generics are not templates.
[...] You can think of templates as a fancy-pants search-and-replace
mechanism. When you say DoIt<string> in a template, the compiler
conceptually searches out all uses of “T”, replaces them with
“string”, and then compiles the resulting source code. Overload
resolution proceeds with the substituted type arguments known, and the
generated code then reflects the results of that overload resolution.
[...] That’s not how generic types work; generic types are, well, generic.
We do the overload resolution once and bake in the result. We do not
change it at runtime when someone, possibly in an entirely different
assembly, uses string as a type argument to the method
On another note, it is strongly recommended not to create mutable structs in C#.
You need to constrain you generic so that the compiler knows something about type T.
One way to do this is to create a parent object that contains the Id value, lets call it Thing, then have your two objects inherit from that. Then you can declare your generic so that T is constrained to children of Thing:
public class Thing
{
public int Id;
}
public class User: Thing
{
public string Email;
}
public class Computer : Thing
{
public string Name;
}
private static void RewriteIListIds<T>(ref IList<T> pre, IList<T> post) where T: Thing
Now the compiler knows that T must contain all the properties of Thing so it can assume there will be an Id field. If you don't want to use inheritance, the you can do the same thing with an interface and have each of your objects implelment that interface.