Related
I'm writing a library that has several public classes and methods, as well as several private or internal classes and methods that the library itself uses.
In the public methods I have a null check and a throw like this:
public int DoSomething(int number)
{
if (number == null)
{
throw new ArgumentNullException(nameof(number));
}
}
But then this got me thinking, to what level should I be adding parameter null checks to methods? Do I also start adding them to private methods? Should I only do it for public methods?
Ultimately, there isn't a uniform consensus on this. So instead of giving a yes or no answer, I'll try to list the considerations for making this decision:
Null checks bloat your code. If your procedures are concise, the null guards at the beginning of them may form a significant part of the overall size of the procedure, without expressing the purpose or behaviour of that procedure.
Null checks expressively state a precondition. If a method is going to fail when one of the values is null, having a null check at the top is a good way to demonstrate this to a casual reader without them having to hunt for where it's dereferenced. To improve this, people often use helper methods with names like Guard.AgainstNull, instead of having to write the check each time.
Checks in private methods are untestable. By introducing a branch in your code which you have no way of fully traversing, you make it impossible to fully test that method. This conflicts with the point of view that tests document the behaviour of a class, and that that class's code exists to provide that behaviour.
The severity of letting a null through depends on the situation. Often, if a null does get into the method, it'll be dereferenced a few lines later and you'll get a NullReferenceException. This really isn't much less clear than throwing an ArgumentNullException. On the other hand, if that reference is passed around quite a bit before being dereferenced, or if throwing an NRE will leave things in a messy state, then throwing early is much more important.
Some libraries, like .NET's Code Contracts, allow a degree of static analysis, which can add an extra benefit to your checks.
If you're working on a project with others, there may be existing team or project standards covering this.
If you're not a library developer, don't be defensive in your code
Write unit tests instead
In fact, even if you're developing a library, throwing is most of the time: BAD
1. Testing null on int must never be done in c# :
It raises a warning CS4072, because it's always false.
2. Throwing an Exception means it's exceptional: abnormal and rare.
It should never raise in production code. Especially because exception stack trace traversal can be a cpu intensive task. And you'll never be sure where the exception will be caught, if it's caught and logged or just simply silently ignored (after killing one of your background thread) because you don't control the user code. There is no "checked exception" in c# (like in java) which means you never know - if it's not well documented - what exceptions a given method could raise. By the way, that kind of documentation must be kept in sync with the code which is not always easy to do (increase maintenance costs).
3. Exceptions increases maintenance costs.
As exceptions are thrown at runtime and under certain conditions, they could be detected really late in the development process. As you may already know, the later an error is detected in the development process, the more expensive the fix will be. I've even seen exception raising code made its way to production code and not raise for a week, only for raising every day hereafter (killing the production. oops!).
4. Throwing on invalid input means you don't control input.
It's the case for public methods of libraries. However if you can check it at compile time with another type (for example a non nullable type like int) then it's the way to go. And of course, as they are public, it's their responsibility to check for input.
Imagine the user who uses what he thinks as valid data and then by a side effect, a method deep in the stack trace trows a ArgumentNullException.
What will be his reaction?
How can he cope with that?
Will it be easy for you to provide an explanation message ?
5. Private and internal methods should never ever throw exceptions related to their input.
You may throw exceptions in your code because an external component (maybe Database, a file or else) is misbehaving and you can't guarantee that your library will continue to run correctly in its current state.
Making a method public doesn't mean that it should (only that it can) be called from outside of your library (Look at Public versus Published from Martin Fowler). Use IOC, interfaces, factories and publish only what's needed by the user, while making the whole library classes available for unit testing. (Or you can use the InternalsVisibleTo mechanism).
6. Throwing exceptions without any explanation message is making fun of the user
No need to remind what feelings one can have when a tool is broken, without having any clue on how to fix it. Yes, I know. You comes to SO and ask a question...
7. Invalid input means it breaks your code
If your code can produce a valid output with the value then it's not invalid and your code should manage it. Add a unit test to test this value.
8. Think in user terms:
Do you like when a library you use throws exceptions for smashing your face ? Like: "Hey, it's invalid, you should have known that!"
Even if from your point of view - with your knowledge of the library internals, the input is invalid, how you can explain it to the user (be kind and polite):
Clear documentation (in Xml doc and an architecture summary may help).
Publish the xml doc with the library.
Clear error explanation in the exception if any.
Give the choice :
Look at Dictionary class, what do you prefer? what call do you think is the fastest ? What call can raises exception ?
Dictionary<string, string> dictionary = new Dictionary<string, string>();
string res;
dictionary.TryGetValue("key", out res);
or
var other = dictionary["key"];
9. Why not using Code Contracts ?
It's an elegant way to avoid the ugly if then throw and isolate the contract from the implementation, permitting to reuse the contract for different implementations at the same time. You can even publish the contract to your library user to further explain him how to use the library.
As a conclusion, even if you can easily use throw, even if you can experience exceptions raising when you use .Net Framework, that doesn't mean it could be used without caution.
Here are my opinions:
General Cases
Generally speaking, it is better to check for any invalid inputs before you process them in a method for robustness reason - be it private, protected, internal, protected internal, or public methods. Although there are some performance costs paid for this approach, in most cases, this is worth doing rather than paying more time to debug and to patch the codes later.
Strictly Speaking, however...
Strictly speaking, however, it is not always needed to do so. Some methods, usually private ones, can be left without any input checking provided that you have full guarantee that there isn't single call for the method with invalid inputs. This may give you some performance benefit, especially if the method is called frequently to do some basic computation/action. For such cases, doing checking for input validity may impair the performance significantly.
Public Methods
Now the public method is trickier. This is because, more strictly speaking, although the access modifier alone can tell who can use the methods, it cannot tell who will use the methods. More over, it also cannot tell how the methods are going to be used (that is, whether the methods are going to be called with invalid inputs in the given scopes or not).
The Ultimate Determining Factor
Although access modifiers for methods in the code can hint on how to use the methods, ultimately, it is humans who will use the methods, and it is up to the humans how they are going to use them and with what inputs. Thus, in some rare cases, it is possible to have a public method which is only called in some private scope and in that private scope, the inputs for the public methods are guaranteed to be valid before the public method is called.
In such cases then, even the access modifier is public, there isn't any real need to check for invalid inputs, except for robust design reason. And why is this so? Because there are humans who know completely when and how the methods shall be called!
Here we can see, there is no guarantee either that public method always require checking for invalid inputs. And if this is true for public methods, it must also be true for protected, internal, protected internal, and private methods as well.
Conclusions
So, in conclusion, we can say a couple of things to help us making decisions:
Generally, it is better to have checks for any invalid inputs for robust design reason, provided that performance is not at stake. This is true for any type of access modifiers.
The invalid inputs check could be skipped if performance gain could be significantly improved by doing so, provided that it can also be guaranteed that the scope where the methods are called are always giving the methods valid inputs.
private method is usually where we skip such checking, but there is no guarantee that we cannot do that for public method as well
Humans are the ones who ultimately use the methods. Regardless of how the access modifiers can hint the use of the methods, how the methods are actually used and called depend on the coders. Thus, we can only say about general/good practice, without restricting it to be the only way of doing it.
The public interface of your library deserves tight checking of preconditions, because you should expect the users of your library to make mistakes and violate the preconditions by accident. Help them understand what is going on in your library.
The private methods in your library do not require such runtime checking because you call them yourself. You are in full control of what you are passing. If you want to add checks because you are afraid to mess up, then use asserts. They will catch your own mistakes, but do not impede performance during runtime.
Though you tagged language-agnostic, it seems to me that it probably doesn't exist a general response.
Notably, in your example you hinted the argument: so with a language accepting hinting it'll fire an error as soon as entering the function, before you can take any action.
In such a case, the only solution is to have checked the argument before calling your function... but since you're writing a library, that cannot have sense!
In the other hand, with no hinting, it remains realistic to check inside the function.
So at this step of the reflexion, I'd already suggest to give up hinting.
Now let's go back to your precise question: to what level should it be checked?
For a given data piece it'd happen only at the highest level where it can "enter" (may be several occurrences for the same data), so logically it'd concern only public methods.
That's for the theory. But maybe you plan a huge, complex, library so it might be not easy to ensure having certainty about registering all "entry points".
In this case, I'd suggest the opposite: consider to merely apply your controls everywhere, then only omit it where you clearly see it's duplicate.
Hope this helps.
In my opinion you should ALWAYS check for "invalid" data - independent whether it is a private or public method.
Looked from the other way... why should you be able to work with something invalid just because the method is private? Doesn't make sense, right? Always try to use defensive programming and you will be happier in life ;-)
This is a question of preference. But consider instead why are you checking for null or rather checking for valid input. It's probably because you want to let the consumer of your library to know when he/she is using it incorrectly.
Let's imagine that we have implemented a class PersonList in a library. This list can only contain objects of the type Person. We have also on our PersonList implemented some operations and therefore we do not want it to contain any null values.
Consider the two following implementations of the Add method for this list:
Implementation 1
public void Add(Person item)
{
if(_size == _items.Length)
{
EnsureCapacity(_size + 1);
}
_items[_size++] = item;
}
Implementation 2
public void Add(Person item)
{
if(item == null)
{
throw new ArgumentNullException("Cannot add null to PersonList");
}
if(_size == _items.Length)
{
EnsureCapacity(_size + 1);
}
_items[_size++] = item;
}
Let's say we go with implementation 1
Null values can now be added in the list
All opoerations implemented on the list will have to handle theese null values
If we should check for and throw a exception in our operation, consumer will be notified about the exception when he/she is calling one of the operations and it will at this state be very unclear what he/she has done wrong (it just wouldn't make any sense to go for this approach).
If we instead choose to go with implementation 2, we make sure input to our library has the quality that we require for our class to operate on it. This means we only need to handle this here and then we can forget about it while we are implementing our other operations.
It will also become more clear for the consumer that he/she is using the library in the wrong way when he/she gets a ArgumentNullException on .Add instead of in .Sort or similair.
To sum it up my preference is to check for valid argument when it is being supplied by the consumer and it's not being handled by the private/internal methods of the library. This basically means we have to check arguments in constructors/methods that are public and takes parameters. Our private/internal methods can only be called from our public ones and they have allready checked the input which means we are good to go!
Using Code Contracts should also be considered when verifying input.
I was wondering what is the best practice for accessing the owner instance when using composition (not aggregation)
public class Manager
{
public List<ElementToManage> Listelmt;
public List<Filter> ListeFilters;
public void LoadState(){}
}
public class Filter
{
public ElementToManage instance1;
public ElementToManage instance2;
public object value1;
public object value2;
public LoadState()
{
//need to access the property Listelmt in the owner instance (manager instance)
//instance1 = Listelmt.SingleOrDefault(...
}
}
So far I'm thinking about two possibilities:
Keep a reference to the owner in the Filter instance.
Declare an event in the Filter class. The manager instance subscribe to it, and the filter throw it when needed.
I feel more like using the second possibility. It seems more OOP to me, and there is less dependencies between the classes ( any refactoring later will be easier),
But debugging and tracing may be a bit harder on the long run.
Regarding business layer classes, i don't remember seeing events for this purpose.
Any insight would be greatly appreciated
There is no concept of an "owner" of a class instance, there should not be any strong coupling between the Filter instance and the object that happens to have an instance of it.
That being the case an event seems appropriate: It allows for loose coupling while enabling the functionality you want. If you went with option #1 on the other hand you would limit the overall usefulness of the Filter class - now it can only be contained in Manager classes, I don't think that is what you would want.
Overall looking at your code you might want to pass in the relevant data the method LoadState operates on so it doesn't have to "reach out".
I recomend the reference to owner of filter instance. The event can be handled by more handlers and can change result of previous handler(s). And you propadly don't want change the owner during lifetime of Filter without notification the Filter instance.
My short answer : Neither.
First option to keep a reference to the owner is problematic for several reasons. Filter class no longer has a single responsibility. Filter and Manager are tightly coupled. etc.
Second option is only a little better, and yes I've used events in similar scenearios, it rarely if ever ends well.
It's difficult to give a definite advice without more specific details. Some thoughts:
1) Are you sure your classes are as they should be? Maybe there should be a class to compose a single ElementToManage and a single Filter ?
2) Who is responsible for creating a Filter? For example, if it is Manager, maybe the Manager can give the list as a construction parameter? Maybe you can create a FilterFactory class that does any needed initializations.
3) Who calls filter.LoadState()? Maybe the needed list could be passed as a parameter to the LoadState() method.
4) I frequently use an "Initialization Design Pattern" (my terminology) For example I'll have a BinaryTree where parent and child will point to each other. The Factory constructs the nodes in a plain state, and than calls an initialize method with other needed objects. The class becomes complicated because I probably need to ensure that an uninitialized object raises an error for every other usage, and need to ensure that an object is initialized only once, is initialized only through the Factory, etc. But if it works, it is usually the best solution, in my opinion.
5) I'm still trying to learn "Dependency Injection" and getting nowhere, I guess it may have something to do with your question. I wonder if someone will come with an answer involving Dependency Injection.
I am entry level .Net developer and using it to develop web sites. I started with classic asp and last year jumped on the ship with a short C# book.
As I developed I learned more and started to see that coming from classic asp I always used C# like scripting language.
For example in my last project I needed to encode video on the webserver and wrote a code like
public class Encoder
{
Public static bool Encode(string videopath) {
...snip...
return true;
}
}
While searching samples related to my project I’ve seen people doing this
public class Encoder
{
Public static Encode(string videopath) {
EncodedVideo encoded = new EncodedVideo();
...snip...
encoded.EncodedVideoPath = outputFile;
encoded.Success = true;
...snip...
}
}
public class EncodedVideo
{
public string EncodedVideoPath { get; set; }
public bool Success { get; set; }
}
As I understand second example is more object oriented but I don’t see the point of using EncodedVideo object.
Am I doing something wrong? Does it really necessary to use this sort of code in a web app?
someone once explained OO to me as a a soda can.
A Soda can is an object, an object has many properties. And many methods. For example..
SodaCan.Drink();
SodaCan.Crush();
SocaCan.PourSomeForMyHomies();
etc...
The purpose of OO Design is theoretically to write a line of code once, and have abstraction between objects.
This means that Coder.Consume(SodaCan.contents); is relative to your question.
An encoded video is not the same thing as an encoder. An encoder returns an encoded video. and encoded video may use an encoder but they are two seperate objects. because they are two different entities serving different functions, they simply work together.
Much like me consuming a soda can does not mean that I am a soda can.
Neither example is really complete enough to evaluate. The second example seems to be more complex than the first, but without knowing how it will be used it's difficult to tell.
Object Oriented design is at it's best when it allows you to either:
1) Keep related information and/or functions together (instead of using parallel arrays or the like).
Or
2) Take advantage of inheritance and interface implementation.
Your second example MIGHT be keeping the data together better, if it returns the EncodedVideo object AND the success or failure of the method needs to be kept track of after the fact. In this case you would be replacing a combination of a boolean "success" variable and a path with a single object, clearly documenting the relation of the two pieces of data.
Another possibility not touched on by either example is using inheritance to better organize the encoding process. You could have a single base class that handles the "grunt work" of opening the file, copying the data, etc. and then inherit from that class for each different type of encoding you need to perform. In this case much of your code can be written directly against the base class, without needing to worry about what kind of encoding is actually being performed.
Actually the first looks better to me, but shouldn't return anything (or return an encoded video object).
Usually we assume methods complete successfully without exceptional errors - if exceptional errors are encountered, we throw an exception.
Object oriented programming is fundamentally about organization. You can program in an OO way even without an OO language like C#. By grouping related functions and data together, it is easier to deal with increasingly complex projects.
You aren't necessarily doing something wrong. The question of what paradigm works best is highly debatable and isn't likely to have a clear winner as there are so many different ways to measure "good" code,e.g. maintainable, scalable, performance, re-usable, modular, etc.
It isn't necessary, but it can be useful in some cases. Take a look at various MVC examples to see OO code. Generally, OO code has the advantage of being re-usable so that what was written for one application can be used for others over and over again. For example, look at log4net for example of a logging framework that many people use.
The way your structure an OO program--which objects you use and how you arrange them--really depends on many factors: the age of the project, the overall size of the project, complexity of the problem, and a bit for just personal taste.
The best advice I can think of that will wrap all the reasons for OO into one quick lesson is something I picked up learning design patterns: "Encapsulate the parts that change." The value of OO is to reuse elements that will be repeated without writing additional code. But obviously you only care to "wrap up" code into objects if it will actually be reused or modified in the future, thus you should figure out what is likely to change and make objects out of it.
In your example, the reason to use the second set up may be that you can reuse the EncodedVideo object else where in the program. Anytime you need to deal with EncodedVideo, you don't concern yourself with the "how do I encode and use video", you just use the object you have and trust it to handle the logic. It may also be valuable to encapsulate the encoding logic if it's complex, and likely to change. Then you isolate changes to just one place in the code, rather than many potential places where you might have used the object.
(Brief aside: The particular example you posted isn't valid C# code. In the second example, the static method has no return type, though I assume you meant to have it return the EncodedVideo object.)
This is a design question, so answer depends on what you need, meaning there's no right or wrong answer. First method is more simple, but in second case you incapsulate encoding logic in EncodedVideo class and you can easily change the logic (based on incoming video type, for instance) in your Encoder class.
I think the first example seems more simple, except I would avoid using statics whenever possible to increase testability.
public class Encoder
{
private string videoPath;
public Encoder(string videoPath) {
this.videoPath = videoPath;
}
public bool Encode() {
...snip...
return true;
}
}
Is OOP necessary? No.
Is OOP a good idea? Yes.
You're not necessarily doing something wrong. Maybe there's a better way, maybe not.
OOP, in general, promotes modularity, extensibility, and ease of maintenance. This goes for web applications, too.
In your specific Encoder/EncodedVideo example, I don't know if it makes sense to use two discrete objects to accomplish this task, because it depends on a lot of things.
For example, is the data stored in EncodedVideo only ever used within the Encode() method? Then it might not make sense to use a separate object.
However, if other parts of the application need to know some of the information that's in EncodedVideo, such as the path or whether the status is successful, then it's good to have an EncodedVideo object that can be passed around in the rest of the application. In this case, Encode() could return an object of type EncodedVideo rather than a bool, making that data available to the rest of your app.
Unless you want to reuse the EncodedVideo class for something else, then (from what code you've given) I think your method is perfectly acceptable for this task. Unless there's unrelated functionality in EncodedVideo and the Encoder classes or it forms a massive lump of code that should be split down, then you're not really lowering the cohesion of your classes, which is fine. Assuming you don't need to reuse EncodedVideo and the classes are cohesive, by splitting them you're probably creating unnecessary classes and increasing coupling.
Remember: 1. the OO philosophy can be quite subjective and there's no single right answer, 2. you can always refactor later :p
i have a design problem.. it may seem that i'm giving you too much details, but those are important.
say i have a very large input form, with a complicated input, that requires quiet complicated validations, includes validations of relations between different inputs. being probably a very burdensome form for the user, i'd like to give him the ultimate experience, and i really don't want to be restricted by programing difficulties here.
i thought that idealic every control should have an empty value at start except those of course, that have default values (the problem is DateTimePicker and such are not supporting empty value).
now the user can fill in any of the controls, in any order he would like. once he has leave the control, the program will validate the control's value, and any of the others validations which are concern with that control, and with other controls that are all non-empty (have been filed in already).
if there are any validation errors, the control is painted in some color, and in some side panel it will specify the errors (in a user friendly language of course, rather than exceptions' descriptions).
if there are errors that concerns to more than one control, only the last one that has been changed is painted.
i'd really like to keep to as many OOP concepts here..
so i have my logic classes, that are dealing with calculating the output and stuff like that. obviously those have nothing to do with the gui. now all of these complicated validations should be also in the logic classes' properties etc. but should be used in the gui as well, so i think there should be something like static validate methods (within the logic classes), that will be used in the gui, and in the logic classes them self.
the problem is, a logic class might contain up to 20 maybe 30 fields to validate... will that static method take 30 parameters? is that okay or is there more acceptable solution?
i'm a bit lost for anything beyond that.. but i'm quite sure there already are some conventions for these situations... i know it has something to do with design patterns, but i have no idea what design patterns there are, which are dealing with such cases, and where should i read about them.
my question basically is how do i integrate the validation of the logic classes and the gui, in the neatest way.
if i already in that, i don't want to open a new question for these:
as i mentioned, i need a method here, that get all the input, all the fields of the class, and somehow perform all the validation checks on the non-null values (if there is a validation check that concern to a few parameters, and even one of them is null, the validation shall not be execute). if you have any interesting ideas, i'd like to hear.
another problem i bump into, is the non-emptyale controls, such as DateTimePicker.... it's really ugly that it will have a certain value, while it should not... don't you think?
p.s.
sorry about my english.. i was too tired to write it perfectly..
EDIT1 working with windows
will that static method take 30
parameters?
Yes but what if you pass your object into your static validation method instead of all its properties individually ex.
public static class YourClassRules
{
public List<SomeSortOfValidationItem> Validate(YourClass obj)
{
var results = new List<SomeSortOfValidationItem>()
if (obj.YourProperty.Length >= 200)
{
results.Add(new SormSortOfValidationItem("YourProperty", "Length must be less than...");
}
//etc.
}
}
my question basically is how do i
integrate the validation of the logic
classes and the gui, in the neatest
way.
There are several different frameworks available. It would be helpful to know if your doing windows or web. Then we could make some recomendations.
another problem i bump into, is the
non-emptyale controls, such as
DateTimePicker.
Are you having issues with the controls or the properties that are bound to the controls. I often use DateTime? or Nullable which will allow for a null value.
Hope this helps.
DataAnnotations can be very easy to implement and very effective. Read this answer for an alternative that can extend further. Also, this question has some great gems regarding validation models too.
Spring has a very good DataBinding and validation API. Since there is a Spring.NET version, I'd recommend looking into it.
Why use a GlobalClass? What are they for? I have inherited some code (shown below) and as far as I can see there is no reason why strUserName needs this. What is all for?
public static string strUserName
{
get { return m_globalVar; }
set { m_globalVar = value; }
}
Used later as:
GlobalClass.strUserName
Thanks
You get all the bugs of global state and none of the yucky direct variable access.
If you're going to do it, then your coder implemented it pretty well. He/She probably thought (correctly) that they would be free to swap out an implementation later.
Generally it's viewed as a bad idea since it makes it difficult to test the system as a whole the more globals you have in it.
My 2 cents.
When you want to use a static member of a type, you use it like ClassName.MemberName. If your code snippet is in the same class as the member you're referring (in this example, you're coding in a GlobalClass member, and using strUserName) you can omit the class name. Otherwise, it's required as the compiler wouldn't have any knowledge of what class you're referring to.
This is a common approach when dealing with Context in ASP.Net; however, the implementation would never use a single variable. So if this is a web app I could see this approach being used to indicate who the current user is (Although there are better ways to do this).
I use a simillar approach where I have a MembershipService.CurrentUser property which then pulls a user out from either SessionState or LogicalCallContext (if its a web or client app).
But in these cases these aren't global as they are scoped within narrow confines (Like the http session state).
One case where I have used a global like this would be if I have some data which is static and never changes, and is loaded from the DB (And there's not enough of the data to justify storing it in a cache). You could just store it in a static variable so you don;t have to go back to the DB.
One a side note why was the developer using Hungarian notation to name Properties? even when there was no intellisense and all the goodness our IDEs provide we never used hungarian notation on Properties.
#Jayne, #Josh, it's hard to tell - but the code in the question could also be a static accessor to a static field - somewhat different than #Josh's static helper example (where you use instance or context variables within your helper).
Static Helper methods are a good way to conveniently abstract stateless chunks of functionality. However in the example there is potential for the global variable to be stateful - Demeter's Law guides us that you should only play with state that you own or are given e.g. by parameters.
http://www.c2.com/cgi/wiki?LawOfDemeter
Given the rules there occasional times when it is necessary to break them. You should trade the risk of using global state (primarily the risk of creating state/concurrency bugs) vs. the necessity to use globals.
Well if you want a piece of data to be available to any other class running in the jvm then the Global Class is the way to go.
There are only two slight problems;
One. The implmentation shown is not thread safe. The set... method of any global class should be marked critical or wrapped in a mutex.
Even in the niave example above consider what happens if two threads run simultaniously:
set("Joe") and set("Frederick") could result in "Joederick" or "Fre" or some other permutation.
Two. It doesnt scale well. "Global" refers to a single jvm. A more complex runtime environment like Jboss could be runnning several inter communicating jvms. So the global userid could be 'Joe' or 'Frederick' depending on which jvm your EJB is scheduled.