C# Extension Methods - How far is too far?

C# Extension Methods - How far is too far? - c#

Rails introduced some core extensions to Ruby like 3.days.from_now which returns, as you'd expect a date three days in the future. With extension methods in C# we can now do something similar:
static class Extensions
{
public static TimeSpan Days(this int i)
{
return new TimeSpan(i, 0, 0, 0, 0);
}
public static DateTime FromNow(this TimeSpan ts)
{
return DateTime.Now.Add(ts);
}
}
class Program
{
static void Main(string[] args)
{
Console.WriteLine(
3.Days().FromNow()
);
}
}
Or how about:
static class Extensions
{
public static IEnumerable<int> To(this int from, int to)
{
return Enumerable.Range(from, to - from + 1);
}
}
class Program
{
static void Main(string[] args)
{
foreach (var i in 10.To(20))
{
Console.WriteLine(i);
}
}
}
Is this fundamentally wrong, or are there times when it is a good idea, like in a framework like Rails?

I like extension methods a lot but I do feel that when they are used outside of LINQ that they improve readability at the expense of maintainability.
Take 3.Days().FromNow() as an example. This is wonderfully expressive and anyone could read this code and tell you exactly what it does. That is a truly beautiful thing. As coders it is our joy to write code that is self-describing and expressive so that it requires almost no comments and is a pleasure to read. This code is paramount in that respect.
However, as coders we are also responsible to posterity, and those who come after us will spend most of their time trying to comprehend how this code works. We must be careful not to be so expressive that debugging our code requires leaping around amongst a myriad of extension methods.
Extension methods veil the "how" to better express the "what". I guess that makes them a double edged sword that is best used (like all things) in moderation.

First, my gut feeling: 3.Minutes.from_now looks totally cool, but does not demonstrate why extension methods are good. This also reflects my general view: cool, but I've never really missed them.
Question: Is 3.Minutes a timespan, or an angle?
Namespaces referenced through a using statement "normally" only affect types, now they suddenly decide what 3.Minutes means.
So the best is to "not let them escape".
All public extension methods in a likely-to-be-referenced namespace end up being "kind of global" - with all the potential problems associated with that. Keep them internal to your assembly, or put them into a separate namespace that is added to each file separately.

Personally I like int.To, I am ambivalent about int.Days, and I dislike TimeSpan.FromNow.
I dislike what I see as a bit of a fad for 'fluent' interfaces that let you write pseudo English code but do it by implementing methods with names that can be baffling in isolation.
For example, this doesnt read well to me:
TimeSpan.FromSeconds(4).FromNow()
Clearly, it's a subjective thing.

I agree with siz and lean conservative on this issue. Rails has that sort of stuff baked in, so it's not really that confusing ever. When you write your "days" and "fromnow" methods, there is no guarantee that your code is bug free. Also, you are adding a dependency to your code. If you put your extension methods in their own file, you need that file in every project. In a project, you need to include that project whenever you need it.
All that said, for really simple extension methods (like Jeff's usage of "left" or thatismatt's usage of days.fromnow above) that exist in other frameworks/worlds, I think it's ok. Anyone who is familiar with dates should understand what "3.Days().FromNow()" means.

I'm on the conservative side of the spectrum, at least for the time being, and am against extension methods. It is just syntactic sugar that, to me, is not that important. I think it can also be a nightmare for junior developers if they are new to C#. I'd rather encapsulate the extensions in my own objects or static methods.
If you are going to use them, just please don't overuse them to a point that you are making it convenient for yourself but messing with anyone else who touches your code. :-)

Each language has its own perspective on what a language should be. Rails and Ruby are designed with their own, very distinct opinions. PHP has clearly different opinions, as does C(++/#)...as does Visual Basic (though apparently we don't like their style).
The balance is having many, easily-read, built-in functions vs. the nitty-gritty control over everything. I wouldn't want SO many functions that you have to go to a lookup every time you want to do anything (and there's got to be a performance overhead to a bloated framework), but I personally love Rails, because what it has saves me a lot of time developing.
I guess what I'm saying here is that if you were designing a language, take a stance, go from there, and build in the functions you (or your target developer would) use most often.

My personal preference would be to use them sparingly for now and to wait to see how Microsoft and other big organizations use them. If we start seeing a lot of code, tutorials, and books use code like 3.Days().FromNow() it makes use it a lot. If only a small number of people use it, then you run the risk of having your code be overly difficult to maintain because not enough people are familiar with how extensions work.
On a related note, I wonder how the performance compares between a normal for loop and the foreach one? It would seem like the second method would involve a lot of extra work for the computer, but I'm not familiar enough with the concept to know for sure.

Related

How does the functional programming recommendation for static methods influence testability?

The more I dive into functional programming I read the recommendation to favor static methods in favor of non-static ones. You can read about that recommendation in this book for example:
http://www.amazon.de/Functional-Programming-Techniques-Projects-Programmer/dp/0470744588
Of course that makes sense if you think about functional purity. A static function stands there and says: "I do not need any state!"
However, how does that influence testability? I mean, isn't it that a system with a lot of static methods becomes a pain to test (since static methods are hard to mock)? Or does mocks play a minor role in functional programming and if so: why?
EDIT
Since there are doubts if the book really makes that recommendation. I will quote a little more. I hope thats ok for Oliver Sturm.
Use Static Methods
Static methods is one of the basic ideas worth considering as a general guideline. It is supported by many object oriented programmers, and from a functional point of view, functions can be made static most of the time. Any pure function can be made static.
(...)
Some may argue that the idea of always passing around all parameters means you're not exploiting the ideas of object orientation as much as you could. That may in fact be true, but then perhaps it is because object orientation concepts don't give as much consideration to issues of parallel execution as they should.
(...)
Finally, a guideline to recommend: when you have written a method that does not require acces to any field in the class it lives in, make it static!
Btw, there have been good answers so far. Thanks for that!

One way of looking at this is that for functional programming you only need to mock state (by providing suitable inputs) that is required by the specific function. For OO programming you need to mock all of the state required for the inner working of the class.
Functional programs also have the side benefit that you can guarantee that repeating the same test with the same input will give the same result. In classic OO you have to guarantee not just the same input, but the same overall state.
In well architectured OO code, the difference will be minimal (as classes will have well defined responsibility) but the requirements for a functional test are still a strict subset of the equivilent OO test.
(I realise that functional programming styles can make use of OO via immutable objects - please read mentions of OO above as 'object oriented programming with mutible state')
Edit:
As pointed out by Fredrik, the important part about functional methods is not that they are static, but that they do not mutate the state of the program. A 'pure' function is a mapping from a set of inputs to a set of outputs (same input always gives same result), and has no other effect.

I think that static methods per se is not the problem, the problem comes when they start to operate on static data. As long as the static method takes input as argument, operates on it and returns a result, I see no problems testing them.
Even when I am not pursuing a functional approach in my code, I tend to make methods static whenever I can. But I think very carefully before introducing static state, or a static type.

All "state" in pure functional programming comes from the inputs. To unit test functional programs you create test inputs and observe the outputs. If your methods can not be tested by giving them test inputs and observing the output they are not functional enough.

In functional programming you would want to mock functions instead of objects. So if you want to test function f without depending on some ComplicatedAndLongFunction in
f(x)
{
myx = g(x);
y = ComplicatedAndLongFunction(myx);
myy = h(y)
return myy;
}
you may want to decouple f from the ComplicatedAndLongFunction by injecting the latter into f:
f(x, calc)
{
myx = g(x);
y = calc(myx);
myy = h(y)
return myy;
}
so you can specify the behavior of calc in you test.
This raises the question (in my head at least) if there are mocking frameworks that make it easy to specify expectations on functions without having to revert to objects.

Use of Syntactic Sugar / Built in Functionality

I was busy looking deeper into things like multi-threading and deadlocking etc. The book is aimed at both pseudo-code and C code and I was busy looking at implementations for things such as Mutex locks and Monitors.
This brought to mind the following; in C# and in fact .NET we have a lot of syntactic sugar for doing things. For instance (.NET 3.5):
lock(obj)
{
body
}
Is identical to:
var temp = obj;
Monitor.Enter(temp);
try
{
body
}
finally
{
Monitor.Exit(temp);
}
There are other examples of course, such as the using() {} construct etc. My question is when is it more applicable to "go it alone" and literally code things oneself than to use the "syntactic sugar" in the language? Should one ever use their own ways rather than those of people who are more experienced in the language you're coding in?
I recall having to not use a Process object in a using block to help with some multi-threaded issues and infinite looping before. I still feel dirty for not having the using construct in there.
Thanks,
Kyle

Stick to the syntactic sugar as much as possible. It's concise, more maintainable, less error-prone, well understood, and they created it for a reason.
If you must have manual control over something (e.g. manipulating an IEnumerator<T> instead of using foreach), then yes, ditch the syntactic sugar. Otherwise, being idiomatic is a good thing.

The biggest cost of software development is maintenance over the long term, so the answer is always, do the thing that will give you the easiest and most cost effective maintenance path (with all the exceptions that might prove the rule, perf for example). If you can use syntactical sugar to make your code more readable then that's your answer if the syntactical sugar gets in the way then don't use it.

In C#, this linq statement:
var filteredCities =
from city in cities
where city.StartsWith("L") && city.Length < 15
orderby city
select city;
is syntactic sugar for (and equivalent to):
var filteredCities =
cities.Where(c => c.StartsWith("L") && c.Length < 15))
.OrderBy(c => c)
.Select(c => c);
If you know C# well, the latter version is far easier to pick apart than the former; you can see exactly what it is doing under the hood.
However, for typical everyday use, most people find the sugared version cleaner to look at, and easier to read.

Your example of not being able to use a using construct is my most common deviation from the new approaches made available in .Net languages and the framework. There are just a lot of cases where the scope of an IDisposable object is a bit outside of a single function.
However, knowing about what these shortcuts do is still as important as ever. I do think many people simply won't dispose an object if they can't wrap it in a using, because they don't understand what it does and what it's making easier.
So I do wish there was something like a tooltip helptext for some of these wonderful shortcuts, that indicated something important is happening - maybe even a different keyword coloring.
Edit:
I've been thinking about this, and I've decided that I believe using is just a misleading keyword to have chosen. foreach does exactly what it sounds like, whereas using doesn't imply, to me, what's actually going on. Anybody have any thoughts on this? What if they keyword had been disposing instead; do you think it'd be any clearer?

Getting my head around object oriented programming

I am entry level .Net developer and using it to develop web sites. I started with classic asp and last year jumped on the ship with a short C# book.
As I developed I learned more and started to see that coming from classic asp I always used C# like scripting language.
For example in my last project I needed to encode video on the webserver and wrote a code like
public class Encoder
{
Public static bool Encode(string videopath) {
...snip...
return true;
}
}
While searching samples related to my project I’ve seen people doing this
public class Encoder
{
Public static Encode(string videopath) {
EncodedVideo encoded = new EncodedVideo();
...snip...
encoded.EncodedVideoPath = outputFile;
encoded.Success = true;
...snip...
}
}
public class EncodedVideo
{
public string EncodedVideoPath { get; set; }
public bool Success { get; set; }
}
As I understand second example is more object oriented but I don’t see the point of using EncodedVideo object.
Am I doing something wrong? Does it really necessary to use this sort of code in a web app?

someone once explained OO to me as a a soda can.
A Soda can is an object, an object has many properties. And many methods. For example..
SodaCan.Drink();
SodaCan.Crush();
SocaCan.PourSomeForMyHomies();
etc...
The purpose of OO Design is theoretically to write a line of code once, and have abstraction between objects.
This means that Coder.Consume(SodaCan.contents); is relative to your question.
An encoded video is not the same thing as an encoder. An encoder returns an encoded video. and encoded video may use an encoder but they are two seperate objects. because they are two different entities serving different functions, they simply work together.
Much like me consuming a soda can does not mean that I am a soda can.

Neither example is really complete enough to evaluate. The second example seems to be more complex than the first, but without knowing how it will be used it's difficult to tell.
Object Oriented design is at it's best when it allows you to either:
1) Keep related information and/or functions together (instead of using parallel arrays or the like).
Or
2) Take advantage of inheritance and interface implementation.
Your second example MIGHT be keeping the data together better, if it returns the EncodedVideo object AND the success or failure of the method needs to be kept track of after the fact. In this case you would be replacing a combination of a boolean "success" variable and a path with a single object, clearly documenting the relation of the two pieces of data.
Another possibility not touched on by either example is using inheritance to better organize the encoding process. You could have a single base class that handles the "grunt work" of opening the file, copying the data, etc. and then inherit from that class for each different type of encoding you need to perform. In this case much of your code can be written directly against the base class, without needing to worry about what kind of encoding is actually being performed.

Actually the first looks better to me, but shouldn't return anything (or return an encoded video object).
Usually we assume methods complete successfully without exceptional errors - if exceptional errors are encountered, we throw an exception.

Object oriented programming is fundamentally about organization. You can program in an OO way even without an OO language like C#. By grouping related functions and data together, it is easier to deal with increasingly complex projects.

You aren't necessarily doing something wrong. The question of what paradigm works best is highly debatable and isn't likely to have a clear winner as there are so many different ways to measure "good" code,e.g. maintainable, scalable, performance, re-usable, modular, etc.
It isn't necessary, but it can be useful in some cases. Take a look at various MVC examples to see OO code. Generally, OO code has the advantage of being re-usable so that what was written for one application can be used for others over and over again. For example, look at log4net for example of a logging framework that many people use.

The way your structure an OO program--which objects you use and how you arrange them--really depends on many factors: the age of the project, the overall size of the project, complexity of the problem, and a bit for just personal taste.
The best advice I can think of that will wrap all the reasons for OO into one quick lesson is something I picked up learning design patterns: "Encapsulate the parts that change." The value of OO is to reuse elements that will be repeated without writing additional code. But obviously you only care to "wrap up" code into objects if it will actually be reused or modified in the future, thus you should figure out what is likely to change and make objects out of it.
In your example, the reason to use the second set up may be that you can reuse the EncodedVideo object else where in the program. Anytime you need to deal with EncodedVideo, you don't concern yourself with the "how do I encode and use video", you just use the object you have and trust it to handle the logic. It may also be valuable to encapsulate the encoding logic if it's complex, and likely to change. Then you isolate changes to just one place in the code, rather than many potential places where you might have used the object.
(Brief aside: The particular example you posted isn't valid C# code. In the second example, the static method has no return type, though I assume you meant to have it return the EncodedVideo object.)

This is a design question, so answer depends on what you need, meaning there's no right or wrong answer. First method is more simple, but in second case you incapsulate encoding logic in EncodedVideo class and you can easily change the logic (based on incoming video type, for instance) in your Encoder class.

I think the first example seems more simple, except I would avoid using statics whenever possible to increase testability.
public class Encoder
{
private string videoPath;
public Encoder(string videoPath) {
this.videoPath = videoPath;
}
public bool Encode() {
...snip...
return true;
}
}

Is OOP necessary? No.
Is OOP a good idea? Yes.
You're not necessarily doing something wrong. Maybe there's a better way, maybe not.
OOP, in general, promotes modularity, extensibility, and ease of maintenance. This goes for web applications, too.
In your specific Encoder/EncodedVideo example, I don't know if it makes sense to use two discrete objects to accomplish this task, because it depends on a lot of things.
For example, is the data stored in EncodedVideo only ever used within the Encode() method? Then it might not make sense to use a separate object.
However, if other parts of the application need to know some of the information that's in EncodedVideo, such as the path or whether the status is successful, then it's good to have an EncodedVideo object that can be passed around in the rest of the application. In this case, Encode() could return an object of type EncodedVideo rather than a bool, making that data available to the rest of your app.

Unless you want to reuse the EncodedVideo class for something else, then (from what code you've given) I think your method is perfectly acceptable for this task. Unless there's unrelated functionality in EncodedVideo and the Encoder classes or it forms a massive lump of code that should be split down, then you're not really lowering the cohesion of your classes, which is fine. Assuming you don't need to reuse EncodedVideo and the classes are cohesive, by splitting them you're probably creating unnecessary classes and increasing coupling.
Remember: 1. the OO philosophy can be quite subjective and there's no single right answer, 2. you can always refactor later :p

Anthropomorphising interfaces - good or bad idea?

I have for some time tried to anthropomorphise (meaning human readable) the names I give to interfaces, to me this is the same as give an interface a role based name – trying to capture the purpose of the interface in the name.
I was having a discussion with other developers who think this is a little strange and childish.
What do the folks of SO think?
Examples (C# syntax):
public interface IShowMessages
{
void Show(string message);
void Show(string title, string message);
}
public class TraceMessenger : IShowMessages
{
}
public interface IHaveMessageParameters
{
IList<string> Parameters { get; }
}
public class SomeClass : IHaveMessageParameters
{
}

IThinkItsATerribleIdea

Of course you should always choose identifiers which are human readable. As in: transport the meaning which they convey even to somebody who is not as familiar with the problem to be solved by the code as you are.
However, using long identifiers does not make your identifiers more 'readable'. To any reasonably experienced programmer, 'tmp' conveys as much information as 'temporaryVariable' does. Same goes for 'i' vs. 'dummyCounter' etc..
In your particular example, the interface names are actually quite annoying since somebody who's used to developing object oriented systems will read the inheritance as 'is a'. And 'SomeClass is a IHaveMessageParameters' sounds silly.
Try using IMessagePrinter and IMessageParameterProvider instead.

Yes, that sounds like a good idea.
What's the alternative?
Code should be human-readable. Any fool can write code a computer can understand. The difficult part is writing code a human can understand.
Humans have to maintain the code, so it's pretty darn important that it is as easy to maintain as possible - that includes that the code should be as readable as possible.

Interfaces describe behavior, and so I name them so as to to communicate the behavior they are mandating. This 'generally' means that the name is a verb, (or adverb) or some form of action-describing phrase. Combined with the "I" for interface, this looks like what you are doing...
ICanMove, IControllable, ICanPrint, ISendMesssages, etc...
using adverbs as in IControllable, IDisposable, IEnumerable, etc. communicates the same thought as a verb form and is terser, so I use this form as well...
Finally, more important (or at least equally important) than what you name the interface, is to keep the interfaces you design as small and logically contained as possible. You should strive to have each interface represent as small and logically connected a set of methods/properties as possible. When an interface has so much in it that there is no obvious name that would describe all the behavior it mandates, it's a sign that there is too much in it, and that it needs to be refactored into two or more smaller interfaces. So, maming interfaces in the way you are proposing helps to enforce this type of organizational design, which is a good thing.

There's nothing strange about using simple human-readable names. But using the I for interface to also stand for the first-person I as though it's talking about itself... is a little unusual, yes.
But the bottom line is, whatever works for you and is understood by you and your team is fine. You gotta go with what works.

In my opinion this approach just adds a greater burden on the developers to come up with such names since it intergrates the I as part of a sentence. I don't find IDisposable for example to be more difficult to read than ICanBeDisposed.

In the OP's examples, the anthropomorphic way compares well against alternatives - eg: IShowMessages vs. something like IMessageShower. But - this is not always the case. Interfaces I have used when programming game objects include: IOpenClosable and ILockable. Alternatives like ICanBeOpenedAndClosed and ICanBeLocked would be more verbose. Or you could simply do IAmOpenClosable and IAmLockable - but then you'd be adding the "Am" just for the anthropomorphic effect with no real information benefit. I am all for minimizing verbosity if the same amount of information is conveyed.

So long as the semantics of what is trying to be achieved aren't lost and terseness isn't irreparably compromised (IDoLotsOfThingsWhichIncludesTheFollowingColonSpace...). I wouldn't generally mind somebody other than myself doing it. Still, there are plenty of contexts in which terseness is paramount, in which this would be unacceptable.

Intentionally using the 'I for Interface' convention in the first person seems a bit silly to be honest. What starts out as a cute pun becomes impossible to follow consistently, and ends up clouding meaning later on. That said, your standalone example reads clearly enough and I wouldn't have a problem with it.

Ab-using languages

Some time ago I had to address a certain C# design problem when I was implementing a JavaScript code-generation framework. One of the solutions I came with was using the “using” keyword in a totally different (hackish, if you please) way. I used it as a syntax sugar (well, originally it is one anyway) for building hierarchical code structure. Something that looked like this:
CodeBuilder cb = new CodeBuilder();
using(cb.Function("foo"))
{
// Generate some function code
cb.Add(someStatement);
cb.Add(someOtherStatement);
using(cb.While(someCondition))
{
cb.Add(someLoopStatement);
// Generate some more code
}
}
It is working because the Function and the While methods return IDisposable object, that, upon dispose, tells the builder to close the current scope. Such thing can be helpful for any tree-like structure that need to be hard-codded.
Do you think such “hacks” are justified? Because you can say that in C++, for example, many of the features such as templates and operator overloading get over-abused and this behavior is encouraged by many (look at boost for example). On the other side, you can say that many modern languages discourage such abuse and give you specific, much more restricted features.
My example is, of course, somewhat esoteric, but real. So what do you think about the specific hack and of the whole issue? Have you encountered similar dilemmas? How much abuse can you tolerate?

I think this is something that has blown over from languages like Ruby that have much more extensive mechanisms to let you create languages within your language (google for "dsl" or "domain specific languages" if you want to know more). C# is less flexible in this respect.
I think creating DSL's in this way is a good thing. It makes for more readable code. Using blocks can be a useful part of a DSL in C#. In this case I think there are better alternatives. The use of using is this case strays a bit too far from its original purpose. This can confuse the reader. I like Anton Gogolev's solution better for example.

Offtopic, but just take a look at how pretty this becomes with lambdas:
var codeBuilder = new CodeBuilder();
codeBuilder.DefineFunction("Foo", x =>
{
codeBuilder.While(condition, y =>
{
}
}

It would be better if the disposable object returned from cb.Function(name) was the object on which the statements should be added. That internally this function builder passed through the calls to private/internal functions on the CodeBuilder is fine, just that to public consumers the sequence is clear.
So long as the Dispose implementation would make the following code cause a runtime error.
CodeBuilder cb = new CodeBuilder();
var f = cb.Function("foo")
using(function)
{
// Generate some function code
f.Add(someStatement);
}
function.Add(something); // this should throw
Then the behaviour is intuitive and relatively reasonable and correct usage (below) encourages and prevents this happening
CodeBuilder cb = new CodeBuilder();
using(var function = cb.Function("foo"))
{
// Generate some function code
function.Add(someStatement);
}
I have to ask why you are using your own classes rather than the provided CodeDomProvider implementations though. (There are good reasons for this, notably that the current implementation lacks many of the c# 3.0 features) but since you don't mention it yourself...
Edit: I would second Anoton's suggest to use lamdas. The readability is much improved (and you have the option of allowing Expression Trees

If you go by the strictest definitions of IDisposable then this is an abuse. It's meant to be used as a method for releasing native resources in a deterministic fashion by a managed object.
The use of IDisposable has evolved to essentially be used by "any object which should have a deterministic lifetime". I'm not saying this is write or wrong but that's how many API's and users are choosing to use IDisposable. Given that definition it's not an abuse.

I wouldn't consider it terribly bad abuse, but I also wouldn't consider it good form because of the cognitive wall you're building for your maintenance developers. The using statement implies a certain class of lifetime management. This is fine in its usual uses and in slightly customized ones (like #heeen's reference to an RAII analogue), but those situations still keep the spirit of the using statement intact.
In your particular case, I might argue that a more functional approach like #Anton Gogolev's would be more in the spirit of the language as well as maintainable.
As to your primary question, I think each such hack must ultimately stand on its own merits as the "best" solution for a particular language in a particular situation. The definition of best is subjective, of course, but there are definitely times (especially when the external constraints of budgets and schedules are thrown into the mix) where a slightly more hackish approach is the only reasonable answer.

I often "abuse" using blocks. I think they provide a great way of defining scope. I have a whole series of objects that I use for capture and restoring state (e.g. of Combo boxes or the mouse pointer) during operations that may change the state. I also use them for creating and dropping database connections.
E.g.:
using(_cursorStack.ChangeCursor(System.Windows.Forms.Cursors.WaitCursor))
{
...
}

I wouldn't call it abuse. Looks more like a fancied up RAII technique to me. People have been using these for things like monitors.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# Extension Methods - How far is too far? - c#

Related

How does the functional programming recommendation for static methods influence testability?

Use of Syntactic Sugar / Built in Functionality

Getting my head around object oriented programming

Anthropomorphising interfaces - good or bad idea?

Ab-using languages

Categories

Resources