Use of Syntactic Sugar / Built in Functionality - c#

I was busy looking deeper into things like multi-threading and deadlocking etc. The book is aimed at both pseudo-code and C code and I was busy looking at implementations for things such as Mutex locks and Monitors.
This brought to mind the following; in C# and in fact .NET we have a lot of syntactic sugar for doing things. For instance (.NET 3.5):
lock(obj)
{
body
}
Is identical to:
var temp = obj;
Monitor.Enter(temp);
try
{
body
}
finally
{
Monitor.Exit(temp);
}
There are other examples of course, such as the using() {} construct etc. My question is when is it more applicable to "go it alone" and literally code things oneself than to use the "syntactic sugar" in the language? Should one ever use their own ways rather than those of people who are more experienced in the language you're coding in?
I recall having to not use a Process object in a using block to help with some multi-threaded issues and infinite looping before. I still feel dirty for not having the using construct in there.
Thanks,
Kyle

Stick to the syntactic sugar as much as possible. It's concise, more maintainable, less error-prone, well understood, and they created it for a reason.
If you must have manual control over something (e.g. manipulating an IEnumerator<T> instead of using foreach), then yes, ditch the syntactic sugar. Otherwise, being idiomatic is a good thing.

The biggest cost of software development is maintenance over the long term, so the answer is always, do the thing that will give you the easiest and most cost effective maintenance path (with all the exceptions that might prove the rule, perf for example). If you can use syntactical sugar to make your code more readable then that's your answer if the syntactical sugar gets in the way then don't use it.

In C#, this linq statement:
var filteredCities =
from city in cities
where city.StartsWith("L") && city.Length < 15
orderby city
select city;
is syntactic sugar for (and equivalent to):
var filteredCities =
cities.Where(c => c.StartsWith("L") && c.Length < 15))
.OrderBy(c => c)
.Select(c => c);
If you know C# well, the latter version is far easier to pick apart than the former; you can see exactly what it is doing under the hood.
However, for typical everyday use, most people find the sugared version cleaner to look at, and easier to read.

Your example of not being able to use a using construct is my most common deviation from the new approaches made available in .Net languages and the framework. There are just a lot of cases where the scope of an IDisposable object is a bit outside of a single function.
However, knowing about what these shortcuts do is still as important as ever. I do think many people simply won't dispose an object if they can't wrap it in a using, because they don't understand what it does and what it's making easier.
So I do wish there was something like a tooltip helptext for some of these wonderful shortcuts, that indicated something important is happening - maybe even a different keyword coloring.
Edit:
I've been thinking about this, and I've decided that I believe using is just a misleading keyword to have chosen. foreach does exactly what it sounds like, whereas using doesn't imply, to me, what's actually going on. Anybody have any thoughts on this? What if they keyword had been disposing instead; do you think it'd be any clearer?

Related

Do you like languages that let you put the "then" before the "if"?

I was reading through some C# code of mine today and found this line:
if (ProgenyList.ItemContainerGenerator.Status != System.Windows.Controls.Primitives.GeneratorStatus.ContainersGenerated) return;
Notice that you can tell without scrolling that it's an "if" statement that works with ItemContainerGenerator.Status, but you can't easily tell that if the "if" clause evaluates to "true" the method will return at that point.
Realistically I should have moved the "return" statement to a line by itself, but it got me thinking about languages that allow the "then" part of the statement first. If C# permitted it, the line could look like this:
return if (ProgenyList.ItemContainerGenerator.Status != System.Windows.Controls.Primitives.GeneratorStatus.ContainersGenerated);
This might be a bit "argumentative", but I'm wondering what people think about this kind of construct. It might serve to make lines like the one above more readable, but it also might be disastrous. Imagine this code:
return 3 if (x > y);
Logically we can only return if x > y, because there's no "else", but part of me looks at that and thinks, "are we still returning if x <= y? If so, what are we returning?"
What do you think of the "then before the if" construct? Does it exist in your language of choice? Do you use it often? Would C# benefit from it?
Let's reformat that a bit and see:
using System.Windows.Controls.Primitives;
...
if (ProgenyList.ItemContainerGenerator.Status != GeneratorStatus.ContainersGenerated)
{
return;
}
Now how hard is it to see the return statement? Admittedly in SO you still need to scroll over to see the whole of the condition, but in an IDE you wouldn't have to... partly due to not trying to put the condition and the result on the same line, and party due to the using directive.
The benefit of the existing C# syntax is that the textual order reflects the execution order - if you want to know what will happen, you read the code from top to bottom.
Personally I'm not a fan of "return if..." - I'd rather reformat code for readability than change the ordering.
I don't like the ambiguity this invites. Consider the following code:
doSomething(x)
if (x > y);
doSomethingElse(y);
What is it doing? Yes, the compiler could figure it out, but it would look pretty confusing for a programmer.
Yes.
It reads better. Ruby has this as part of its syntax - the term being 'statement modifiers'
irb(main):001:0> puts "Yay Ruby!" if 2 == 2
Yay Ruby!
=> nil
irb(main):002:0> puts "Yay Ruby!" if 2 == 3
=> nil
To close, I need to stress that you need to 'use this with discretion'. The ruby idiom is to use this for one-liners. It can be abused - however I guess this falls into the realm of responsible development - don't constrain the better developers by building in restrictions to protect the poor ones.
It's look ugly for me. The existing syntax much better.
if (x > y) return 3;
I think it's probably OK if the scope were limited to just return statements. As I said in my comment, imagine if this were allowed:
{
doSomething();
doSomethingElse();
// 50 lines...
lastThink();
} if (a < b);
But even just allowing it only on return statements is probably a slippery slope. People will ask, "return x if (a); is allowed, so why not something like doSomething() if (a);?" and then you're on your way down the slope :)
I know other languages do get away with it, but C#'s philosophy is more about making The One Right WayTM easy and having more than one way to do something is usually avoided (though with exceptions). Personally, I think it works pretty well, because I can look at someone else's code and know that it's pretty much in the same style that I'd write it in.
I don't see any problem with
return 3 if (x > y);
It probably bothers you because you are not accustomed to the syntax. It is also nice to be able to say
return 3 unless y <= x
This is a nice syntax option, but I don't think that c# needs it.
I think Larry Wall was very smart when he put this feature into Perl. The idea is that you want to put the most important part at the beginning where it's easy to see. If you have a short statement (i.e. not a compound statement), you can put it before the if/while/etc. If you have a long (i.e. compound) statement, it goes in braces after the condition.
Personally I like languages that let me choose.
That said, if you refactor as well as reformat, it probably doesn't matter what style you use, because they will be equally readable:
using System.Windows.Controls.Primitives;
...
var isContainersGenerated =
ProgenyList.ItemContainerGenerator.Status == GeneratorStatus.ContainersGenerated;
if (!isContainersGenerated) return;
//alternatively
return if (!isContainersGenerated);
There is a concern reading the code that you think a statement will execute only later to find out it might execute.
For example if you read "doSomething(x)", you're thinking "okay so this calls doSomething(x)" but then you read the "if" after it and have to realise that the previous call is conditional on the if statement.
When the "if" is first you know immediately that the following code might happen and can treat it as such.
We tend to read sequentially, so reading and going in your mind "the following might happen" is a lot easier than reading and then realising everything you just read needs to be reparsed and that you need to evaluate everything to see if it's within the scope of your new if statement.
Both Perl and Ruby have this and it works fine. Personally I'm fine with as much functionality you want to throw at me. The more choices I have to write my code the better the overall quality, right? "The right tool for the job" and all that.
Realistically though, it's kind of a moot point since it's pretty late for such a fundamental addition in C#'s lifecycle. We're at the point where any minor syntax change would take a lot of work to implement due to the size of the compiler code and its syntax parsing algorithm. For a new addition to be even considered it would have to bring quite a bit of functionality, and this is just a (not so) different way of saying the same thing.
Humans read beginning to end. In analyzing code flow, limits of the short term memory make it more difficult to read postfix conditions due to additional backtracking required. For short expressions, this may not be a problem, but for longer expressions it will incur significant overhead for users that are not seasoned in the language they are reading.
Agreed with confusing , I never heard about this construction before , so I think correct way using then before if must always contents the result of else, like
return (x > y) ? 3 : null;
else way there is no point of using Imperative constructions like
return 3 if (x > y);
return 4 if (x = y);
return 5 if (x < y);
imho It's kinda weird, because I have no idea where to use it...
It's like a lot of things really, it makes perfect sense when you use it in a limited context(a one liner), and makes absolutely no sense if you use it anywhere else.
The problem with that of course is that it'd be almost impossible to restrict the use to where it makes sense, and allowing its use where it doesn't make sense is just odd.
I know that there's a movement coming out of scripting languages to try and minimize the number of lines of code, but when you're talking about a compiled language, readability is really the key and as much as it might offend your sense of style, the 4 line model is clearer than the reversed if.
I think it's a useful construct and a programmer would use it to emphasize what is important in the code and to de-emphasize what is not important. It is about writing intention-revealing code.
I use something like this (in coffeescript):
index = bla.find 'a'
return if index is -1
The most important thing in this code is to get out (return) if nothing is found - notice the words I just used to explain the intention were in the same order as that in the code.
So this construct helps me to code in a way which reflects my intention slightly better.
It shouldn't be too surprising to realize that the order in which correct English or traditional programming-language grammar has typically required, isn't always the most effective or simplest way to create meaning.
Sometimes you need to let everything hang out and truly reassess what is really the best way to do something.
It's considered grammatically incorrect to put the answer before the question, why would it be any different in code?

Go To Statement Considered Harmful?

If the statement above is correct, then why when I use reflector on .Net BCL I see it is used a lot?
EDIT: let me rephrase: are all the GO-TO's I see in reflector written by humans or compilers?
I think the following excerpt from the Wikipedia Article on Goto is particularly relevant here:
Probably the most famous criticism of
GOTO is a 1968 letter by Edsger
Dijkstra called Go To Statement
Considered Harmful. In that letter
Dijkstra argued that unrestricted GOTO
statements should be abolished from
higher-level languages because they
complicated the task of analyzing and
verifying the correctness of programs
(particularly those involving loops).
An alternative viewpoint is presented
in Donald Knuth's Structured
Programming with go to Statements
which analyzes many common programming
tasks and finds that in some of them
GOTO is the optimal language construct
to use.
So, on the one hand we have Edsger Dijkstra (a incredibly talented computer scientist) arguing against the use of the GOTO statement, and specifically arguing against the excessive use of the GOTO statement on the grounds that it is a much less structured way of writing code.
On the other hand, we have Donald Knuth (another incredibly talented computer scientist) arguing that using GOTO, especially using it judiciously can actually be the "best" and most optimal construct for a given piece of program code.
Ultimately, IMHO, I believe both men are correct. Dijkstra is correct in that overuse of the GOTO statement certainly makes a piece of code less readable and less structured, and this is certainly true when viewing computer programming from a purely theoretical perspective.
However, Knuth is also correct as, in the "real world", where one must take a pragmatic approach, the GOTO statement when used wisely can indeed be the best choice of language construct to use.
The above isn't really correct - it was a polemical device used by Dijkstra at a time when gotos were about the only flow control structure in use. In fact, several people have produced rebuttals, including Knuth's classic "Structured Programming Using Goto" paper (title from memory). And there are some situations (error handling, state machines) where gotos can produce clearer code (IMHO), than the "structured" alternatives.
These goto's are very often generated by the compiler, especially inside enumerators.
The compiler always knows what she's doing.
If you find yourself in the need to use goto, you should make sure it is the only option. Most often you'll find there's a better solution.
Other than that, there are very few instances the use of goto can be justified, such as when using nested loops. Again, there are other options in this case still. You could break out the inner loop in a function and use a return statement instead. You need to look closely if the additional method call is really too costly.
In response to your edit:
No, not all gotos are compiler generated, but a lot of them result from compiler generated state machines (enumerators), switch case statements or optimized if else structures. There are only a few instances you'll be able to judge whether it was the compiler or the original developer. You can get a good hint by looking at the function/class name, a compiler will generate "forbidden" names to avoid name clashes with your code. If everything looks normal and the code has not been optimized or obfuscated the use of goto is probably intended.
Keep in mind that the code you are seeing in Reflector is a disassembly -- Reflector is looking at the compiled byte codes and trying to piece together the original source code.
With that, you must remember that rules against gotos apply to high-level code. All the constructs that are used to replace gotos (for, while, break, switch etc) all compile down to code using JMPs.
So, Reflector looks at code much like this:
A:
if !(a > b)
goto B;
DoStuff();
goto A;
B: ...
And must realize that it was actually coded as:
while (a > b)
DoStuff();
Sometimes the code being read to too complicated for it to recognize the pattern.
Go To statement itself is not harmful, it is even pretty useful sometimes. Harmful are users who tend to put it in inappropriate places in their code.
When compiled down to assembly code, all control structured and converted to (un)conditional jumps. However, the optimizer may be too powerful, and when the disassembler cannot identify what control structure a jump pattern corresponds to, the always-correct statement, i.e. goto label; will be emitted.
This has nothing to do with the harm(ful|less)ness of goto.
what about a double loop or many nested loops, of which you have break out, for ex.
foreach (KeyValuePair<DateTime, ChangedValues> changedValForDate in _changedValForDates)
{
foreach (KeyValuePair<string, int> TypVal in changedValForDate.Value.TypeVales)
{
RefreshProgress("Daten werden geändert...", count++, false);
if (IsProgressCanceled)
{
goto TheEnd; //I like goto :)
}
}
}
TheEnd:
in this case you should consider that here the following should be done with break:
foreach(KeyValuePair<DateTime, ChangedValues> changedValForDate in _changedValForDates)
{
foreach (KeyValuePair<string, int> TypVal in changedValForDate.Value.TypeVales)
{
RefreshProgress("Daten werden geändert...", count++, false);
if (IsProgressCanceled)
{
break; //I miss goto :|
}
}
if (IsProgressCanceled)
{
break; //I really miss goto now :|
}//waaaAA !! so many brakets, I hate'm
}
The general rule is that you don't need to use goto. As with any rule there are of course exceptions, but as with any exceptions they are few.
The goto command is like a drug. If it's used in limited amounts only in special situations, it's good. If you use too much all the time, it will ruin your life.
When you are looing at the code using Reflector, you are not seeing the actual code. You are seeing code that is recreated from what the compiler produced from the original code. When you see a goto in the recreated code, it's not certain that there was a goto in the original code. There might be a more structured command to control the flow, like a break or a continue which has been implemented by the compiler in the same way as a goto, so that Reflector can't tell the difference.
goto considered harmful (for human to use but
for computers its okay).
because no matter how madly we(human) use goto, compiler always knows how to read the code.
Believe me...
Reading others code with gotos in it is HARD. Reading your own code with gotos in it is HARDER.
That is why you see it used in low level (machine languages) and not in high level (human languages e.g. C#,Python...) ;)
"C provides the infinitely-abusable goto statement, and labels to branch to. Formally, the goto is never necessary, and in practice it is almost always easy to write code without it. We have not used goto in this book."
-- K&R (2nd Ed.) : Page 65
I sometimes use goto when I want to perform a termination action:
static void DoAction(params int[] args)
{
foreach (int arg in args)
{
Console.WriteLine(arg);
if (arg == 93) goto exit;
}
//edit:
if (args.Length > 3) goto exit;
//Do another gazillion actions you might wanna skip.
//etc.
//etc.
exit:
Console.Write("Delete resource or whatever");
}
So instead of hitting return, I send it to the last line that performs another final action I can refer to from various places in the snippet instead of just terminating.
In decompiled code, virtually all gotos that you see will be synthetic. Don't worry about them; they're an artifact of how the code is represented at the low level.
As to valid reasons for putting them in your own code? The main one I can think of is where the language you are using does not provide a control construct suitable for the problem you are tackling; languages which make it easy to make custom control flow systems typically don't have goto at all. It's also always possible to avoid using them at all, but rearranging arbitrarily complex code into a while loop and lots of conditionals with a whole battery of control variables... that can actually make the code even more obscure (and slower too; compilers usually aren't smart enough to pick apart such complexity). The main goal of programming should be to produce a description of a program that is both clear to the computer and to the people reading it.
If it's harmful or not, it's a matter of likes and dislikes of each one. I personally don't like them, and find them very unproductive as they attempt maintainability of the code.
Now, one thing is how gotos affect our reading of the code, while another is how the jitter performs when found one. From Eric Lippert's Blog, I'd like to quote:
We first run a pass to transform loops into gotos and labels.
So, in fact the compiler transforms pretty each flow control structure into goto/label pattern while emitting IL. When reflector reads the IL of the assembly, it recognizes the pattern, and transforms it back to the appropriate flow control structure.
In some cases, when the emitted code is too complicated for reflector to understand, it just shows you the C# code that uses labels and gotos, equivalent to the IL it's reading. This is the case for example when implementing IEnumerable<T> methods with yield return and yield break statements. Those kind of methods get transformed into their own classes implementing the IEnumerable<T> interface using an underlying state machine. I believe in BCL you'll find lots of this cases.
GOTO can be useful, if it's not overused as stated above. Microsoft even uses it in several instances within the .NET Framework itself.
These goto's are very often generated by the compiler, especially inside enumerators. The compiler always knows what she's doing.
If you find yourself in the need to use goto, you should make sure it is the only option. Most often you'll find there's a better solution.
Other than that, there are very few instances the use of goto can be justified, such as when using nested loops. Again, there are other options in this case still. You could break out the inner loop in a function and use a return statement instead. You need to look closely if the additional method call is really too costly.

C# foreach vs functional each [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Which one of these do you prefer?
foreach(var zombie in zombies)
{
zombie.ShuffleTowardsSurvivors();
zombie.EatNearbyBrains();
}
or
zombies.Each(zombie => {
zombie.ShuffleTowardsSurvivors();
zombie.EatNearbyBrains();
});
The first. It's part of the language for a reason.
Personally, I'd only use the second, functional approach to flow control if there is a good reason to do so, such as using Parallel.ForEach in .NET 4. It has many disadvantages, including:
It's slower. It's going to introduce a delegate invocation at each element, just like you did foreach (..) { myDelegate(); }
It's non-standard, so will be more difficult to understand by most developers
If you close over any locals, you're going to force the compiler to make a closure. This can lead to strange issues if there's threading involved, plus adds completely unnecessary bloat to the assembly.
I see no reason to write your own syntax for a flow control construct that already exists in the language.
Here you're doing some very imperative things like writing a statement rather than an expression (as presumably the Each method returns no value) and mutating state (which one can only assume the methods do, as they also appear to return no value) yet you're trying to pass them off as 'functional programming' by passing a collection of statements as a delegate. This code could barely be further from the ideals and idioms of functional programming, so why try to disguise it as such?
As much as I like multi-paradigm languages such as C#, I think they are easiest to understand and maintain when paradigms are mixed at a higher level (e.g. an entire method written in either a functional or an imperative style) rather than when multiple paradigms are mixed within a single statement or expression.
If you're writing imperative code just be honest about it and use a loop. It's nothing to be ashamed of. Imperative code is not an inherently bad thing.
Second form.
In my opinion, the less language constructs and keywords you have to use, the better. C# has enough extraneous crud in it as it is.
Generally the less you have to type, the better. Seriously, how could you not want to use "var" in situations like this? Surely if being explicit was your only goal, you'd still be using hungarian notation... you have an IDE that gives you type information whenever you hover over... or of course Ctrl+Q if you're using Resharper...
#T.E.D. The performance implications of a delegate invocation are a secondary concern. If you're doing this a thousand terms sure, run dot trace and see if it's not acceptable.
#Reed Copsey: re non-standard, if a developer can't work out what ".Each" is doing then you've got more problems, heh. Hacking the language to make it nicer is one of the great joys of programming.
The lamda version is actually not slower. I just did a quick test and the delegate version is about 30% faster.
Here is the codez:
class Blah {
public void DoStuff() {
}
}
List<Blah> blahs = new List<Blah>();
DateTime start = DateTime.Now;
for(int i = 0; i < 30000000; i++) {
blahs.Add(new Blah());
}
TimeSpan elapsed = (DateTime.Now - start);
Console.WriteLine(string.Format(System.Globalization.CultureInfo.CurrentCulture, "Allocation - {0:00}:{1:00}:{2:00}.{3:000}",
elapsed.Hours,
elapsed.Minutes,
elapsed.Seconds,
elapsed.Milliseconds));
start = DateTime.Now;
foreach(var bl in blahs) {
bl.DoStuff();
}
elapsed = (DateTime.Now - start);
Console.WriteLine(string.Format(System.Globalization.CultureInfo.CurrentCulture, "foreach - {0:00}:{1:00}:{2:00}.{3:000}",
elapsed.Hours,
elapsed.Minutes,
elapsed.Seconds,
elapsed.Milliseconds));
start = DateTime.Now;
blahs.ForEach(bl=>bl.DoStuff());
elapsed = (DateTime.Now - start);
Console.WriteLine(string.Format(System.Globalization.CultureInfo.CurrentCulture, "lambda - {0:00}:{1:00}:{2:00}.{3:000}",
elapsed.Hours,
elapsed.Minutes,
elapsed.Seconds,
elapsed.Milliseconds));
OK, So I've run more tests and here are the results.
The order of the execution(forach, lambda or lambda, foreach) didn't make much difference, lambda version was still faster:
foreach - 00:00:00.561
lambda - 00:00:00.389
lambda - 00:00:00.317
foreach - 00:00:00.337
The difference in performance is a lot less for arrays of classes. Here are the numbers for Blah[30000000]:
lambda - 00:00:00.317
foreach - 00:00:00.337
Here is the same test but Blah being a struct:
Blah[] version
lambda - 00:00:00.676
foreach - 00:00:00.437
List version:
lambda - 00:00:00.461
foreach - 00:00:00.391
Optimized build, Blah is a struct using an array.
lambda - 00:00:00.426
foreach - 00:00:00.079
Conclusion: There is no blanket answer for performance of foreach vs lambda. The answer is It depends. Here is a more scientific test for List<T>. As far as I can tell it's pretty damn efficient. If you are really concerned with performance use for(int i... loop. For iterating over a collection of a thousand customer records (example) it really doesn't matter all that much.
As far as deciding between which version to use I would put potential performance hit for lambda version way at the bottom.
Conclusion #2 T[] (where T is a value type) foreach loop is about 5 times faster for this test in an optimized build. That's the only significant difference between a Debug and Release build. So there you go, for arrays of value types use foreach, everything else - it doesn't matter.
This question contains some useful discussion, as well as a link to an MSDN blog post, on the philosophical aspects of the topic.
I think extension methods are cool, but I think break and edit-and-continue are cooler.
I'd think the second form would be tougher to optimize, as there's no way for the compiler to unroll the loop any differently for this one call than it does for anybody else's call to the Each method.
Since it was asked, I'll elaborate. The method's implementation is quite liable to be compiled separately from the code that invokes it. This means that the compiler does not know exactly how many loops it is going to have to perform.
If you use the "foreach" form then that information may be avaliable to the compiler when it is creating the code for the loop (it also may not be available, in which case no difference).
For example, if the compiler happens to know (from previous code in the same file) that the list has exactly 20 items in it, it can replace the entire loop with 20 references.
However, when the compiler creates code for the "Each" method off in its source file, it has no idea how big the caller's list is going to be. It has to support any size. The best it can do is try to find some kind of optimum unrolling for its CPU, and add extra code to loop through that and do a proper loop if it is too small for the unrolling. For a typical small loop this might even end up being slower. Of course for small loops you don't care as much....unless they happen to be inside a big loop.
As another poster mentioned, this is (and should be) a secondary concern. The important thing is which is easier to read and/or maintain, but I don't see a huge difference there.
I don't prefer either, because of what I consider to be an un-needed use of 'var'. I would write is as:
foreach(Zombie zombie in zombies){
}
But as to the Functional or foreach, for me I most definitely prefer foreach, because there doesn't seem to be a good reason for the latter.

Ab-using languages

Some time ago I had to address a certain C# design problem when I was implementing a JavaScript code-generation framework. One of the solutions I came with was using the “using” keyword in a totally different (hackish, if you please) way. I used it as a syntax sugar (well, originally it is one anyway) for building hierarchical code structure. Something that looked like this:
CodeBuilder cb = new CodeBuilder();
using(cb.Function("foo"))
{
// Generate some function code
cb.Add(someStatement);
cb.Add(someOtherStatement);
using(cb.While(someCondition))
{
cb.Add(someLoopStatement);
// Generate some more code
}
}
It is working because the Function and the While methods return IDisposable object, that, upon dispose, tells the builder to close the current scope. Such thing can be helpful for any tree-like structure that need to be hard-codded.
Do you think such “hacks” are justified? Because you can say that in C++, for example, many of the features such as templates and operator overloading get over-abused and this behavior is encouraged by many (look at boost for example). On the other side, you can say that many modern languages discourage such abuse and give you specific, much more restricted features.
My example is, of course, somewhat esoteric, but real. So what do you think about the specific hack and of the whole issue? Have you encountered similar dilemmas? How much abuse can you tolerate?
I think this is something that has blown over from languages like Ruby that have much more extensive mechanisms to let you create languages within your language (google for "dsl" or "domain specific languages" if you want to know more). C# is less flexible in this respect.
I think creating DSL's in this way is a good thing. It makes for more readable code. Using blocks can be a useful part of a DSL in C#. In this case I think there are better alternatives. The use of using is this case strays a bit too far from its original purpose. This can confuse the reader. I like Anton Gogolev's solution better for example.
Offtopic, but just take a look at how pretty this becomes with lambdas:
var codeBuilder = new CodeBuilder();
codeBuilder.DefineFunction("Foo", x =>
{
codeBuilder.While(condition, y =>
{
}
}
It would be better if the disposable object returned from cb.Function(name) was the object on which the statements should be added. That internally this function builder passed through the calls to private/internal functions on the CodeBuilder is fine, just that to public consumers the sequence is clear.
So long as the Dispose implementation would make the following code cause a runtime error.
CodeBuilder cb = new CodeBuilder();
var f = cb.Function("foo")
using(function)
{
// Generate some function code
f.Add(someStatement);
}
function.Add(something); // this should throw
Then the behaviour is intuitive and relatively reasonable and correct usage (below) encourages and prevents this happening
CodeBuilder cb = new CodeBuilder();
using(var function = cb.Function("foo"))
{
// Generate some function code
function.Add(someStatement);
}
I have to ask why you are using your own classes rather than the provided CodeDomProvider implementations though. (There are good reasons for this, notably that the current implementation lacks many of the c# 3.0 features) but since you don't mention it yourself...
Edit: I would second Anoton's suggest to use lamdas. The readability is much improved (and you have the option of allowing Expression Trees
If you go by the strictest definitions of IDisposable then this is an abuse. It's meant to be used as a method for releasing native resources in a deterministic fashion by a managed object.
The use of IDisposable has evolved to essentially be used by "any object which should have a deterministic lifetime". I'm not saying this is write or wrong but that's how many API's and users are choosing to use IDisposable. Given that definition it's not an abuse.
I wouldn't consider it terribly bad abuse, but I also wouldn't consider it good form because of the cognitive wall you're building for your maintenance developers. The using statement implies a certain class of lifetime management. This is fine in its usual uses and in slightly customized ones (like #heeen's reference to an RAII analogue), but those situations still keep the spirit of the using statement intact.
In your particular case, I might argue that a more functional approach like #Anton Gogolev's would be more in the spirit of the language as well as maintainable.
As to your primary question, I think each such hack must ultimately stand on its own merits as the "best" solution for a particular language in a particular situation. The definition of best is subjective, of course, but there are definitely times (especially when the external constraints of budgets and schedules are thrown into the mix) where a slightly more hackish approach is the only reasonable answer.
I often "abuse" using blocks. I think they provide a great way of defining scope. I have a whole series of objects that I use for capture and restoring state (e.g. of Combo boxes or the mouse pointer) during operations that may change the state. I also use them for creating and dropping database connections.
E.g.:
using(_cursorStack.ChangeCursor(System.Windows.Forms.Cursors.WaitCursor))
{
...
}
I wouldn't call it abuse. Looks more like a fancied up RAII technique to me. People have been using these for things like monitors.

C# Extension Methods - How far is too far?

Rails introduced some core extensions to Ruby like 3.days.from_now which returns, as you'd expect a date three days in the future. With extension methods in C# we can now do something similar:
static class Extensions
{
public static TimeSpan Days(this int i)
{
return new TimeSpan(i, 0, 0, 0, 0);
}
public static DateTime FromNow(this TimeSpan ts)
{
return DateTime.Now.Add(ts);
}
}
class Program
{
static void Main(string[] args)
{
Console.WriteLine(
3.Days().FromNow()
);
}
}
Or how about:
static class Extensions
{
public static IEnumerable<int> To(this int from, int to)
{
return Enumerable.Range(from, to - from + 1);
}
}
class Program
{
static void Main(string[] args)
{
foreach (var i in 10.To(20))
{
Console.WriteLine(i);
}
}
}
Is this fundamentally wrong, or are there times when it is a good idea, like in a framework like Rails?
I like extension methods a lot but I do feel that when they are used outside of LINQ that they improve readability at the expense of maintainability.
Take 3.Days().FromNow() as an example. This is wonderfully expressive and anyone could read this code and tell you exactly what it does. That is a truly beautiful thing. As coders it is our joy to write code that is self-describing and expressive so that it requires almost no comments and is a pleasure to read. This code is paramount in that respect.
However, as coders we are also responsible to posterity, and those who come after us will spend most of their time trying to comprehend how this code works. We must be careful not to be so expressive that debugging our code requires leaping around amongst a myriad of extension methods.
Extension methods veil the "how" to better express the "what". I guess that makes them a double edged sword that is best used (like all things) in moderation.
First, my gut feeling: 3.Minutes.from_now looks totally cool, but does not demonstrate why extension methods are good. This also reflects my general view: cool, but I've never really missed them.
Question: Is 3.Minutes a timespan, or an angle?
Namespaces referenced through a using statement "normally" only affect types, now they suddenly decide what 3.Minutes means.
So the best is to "not let them escape".
All public extension methods in a likely-to-be-referenced namespace end up being "kind of global" - with all the potential problems associated with that. Keep them internal to your assembly, or put them into a separate namespace that is added to each file separately.
Personally I like int.To, I am ambivalent about int.Days, and I dislike TimeSpan.FromNow.
I dislike what I see as a bit of a fad for 'fluent' interfaces that let you write pseudo English code but do it by implementing methods with names that can be baffling in isolation.
For example, this doesnt read well to me:
TimeSpan.FromSeconds(4).FromNow()
Clearly, it's a subjective thing.
I agree with siz and lean conservative on this issue. Rails has that sort of stuff baked in, so it's not really that confusing ever. When you write your "days" and "fromnow" methods, there is no guarantee that your code is bug free. Also, you are adding a dependency to your code. If you put your extension methods in their own file, you need that file in every project. In a project, you need to include that project whenever you need it.
All that said, for really simple extension methods (like Jeff's usage of "left" or thatismatt's usage of days.fromnow above) that exist in other frameworks/worlds, I think it's ok. Anyone who is familiar with dates should understand what "3.Days().FromNow()" means.
I'm on the conservative side of the spectrum, at least for the time being, and am against extension methods. It is just syntactic sugar that, to me, is not that important. I think it can also be a nightmare for junior developers if they are new to C#. I'd rather encapsulate the extensions in my own objects or static methods.
If you are going to use them, just please don't overuse them to a point that you are making it convenient for yourself but messing with anyone else who touches your code. :-)
Each language has its own perspective on what a language should be. Rails and Ruby are designed with their own, very distinct opinions. PHP has clearly different opinions, as does C(++/#)...as does Visual Basic (though apparently we don't like their style).
The balance is having many, easily-read, built-in functions vs. the nitty-gritty control over everything. I wouldn't want SO many functions that you have to go to a lookup every time you want to do anything (and there's got to be a performance overhead to a bloated framework), but I personally love Rails, because what it has saves me a lot of time developing.
I guess what I'm saying here is that if you were designing a language, take a stance, go from there, and build in the functions you (or your target developer would) use most often.
My personal preference would be to use them sparingly for now and to wait to see how Microsoft and other big organizations use them. If we start seeing a lot of code, tutorials, and books use code like 3.Days().FromNow() it makes use it a lot. If only a small number of people use it, then you run the risk of having your code be overly difficult to maintain because not enough people are familiar with how extensions work.
On a related note, I wonder how the performance compares between a normal for loop and the foreach one? It would seem like the second method would involve a lot of extra work for the computer, but I'm not familiar enough with the concept to know for sure.

Categories

Resources