When using extremely short-lived objects that I only need to call one method on, I'm inclined to chain the method call directly to new. A very common example of this is something like the following:
string noNewlines = new Regex("\\n+").Replace(" ", oldString);
The point here is that I have no need for the Regex object after I've done the one replacement, and I like to be able to express this as a one-liner. Is there any non-obvious problem with this idiom? Some of my coworkers have expressed discomfort with it, but without anything that seemed to be like a good reason.
(I've marked this as both C# and Java, since the above idiom is common and usable in both languages.)
This particular pattern is fine -- I use it myself on occasion.
But I would not use this pattern as you have in your example. There are two alternate approaches that are better.
Better approach: Use the static method Regex.Replace(string,string,string). There is no reason to obfuscate your meaning with the new-style syntax when a static method is available that does the same thing.
Best approach: If you use the same static (not dynamically-generated) Regex from the same method, and you call this method a lot, you should store the Regex object as a private static field on the class containing the method, since this avoids parsing the expression on each call to the method.
I don't see anything wrong with this; I do this quite frequently myself.
The only exception to the rule might be for debugging purposes, it's sometimes necessary to be able to see the state of the object in the debugger, which can be difficult in a one-liner like this.
If you don't need the object afterwards, I don't see a problem - I do it myself from time to time as well. However, it can be quite hard to spot, so if your coworkers are expressing discomfort, you might need to put it into a variable so there are no hard feelings on the team. Doesn't really hurt you.
You just have to be careful when you're chaining methods of objects that implement IDisposable. Doing a single-line chain doesn't really leave room for calling Dispose or the using {...} block.
For example:
DialogResult result = New SomeCfgDialog(some_data).ShowDialog();
There is no instance variable on which to call Dispose.
Then there is potential to obfuscate intent, hurt rather than improve readability and make it tougher to examine values while debugging. But those are all issues particular to the object and the situation and the number of methods chained. I don't think that there is a single reason to avoid it. Sometimes doing this will make the code more concise and readable and other times it might hurt for some of the reasons mentioned above.
As long as you're sure that the object is never needed again (or you're not creating multiple instances of an identical object), then there's no problem with it.
If the rest of your team isn't comfortable with it, though, you might want to re-think the decision. The team should set the standards and you should follow them. Be consistent. If you want to change the standard, discuss it. If they don't agree, then fall in line.
I think thats ok, and would welcome comments/reasons to the contrary. When the object is not short lived (or uses unmanaged resources - ie COM) then this practice can get you into trouble.
The issue is readability.
Putting the "chained" methods on a separate line seems to be the preferred convention with my team.
string noNewlines = new Regex("\\n+")
.Replace(" ", oldString);
One reason to avoid this style is that your coworkers might want to inspect the object in a debug mode. If you compound the similar instantiation the readability goes down a lot. For example :
String val = new Object1("Hello").doSomething(new Object2("interesting").withThis("input"));
Generally I prefer using a static method for the specific example you have mentioned.
The only potential problem I could see is - if, for some reason, new Regex were NULL because it was not instantiated correctly, you would get a Null Pointer Exception. However, I highly doubt that since Regex is always defined...
If you don't care about the object you invoke the method on, that's a sign that the method should probably be static.
In C#, I'd probably write an extension method to wrap the regex, so that I could write
string noNewlines = oldString.RemoveNewlines();
The extension method would look something like
using System.Text.RegularExpressions;
namespace Extensions
{
static class SystemStringExtensions
{
public static string RemoveNewlines(this string inputString)
{
// replace newline characters with spaces
return Regex.Replace(inputString, "\\n+", " ");
}
}
}
I find this much easier to read than your original example. It's also quite reusable, as stripping newline characters is one of the more common activities.
Related
Basically, say I have a method:
string MyMethod(string someVar);
And I need to use the return value in another method, is it advisable to do:
string myString = AnotherMethod(MyMethod(someString));
rather than:
string anotherString = MyMethod(someString);
string returnValue = AnotherMethod(anotherString);
Personally, I would use the longer version - it is more readable and makes debugging easier.
Calling a method in the parameter list of another method can be confusing to the reader.
There is a minor impact on memory usage with the longer version, as you have one additional variable that needs to be allocated, but this would be minimal.
For testing and debugging purposes (setting breakpoints/inspecting the results from each call separately) the longer version is the one I would prefer.
Regarding speed/efficiency there should be no measurable diference IMHO.
i would also prefer 2nd option.
cause it is less complex to understand and use too.
The longer one is more readable and easy to debug.
It is quite easy to set a breakpoint on the line of
string anotherString = MyMethod(someString);
Also, you can add watch on the varieable "anotherString" as well.
One use for the var type in c# seems to be shortcuts/simplifying/save unnecessary typing. One thing that I've considered is this:
MyApp.Properties.Settings.Default.Value=1;
That's a ton of unnecessary code. An alternative to this is declaring:
using MyApp.Properties;
-- or --
using appsettings = MyAppp.Properties.Settings;
leading to: appsettings.Default.Value=1 or Settings.Default.Value=1
Abit better, but I'd like it shorter still:
var appsettings = MyFirstCSharpApp.Properties.Settings.Default;
appsettings.Value=1;
Finally, it's short enough, and it could be used for other annoyingly long calls too but is this an accepted way of doing it? I'm considering whether the "shortcut var" will always be pointing to the existing instance of whatever I'm making a shortcut too? (obviously not just the settings as in this example)
It's acceptable code in that the compiler will take it and know what to do with it. It's acceptable code logically in that it shortens code later. It's really not any different than actually defining the variable type (int/bool/whatever) rather than saying var. When you're in the studio, putting your mouse of the variable gives you its compiled type, so there shouldn't be any real issue with it. Some might call it lazy, but laziness is the mother of invention. As long as your code doesn't become unreadable, I can't see how it would be much of a problem.
There is nothing wrong with that, as long as the code is clear.
In Fact, var is more and more used exactly for that : shortening the code.
Specially in the case of
MyClass myClass = new MyClass();
Which is very clear enough using
var myClass = new MyClass();
And btw, ReSharper helps you enforce that var is used everywhere it can be !
Seems fine to me, it can enhance readability especially if you're using that long .notated syntax many times.
As a side, if you're using an indexed property or an autogenerated property (which does work on the fly for the GETTER) multiple times, then there can be a performance hit for each access of this property. That's microoptimisation though, so probably shouldn't worry too much about that.
Just be careful to know that the static variables you are referencing do not change from underneath you. For instance, the following would break your "shortcut":
var appsettings = MyFirstCSharpApp.Properties.Settings.Default;
MyFirstCSharpApp.Properties.Settings.Default = new DefaultSettings(); // new reference
appsettings.Value=1;
Of course, I am not suggesting that you would ever write code that does this, but we are talking about global variables here... any code anywhere can change out this reference. Caching the reference in appsettings CAN be dangerous in cases like these... one of the many pitfalls of being coupled to static variables, IMO.
I often find myself using lambdas as some sort of "local functions" to make my life easier with repetetive operations like those:
Func<string, string> GetText = (resource) => this.resourceManager.GetString(resource);
Func<float, object, string> FormatF1 = (f, o) => String.Format("{0:F1} {1}", f, o);
Func<float, object, string> FormatF2 = (f, o) => String.Format("{0:F2} {1}", f, o);
Instead of writing the String.Format-thing over and over, I can happily blow away with FormatF2 e.g. and save myself time and when I need to change something about the formatting, only one place to make edits.
Especially when I need the functionality in the given function exclusively, I'm very reluctant to turn them into a real function. While the lambdas above were relatively small... sometimes I have larger ones like (the following is supposed to add data to a table for print output):
Action<string, string, string> AddSurfaceData = (resource, col, unit) => {
renderTable.Cells[tableRowIndex, 0].Text = "\t" + this.GetText(resource);
renderTable.Cells[tableRowIndex, 1].Text = FormatF2(paraHydReader.GetFloat(paraHydReader.GetOrdinal(col)), "");
renderTable.Cells[tableRowIndex, 1].Style.TextAlignHorz = C1.C1Preview.AlignHorzEnum.Right;
renderTable.Cells[tableRowIndex, 2].Text = " " + this.GetText(unit);
renderTable.Cells[tableRowIndex, 2].Style.TextAlignHorz = C1.C1Preview.AlignHorzEnum.Left;
++tableRowIndex;
};
Again, I need this often and all the benefits of above apply, too. However, as you can see, this one is quite long for a lambda expression.. the question is: When do you draw the line? Is my last lambda too much? What other ways (other than using real functions or trying to stuff the data in containers and loop over them) exist to avoid writing the same code over and over again?
Thanks in advance
Christian
It is something you use potentially many times within a method, and only that inside that method. I like this idea. Do it if it doesn't make your code hard to read. I would say that you should reconsider if you find it difficult to see what is the content of the lambda function vs. what is the real content of the method. In that case it might be cleaner to pull it out in a separate private method.
At the end, this is really a matter of taste...
I agree with awe: for small scale reuse inside a method (or even a class) this is perfect. Like the string.Format examples. I use this quite often. It's basically the same thing as using a local variable for an intermediate value that you use more than once, but then for code.
Your second example seems to be pushing it a bit. Somehow this gives me the feeling a private method AddSurfaceData (possibly static, depending on its use?) would be a better fit. That is of course outside of the context that you have, so use your own good judgement.
A Lambda method is an anonymous method.
This means that you should not give it a name.
If you are doing that, (in your case, you are assigning a name with your reference), it's just another way to declare a function.
C# has already got a way to declare functions, and it's not the lambda way, which was added
uniquely to pass functions via parameters and returns them as return values.
Think, as an example, in javascript:
function f(var1,var2,...,varX)
{
some code
}
or
var f = function() {
some code
}
Different syntax (almost) same thing.
For more information on why it's not the same thing: Javascript: var functionName = function() {} vs function functionName() {}
Another example: in Haskell You can define two functions:
function1 :: Int -> Int
function1 x = x + 2
or
function2 :: Int -> Int
function2 = \x -> x + 2
Same thing (this time I think it's the very same), different syntax. I prefer the first one, it's more clear.
C# 3.5, as Javascript, has got a lot of functional influences. Some of them should it be used wisely, IMHO.
Someone said local lambda functions with assignment in a reference is a good substitute for a method defined within another method, similar to a "let", or a "where" clause in Haskell.
I say "similar" because the twos have very different semantics, for instance, in Haskell I can use function name which is not declared yet and define it later with "where", while in C#, with function/reference assignment I can't do this.
By the way I think it's a good idea, I'm not banning this use of lambda function, I just want to make people think about it.
Every language has got his abstraction mechanism, use it wisely.
I like the idea. I don't see a better way to maintain code locality without violating the DRY principle. And I think it's only harder to read if you're not accustomed to lambdas.
+1 on nikie re DRY being good in general.
I wouldnt use PascalCase naming for them though.
Be careful though - in most cases the stuff you have in there is just an Extract Method in a dress or a potential helper or extension function. e.g., GetText is a Method and FormatF* is probably a helper method...
I have no problem with the long example. I see that you are repackaging compound data very elegantly.
It's the short ones that will drive your colleagues to investigate the advantages of voluntary institutionalization. Please declare some kind of constant to hold your format string and use "that String.Format-thing over and over". As a C# programmer, I know what that does without looking elsewhere for home-spun functions. That way, when I need to know what the formatted string will look like, I can just examine the constant.
I agree with volothamp in general, but in addition ...
Think of the other people that have to maintain your code. I think this is easier to understand than your first example and still offers the maintenance benefits you mention:
String.Format(this.resourceManager.GetString("BasicFormat"), f, o);
String.Format(this.resourceManager.GetString("AdvancedFormat"), f, o);
And your second example appears to be just a different way to declare a function. I don't see any useful benefit over declaring a helper method. And declaring a helper method will be more understandable to the majority of coders.
I personally think its not in good taste to use lambda functions when there is no need for it. I personally wont use a lambda function to replace a simple few lines of procedural code. Lambda functions offer many enhancements, but make the code slightly more complicated to read.
I wouldnt use it to replace string.format.
For quick tasks where I only use an instantiated object once, I am aware that I can do the following:
int FooBarResult = (new Foo()).Bar();
I say this is perfectly acceptable with non-disposable objects and more readable than the alternative:
Foo MyOnceUsedFoo = new Foo();
int FooBarResult = MyOnceUsedFoo.Bar();
Which do you use, and why?
Would you ever use this type of anonymous instantiation in a production app?
Preference: with parenthesis "(new Foo()).Bar();" or without "new Foo().Bar();"?
(Edited to abstract question away from Random class)
Side note regarding random numbers: In fact, no, your specific example (new Random().Next(0,100)) is completely unacceptable. The generated random numbers will be far from uniform.
Other than that, in general, there is not much difference between the two. The compiler will most probably generate the exact same code in either case. You should go with the most readable case (long statements might harm readability; more code will do it too, so you have to make the trade-off in your specific case).
By the way, if you chose to go with the single line case, omit the unnecessary parens (new MyObject().Method() will do).
You might want to consider the implications of using the code in the debugger. The second case will allow you to inspect the object you've created, whereas the first won't. Granted you can always back out to the second case when you're attempting to debug the code.
I've done it both ways and don't really have a preference. I prefer whatever looks more readable, which is highly dependent on the complexity of the class and method being called.
BTW -- you might want to pick a different example. I fear that your point might get lost in discussions over the best way to generate random numbers.
If you are only using the object once, the first way is better all the time.
It is shorter and clearer, because it makes it explicit that you will not use the object later.
It will probably compile to the same CIL anyway, so there's no advantage to the second form.
First statement. It's more readable, has less code and doesn't leave temps around.
The second one is debugging friendly, while the first one isn't. The second wins only because of this.
In fact the first way, creating a temporary, is more readable for two reasons:
1) it's more concise
There's less code to read, there's no unnecessary local variable introduced, and no potential name clash with another local, or shadowing of any variable with the same name in an enclosing scope
2) it communicates something that the second form doesn't, that the object is being used temporarily.
Reading it, I know that that instance is never going to be used again, so in my "mental compiler" that I use to understand the code I'm reading, I don't have to keep a reference to it any more than the code keeps a reference to it.
As Mehrdad notes, though, doing it with a Random class isn't a good idea.
As he also notes, the redundant parentheses make it less concise; unless you're in a dusty corner of a language, assume that competent programmers know the language's operator precedence. In this case, even if I don't know the operator precedence, the alternative parse (calling new on a function's return) is nonsensical, so the "obvious" reading is the correct reading.
int RandomIndex = (new Random()).Next(0,100);
int RandomIndex = new Random().Next(0,100);
Some time ago I had to address a certain C# design problem when I was implementing a JavaScript code-generation framework. One of the solutions I came with was using the “using” keyword in a totally different (hackish, if you please) way. I used it as a syntax sugar (well, originally it is one anyway) for building hierarchical code structure. Something that looked like this:
CodeBuilder cb = new CodeBuilder();
using(cb.Function("foo"))
{
// Generate some function code
cb.Add(someStatement);
cb.Add(someOtherStatement);
using(cb.While(someCondition))
{
cb.Add(someLoopStatement);
// Generate some more code
}
}
It is working because the Function and the While methods return IDisposable object, that, upon dispose, tells the builder to close the current scope. Such thing can be helpful for any tree-like structure that need to be hard-codded.
Do you think such “hacks” are justified? Because you can say that in C++, for example, many of the features such as templates and operator overloading get over-abused and this behavior is encouraged by many (look at boost for example). On the other side, you can say that many modern languages discourage such abuse and give you specific, much more restricted features.
My example is, of course, somewhat esoteric, but real. So what do you think about the specific hack and of the whole issue? Have you encountered similar dilemmas? How much abuse can you tolerate?
I think this is something that has blown over from languages like Ruby that have much more extensive mechanisms to let you create languages within your language (google for "dsl" or "domain specific languages" if you want to know more). C# is less flexible in this respect.
I think creating DSL's in this way is a good thing. It makes for more readable code. Using blocks can be a useful part of a DSL in C#. In this case I think there are better alternatives. The use of using is this case strays a bit too far from its original purpose. This can confuse the reader. I like Anton Gogolev's solution better for example.
Offtopic, but just take a look at how pretty this becomes with lambdas:
var codeBuilder = new CodeBuilder();
codeBuilder.DefineFunction("Foo", x =>
{
codeBuilder.While(condition, y =>
{
}
}
It would be better if the disposable object returned from cb.Function(name) was the object on which the statements should be added. That internally this function builder passed through the calls to private/internal functions on the CodeBuilder is fine, just that to public consumers the sequence is clear.
So long as the Dispose implementation would make the following code cause a runtime error.
CodeBuilder cb = new CodeBuilder();
var f = cb.Function("foo")
using(function)
{
// Generate some function code
f.Add(someStatement);
}
function.Add(something); // this should throw
Then the behaviour is intuitive and relatively reasonable and correct usage (below) encourages and prevents this happening
CodeBuilder cb = new CodeBuilder();
using(var function = cb.Function("foo"))
{
// Generate some function code
function.Add(someStatement);
}
I have to ask why you are using your own classes rather than the provided CodeDomProvider implementations though. (There are good reasons for this, notably that the current implementation lacks many of the c# 3.0 features) but since you don't mention it yourself...
Edit: I would second Anoton's suggest to use lamdas. The readability is much improved (and you have the option of allowing Expression Trees
If you go by the strictest definitions of IDisposable then this is an abuse. It's meant to be used as a method for releasing native resources in a deterministic fashion by a managed object.
The use of IDisposable has evolved to essentially be used by "any object which should have a deterministic lifetime". I'm not saying this is write or wrong but that's how many API's and users are choosing to use IDisposable. Given that definition it's not an abuse.
I wouldn't consider it terribly bad abuse, but I also wouldn't consider it good form because of the cognitive wall you're building for your maintenance developers. The using statement implies a certain class of lifetime management. This is fine in its usual uses and in slightly customized ones (like #heeen's reference to an RAII analogue), but those situations still keep the spirit of the using statement intact.
In your particular case, I might argue that a more functional approach like #Anton Gogolev's would be more in the spirit of the language as well as maintainable.
As to your primary question, I think each such hack must ultimately stand on its own merits as the "best" solution for a particular language in a particular situation. The definition of best is subjective, of course, but there are definitely times (especially when the external constraints of budgets and schedules are thrown into the mix) where a slightly more hackish approach is the only reasonable answer.
I often "abuse" using blocks. I think they provide a great way of defining scope. I have a whole series of objects that I use for capture and restoring state (e.g. of Combo boxes or the mouse pointer) during operations that may change the state. I also use them for creating and dropping database connections.
E.g.:
using(_cursorStack.ChangeCursor(System.Windows.Forms.Cursors.WaitCursor))
{
...
}
I wouldn't call it abuse. Looks more like a fancied up RAII technique to me. People have been using these for things like monitors.