Best-practice for data-object properties: IEnumerable vs Array [closed]

Best-practice for data-object properties: IEnumerable vs Array [closed] - c#

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Short question: is it OK to declare data-object properties as IEnumerable, or it should be Array instead?
Background:
I've just found a bug in our project which caused performance issue. The reason was that IEnumerable has been iterated multiple times. But it is simple only at first sight. I think there is a design flaw there which allowed that to happen.
The deeper investigation has shown that one method GetAllUsers returned a UsersResponse object, one of the properties of which was IEnumerable<T> UsersList. When caching was implemented, obviously the entire UsersResponse object was being cached, and it worked fine at that time because GetAllUsers assigned an array to IEnumerable<T> UsersList. Later implementation of GetAllUsers has changed and for some reason developer decided that ToArray() call is redundant. So I think the problem was that UsersResponse object was not well-designed and allowed too much freedom for it's factory-method. On the other hand, caching an object which contains IEnumerable properties is also useless in principle.
So we return to my question of designing data-objects in general: when you declare it, not knowing if it will be cached some time in the future or how it will be used in other ways except your current need, is it OK to declare it's properties as IEnumerable, placing the responsibility of careful usage on other developers, or it must be Array from the start?
What I've searched:
The only suggestion I've found is Jon Wagner's blog post where he recommends to “Seal” LINQ chains as soon as they have been built. But this relates more to building the IEnumerable than storing it in an entity property. Although in conjunction with principle to return as specific type as possible, it can imply declaring property as Array.

When I'm thinking about API design I am always trying to be "nice" to the consumer. This means accepting as much as possible for parameters (when possible), and providing as much as possible for return values. If you too buy into this, it means that you should strive for IEnumerable parameters while providing Array (or similar) return values. The result is maximum value for the consumer of your API (even if it is yourself in the end).

I generally favour the Array approach unless it is forced into collections.
Especially when there's a lot numerical computations.
Advantages:
Compact storage. (probability for hacky techniques)
[ ] accessor. ( not applicable for C# )
readable code for multi dimensional arrays
last but not least, stick to one and never use both. it is terrible to convert a collection from Array to Collection and vice versa, this does no good either.

The most abstract type that fulfills the requirements should be used. If you need to make it impossible to store an "open" query in a DTO, then use ICollection<T> or (if you need indexed access) IList<T> for the property.
Using an array will nail you down to a concrete implementation. This may be a non-issue or it may turn out to be a pain at some later point. The latter rules out array IMHO.
BTW: It is the caller's responsibility to only iterate once over an IEnumerable<T>.

Related

Why is linq to object implementing iterators manually?

While browsing the .net core source i notice that even in source form iterator classes are manually implemented instead of relying on the yield statement and auto IEnumerable implementation.
You can see at this line for example the decalartion and implementation of the where iterator https://github.com/dotnet/corefx/blob/master/src/System.Linq/src/System/Linq/Enumerable.cs#L168
I'm assuming if they went through the trouble of doing this instead of a simple yield statement there has to be some benefit but i can't immediately see which, it seems pretty similar to what i remember the compiler does automatically from i read back eric lippert's blog a few years back and i remember when i naively reimplemented LINQ with yield statements in it's early days to better understand it the performance profile was similar to that of .NET version.
It piqued my curiosity but it's also an actually important question as i'm in the middle of a fairly big data - in memory project and if i'm missing something obvious that makes this approach better i would love to know the tradeoffs.
Edit : to clarify, i do understand why they can't just yield in the where method (different enumeration for different container types), what i don't understand is why they implement the iterator itself (that is, instead of forking to diferent iterators, forking to diferent methods, iterating diferently based on type, and yielding to have the auto implemented state machine instead of manual case 1 goto 2 case 2 etc).

One possible reason is that the specialized iterators perform a few optimizations, like combining selectors and predicates and taking advantage of indexed collections.
The usefulness of these is that some sources can be iterated in a more optimal way than what the compiler magic for yield would generate. And by creating these custom iterators, they can pass this extra information about the source to subsequent LINQ operations in a single chain (instead of making that information available only to the first operation in the chain). Thus, all Where and Select operations (that don't have anything else between them) can be executed as one.

Best practices for restricting access to enum parameter in C# [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Consider for the question this String.Split overload, which takes a StringSplitOptions enum as a parameter.
Isn't it bad that the enum itself is public and accessible to everything that includes the System namespace? I mean, the enum is completely specific to options of the Split method, yet it's available outside of it's scope.
Perhaps there is a better way to model this, like putting the enum inside the String class itself, and accessing it by using String.SplitOptions for instance? I very rarely see this (I actually can't remember any such case now), so I assume it is not preferred for some reason. In general, I think reducing the scope of things is a best practice because you lower the chance of problems occurring by using a class/member in an incorrect scope, so to speak.
I'm using Split as an example here, but it is quite common for a Enum to be used only by a method or class in our code base too. I generally create the enum as a public type in a separate cs file like any other class, but I would love to hear other approaches to this 'problem'.
Update:
I just found this article that attacks this exact problem, with a Folder class and a Filter enum but again seems go against what I believe would be more correct in that case (placing the enum inside the class somehow). One of the comments in there from ToddM (which I happen to agree with) states:
...
But, even then, I feel your logic is wrong. Your main complaint
against embedding the enum inside of the class is that it will take
too long to type. Given how verbose C# tends to be, this is not really
a sensible argument. In VS, CTRL+SPACE is your friend.
Logically, I feel placing the enum inside of the class is far more
correct. Take your example: what is a MyNameSpace.Filter? Where does
it apply? I guess it's a filter for your namespace? It's impossible to
tell, especially if your namespace grows to contain dozens of classes.
Now consider MyNameSpace.Folder.Filter -- it is, in my mind, far more
intuitive that Filter applies in some way, shape, or form to the
Folder class. Indeed, another class can be added to the namespace with
its own concept of filter, one of whose members may be 'File'. Just
because you've introduced a new class into the namespace doesn't give
you the right to pollute that namespace with various 'helper' types.
If you are developing as part of a large development team, your style
is, well, rude.
...

It's an interesting idea to nest the enum in order to suggest that it has a reduced scope, or to give it better semantics. I have used this idea before in order to have both error codes and warning codes in a post-compiler I developed. This way, I could use the same enum name Code nested either in the Error class or the Warning class.
On the other hand, public nested types are generally discouraged. They can be confusing to clients who have to qualify them with the outer class name. Look at the related guidelines on MSDN. Some that are relevant:
DO NOT use public nested types as a logical grouping construct; use namespaces for this.
AVOID publicly exposed nested types. The only exception to this is if variables of the nested type need to be declared only in rare scenarios such as subclassing or other advanced customization scenarios.
DO NOT use nested types if the type is likely to be referenced outside of the containing type.
For example, an enum passed to a method defined on a class should not be defined as a nested type in the class.
I believe those guidelines were followed when developing the StringSplitOptions enum, and most of the others in the BCL.

String.Split() is public, so StringSplitOptions has to be public too. Both String and StringSplitOptions exist in the System namespace. Both have public scope. Neither is "available outside of [the other's] scope".

I think one of the reasons is that it would make every call using an embedded enum wider (the name of the class becomes a mandatory prefix).
I personally wouln't appreciate having to use ResultSetTransformer.ResultSetTransformerOptions every time I have to use this enum, it would make my line horribly long.
But as others pointed out, I don't think it's standard in the framework to embed enums in classes at all, possibly for this reason.

Is using reflection in .Net effects the performance reasonably bad? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How costly is Reflection?
For sake of making my code generic I have started to use Reflection -to obtain some proerties in DTO objects and to set them-
Does usage of reflection to obtain properties and set them can effect the performance so badly compared to hard-coded setters?

Yes, there is a cost involved in using Reflection.
However, the actual impact on the overall application performance varies. One rule of thumb is to never use Reflection in code that gets executed many times, such as loops. That will usually lead to algorithms that slow down exponentially (O(cn)).
In many cases you can write generic code using delegates instead of Reflection, as described in this blog post.

Yes, reflection is slow.
You can try to decrease the impact of it by caching the xxxInfo (like MethodInfo, PropertyInfo etc) objects you retrieve via reflection per reflected Type and, i.e. keep them im a dictionary. Subsequent lookups in the dictionary are faster than retrieving the information every time.
You can also search here at SO for some questions about Reflection performance. For certain edge-cases there are pretty performant workarounds like using CreateDelegate to call methods instead of using MethodInfo.Invoke().

Aside from the fact that it's slower having to set properties through reflection is a design issue as you apparently have separated concerns or encapsulated properties through object oriented design which is now preventing you from setting them directly. I would say you look at your design (there can be edge cases though) instead of thinking about reflection.
One of the downsides aside from the performance impact is that you're using a statically typed language thus the compiler checks your code and compiles it. Normally at compile time you have the certainty that all properties you're using are there and are spelled correctly. When you start to use reflection you push this check to runtime which is a real shame as you're (in my opinion) missing one of the biggest benefits of using a static typed language. This will also limit your refactoring opportunities in the (near) future as you're not sure anymore if you replaced all occurrences of an assignment for example when renaming a property.

evaluating cost/benefits of using extension methods in C# => 3.0 [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
In what circumstances (usage scenarios) would you choose to write an extension rather than sub-classing an object ?
< full disclosure : I am not an MS employee; I do not know Mitsu Furota personally; I do know the author of the open-source Componax library mentioned here, but I have no business dealings with him whatsoever; I am not creating, or planning to create any commercial product using extensions : in sum : this post is from pure intellectal curiousity related to my trying to (continually) become aware of "best practices" >
I find the idea of extension methods "cool," and obviously you can do "far-out" things with them as in the many examples you can in Mitsu Furota's (MS) blog postslink text.
A personal friend wrote the open-source Componax librarylink text, and there's some remarkable facilities in there; but he is in complete command of his small company with total control over code guidelines, and every line of code "passes through his hands."
While this is speculation on my part : I think/guess other issues might come into play in a medium-to-large software team situation re use of Extensions.
Looking at MS's guidelines at link text, you find :
In general, you will probably be
calling extension methods far more
often than implementing your own. ...
In general, we recommend that you
implement extension methods sparingly
and only when you have to. Whenever
possible, client code that must extend
an existing type should do so by
creating a new type derived from the
existing type. For more information,
see Inheritance (C# Programming
Guide). ... When the compiler
encounters a method invocation, it
first looks for a match in the type's
instance methods. If no match is
found, it will search for any
extension methods that are defined for
the type, and bind to the first
extension method that it finds.
And at Ms's link text :
Extension methods present no specific
security vulnerabilities. They can
never be used to impersonate existing
methods on a type, because all name
collisions are resolved in favor of
the instance or static method defined
by the type itself. Extension methods
cannot access any private data in the
extended class.
Factors that seem obvious to me would include :
I assume you would not write an extension unless you expected it be used very generally and very frequently. On the other hand : couldn't you say the same thing about sub-classing ?
Knowing we can compile them into a seperate dll, and add the compiled dll, and reference it, and then use the extensions : is "cool," but does that "balance out" the cost inherent in the compiler first having to check to see if instance methods are defined as described above. Or the cost, in case of a "name clash," of using the Static invocation methods to make sure your extension is invoked rather than the instance definition ?
How frequent use of Extensions would affect run-time performance or memory use : I have no idea.
So, I'd appreciate your thoughts, or knowing about how/when you do, or don't do, use Extensions, compared to sub-classing.
thanks, Bill

My greatest usage for them is to extend closed-off 3rd party APIs.
Most of the time, when a software developer is offering an API on Windows these days, they are leaning more and more toward .NET for that extensibility. I like to do this because I prefer to depend on my own methods that I can modify in the future and serve as a global entry point to their API, in the case that they change it.
Previously, when having to do this, and I couldn't inherit the API object because it was sealed or something, I would rely on the Adapter pattern to make my own classes that wrapped up their objects. This is a functional, but rather inelegant solution. Extension methods give you a beautiful way to add more functionality to something that you don't control.
Many other peoples' greatest usage for them is LINQ!
LINQ would not be possible without the extension methods provided to IEnumerable.
The reason why people love them is because they make code more readable.
I have noticed another MAJOR usage of extension methods (myself included) is to make code more readable, and make it appear as if the code to do something belongs where it is supposed to. It also gets rid of the dreaded "Util" static-god-class that I have seen many times over. What looks better... Util.DecimalToFraction(decimal value); or value.ToFraction();? If you're like me, the latter.
Finally, there are those who deem the "static method" as EVIL!
Many 'good programmers' will tell you that you should try to avoid static methods, especially those who use extensive unit testing. Static methods are difficult to test in some cases, but they are not evil if used properly. While extension methods ARE static... they don't look or act like it. This allows you to get those static methods out of your classes, and onto the objects that they really should be attached to.
Regarding performance..
Extension methods are no different than calling a static method, passing the object being extended as a parameter... because that is what the compiler turns it into. The great thing about that is that your code looks clean, it does what you want, and the compiler handles the dirty work for you.

I use extension methods as a way to improve the functionality for classes without increasing the complexity of the class. You can keep your classes simple, and then add your repetitive work later on as an extension.
The Min() and Max() extension methods are great examples of this. You could just as easily declare a private method that would calculate these, but an extension method provides better readability, makes the functionality available to your entire project, and didn't require making an array any more complex of an object.

Taking the sub-classing approach vs. extension methods requires a couple of things to be true
The type must be extendable (not-sealed)
All places the type is created must support a factory pattern of sorts or the other code will just create the base type.
Adding an extension method requires really nothing other than using a C# 3.0+ compiler.
But most importantly, an inheritance hierarchy should represent an is-a relationship. I don't feel that adding 1 or 2 new methods / behaviors to a class truly expressing this type of relationship. It is instead augmenting existing behavior. A wrapper class or extension method much better fits the scenario.

In some cases you can't use a subclass: string for instance is sealed. You can however still add extension methods.

Why is it considered bad to expose List<T>? [duplicate]

This question already has answers here:
List<T> or IList<T> [closed]
(18 answers)
Closed 8 years ago.
According to FXCop, List should not be exposed in an API object model. Why is this considered bad practice?

I agree with moose-in-the-jungle here: List<T> is an unconstrained, bloated object that has a lot of "baggage" in it.
Fortunately the solution is simple: expose IList<T> instead.
It exposes a barebones interface that has most all of List<T>'s methods (with the exception of things like AddRange()) and it doesn't constrain you to the specific List<T> type, which allows your API consumers to use their own custom implementers of IList<T>.
For even more flexibility, consider exposing some collections to IEnumerable<T>, when appropriate.

There are the 2 main reasons:
List<T> is a rather bloated type with many members not relevant in many scenarios (is too “busy” for public object models).
The class is unsealed, but not specifically designed to be extended (you cannot override any members)

It's only considered bad practice if you are writing an API that will be used by thousands or millions of developers.
The .NET framework design guidelines are meant for Microsoft's public APIs.
If you have an API that's not being used by a lot of people, you should ignore the warning.

i think you dont want your consumers adding new elements into your return. An API should be clear and complete and if it returns an array, it should return the exact data structure. I dont think it has anything to do with T per say but rather returning a List<> instead of an array [] directly

One reason is because List isn't something you can simulate. Even in less-popular libraries, I've seen iterations that used to expose a List object as an IList due to this recommendation, and in later versions decided to not store the data in a list at all (perhaps in a database). Because it was an IList, it wasn't a breaking change to change the implementation underneath the clients and yet keep everyone working.

One of the reason is that user will be able to change the list and owner of the list will not know about this, while in some cases it must do some stuff after adding/removing items to/from the list. Even if it isn't required now it can become a requirement in future. So it is better to add AddXXX / RemoveXXX method to the owner of the class and expose list an an IEnumerable or (which is better in my opinion) expose it as an IList and use ObservableCollection from WindowsBase.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.