interface compatible with Linq without a chance of multiple exection - c#

I'm writing a very high amount of computed code that need a huge amount of linq stuff, so basically, a method do some stuff, pass to another method do a lot of other stuff using linq. and this is happening like 10k times.
to minimize the effort I created some extensions method that do the repetitive tasks. you can imagine something along this but more complicated
public static IEnumerable<int> IsBiggerThan2(this IEnumerable<int> input){
return input.Where(x=> x>2);
}
public static IEnumerable<int> IsMod2(this IEnumerable<int> input){
return input.Where(x=> x%2==0);
}
public static IEnumerable<int> IsMod3(this IEnumerable<int> input){
return input.Where(x=> x%3==0);
}
my problem is when I use linq, the output is IEnumerable, this could cause to multiple execution, I also don't want to spam .ToList() at end of everyline.
var firstCalculation = someInput.select(x=> x+1);
var biggerAndMod2 = firstCalculation.IsbiggerThan2().IsMod2();
var biggerAndMod2And3 = firstCalculation.IsbiggerThan2().IsMod2().IsMod3();
var biggerAndMod2Plus1= biggerAndMod2.select(x=> x+1).IsBiggerThan2();
// and go on
after many lines it's becoming quite daunting and I wonder why there is no interface that share some characteristic between List and Enumerable.
I can pass IList<int> to IEnumerable<int> but not the vice versa and I need to cast it to list, I am looking for a workaround that I can accept linq result as my Input without needing to cast it to List
public static IEnumerable<int> IsMod3(this ISomething<int> input){
return input.Where(x=> x%3);
}
I tried ICollection, IReadonlyList, IList but no luck, all need to cast to List
Update 1:
in describing the problem I made it over simplified, in those extension methods there are similar cases as what I showed with multiple use of the input arguments. in this way either I need to call .ToList on every input variable or before use of every extensions

The experimental Memoize method from MoreLinq may be what you want.

Related

LINQ Concatenation with a single extra element

If we want a single IEnumerable<T> representing the concatenation of of two IEnumberable<T>s we can use the LINQ Concat() method.
For Example:
int[] a = new int[] { 1, 2, 3 };
int[] b = new int[] { 4, 5 };
foreach (int i in a.Concat(b))
{
Console.Write(i);
}
of course outputs 12345.
My question is, why is there no overload of Concat() just accepting a single element of type T such that:
int[] a = new int[] { 1, 2, 3 };
foreach (int i in a.Concat(4))
{
Console.Write(i);
}
would compile and produce the output: 1234?
Googling around the issue throws up a couple of SO questions where the accepted answer suggests that the best approach when looking to acheive this is to simply do a.Concat(new int[] {4}). Which is fine(ish) but a little 'unclean' in my opinion because:
Maybe there is a performance hit from declaring a new array (albeit this is presumably going to be negligible pretty much evey time)
It just doesn't look as neat, easy to read and natural as a.Concat(4)
Anyone know why such an overload doesn't exist?
Also, assuming my Googling hasn't let me down - there is no such similar LINQ extension method taking a single element of type T.
(I understand it is trivially easy to roll one's own extension method to produce this effect - but doesnt that just make the ommision even more odd? I suspect there will be a reason for it's ommision but can't imagine what it could be?)
UPDATE:
Acknowledging the couple of votes to close this as opinion based - I should clarify that I am NOT seeking peoples opinions on whether this would be a good addition to LINQ.
More I am seeking to understand the FACTUAL reasons why it is not ALREADY part of LINQ.
In .NET Framework 4.7.1 they added Prepend and Append methods to add one element to the beginning and to the end of enumerable correspondingly.
usage:
var emptySequence = Enumerable.Empty<long>();
var singleElementSequence = emptySequence.Append(256L);
A good reason for inclusion (in one form or another) would be for IEnumerables to be more like functional sequence monads.
But since LINQ did not arrive until .NET 3.0, and is implemented mostly using extension methods, I can imagine that they omitted extension methods working on a single element of T. Still this is pure speculation on my part.
They did however include generator functions, that are not extension methods. Specifically the following:
Enumerable.Empty
Enumerable.Repeat
Enumerable.Range
You could use these instead of homebrew extension methods. The two use cases you mentioned, can be solved as:
int[] a = new int[] { 1, 2, 3 };
var myPrependedEnumerable = Enumerable.Repeat(0, 1).Concat(a);
var myAppendedEnumerable = a.Concat(Enumerable.Repeat(4, 1));
It might have been nice if an additional overload was included as syntactical sugar.
Enumerable.FromElement(x); // or a better name (see below).
The absence of an explicit Unit function is curious and interesting
In the interesting MoreLINQ series of blog posts by Bart De Smet, illustrated using the System.Linq.EnumerableEx, the post More LINQ with System.Interactive – Sequences under construction specifically deals with this question, using the following appropriately named method for constructing a single element IEnumerable.
public static IEnumerable<TSource> Return<TSource>(TSource value);
This is nothing but the return function (sometimes referred to as unit) used on a monad.
Also interesting is the blog series by Eric Lippert on monads, which features the following quote in part eight:
IEnumerable<int> sequence = Enumerable.Repeat<int>(123, 1);
And frankly, that last one is a bit dodgy. I wish there was a static method on Enumerable specifically for making a one-element sequence.
Furthermore, the F# language provides the seq type:
Sequences are represented by the seq<'T> type, which is an alias for IEnumerable. Therefore, any .NET Framework type that implements System.IEnumerable can be used as a sequence.
It provides an explicit unit function as Seq.singleton.
Concluding
While none of this provides us with facts that shed light on the reasons why these sequence constructs are not explicitly present in c#, until someone with knowledge of the design decision process shares that information, it does highlight it would be worth knowing more about.
First - your Googling is fine - there is no such method. I like the idea though. It's a use case that if you run in to, having it would be great.
I suspect it wasn't included with the LINQ API because the designers didn't see a common enough need for it. That's just my conjecture though.
You're right to say that creating an array with just one element isn't all that intuitive. You can get the feel and performance you're going for with this:
public static class EnumerableExtensions {
public static IEnumerable<T> Concat<T>(this IEnumerable<T> source, T element) {
foreach (var e in source) {
yield return e;
}
yield return element;
}
public static IEnumerable<T> Concat<T>(this T source, IEnumerable<T> element) {
yield return source;
foreach (var e in element) {
yield return e;
}
}
}
class Program
{
static void Main()
{
List<int> ints = new List<int> {1, 2, 3};
var startingInt = 0;
foreach (var i in startingInt.Concat(ints).Concat(4)) {
Console.WriteLine(i);
}
}
}
Output:
0
1
2
3
4
Lazy evaluation
Implemented similarly to the built-in LINQ methods (they actually return an internal iterator, instead of directly yielding)
Argument checking wouldn't hurt it
Philosophical questions, I like that.
At first, you can create easily that behaviour with an extension method
public static IEnumerable<TSource> Concat(this IEnumerable<TSource> source, TSource element)
{
return source.Concat(new[]{element});
}
I think the central question is that IEnumerable is an immutable interface and it is not meant to be modified on the fly.
This could (I use could because I do not work in Microsoft so I may be completely wrong) be the reason while the modify part of IEnumerable is not so well developed (in the meaning that you're missing some handy methods).
If you have to modify that collection, consider to use a List or another interface.

Is there any performance difference between compare method and compare class?

Are there any difference in performance between
List<T>.Sort Method (Comparison<T>)
and
List<T>.Sort Method (IComparer<T>)?
Does exists any structural (software architectural) benefits?
When do you use the compare method instead of compare class and vice versa?
EDIT:
The List<T>.Sort Method (IComparer<T>) is faster. Thanks Jim Mischel!
The performance difference is around 1% on my PC.
It seems that the compare class is the faster one.
The difference is that the first accepts a method (anonymous or not) and the second accepts an instance of a comparer object. Sometimes it is easier to define complex and customizeable comparer classes rather than write everything inside a single function.
I prefer the first for simple sorting in one dimension and the latter for multidimensional sorting in e.g. data grids.
Using a comparer you can have private members which can often help with caching. This is useful in certain scenarios (again, in complex sorting of a large data set displayed in a grid).
As I recall, List.Sort(Comparer<T>) instantiates an IComparer<T> and then calls List.Sort(IComparer<T>).
It looks something like this:
class SortComparer<T>: IComparer<T>
{
private readonly Comparison<T> _compare;
public SortComparer(Comparison<T> comp)
{
_compare = comp;
}
public int Compare(T x, T y)
{
return _compare(x, y);
}
}
public Sort(Comparison<T> comp)
{
Sort(new SortComparer(comp));
}
So they really end up doing the same thing. When I timed this stuff (back in .NET 3.5), Sort(IComparer<T>) was slightly faster because it didn't have to do the extra dereference on every call. But the difference really wasn't big enough to worry about. This is definitely a case of use whatever works best in your code rather than what performs the fastest.
A little more about it, including information about default IComparer implementations: Of Comparison and IComparer

return single instance object as IEnumerable

I have in instance of class foo and i want to return it as IEnumerable.
Can i do it without creating a new list etc..
Perhaps something like the following:
IEnumerable<foo>.fromInstance(foo)
Options:
Create an instance of a collection class, like an array or a list. This would be mutable by default, which would be slightly unhelpful if this is a sequence you want to be able to hand out in your API. You could create a ReadOnlyCollection<T> wrapper around such a collection though.
Write your own iterator block as per Botz3000's answer
Use Enumerable.Repeat(item, 1) from LINQ, if you're using .NET 3.5.
The best answer here depends on the usage. If you only need this to call another method which uses a sequence, and you know it won't be modified, I'd probably use an array. For example, in order to call Concat on some other sequence, you might want:
var wholeList = regularList.Concat(new[] { finalValue });
I have confidence that Concat isn't going to mutate the array, and nothing else will ever see the reference to the array itself.
If you need to return the sequence to some other code, and you don't know what it might do with it, I'd probably use Enumerable.Repeat.
you could do this:
public IEnumerable<Foo> GetSingleFooAsEnumerable() {
yield return singleFoo;
}
The best idiomatic way to do this is something like new[] { foo } which just creates a 1-element array of whatever type foo is declared to be.
The one possible downside to this is that arrays aren't immutable, so somebody can cast your IEnumerable<T> to a T[] and change the value in there. This is fairly unlikely, though, so I don't worry about it.
IENumerable is supposed to be used for something that you can enumerate through, so using it for a single instance seems quite strange. If you really need to, you can get it done like this. It might be a better way, but this should get the job done:
List<foo> myList = new List<foo>();
myList.Add( myInstanceOfFoo );
IEnumerable myEnumerable = myList.AsEnumerable();
Regardless of how you see this, you are actually trying to make a list of one element, and then it's really no reason to make it a list.

What is the reason string.Join needs to take an array instead of an IEnumerable?

As the title says: Why does string.Join need to take an array instead of an IEnumerable? This keeps annoying me, since I have to add a .ToArray() when I need to create a joined string from the result of a LINQ expression.
My experience tells me that I'm missing something obvious here.
Upgrade to .NET 4.0 and use the overload that accepts an IEnumerable<string>. Otherwise, just accept that it was a long outstanding problem that wasn't addressed until .NET 4.0. You can fix the problem by creating your own extension method too!
public static class StringEnumerableExtensions {
public static string Join(this IEnumerable<string> strings, string separator) {
return String.Join(separator, strings.ToArray());
}
}
Usage:
IEnumerable<string> strings;
Console.WriteLine(strings.Join(", "));
Overloads of Join that take an IEnumerable<T> argument were introduced in .NET 4 - if you're not using .NET 4 then I'm afraid you're stuck with passing an array or writing your own implementation.
I'm guessing that the reason is simply that it wasn't deemed important enough when the framework was first being designed. IEnumerable<T> became a lot more prominent with the introduction of LINQ.
(Of course, there were no generic types in .NET when it was being designed, but there's no reason why they couldn't have done it with plain non-generic IEnumerable if they'd have thought it worthwhile.)
And there's no reason why you can't roll your own version of Join that takes an IEnumerable<T> if you feel that you need it and you're unable to upgrade to .NET 4.
It doesn't, any more. .NET 4 added some overloads to make this easier to use. In particular, not only do you not need to pass in an array - it doesn't need to be a string sequence either. String.Join(String, IEnumerable<T>) will call ToString on each item in the sequence.
If you're not using .NET 4 but are performing a lot of string-joining operations, you could always write your own methods, of course.
I would guess that String.Join requires the ability to iterate through the array twice (once to measure length, and once to do the copy). Some classes which implement iEnumerable could be successfully joined into a string array by doing one pass to count the length, calling Reset on the enumerator, and using a second pass to copy the data, but since iEnumerable supports neither a Capabilities property, nor a family of derived classes like iMultiPassEnumerable, the only ways String.Join could safely accept an iEnumerable would be to either (1) enumerate to some type of list and run the join on that, (2) guess at the target string size, and reallocate as needed, or (3) combine the approaches, grouping short strings into clusters of up to e.g. 8K, and then combining all clusters into a final result (which would be a mixture of pre-concatenated clusters and long strings from the original array).
While I would certainly grant that it would be handy for String.Join to include an overhead that converts an iEnumerable to a List, I don't see that it would provide any more efficiency than doing such conversion manually (unlike the array version of String.Join, which is more efficient than manually joining strings individually).

Are methods that modify reference type parameters bad?

I've seen methods like this:
public void Foo(List<string> list)
{
list.Add("Bar");
}
Is this good practice to modify parameters in a method?
Wouldn't this be better?
public List<string> Foo(List<string> list)
{
// Edit
List<string> newlist = new List<string>(list);
newlist.Add("Bar");
return newlist;
}
It just feels like the first example has unexpected side effects.
In the example you've given, the first seems a lot nicer to me than the second. If I saw a method that accepted a list and also returned a list, my first assumption would be that it was returning a new list and not touching the one it was given. The second method, therefore, is the one with unexpected side effects.
As long as your methods are named appropriately there's little danger in modifying the parameter. Consider this:
public void Fill<T>(IList<T> list)
{
// add a bunch of items to list
}
With a name like "Fill" you can be pretty certain that the method will modify the list.
Frankly, in this case, both methods do more or less the same thing. Both will modify the List that was passed in.
If the objective is to have lists immutable by such a method, the second example should make a copy of the List that was sent in, and then perform the Add operation on the new List and then return that.
I'm not familiar with C# nor .NET, so my guess would be something along the line of:
public List<string> Foo(List<string> list)
{
List<string> newList = (List<string>)list.Clone();
newList.Add("Bar");
return newList;
}
This way, the method which calls the Foo method will get the newly created List returned, and the original List that was passed in would not be touched.
This really is up to the "contract" of your specifications or API, so in cases where Lists can just be modified, I don't see a problem with going with the first approach.
You're doing the exact same thing in both methods, just one of them is returning the same list.
It really depends on what you're doing, in my opinion. Just make sure your documentation is clear on what is going on. Write pre-conditions and post-conditions if you're into that sort of thing.
It's actually not that unexpected that a method that takes a list as parameter modifies the list. If you want a method that only reads from the list, you would use an interface that only allows reading:
public int GetLongest(IEnumerable<string> list) {
int len = 0;
foreach (string s in list) {
len = Math.Max(len, s.Length);
}
return len;
}
By using an interface like this you don't only prohibit the method from changing the list, it also gets more flexible as it can use any collection that implements the interface, like a string array for example.
Some other languages has a const keyword that can be applied to parameters to prohibit a method from changing them. As .NET has interfaces that you can use for this and strings that are immutable, there isn't really a need for const parameters.
The advent of extension methods has made it a bit easier to deal with methods that introduce side effects. For example, in your example it becomes much more intuitive to say
public static class Extensions
{
public static void AddBar(this List<string> list)
{
list.Add("Bar");
}
}
and call it with
mylist.AddBar();
which makes it clearer that something is happening to the list.
As mentioned in the comments, this is most useful on lists since modifications to a list can tend to be more confusing. On a simple object, I would tend to just to modify the object in place.

Categories

Resources