How to avoid methods calling each other in a chain? - c#

I have a C# class where I use a set of methods to filter out a List. I essentially do this by one method from inside the other and so on. So a( values) does some filtering, based on the output calls b(List values) or exits, b(List values) does some filtering and based on the output calls c(List values).
I want to remove this method chaining code from and control everything from the method which calls a(List values). I can use if-else-if but that will lead to too many if-else-ifs which I dont think is so great.
Are there any design patterns to solve this? Or some algo? Any help is appreciated.
Thanks.
Gaurav

What you are discribinq sounds like typical scenario to use LINQ with Iterator pattern (it can be implemented in C# easily with iterator blocks).
var results =
someCollection
.Where(c => c.SomeProperty < someValue * 2)
.Where(c => c.OtherProperty == "hi")
.OrderBy(c => c.AnotherProperty)
.Select(c => new {c.SomeProperty, c.OtherProperty});
Or as query expression:
var results = from c in SomeCollection
where c.SomeProperty < someValue * 2
where c.OtherProperty == "Hi"
orderby c.AnotherProperty
select new {c.SomeProperty, c.OtherProperty};
You can chain as many operations as you wish. Of course much more advanced operations, shuch as joins and grouping, are also available.
I recommend Jon Skeet's C# in Depth book if you really want to learn these techniques (and many others).

I want to remove this method chaining code from and control everything from the method which calls a(List values)
Well since you're filtering values you could use something like the LINQKit PredicateBuilder which would allow you to create a list of filters and apply them to the linq exprssion
However if you're doing more than just filtering you could use create instances of the Action<T> Delegate to represent actions that could be done to a your list and then apply them.
There's also the possibility of using Continuation Passing Style CPS but thinking about how that works makes my ears bleed.

Related

Select Statement doing a "for each" thing, then returning itself

I have a data pipeline. I would like to do some conditional transformations on the data in a fast way. It seems like I could just build up an enumeration without triggering it till the end like so:
var data = read();
if (!adminUser) data = data.Select(d => {d.ClearAdminOnlyFields(); return d;});
if (summarize) data = data.Select(d => {d.ClearVerboseFields(); return d;});
if (translate) data = data.Select(d => {d.Translate(culture); return d;});
return data;
The data above is thousands of items. I have tried googling this style of using select, but can't find any good examples of it being used. It seems like folks always enumerate with a .ToList() then do the transforms in a .ForEach(), but multiple enumerations like that should be slower! It seems like it would also be slower to do one big foreach with the if check inside it.
My question is: Am I wrong about this being faster? If so, can you explain what alternative is faster/better and why.
You should not do this, not because it doesn't work, but because it violates the common expectation of what Select does (transforms data without side effects).
You should use a foreach for this kind of logic instead. You should be able to do this with a single foreach, enumerating only once. Using Select to do this is not faster.

Pre-processing every item in an IQueryable<T> in LINQ to objects

Please have a look at the following code:
var list = new List<MyClass>();
...
var selection = list
.Preprocess()
.Where(x => x.MyProperty = 42)
.ToList();
I want to implement Preprocess in such a way that it can process every item selected by the predicate that follows. So, if list contains thousands of object, I don't want Preprocess to process all of them, but only those that are selected by the Where clause.
I know this sounds like it would be better to move Preprocess to the end of the query, but I have my reasons to do it that way.
Is that possible with IQueryable<T>? Or does it behave like IEnumerable, where the whole LINQ-to-objects pipeline is strictly sequential?
For your purposes, IQueryable here is no different from IEnumerable, which is, as you said, purely sequential. Immediately after you invoke Where(), there's effectively no trace of the original "collection".
Now, you could theoretically do what you want, but that would require rolling "your own LINQ". Preprocess() could return some kind of PreprocessedEnumerable, all the custom operators would attach to it and the final ToList() will do the reordering of the calls.

Select and ForEach on List<> [duplicate]

This question already has answers here:
LINQ equivalent of foreach for IEnumerable<T>
(22 answers)
Closed 9 years ago.
I am quite new to C# and was trying to use lambda expressions.
I am having a list of object. I would like to select item from the list and perform foreach operation on the selected items. I know i could do it without using lambda expression but wanted to if this was possible using lambda expression.
So i was trying to achieve a similar result
List<UserProfile> users = new List<UserProfile>();
..load users with list of users
List<UserProfile> selecteditem = users.Where(i => i.UserName=="").ToList();
foreach(UserProfile item in selecteditem)
{
item.UserName = "NA";
}
it was possible to do
users.Where(i => i.UserName=="").ToList().ForEach(i=>i.UserName="NA");
but not something like this
users.select(i => i.UserName=="").ForEach(i=>i.UserName="NA");
Can someone explain this behaviour..
Let's start here:
I am having a list of object.
It's important to understand that, while accurate, that statement leaves a c# programmer wanting more. What kind of object? In the .Net world, it pays to always keep in mind what specific type of object you are working with. In this case, that type is UserProfile. This may seem like a side issue, but it will become more relevant to the specific question very quickly. What you want to say instead is this:
I have a list of UserProfile objects.
Now let's look at your two expressions:
users.Where(i => i.UserName=="").ToList().ForEach(i=>i.UserName="NA");
and
users.Where(i => i.UserName=="").ForEach(i=>i.UserName="NA");
The difference (aside from that only the first compiles or works) is that you need to call .ToList() to convert the results of Where() function to a List type. Now we begin to see why it is that you want to always think in terms of types when working with .Net code, because it should now occur to you to wonder, "What type am I working with, then?" I'm glad you asked.
The .Where() function results in an IEnumerable<T> type, which is actually not a full type all by itself. It's an interface that describes certain things a type that implements it's contract will be able to do. The IEnumerable interface can be confusing at first, but the important thing to remember is that it defines something that you can use with a foreach loop. That is it's sole purpose. Anything in .Net that you can use with a foreach loop: arrays, lists, collections — they pretty much all implement the IEnumerable interface. There are other things you can loop over, as well. Strings, for example. Many methods you have today that require a List or Array as an argument can be made more powerful and flexible simply by changing that argument type to IEnumerable.
.Net also makes it easy to create state machine-based iterators that will work with this interface. This is especially useful for creating objects that don't themselves hold any items, but do know how to loop over items in a different collection in a specific way. For example, I might loop over just items 3 through 12 in an array of size 20. Or might loop over the items in alphabetical order. The important thing here is that I can do this without needing to copy or duplicate the originals. This makes it very efficient in terms of memory, and it's structure in such a way that you can easily compose different iterators together to get very powerful results.
The IEnumerable<T> type is especially important, because it is one of two types (the other being IQueryable) that form the core of the linq system. Most of the .Where(), .Select(), .Any() etc linq operators you can use are defined as extensions to IEnumerable.
But now we come to an exception: ForEach(). This method is not part of IEnumerable. It is defined directly as part of the List<T> type. So, we see again that it's important to understand what type you are working with at all times, including the results of each of the different expressions that make up a complete statement.
It's also instructional to go into why this particular method is not part of IEnumerable directly. I believe the answer lies in the fact that the linq system takes a lot of inspiration from a the Functional Programming world. In functional programming, you want to have operations (functions) that do exactly one thing, with no side effects. Ideally, these functions will not alter the original data, but rather they will return new data. The ForEach() method is implicitly all about creating bad side effects that alter data. It's just bad functional style. Additionally, ForEach() breaks method chaining, in that it doesn't return a new IEnumerable.
There is one more lesson to learn here. Let's take a look at your original snippet:
List<UserProfile> users = new List<UserProfile>();
// ..load users with list of users
List<UserProfile> selecteditem = users.Where(i => i.UserName=="").ToList();
foreach(UserProfile item in selecteditem)
{
item.UserName = "NA";
}
I mentioned something earlier that should help you significantly improve this code. Remember that bit about how you can have IEnumerable items that loop over a collection, without duplicating it? Think about what happens if you wrote that code this way, instead:
List<UserProfile> users = new List<UserProfile>();
// ..load users with list of users
var selecteditem = users.Where(i => i.UserName=="");
foreach(UserProfile item in selecteditem)
{
item.UserName = "NA";
}
All I did was remove the call to .ToList(), but everything will still work. The only thing that changed is we avoided needing to copy the entire list. That should make this code faster. In some circumstances, it can make the code a lot faster. Something to keep in mind: when working the with the linq operator methods, it's generally good to avoid calling .ToArray() or .ToList() whenever possible, and it's possible a lot more than you might think.
As for the foreach() {...} vs .Foreach( ... ): the former is still perfectly appropriate style.
Sure, it's quite simple. List has a ForEach method. There is no such method, or extension method, for IEnumerable.
As to why one has a method and another doesn't, that's an opinion. Eric Lippert blogged on the topic if you're interested in his.

Is .Select<T>(...) to be prefered before .Where<T>(...)?

I got in a discussion with two colleagues regarding a setup for an iteration over an IEnumerable (the contents of which will not be altered in any way during the operation). There are three conflicting theories on which is the optimal approach. Both the others (and me as well) are very certain and that got me unsure, so for the sake of clarity, I want to check with an external source.
The scenario is as follows. We had the code below as a starting point and discovered that some of the hazaas need not to be acted upon. So, starting with the code below, we started to add a blocker for the action.
foreach(Hazaa hazaa in hazaas) ;
My suggestion is as follows.
foreach(Hazaa hazaa in hazaas.Where(element => condition)) ;
One of the guys wants to resolve it by a more explicit form, claiming that LINQ is not appropriate in this case (not sure why it'd be so but he seems to be very convinced). He's solution is this.
foreach(Hazaa hazaa in hazaas) ;
if(condition) ;
The other contra-suggestion is supported by the claim that Where risks to repeat the filtering process needlessly and that it's more certain to minimize the computational workload by picking the appropriate elements once for all by Select.
foreach(Hazaa hazaa in hazaas.Select(element => condition)) ;
I argue that the first is obsolete, since LINQ can handle data objects quite well.
I also believe that Select-ing is in this case equivalently fast to Where-ing and no needless steps will be taken (e.g. the evaluation of the condition on the elements will only be performed once). If anything, it should be faster using Where because we won't be creating an extra instance of anything.
Who's right?
Select is inappropriate. It doesn't filter anything.
if is a possible solution, but Where is just as explicit.
Where executes the condition exactly once per item, just as the if. Additionally, it is important to note that the call to Where doesn't iterate the list. So, using Where you iterate the list exactly once, just like when using if.
I think you are discussing with one person that didn't understand LINQ - the guy that wants to use Select - and one that doesn't like the functional aspect of LINQ.
I would go with Where.
The .Where() and the if(condition) approach will be the same.
But since LinQ is nicely readable i'd prefer that.
The approach with .Select() is nonsense, since it will not return the Hazaa-Object, but an IEnumerable<Boolean>
To be clear about the functions:
myEnumerable.Where(a => isTrueFor(a)) //This is filtering
myEnumerable.Select(a => a.b) //This is projection
Where() will run a function, which returns a Boolean foreach item of the enumerable and return this item depending on the result of the Boolean function
Select() will run a function for every item in the list and return the result of the function without doing any filtering.

Generic list FindAll() vs. foreach

I'm looking through a generic list to find items based on a certain parameter.
In General, what would be the best and fastest implementation?
1. Looping through each item in the list and saving each match to a new list and returning that
foreach(string s in list)
{
if(s == "match")
{
newList.Add(s);
}
}
return newList;
Or
2. Using the FindAll method and passing it a delegate.
newList = list.FindAll(delegate(string s){return s == "match";});
Don't they both run in ~ O(N)? What would be the best practice here?
Regards,
Jonathan
You should definitely use the FindAll method, or the equivalent LINQ method. Also, consider using the more concise lambda instead of your delegate if you can (requires C# 3.0):
var list = new List<string>();
var newList = list.FindAll(s => s.Equals("match"));
I would use the FindAll method in this case, as it is more concise, and IMO, has easier readability.
You are right that they are pretty much going to both perform in O(N) time, although the foreach statement should be slightly faster given it doesn't have to perform a delegate invocation (delegates incur a slight overhead as opposed to directly calling methods).
I have to stress how insignificant this difference is, it's more than likely never going to make a difference unless you are doing a massive number of operations on a massive list.
As always, test to see where the bottlenecks are and act appropriately.
Jonathan,
A good answer you can find to this is in chapter 5 (performance considerations) of Linq To Action.
They measure a for each search that executes about 50 times and that comes up with foreach = 68ms per cycle / List.FindAll = 62ms per cycle. Really, it would probably be in your interest to just create a test and see for yourself.
List.FindAll is O(n) and will search the entire list.
If you want to run your own iterator with foreach, I'd recommend using the yield statement, and returning an IEnumerable if possible. This way, if you end up only needing one element of your collection, it will be quicker (since you can stop your caller without exhausting the entire collection).
Otherwise, stick to the BCL interface.
Any perf difference is going to be extremely minor. I would suggest FindAll for clarity, or, if possible, Enumerable.Where. I prefer using the Enumerable methods because it allows for greater flexibility in refactoring the code (you don't take a dependency on List<T>).
Yes, they both implementations are O(n). They need to look at every element in the list to find all matches. In terms of readability I would also prefer FindAll. For performance considerations have a look at LINQ in Action (Ch 5.3). If you are using C# 3.0 you could also apply a lambda expression. But that's just the icing on the cake:
var newList = aList.FindAll(s => s == "match");
Im with the Lambdas
List<String> newList = list.FindAll(s => s.Equals("match"));
Unless the C# team has improved the performance for LINQ and FindAll, the following article seems to suggest that for and foreach would outperform LINQ and FindAll on object enumeration: LINQ on Objects Performance.
This artilce was dated back to March 2009, just before this post originally asked.

Categories

Resources