Methods: What is better List or object? - c#

While I was programming I came up with this question,
What is better, having a method accept a single entity or a List of those entity's?
For example I need a List of strings. I can either have:
a method accepting a List and return a List of strings with the results.
List<string> results = methodwithlist(List[objects]);
or
a method accepting a object and return a string. Then use this function in a loop and so filling a list.
for int i = 0; i < List<objects>.Count;i++;)
{
results = methodwithsingleobject(List<objects>[i]);
}
** This is just a example. I need to know which one is better, or more used and why.
Thanks!

Well, it's easy to build the first form when you've got the second - but using LINQ, you really don't need to write your own, once you've got the projection. For example, you could write:
List<string> results = objectList.Select(X => MethodWithSingleObject()).ToList();
Generally it's easier to write and test a method which only deals with a single value, unless it actually needs to know the rest of the values in the collection (e.g. to find aggregates).

I would choose the second because it's easier to use when you have a single string (i.e. it's more general purpose). Also, the responsibility of the method itself is more clear because the method should not have anything to do with lists if it's purpose is just to modify a string.
Also, you can simplify the call with Linq:
result = yourList.Select(p => methodwithsingleobject(p));

This question comes up a lot when learning any language, the answer is somewhat moot since the standard coding practice is to rely upon LINQ to optimize the code for you at runtime. But this presumes you're using a version of the language that supports it. But if you do want to do some research on this there are a few Stack Overflow articles that delve into this and also give external resources to review:
In .NET, which loop runs faster, 'for' or 'foreach'?
C#, For Loops, and speed test... Exact same loop faster second time around?
What I have learned, though, is not to rely too heavily on Count and to use Length on typed Collections as that can be a lot faster.
Hope this is helpful.

Related

Is it bad practice to purposely rely on Linq Side Effects?

A programming pattern like this comes up every so often:
int staleCount = 0;
fileUpdatesGridView.DataSource = MultiMerger.TargetIds
.Select(id =>
{
FileDatabaseMerger merger = MultiMerger.GetMerger(id);
if (merger.TargetIsStale)
staleCount++;
return new
{
Id = id,
IsStale = merger.TargetIsStale,
// ...
};
})
.ToList();
fileUpdatesGridView.DataBind();
fileUpdatesMergeButton.Enabled = staleCount > 0;
I'm not sure there is a more succinct way to code this?
Even if so, is it bad practice to do this?
No, it is not strictly "bad practice" (like constructing SQL queries with string concatenation of user input or using goto).
Sometimes such code is more readable than several queries/foreach or no-side-effect Aggregate call. Also it is good idea to at least try to write foreach and no-side-effect versions to see which one is more readable/easier to prove correctness.
Please note that:
it is frequently very hard to reason what/when will happen with such code. I.e. you sample hacks around the fact of LINQ queries executed lazily with .ToList() call, otherwise that value will not be computed.
pure functions can be run in parallel, once with side effects need a lot of care to do so
if you ever need to convert LINQ-to-Object to LINQ-to-SQL you have to rewrite such queries
generally LINQ queries favor functional programming style without side-effects (and hence by convention readers would not expect side-effects in the code).
Why not just code it like this:
var result=MultiMerger.TargetIds
.Select(id =>
{
FileDatabaseMerger merger = MultiMerger.GetMerger(id);
return new
{
Id = id,
IsStale = merger.TargetIsStale,
// ...
};
})
.ToList();
fileUpdatesGridView.DataSource = result;
fileUpdatesGridView.DataBind();
fileUpdatesMergeButton.Enabled = result.Any(r=>r.IsStale);
I would consider this a bad practice. You are making the assumption that the lambda expression is being forced to execute because you called ToList. That's an implementation detail of the current version of ToList. What if ToList in .NET 7.x is changed to return an object that semi-lazily converts the IQueryable? What if it's changed to run the lambda in parallel? All of a sudden you have concurrency issues on your staleCount. As far as I know, both of those are possibilities which would break your code because of bad assumptions your code is making.
Now as far as repeatedly calling MultiMerger.GetMerger with a single id, that really should be reworked to be a join as the logic for doing a join (w|c)ould be much more efficient than what you have coded there and would scale a lot better, especially if the implementation of MultiMerger is actually pulling data from a database (or might be changed to do so).
As far as calling ToList() before passing it to the Datasource, if the Datasource doesn't use all the fields in your new object, you would be (much) faster and take less memory to skip the ToList and let the datasource only pull the fields it needs. What you've done is highly couple the data to the exact requirements of the view, which should be avoided where possible. An example would be what if you all of a sudden need to display a field that exists in FileDatabaseMerger, but isn't in your current anonymous object? Now you have to make changes to both the controller and view to add it, where if you just passed in an IQueryable, you would only have to change the view. Again, faster, less memory, more flexible, and more maintainable.
Hope this helps.. And this question really should be posted of code review, not stackoverflow.
Update on further review, the following code would be much better:
var result=MultiMerger.GetMergersByIds(MultiMerger.TargetIds);
fileUpdatesGridView.DataSource = result;
fileUpdatesGridView.DataBind();
fileUpdatesMergeButton.Enabled = result.Any(r=>r.TargetIsStale);
or
var result=MultiMerger.GetMergers().Where(m=>MultiMerger.TargetIds.Contains(m.Id));
fileUpdatesGridView.DataSource = result;
fileUpdatesGridView.DataBind();
fileUpdatesMergeButton.Enabled = result.Any(r=>r.TargetIsStale);

Select and ForEach on List<> [duplicate]

This question already has answers here:
LINQ equivalent of foreach for IEnumerable<T>
(22 answers)
Closed 9 years ago.
I am quite new to C# and was trying to use lambda expressions.
I am having a list of object. I would like to select item from the list and perform foreach operation on the selected items. I know i could do it without using lambda expression but wanted to if this was possible using lambda expression.
So i was trying to achieve a similar result
List<UserProfile> users = new List<UserProfile>();
..load users with list of users
List<UserProfile> selecteditem = users.Where(i => i.UserName=="").ToList();
foreach(UserProfile item in selecteditem)
{
item.UserName = "NA";
}
it was possible to do
users.Where(i => i.UserName=="").ToList().ForEach(i=>i.UserName="NA");
but not something like this
users.select(i => i.UserName=="").ForEach(i=>i.UserName="NA");
Can someone explain this behaviour..
Let's start here:
I am having a list of object.
It's important to understand that, while accurate, that statement leaves a c# programmer wanting more. What kind of object? In the .Net world, it pays to always keep in mind what specific type of object you are working with. In this case, that type is UserProfile. This may seem like a side issue, but it will become more relevant to the specific question very quickly. What you want to say instead is this:
I have a list of UserProfile objects.
Now let's look at your two expressions:
users.Where(i => i.UserName=="").ToList().ForEach(i=>i.UserName="NA");
and
users.Where(i => i.UserName=="").ForEach(i=>i.UserName="NA");
The difference (aside from that only the first compiles or works) is that you need to call .ToList() to convert the results of Where() function to a List type. Now we begin to see why it is that you want to always think in terms of types when working with .Net code, because it should now occur to you to wonder, "What type am I working with, then?" I'm glad you asked.
The .Where() function results in an IEnumerable<T> type, which is actually not a full type all by itself. It's an interface that describes certain things a type that implements it's contract will be able to do. The IEnumerable interface can be confusing at first, but the important thing to remember is that it defines something that you can use with a foreach loop. That is it's sole purpose. Anything in .Net that you can use with a foreach loop: arrays, lists, collections — they pretty much all implement the IEnumerable interface. There are other things you can loop over, as well. Strings, for example. Many methods you have today that require a List or Array as an argument can be made more powerful and flexible simply by changing that argument type to IEnumerable.
.Net also makes it easy to create state machine-based iterators that will work with this interface. This is especially useful for creating objects that don't themselves hold any items, but do know how to loop over items in a different collection in a specific way. For example, I might loop over just items 3 through 12 in an array of size 20. Or might loop over the items in alphabetical order. The important thing here is that I can do this without needing to copy or duplicate the originals. This makes it very efficient in terms of memory, and it's structure in such a way that you can easily compose different iterators together to get very powerful results.
The IEnumerable<T> type is especially important, because it is one of two types (the other being IQueryable) that form the core of the linq system. Most of the .Where(), .Select(), .Any() etc linq operators you can use are defined as extensions to IEnumerable.
But now we come to an exception: ForEach(). This method is not part of IEnumerable. It is defined directly as part of the List<T> type. So, we see again that it's important to understand what type you are working with at all times, including the results of each of the different expressions that make up a complete statement.
It's also instructional to go into why this particular method is not part of IEnumerable directly. I believe the answer lies in the fact that the linq system takes a lot of inspiration from a the Functional Programming world. In functional programming, you want to have operations (functions) that do exactly one thing, with no side effects. Ideally, these functions will not alter the original data, but rather they will return new data. The ForEach() method is implicitly all about creating bad side effects that alter data. It's just bad functional style. Additionally, ForEach() breaks method chaining, in that it doesn't return a new IEnumerable.
There is one more lesson to learn here. Let's take a look at your original snippet:
List<UserProfile> users = new List<UserProfile>();
// ..load users with list of users
List<UserProfile> selecteditem = users.Where(i => i.UserName=="").ToList();
foreach(UserProfile item in selecteditem)
{
item.UserName = "NA";
}
I mentioned something earlier that should help you significantly improve this code. Remember that bit about how you can have IEnumerable items that loop over a collection, without duplicating it? Think about what happens if you wrote that code this way, instead:
List<UserProfile> users = new List<UserProfile>();
// ..load users with list of users
var selecteditem = users.Where(i => i.UserName=="");
foreach(UserProfile item in selecteditem)
{
item.UserName = "NA";
}
All I did was remove the call to .ToList(), but everything will still work. The only thing that changed is we avoided needing to copy the entire list. That should make this code faster. In some circumstances, it can make the code a lot faster. Something to keep in mind: when working the with the linq operator methods, it's generally good to avoid calling .ToArray() or .ToList() whenever possible, and it's possible a lot more than you might think.
As for the foreach() {...} vs .Foreach( ... ): the former is still perfectly appropriate style.
Sure, it's quite simple. List has a ForEach method. There is no such method, or extension method, for IEnumerable.
As to why one has a method and another doesn't, that's an opinion. Eric Lippert blogged on the topic if you're interested in his.

Converting C# to Java, how do I handle IEnumerable?

I've been tasked with converting some C# code to Java. Most of it is fine, but I am having some trouble working out how to translate IEnumerable.
The code I have in C# is this:
public IEnumerable<Cat> Reorder(IEnumerable<Cat> catList)
{
// some logic that reorders the list
}
My googling suggested that I should be using Iterable<Cat> as an alternative. However, I also stumbled upon something saying you should never have Iterable<T> as a return type.
I'm a bit unfamiliar with data structures in Java. What should I be returning, and how would you re-order a collection of objects?
In C#, assuming you don't use linq, you'd create an empty array or List or similar, and add the items in as you repeatedly iterate through them, checking the criteria. Which data structure would I use in Java to achieve this?
It depends a bit on what you want to do with the return value.
Java has no LINQ, so using an Iterable<T> other than inside a foreach loop is a bit of a PITA. This blog post describes it in more depth.
The alternative is to return a Collection<T>.
Having said that, returning an Iterable<T> is not wrong, it just makes it harder to consume the return value in certain scenarios.
In Java you would use an implementation of List<T> like ArrayList<T> for temporary instances inside methods. When you would want to return that instance from a method, the return type would be the interface List<T> or Collection<T>.
You could do something like Collections.sort(list); if you implement a Interface Comparable at your objects (similiar can be done with c# and the IComparer)
"add the items in as you repeatedly iterate through them," I hope you don't really mean what I'm thinking you mean... There are a hell lot of sorting algorithms.
There's nothing wrong with Iterable, when you need to use it.
However, considering what the code above tells me, I think you'd be better going with java.util.Collection or java.util.List (the latter if catList is definitely a list of cats).
The way of reordering actually depends on the ordering requirements. It's probably as simple as using Collections.sort(List list)
You should identify the data structure you need.
Basically the 3 great data structure families are:
List: An ordered index-based collection;
Set: A collection that contains no duplicate elements;
Map: A key-value based collection (A dictionnary in C#). Not adapted here in this precise case.
Depending on what you need, you should return List<Cat> or Set<Cat>.

Write a lambda expression to perform a calulcation on an list

I have a List/IEnumerable of objects and I'd like to perform a calculation on some of them.
e.g.
myList.Where(f=>f.Calculate==true).Calculate();
to update myList, based on the Where clause, so that the required calulcation is performed and the entire list updated as appropriate.
The list contains "lines" where an amount is either in Month1, Month2, Month3...Month12, Year1, Year2, Year3-5 or "Long Term"
Most lines are fixed and always fall into one of these months, but some "lines" are calulcated based upon their "Maturity Date".
Oh, and just to complicate things! the list (at the moment) is of an anonymous type from a couple of linq queries. I could make it a concrete class if required though, but I'd prefer not to if I can avoid it.
So, I'd like to call a method that works on only the calculated lines, and puts the correct amount into the correct "month".
I'm not worried about the calculation logic, but rather how to get this into an easily readable method that updates the list without, ideally, returning a new list.
[Is it possible to write a lambda extension method to do both the calculation AND the where - or is this overkill anyway as Where() already exists?]
Personally, if you want to update the list in place, I would just use a simple loop. It will be much simpler to follow and maintain:
for (int i=0;i<list.Count;++i)
{
if (list[i].ShouldCalculate)
list[i] = list[i].Calculate();
}
This, at least, is much more obvious that it's going to update. LINQ has the expectation of performing a query, not mutating the data.
If you really want to use LINQ for this, you can - but it will still require a copy if you want to have a List<T> as your results:
myList = myList.Select(f => f.ShouldCalculate ? f.Calculate() : f).ToList();
This would call your Calculate() method as needed, and copy the original when not needed. It does require a copy to create a new List<T>, though, as you mentioned that was a requirement (in comments).
However, my personal preference would still be to use a loop in this case. I find the intent much more clear - plus, you avoid the unnecessary copy operation.
Edit #2:
Given this comment:
Oh, and just to complicate things! the list (at the moment) is of an anonymous type from a couple of linq queries
If you really want to use LINQ style syntax, I would recommend just not calling ToList() on your original queries. If you leave them in their original, IEnumerable<T> form, you can easily do my second option above, but on the original query:
var myList = query.Select(f => f.ShouldCalculate ? f.Calculate() : f).ToList();
This has the advantage of only constructing the list one time, and preventing the copy, as the original sequence will not get evaluated until this operation.
LINQ is mostly geared around side-effect-free queries, and anonymous types themselves are immutable (although of course they can maintain references to mutable types).
Given that you want to mutate the list in place, LINQ isn't a great fit.
As per Reed's suggestion, I would use a straight for loop. However, if you want to perform different calculations at different points, you could encapsulate this:
public static void Recalculate<T>(IList<T> list,
Func<T, bool> shouldCalculate,
Func<T, T> calculation)
{
for (int i = 0; i < list.Count; i++)
{
if (shouldCalculate(items[i]))
{
items[i] = calculation(items[i]);
}
}
}
If you really want to use this in a fluid way, you could make it return the list - but I would personally be against that, as it would then look like it was side-effect-free like LINQ.
And like Reed, I'd also prefer to do this by creating a new sequence...
Select doesn't copy or clone the objects it passes to the passed delegate, any state changes to that object will be reflected through the reference in the container (unless it is a value type).
So updating reference types is not a problem.
To replace the objects (or when working with value types1) this are more complex and there is no inbuilt solution with LINQ. A for loop is clearest (as with the other answers).
1 Remembering, of course, that mutable value types are evil.

Which LINQ query is more effective?

I have a huge IEnumerable(suppose the name is myItems), which way is more effective?
Solution 1: Filter it first then ForEach.
Array.ForEach(myItems.Where(FILTER-IT-HERE).ToArray(),MY-ACTION);
Solution 2: Do RETURN in MY-ACTION if the item is not up to the mustard.
Array.ForEach(myItems.ToArray(),MY-ACTION-WITH-FILTER);
Is one of them always better than another? Or any other good suggestions? Thanks in advance.
Did you do any measurements? Since WE can't measure the run time of My-Action then only you can. Measure and decide.
Sometimes one has to create benchmark's because similar looking activities could produce radically different and unexpected results.
You do not say what your data source is so I'm going to assume it may be data on an SQL server in which case filtering at the server side will likely always be the best approach because you have minimized the amount of data transfer. Memory access is always faster than data transfer from disk to memory so whenever you can transfer fewer records, you are likely to have better performance.
Well, both times, you're converting to an array, which might not be so efficient if the IEnumerable is very large (like you said). You could create a generic extension method for IEnumerable, like:
public static void ForEach<T>(this IEnumerable<T> current, Action<T> action) {
foreach (var i in current) {
action(i);
}
}
and then you could do this:
IEnumerable<int> ints = new List<int>();
ints.Where(i => i == 5).ForEach(i => Console.WriteLine(i));
If performance is a concern, it's unclear to me why you'd be bothering to construct an entire array in the first place. Why not just this?
foreach (var item in myItems.Where(FILTER-IT-HERE))
MY-ACTION;
Or:
foreach (var item in myItems)
MY-ACTION-WITH-FILTER;
I ask because, while the others are right that you can't really know without testing, I wouldn't expect there to be much difference between the above two options. I would expect there to be a difference, on the other hand, between creating/populating an array (seemingly for no reason) and not creating an array.
Everything else being equal, calling ToArray() first will impart a greater performance hit than when calling it last. Although, as has been stated by others before me,
Why use ToArray() and Array.ForEach() at all?
We don't know that everything else actually is equal since you do not reveal the implementation details of your filter and action.
The idea of LINQ is to work on enumerable collections, so the best LINQ query is the one where you don't use Array.ForEach() and .ToArray() at all.
I would say that this falls into the category of premature optimization. If, after establishing benchmarks, you find that the code is too slow, you can always try each approach and pick the result that works better for you.
Since we don't know how the IEnumerable<> is produced it's hard to say which approach will perform better. We also don't know how many items will remain after you apply your predicate - nor do we know whether the action or iteration steps are going to be the dominant factor in the execution of your code. The only way to know for sure is to try it both ways, profile the results, and pick the best.
Performance aside, I would choose the version that is most clear - which (for me) is to first filter and then apply the projection to the result.

Categories

Resources