I just found a couple of c# code refactoring examples on the internet, and stumbled upon this particular piece of code.
Can anyone explain to me, why Method2() would be better than Method1()?
Method #1 - Doing multiple iterations on IEnumerable<string>
public void Method1()
{
IEnumerable<string> names = GetNames();
foreach (var name in names)
{
Console.WriteLine("Found " + name);
}
var allnames = new StringBuilder();
foreach (var name in names)
{
allnames.Append(name + " ");
}
}
Method #2 - Doing multiple iterations on List<string>
public void Method2()
{
IEnumerable<string> names = GetNames();
var enumerable = names as List<string> ?? names.ToList();
foreach (var name in enumerable)
{
Console.WriteLine("Found " + name);
}
var allnames = new StringBuilder();
foreach (var name in enumerable)
{
allnames.Append(name + " ");
}
}
Because IEnumerable may do lazy iteration. In which case the iteration code will run twice.
For example, if GetNames is actually communicating with a DB then iterating over the returned IEnumerable may perform the actual SQL query. In which case in Method 1 you're going to perform that task twice.
In method 2 the call to ToList causes evaluation of the IEnumerable only once and so your SQL query would only run once.
Because you don't always know what is actually behind an IEnumerable it's often seen as best practice to force enumeration only once.
Both method are good at what it does. The only differentiating factor is why we should use one or the other. In case of second method, the .ToList() call eagerly evaluate the expression, which prepare the IEnumerable collection. And in the first method, it only evaluate the expression when CLR execute following code block. As i said, it depend on how you want to get ahead.
foreach (var name in names)
Method 2 is better because there are possible multiple enumeration of IEnumerable.
Related
Which is more efficient - to attempt to iterate an empty list, or to test if there are any items in the list before attempting to iterate?
For example, first get an empty list of things:
var listOfThings = GetZeroThings(); // this returns 0 things
is it less efficient to attempt to execute this:
foreach (var thing in listOfThings)
{
}
or should I test whether there are any items? e.g.
if (listOfThings.Any())
{
foreach (var thing in listOfThings)
{
}
}
If you look at Any's source:
public static bool Any<TSource>(this IEnumerable<TSource> source)
{
if (source == null) throw Error.ArgumentNull("source");
using (IEnumerator<TSource> e = source.GetEnumerator()) {
if (e.MoveNext()) return true;
}
return false;
}
You can see that all it's doing is to call GetEnumerator on the source collection and call MoveNext method. The foreach will do exactly the same for an empty collection so, calling Any before foreach is redundant. But I believe it won't make any significant affect in terms of performance and I think using Any before foreach makes the code more readable.
My simple test shows that the one with Any is slower:
[2016-04-18 03:01:55.900 UTC] Without Any: 254 ms
[2016-04-18 03:01:56.266 UTC] With Any: 363 ms
The code is below (may not be reproducable since you do not have my component logBox, but is included to show the fairness of the test):
IEnumerable<int> listOfThings = new List<int>();
logBox.GetTimeLapse();
for (int i = 0; i < 10000000; ++i)
foreach (var thing in listOfThings)
Console.WriteLine("Do something!");
logBox.WriteTimedLogLine("Without Any: " + logBox.GetTimeLapse());
logBox.GetTimeLapse();
for (int i = 0; i < 10000000; ++i)
if (listOfThings.Any())
foreach (var thing in listOfThings)
Console.WriteLine("Do something!");
logBox.WriteTimedLogLine("With Any: " + logBox.GetTimeLapse());
Which I think is making sense because of the present of extra Any while your foreach (var thing in listOfThings) alone would have already done the checking for you anyway.
Edit:
Additional note by Jonathan Allen which I think is worth included: The line if (listOfThings.Any()) (1) allocates memory and (2) makes a virtual dispatch call. The line foreach (var thing in listOfThings) does neither.
You need to profile it. It's really impossible to give you an answer. That being said, checking Any() looks like it can cause issues.
For example:
void Main()
{
var listOfThings = GetThings();
if (listOfThings.Any())
{
foreach (var item in listOfThings)
{
Console.WriteLine("A");
}
}
}
public IEnumerable<int> GetThings()
{
Thread.Sleep(10000);
yield return 1;
}
Assuming the first element is expensive to get, you're doing the work twice. It entirely depends on how your collection behaves. As above, you need to profile your own circumstances.
Just by looking at it, it seems like Any() will always be redundant, as it grabs the enumerator anyway.
Why we can iterate item ex
mList.ForEach((item)
{
item.xyz ....
}
and for a simple array we need to force foreach loop?
foreach(int i in arr)
i.xyz
or use delegate type ?
Action<int> action = new Action<int>(myfunc);
Array.ForEach(intArray, action);
What is the differemce?
The first syntax is not correct. It should be like this:
mList.ForEach(item =>
{
// item.xyz
});
The ForEach is a method of List<T> that enables you for each item in a list to call an Action<T>.
On the other hand the foreach
statement repeats a group of embedded statements for each element in
an array or an object collection that implements the
System.Collections.IEnumerable or
System.Collections.Generic.IEnumerable interface.
That being said, ForEach can be called only on lists and foreach can be called on any object that implements either IEnumerable or IEnumerable. That's the big difference here.
Regarding the delegate type, there isn't any difference. Actually, lambda expressions item=>{ item.xyz = ...} are a shorthand for delegates.
The language defines foreach as an operation of IEnumerable. Therefore, everything which implements IEnumerable is iteratable. However, not all IEnumerables 'make sense' when using a ForEach block.
Take this for example:
public static IEnumerable<MyObject> GetObjects()
{
var i = 0;
while(i < 30)
yield return new MyObject { Name = "Object " + i++ };
}
And then you do something like this:
var objects = GetObjects();
objects.ForEach(o => o.Name = "Rob");
foreach (var obj in objects)
Console.WriteLine(obj.Name);
IF that compiled, it would print out Object 0 to Object 29 - NOT Rob 30 times.
The reason for this is that the iterator is reset each time you iterate the enumerable. It makes sense for ForEach on a list, as the enumerable has been materialized, and objects are not re-created every time you iterate it.
In order to make ForEach work on an enumerable, you'd need to materialize the collection as well (such as putting it into a list), but even that is not always possible, as you can have an enumerable with no defined end:
public static IEnumerable<MyObject> GetObjects()
{
while(true)
yield return new MyObject { Name = "Object " };
}
It also makes sense to have ForEach on Array - but for reasons I'm unaware of, it was defined as Array.ForEach(arr) rather than arr.ForEach()
Moral of the story is, if you think you need a ForEach block, you probably want to materialize the enumerable first, usually to a List<T> or an array (T[]).
I need to create an IEnummerable of DcumentSearch object from IQueryable
The following code causes the database to load the entire result which makes my app slow.
public static IEnumerable<DocumentSearch> BuildDocumentSearch(IQueryable<Document> documents)
{
var enumerator = documents.GetEnumerator();
while(enumerator.MoveNext())
{
yield return new DocumentSearch(enumerator.Current);
}
}
The natural way of writing this is:
public static IEnumerable<DocumentSearch> BuildDocumentSearch(IQueryable<Document> documents)
{
return documents.Select(doc => new DocumentSearch(doc));
}
When you call one of the IEnumerable extension methods like Select, Where, OrderBy etc, you are still adding to the recipe for the results that will be returned. When you try to access an element of an IEnumerable (as in your example), the result set must be resolved at that time.
For what it's worth, your while loop would be more naturally written as a foreach loop, though it should have the same semantics about when the query is executed.
This is non-language-specific, but I'll use examples in C#. Often I face the problem in which I need to add a parameter to an object inside any given iteration of at least one of its parameters, and I have always to come up with a lame temporary list or array of some kind concomitant with the problem of keeping it properly correlated.
So, please bear with me on the examples below:
Is there an easier and better way to do this in C sharp?
List<String> storeStr;
void AssignStringListWithNewUniqueStr (List<String> aList) {
foreach (String str in aList) {
storeStr.add(str);
str = AProcedureToGenerateNewUniqueStr();
}
}
void PrintStringListWithNewUniqueStr (List<String> aList) {
int i = 0;
foreach (String str in aList) {
print(str + storeStr[i]);
i++;
}
}
Notice the correlation above is guaranteed only because I'm iterating through an unchanged aList. When asking about a "easier and better way" I mean it should also make sure the storeStr would always be correlated with its equivalent on aList while keeping it as short and simple as possible. The List could also have been any kind of array or object.
Is there any language in which something like this is possible? It must give same results than above.
IterationBound<String> storeStr;
void AssignStringListWithNewUniqueStr (List<String> aList) {
foreach (String str in aList) {
storeStr = str;
str = AProcedureToGenerateNewUniqueStr();
}
}
void PrintStringListWithNewUniqueStr (List<String> aList) {
foreach (String str in aList) {
print(str + storeStr);
}
}
In this case, the fictitious "IterationBound" kind would guarantee the correlation between the list and the new parameter (in a way, just like Garbage Collectors guarantee allocs). It would somehow notice it was created inside an iteration and associate itself with that specific index (no matter if the syntax there would be uglier, of course). Then, when its called back again in another iteration and it was already created or stored in that specific index, it would retrieve this specific value of that iteration.
Why not simply project your enumerable into a new form?
var combination = aList
.Select(x => new { Initial = x, Addition = AProcedureToGenerateNewUniqueStr() })
.ToList()
.ForEach(x =>
{
print(x.Initial + x.Addition);
});
This way you keep each element associated with the new data.
aList.ForEach(x => print(x + AProcedureToGeneratorNewUniqueString()));
I am having trouble remembering how (but not why) to use IEnumerators in C#. I am used to Java with its wonderful documentation that explains everything to beginners quite nicely. So please, bear with me.
I have tried learning from other answers on these boards to no avail. Rather than ask a generic question that has already been asked before, I have a specific example that would clarify things for me.
Suppose I have a method that needs to be passed an IEnumerable<String> object. All the method needs to do is concatenate the letters roxxors to the end of every String in the iterator. It then will return this new iterator (of course the original IEnumerable object is left as it was).
How would I go about this? The answer here should help many with basic questions about these objects in addition to me, of course.
Here is the documentation on IEnumerator. They are used to get the values of lists, where the length is not necessarily known ahead of time (even though it could be). The word comes from enumerate, which means "to count off or name one by one".
IEnumerator and IEnumerator<T> is provided by all IEnumerable and IEnumerable<T> interfaces (the latter providing both) in .NET via GetEnumerator(). This is important because the foreach statement is designed to work directly with enumerators through those interface methods.
So for example:
IEnumerator enumerator = enumerable.GetEnumerator();
while (enumerator.MoveNext())
{
object item = enumerator.Current;
// Perform logic on the item
}
Becomes:
foreach(object item in enumerable)
{
// Perform logic on the item
}
As to your specific scenario, almost all collections in .NET implement IEnumerable. Because of that, you can do the following:
public IEnumerator Enumerate(IEnumerable enumerable)
{
// List implements IEnumerable, but could be any collection.
List<string> list = new List<string>();
foreach(string value in enumerable)
{
list.Add(value + "roxxors");
}
return list.GetEnumerator();
}
public IEnumerable<string> Appender(IEnumerable<string> strings)
{
List<string> myList = new List<string>();
foreach(string str in strings)
{
myList.Add(str + "roxxors");
}
return myList;
}
or
public IEnumerable<string> Appender(IEnumerable<string> strings)
{
foreach(string str in strings)
{
yield return str + "roxxors";
}
}
using the yield construct, or simply
var newCollection = strings.Select(str => str + "roxxors"); //(*)
or
var newCollection = from str in strings select str + "roxxors"; //(**)
where the two latter use LINQ and (**) is just syntactic sugar for (*).
If i understand you correctly then in c# the yield return compiler magic is all you need i think.
e.g.
IEnumerable<string> myMethod(IEnumerable<string> sequence)
{
foreach(string item in sequence)
{
yield return item + "roxxors";
}
}
I'd do something like:
private IEnumerable<string> DoWork(IEnumerable<string> data)
{
List<string> newData = new List<string>();
foreach(string item in data)
{
newData.Add(item + "roxxors");
}
return newData;
}
Simple stuff :)
Also you can use LINQ's Select Method:
var source = new[] { "Line 1", "Line 2" };
var result = source.Select(s => s + " roxxors");
Read more here Enumerable.Select Method