First Time Calling Extension Methods is Slower Than Subsequent Calls

First Time Calling Extension Methods is Slower Than Subsequent Calls - c#

I have a class that modifies data via some extension methods. In order to debug performance, I have created some rough debug code to make multiple calls to the same methods using the same data a number of times. I am finding that it consistently takes a significantly longer time to do the calculations the first time through the loop than subsequent calls.
For example, for a small set of data, it appears to take about 5 seconds to do the calculations, while each subsequent call is a second or so.
Thanks,
wTs
The code looks something like this:
Test Code
void TestCode()
{
for (int i = 0; i < iterationsPerLoop; i++)
{
DateTime startTime = DateTime.Now;
// The test is actually being done in a BackgroundWorker
dispatcher.Invoke(DispatcherPriority.Normal,
(Action)(() => this.PropertyCausingCodeToRun = "Run";
while (this.WaitForSomeCondition)
Thread.Sleep(125);
DateTime endTime = DateTime.Now;
double result = endTime.Subtract(startTime).TotalSeconds;
}
}
Method where extension methods called
private static List<ObservableItem> GetAvailableItems(MyObject myObject)
{
var items = new List<ObservableItem>(myObject.Items.ToList());
var selectedItems = items.OrderByDescending(item => item.ASortableProperty)
.SetItemIsAvailable(false)
.SetItemPriority()
.OrderByDescending(item => item.Priority)
.Where(item => item.Priority > 0)
.SetItemIsAvailable(true)
.OrderBy(item => item.Date);
return selectedItems.ToList();
}
Extension Methods (ObservableItems all created on different thread)
static class MyExtensionMethods
{
public static IEnumerable<T> SetItemIsAvailable<T>(this IEnumerable<T> sourceList,
Boolean isAvailable) where T : ObservableItem
{
Action<T> setAvailable = i => i.IsAvailable = isAvailable;
List<DispatcherOperation> invokeResults = new List<DispatcherOperation>();
foreach (var item in sourceList)
{
invokeResults.Add(
item.ItemDispatcher.BeginInvoke(setAvailable , new object[] { item }));
}
invokeResults.ForEach(ir => ir.Wait());
return sourceList;
}
public static IEnumerable<T> SetItemPriority<T>(this IEnumerable<T> sourceList) where T : ObservableItem
{
Action<T, double> setPriority = new Action<T, double>((item, priority) =>
{
item.Priority = priority;
});
List<DispatcherOperation> invokeResults = new List<DispatcherOperation>();
foreach (var item in sourceList)
{
double priority = ......; // Some set of calculations
invokeResults.Add(
item.ItemDispatcher.BeginInvoke(setPriority,
new object[] { asset, priority }));
}
invokeResults.ForEach(ir => ir.Wait());
return sourceList;
}
}

Most often, the first time methods are called, there is some overhead associated with the JIT compilation time. This will have an effect (though most likely not that much).
However, looking at your code, you're spending a huge amount of time waiting on asynchronous calls marshalling to the UI via the dispatcher. This is going to put a large hit on your overall performance, and slow this way down.
I'd recommend doing all of your operations in one dispatch call, and using Invoke instead of BeginInvoke. Instead of marshalling one message per item, just marshal a single delegate that includes the foreach loop through your items.
This will be significantly faster.

The real issue, as I've figured out, was caused by the property initially being called to sort the items (before the extension methods are even called).
The property is of the form:
public Double ASortableProperty
{
get
{
if (mASortableProperty.HasValue)
{
return mASortableProperty.Value;
}
mASortableProperty = this.TryGetDoubleValue(...);
return (mASortableProperty.HasValue ? mASortableProperty.Value : 0);
}
}
Therefore, the first time through the loop, the values were not initialized from the database, and the cost was in retrieving these values, before the sort can take place.

Related

IEnumerable<T> and .Where Linq method behaviour?

I thought I know everything about IEnumerable<T> but I just met a case that I cannot explain. When we call .Where linq method on a IEnumerable, the execution is deferred until the object is enumerated, isn't it?
So how to explain the sample below :
public class CTest
{
public CTest(int amount)
{
Amount = amount;
}
public int Amount { get; set; }
public override string ToString()
{
return $"Amount:{Amount}";
}
public static IEnumerable<CTest> GenerateEnumerableTest()
{
var tab = new List<int> { 2, 5, 10, 12 };
return tab.Select(t => new CTest(t));
}
}
Nothing bad so far!
But the following test gives me an unexpected result although my knowledge regarding IEnumerable<T> and .Where linq method :
[TestMethod]
public void TestCSharp()
{
var tab = CTest.GenerateEnumerableTest();
foreach (var item in tab.Where(i => i.Amount > 6))
{
item.Amount = item.Amount * 2;
}
foreach (var t in tab)
{
var s = t.ToString();
Debug.Print(s);
}
}
No item from tab will be multiplied by 2. The output will be :
Amount:2
Amount:5
Amount:10
Amount:12
Does anyone can explain why after enumerating tab, I get the original value.
Of course, everything work fine after calling .ToList() just after calling GenerateEnumerableTest() method.

var tab = CTest.GenerateEnumerableTest();
This tab is a LINQ query that generates CTest instances that are initialized from int-values which come from an integer array which will never change. So whenever you ask for this query you will get the "same" instances(with the original Amount).
If you want to "materialize" this query you could use ToList and then change them.
Otherwise you are modifying CTest instances that exist only in the first foreach loop. The second loop enumerates other CTest instances with the unmodified Amount.
So the query contains the informations how to get the items, you could also call the method directly:
foreach (var item in CTest.GenerateEnumerableTest().Where(i => i.Amount > 6))
{
item.Amount = item.Amount * 2;
}
foreach (var t in CTest.GenerateEnumerableTest())
{
// now you don't expect them to be changed, do you?
}

Like many LINQ operations, Select is lazy and use deferred execution so your lambda expression is never being executed, because you're calling Select but never using the results. This is why, everything work fine after calling .ToList() just after calling GenerateEnumerableTest() method:
var tab = CTest.GenerateEnumerableTest().ToList();

Is it the same to iterate over Linq expression result than to assign it first to a variable?

So, this is more difficult to explain in words, so i will put code examples.
let's suppose i already have a list of clients that i want to filter.
Basically i want to know if this:
foreach(var client in list.Where(c=>c.Age > 20))
{
//Do something
}
is the same as this:
var filteredClients = list.Where(c=>c.Age > 20);
foreach(var client in filteredClients)
{
//Do something
}
I've been told that the first approach executes the .Where() in every iteration.
I'm sorry if this is a duplicate, i couldn't find any related question.
Thanks in advance.

Yes, both those examples are functionally identical. One just stores the result from Enumerable.Where in a variable before accessing it while the other just accesses it directly.
To really see why this will not make a difference, you have to understand what a foreach loop essentially does. The code in your examples (both of them) is basically equivalent to this (I’ve assumed a known type Client here):
IEnumerable<Client> x = list.Where(c=>c.Age > 20);
// foreach loop
IEnumerator<Client> enumerator = x.GetEnumerator();
while (enumerator.MoveNext())
{
Client client = enumerator.Current;
// Do something
}
So what actually happens here is the IEnumerable result from the LINQ method is not consumed directly, but an enumerator of it is requested first. And then the foreach loop does nothing else than repeatedly asking for a new object from the enumerator and processing the current element in each loop body.
Looking at this, it doesn’t make sense whether the x in the above code is really an x (i.e. a previously stored variable), or whether it’s the list.Where() call itself. Only the enumerator object—which is created just once—is used in the loop.
Now to cover that SharePoint example which Colin posted. It looks like this:
SPList activeList = SPContext.Current.List;
for (int i=0; i < activeList.Items.Count; i++)
{
SPListItem listItem = activeList.Items[i];
// do stuff
}
This is a fundamentally different thing though. Since this is not using a foreach loop, we do not get that one enumerator object which we use to iterate through the list. Instead, we repeatedly access activeList.Items: Once in the loop body to get an item by index, and once in the continuation condition of the for loop where we get the collection’s Count property value.
Unfortunately, Microsoft does not follow its own guidelines all the time, so even if Items is a property on the SPList object, it actually is creating a new SPListItemCollection object every time. And that object is empty by default and will only lazily load the actual items when you first access an item from it. So above code will eventually create a large amount of SPListItemCollections which will each fetch the items from the database. This behavior is also mentioned in the remarks section of the property documentation.
This generally violates Microsoft’s own guidelines on choosing a property vs a method:
Do use a method, rather than a property, in the following situations.
The operation returns a different result each time it is called, even if the parameters do not change.
Note that if we used a foreach loop for that SharePoint example again, then everything would have been fine, since we would have again only requested a single SPListItemCollection and created a single enumerator for it:
foreach (SPListItem listItem in activeList.Items.Cast<SPListItem>())
{ … }

They are not quite the same:
Here is the original C# code:
static void ForWithVariable(IEnumerable<Person> clients)
{
var adults = clients.Where(x => x.Age > 20);
foreach (var client in adults)
{
Console.WriteLine(client.Age.ToString());
}
}
static void ForWithoutVariable(IEnumerable<Person> clients)
{
foreach (var client in clients.Where(x => x.Age > 20))
{
Console.WriteLine(client.Age.ToString());
}
}
Here is the decompiled Intermediate Language (IL) code this results in (according to ILSpy):
private static void ForWithVariable(IEnumerable<Person> clients)
{
Func<Person, bool> arg_21_1;
if ((arg_21_1 = Program.<>c.<>9__1_0) == null)
{
arg_21_1 = (Program.<>c.<>9__1_0 = new Func<Person, bool>(Program.<>c.<>9.<ForWithVariable>b__1_0));
}
IEnumerable<Person> enumerable = clients.Where(arg_21_1);
foreach (Person current in enumerable)
{
Console.WriteLine(current.Age.ToString());
}
}
private static void ForWithoutVariable(IEnumerable<Person> clients)
{
Func<Person, bool> arg_22_1;
if ((arg_22_1 = Program.<>c.<>9__2_0) == null)
{
arg_22_1 = (Program.<>c.<>9__2_0 = new Func<Person, bool>(Program.<>c.<>9.<ForWithoutVariable>b__2_0));
}
foreach (Person current in clients.Where(arg_22_1))
{
Console.WriteLine(current.Age.ToString());
}
}
As you can see, there is a key difference:
IEnumerable<Person> enumerable = clients.Where(arg_21_1);
A more practical question, however, is whether the differences hurt performance. I concocted a test to measure that.
class Program
{
public static void Main()
{
Measure(ForEachWithVariable);
Measure(ForEachWithoutVariable);
Console.ReadKey();
}
static void Measure(Action<List<Person>, List<Person>> action)
{
var clients = new[]
{
new Person { Age = 10 },
new Person { Age = 20 },
new Person { Age = 30 },
}.ToList();
var adultClients = new List<Person>();
var sw = new Stopwatch();
sw.Start();
for (var i = 0; i < 1E6; i++)
action(clients, adultClients);
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds.ToString());
Console.WriteLine($"{adultClients.Count} adult clients found");
}
static void ForEachWithVariable(List<Person> clients, List<Person> adultClients)
{
var adults = clients.Where(x => x.Age > 20);
foreach (var client in adults)
adultClients.Add(client);
}
static void ForEachWithoutVariable(List<Person> clients, List<Person> adultClients)
{
foreach (var client in clients.Where(x => x.Age > 20))
adultClients.Add(client);
}
}
class Person
{
public int Age { get; set; }
}
After several runs of the program, I was not able to find any significant difference between ForEachWithVariable and ForEachWithoutVariable. They were always close in time, and neither was consistently faster than the other. Interestingly, if I change 1E6 to just 1000, the ForEachWithVariable is actually consistently slower, by about 1 millisecond.
So, I conclude that for LINQ to Objects, there is no practical difference. The same type of test could be run if your particular use case involves LINQ to Entities (or SharePoint).

Create an asynchronous LinQ query

I check and tried this example: How to write Asynchronous LINQ query?
This works well and it asynchronously executes both procedures but my target is to create a query that when you get the first result of the query you can already start to use this value while the query is still executing and looking for more values.
I did this in while I was programming for android and the result was amazing and really fast but I have no clue how to make it in C# and using linQ

Asynchronous sequences are modeled in .NET as IObservables. Reactive Extensions is the asynchronous section of LINQ.
You can generate an observable sequence in any number of ways, such as through a call to Generate:
var sequence = Observable.Generate(0,
i => i < 10,
i => i + 1,
i => SomeExpensiveGeneratorFunction());
And then query it using all of the regular LINQ functions (which as a result allows for the use of query syntax) along with a number of additional operations that make sense specifically for asynchronous sequence (and also a lot of different ways of creating observable sequences) such as:
var query = from item in sequence
where ConditionIsTrue(item)
select item.ToString();
The short description of what's going on here is to just say that it does exactly what you want. The generator function simply notifies its subscribers whenever it successfully generates a value (or when it's done) and continues generating values without waiting for the subscribers to finish, the Where method will subscribe to sequence, and notify its subscribers whenever it observes a value that passes the condition, Select will subscribe to the sequence returned by Where and perform its transformation (asynchronously) whenever it gets a value and will then push it to all of its subscribers.

I have modified TheSoftwareJedi answer from your given link.
You can raise the first startup event from the Asynchronous class, and use it to start-up your work.
Here's the class,
public static class AsynchronousQueryExecutor
{
private static Action<object> m_OnFirstItemProcessed;
public static void Call<T>(IEnumerable<T> query, Action<IEnumerable<T>> callback, Action<Exception> errorCallback, Action<object> OnFirstItemProcessed)
{
m_OnFirstItemProcessed = OnFirstItemProcessed;
Func<IEnumerable<T>, IEnumerable<T>> func =
new Func<IEnumerable<T>, IEnumerable<T>>(InnerEnumerate<T>);
IEnumerable<T> result = null;
IAsyncResult ar = func.BeginInvoke(
query,
new AsyncCallback(delegate(IAsyncResult arr)
{
try
{
result = ((Func<IEnumerable<T>, IEnumerable<T>>)((AsyncResult)arr).AsyncDelegate).EndInvoke(arr);
}
catch (Exception ex)
{
if (errorCallback != null)
{
errorCallback(ex);
}
return;
}
//errors from inside here are the callbacks problem
//I think it would be confusing to report them
callback(result);
}),
null);
}
private static IEnumerable<T> InnerEnumerate<T>(IEnumerable<T> query)
{
int iCount = 0;
foreach (var item in query) //the method hangs here while the query executes
{
if (iCount == 0)
{
iCount++;
m_OnFirstItemProcessed(item);
}
yield return item;
}
}
}
here's the associations,
private void OnFirstItem(object value) // Your first items is proecessed here.
{
//You can start your work here.
}
public void HandleResults(IEnumerable<int> results)
{
foreach (var item in results)
{
}
}
public void HandleError(Exception ex)
{
}
and here's how you should call the function.
private void buttonclick(object sender, EventArgs e)
{
IEnumerable<int> range = Enumerable.Range(1,10000);
var qry = TestSlowLoadingEnumerable(range);
//We begin the call and give it our callback delegate
//and a delegate to an error handler
AsynchronousQueryExecutor.Call(qry, HandleResults, HandleError, OnFirstItem);
}
If this meets your expectation, you can use this to start your work with the first item processed.

Try again ...
If I understand you the logic you want is something like ...
var query = getData.Where( ... );
query.AsParallel().ForEach(r => {
//other stuff
});
What will happen here ...
Well in short, the compiler will evaluate this to something like: Whilst iterating through query results in parallel perform the logic in the area where the comment is.
This is async and makes use of an optimal thread pool managed by .net to ensure the results are acquired as fast as possible.
This is an automatically managed async parallel operation.
It's also worth noting that I if I do this ...
var query = getData.Where( ... );
... no actual code is run until I begin iterating the IQueryable and by declaring the operation a parallel one the framework is able to operate on more than one of the results at any point in time by threading the code for you.
The ForEach is essentially just a normal foreach loop where each iteration is asynchronously handled.
The logic you put in there could call some sort of callback if you wanted but that's down to how you wrap this code ...
Might I suggest something like this:
void DoAsync<T>(IQueryable<T> items, Func<T> operation, Func<T> callback)
{
items.AsParallel().ForEach(x => {
operation(x);
callback(x);
});
}

This is pretty simple with the TPL.
Here's a dummy "slow" enumerator that has to do a bit of work between getting items:
static IEnumerable<int> SlowEnumerator()
{
for (int i = 0; i < 10; i++)
{
Thread.Sleep(1000);
yield return i;
}
}
Here's a dummy bit of work to do with each item in the sequence:
private static void DoWork(int i)
{
Thread.Sleep(1000);
Console.WriteLine("{0} at {1}", i, DateTime.Now);
}
And here's how you can simultenously run the "bit of work" on one item that the enumerator has returned and ask the enumerator for the next item:
foreach (var i in SlowEnumerator())
{
Task.Run(() => DoWork(i));
}
You should get work done every second - not every 2 seconds as you would expect if you had to interleave the two types of work:
0 at 20/01/2015 10:56:52
1 at 20/01/2015 10:56:53
2 at 20/01/2015 10:56:54
3 at 20/01/2015 10:56:55
4 at 20/01/2015 10:56:56
5 at 20/01/2015 10:56:57
6 at 20/01/2015 10:56:58
7 at 20/01/2015 10:56:59
8 at 20/01/2015 10:57:00
9 at 20/01/2015 10:57:01

How to yield return inside anonymous methods?

Basically I have an anonymous method that I use for my BackgroundWorker:
worker.DoWork += ( sender, e ) =>
{
foreach ( var effect in GlobalGraph.Effects )
{
// Returns EffectResult
yield return image.Apply (effect);
}
};
When I do this the compiler tells me:
"The yield statement cannot be used
inside an anonymous method or lambda
expression"
So in this case, what's the most elegant way to do this? Btw this DoWork method is inside a static method, in case that matters for the solution.

Unfortunately you can't.
The compiler does not allow you to combine the two "magic" pieces of code. Both involve rewriting your code to support what you want to do:
An anonymous method is done by moving the code to a proper method, and lifting local variables to fields on the class with that method
An iterator method is rewritten as a state machine
You can, however, rewrite the code to return the collection, so in your particular case I would do this:
worker.DoWork += ( sender, e ) =>
{
return GlobalGraph.Effects
.Select(effect => image.Apply(effect));
};
though it looks odd for an event (sender, e) to return anything at all. Are you sure you're showing a real scenario for us?
Edit Ok, I think I see what you're trying to do here.
You have a static method call, and then you want to execute code in the background, and then return data from that static method once the background call completes.
This is, while possible, not a good solution since you're effectively pausing one thread to wait for another, that was started directly before you paused the thread. In other words, all you're doing is adding overhead of context switching.
Instead you need to just kick off the background work, and then when that work is completed, process the resulting data.

Perhaps just return the linq expression and defer execution like yield:
return GlobalGraph.Effects.Select(x => image.Apply(x));

Unless I'm missing something, you can't do what you're asking.
(I do have an answer for you, so please read past my explanation of why you can't do what you're doing first, and then read on.)
You full method would look something like this:
public static IEnumerable<EffectResult> GetSomeValues()
{
// code to set up worker etc
worker.DoWork += ( sender, e ) =>
{
foreach ( var effect in GlobalGraph.Effects )
{
// Returns EffectResult
yield return image.Apply (effect);
}
};
}
If we assume that your code was "legal" then when GetSomeValues is called, even though the DoWork handler is added to worker, the lambda expression isn't executed until the DoWork event is fired. So the call to GetSomeValues completes without returning any results and the lamdba may or may not get called at a later stage - which is then too late for the caller of the GetSomeValues method anyway.
Your best answer is to the use Rx.
Rx turns IEnumerable<T> on its head. Instead of requesting values from an enumerable, Rx has values pushed to you from an IObservable<T>.
Since you're using a background worker and responding to an event you are effectively having the values pushed to you already. With Rx it becomes easy to do what you're trying to do.
You have a couple of options. Probably the simplest is to do this:
public static IObservable<IEnumerable<EffectResult>> GetSomeValues()
{
// code to set up worker etc
return from e in Observable.FromEvent<DoWorkEventArgs>(worker, "DoWork")
select (
from effect in GlobalGraph.Effects
select image.Apply(effect)
);
}
Now callers of your GetSomeValues method would do this:
GetSomeValues().Subscribe(ers =>
{
foreach (var er in ers)
{
// process each er
}
});
If you know that DoWork is only going to fire once, then this approach might be a little better:
public static IObservable<EffectResult> GetSomeValues()
{
// code to set up worker etc
return Observable
.FromEvent<DoWorkEventArgs>(worker, "DoWork")
.Take(1)
.Select(effect => from effect in GlobalGraph.Effects.ToObservable()
select image.Apply(effect))
.Switch();
}
This code looks a little more complicated, but it just turns a single do work event into a stream of EffectResult objects.
Then the calling code looks like this:
GetSomeValues().Subscribe(er =>
{
// process each er
});
Rx can even be used to replace the background worker. This might be the best option for you:
public static IObservable<EffectResult> GetSomeValues()
{
// set up code etc
return Observable
.Start(() => from effect in GlobalGraph.Effects.ToObservable()
select image.Apply(effect), Scheduler.ThreadPool)
.Switch();
}
The calling code is the same as the previous example. The Scheduler.ThreadPool tells Rx how to "schedule" the processing of subscriptions to the observer.
I hope this helps.

For new readers: the most elegant way to implement 'anonymous iterators' (i. e. nested in other methods) in C#5 is probably something like this cool trick with async/await (don't be confused by these keywords, the code below is computed absolutely synchronously - see details in the linked page):
public IEnumerable<int> Numbers()
{
return EnumeratorMonad.Build<int>(async Yield =>
{
await Yield(11);
await Yield(22);
await Yield(33);
});
}
[Microsoft.VisualStudio.TestTools.UnitTesting.TestMethod]
public void TestEnum()
{
var v = Numbers();
var e = v.GetEnumerator();
int[] expected = { 11, 22, 33 };
Numbers().Should().ContainInOrder(expected);
}
C#7 (available now in Visual Studio 15 Preview) supports local functions, which allow yield return:
public IEnumerable<T> Filter<T>(IEnumerable<T> source, Func<T, bool> filter)
{
if (source == null) throw new ArgumentNullException(nameof(source));
if (filter == null) throw new ArgumentNullException(nameof(filter));
return Iterator();
IEnumerable<T> Iterator()
{
foreach (var element in source)
{
if (filter(element)) { yield return element; }
}
}
}

DoWork is of type DoWorkEventHandler which returns nothing (void),
so it's not possible at all in your case.

The worker should set the Result property of DoWorkEventArgs.
worker.DoWork += (s, e) => e.Result = GlobalGraph.Effects.Select(x => image.Apply(x));

Ok so I did something like this which does what I wanted (some variables omitted):
public static void Run ( Action<float, EffectResult> action )
{
worker.DoWork += ( sender, e ) =>
{
foreach ( var effect in GlobalGraph.Effects )
{
var result = image.Apply (effect);
action (100 * ( index / count ), result );
}
}
};
and then in the call site:
GlobalGraph.Run ( ( p, r ) =>
{
this.Progress = p;
this.EffectResults.Add ( r );
} );

I wanted to supplement user1414213562's answer with an implementation of the ForEachMonad.
static class ForEachMonad
{
public static IEnumerable<A> Lift<A>(A a) { yield return a; }
// Unfortunately, this doesn't compile
// public static Func<IEnumerable<A>, IEnumerable<B>> Lift<A, B>(Func<A, IEnumerable<B>> f) =>
// (IEnumerable<A> ea) => { foreach (var a in ea) { foreach (var b in f(a)) { yield return b; } } }
// Fortunately, this does :)
public static Func<IEnumerable<A>, IEnumerable<B>> Lift<A, B>(Func<A, IEnumerable<B>> f)
{
IEnumerable<B> lift(IEnumerable<A> ea)
{
foreach (var a in ea) { foreach (var b in f(a)) { yield return b; } }
}
return lift;
}
public static void Demo()
{
var f = (int x) => (IEnumerable<int>)new int[] { x + 1, x + 2, x + 3 };
var g = (int x) => (IEnumerable<double>)new double[] { Math.Sqrt(x), x*x };
var log = (double d) => { Console.WriteLine(d); return Lift(d); };
var e1 = Lift(0);
var e2 = Lift(f)(e1);
var e3 = Lift(g)(e2);
// we call ToArray in order to materialize the IEnumerable
Lift(log)(e3).ToArray();
}
}
Running ForEachMonad.Demo() produces the following output:
1
1
1,4142135623730951
4
1,7320508075688772
9

Elegantly refactoring code like this (to avoid a flag)

I have a function running over an enumerable, but the function should be a little bit different for the first item, for example:
void start() {
List<string> a = ...
a.ForEach(DoWork);
}
bool isFirst = true;
private void DoWork(string s) {
// do something
if(isFirst)
isFirst = false;
else
print("first stuff");
// do something
}
How would you refactor this to avoid that ugly flag?

Expounding on Jimmy Hoffa's answer if you actually want to do something with the first item you could do this.
DoFirstWork(a[0])
a.Skip(1).ForEach(DoWork)
If the point is that it is separate in logic from the rest of the list then you should use a separate function.

It might be a bit heavy handed, but I pulled this from another SO question a while back.
public static void IterateWithSpecialFirst<T>(this IEnumerable<T> source,
Action<T> firstAction,
Action<T> subsequentActions)
{
using (IEnumerator<T> iterator = source.GetEnumerator())
{
if (iterator.MoveNext())
{
firstAction(iterator.Current);
}
while (iterator.MoveNext())
{
subsequentActions(iterator.Current);
}
}
}

Check out Jon Skeet's smart enumerations.
They are part of his Miscellaneous Utility Library

EDIT: added usage example, added a ForFirst method, reordered my paragraphs.
Below is a complete solution.
Usage is either of the following:
list.ForFirst(DoWorkForFirst).ForRemainder(DoWork);
// or
list.ForNext(1, DoWorkForFirst).ForRemainder(DoWork);
The crux is the ForNext method, which performs an action for the specified next set of items from the collection and returns the remaining items. I've also implemented a ForFirst method that simply calls ForNext with count: 1.
class Program
{
static void Main(string[] args)
{
List<string> list = new List<string>();
// ...
list.ForNext(1, DoWorkForFirst).ForRemainder(DoWork);
}
static void DoWorkForFirst(string s)
{
// do work for first item
}
static void DoWork(string s)
{
// do work for remaining items
}
}
public static class EnumerableExtensions
{
public static IEnumerable<T> ForFirst<T>(this IEnumerable<T> enumerable, Action<T> action)
{
return enumerable.ForNext(1, action);
}
public static IEnumerable<T> ForNext<T>(this IEnumerable<T> enumerable, int count, Action<T> action)
{
if (enumerable == null)
throw new ArgumentNullException("enumerable");
using (var enumerator = enumerable.GetEnumerator())
{
// perform the action for the first <count> items of the collection
while (count > 0)
{
if (!enumerator.MoveNext())
throw new ArgumentOutOfRangeException(string.Format(System.Globalization.CultureInfo.InvariantCulture, "Unexpected end of collection reached. Expected {0} more items in the collection.", count));
action(enumerator.Current);
count--;
}
// return the remainder of the collection via an iterator
while (enumerator.MoveNext())
{
yield return enumerator.Current;
}
}
}
public static void ForRemainder<T>(this IEnumerable<T> enumerable, Action<T> action)
{
if (enumerable == null)
throw new ArgumentNullException("enumerable");
foreach (var item in enumerable)
{
action(item);
}
}
}
I felt a bit ridiculous making the ForRemainder method; I could swear that I was re-implementing a built-in function with that, but it wasn't coming to mind and I couldn't find an equivalent after glancing around a bit. UPDATE: After reading the other answers, I see there apparently isn't an equivalent built into Linq. I don't feel so bad now.

using System.Linq; // reference to System.Core.dll
List<string> list = ..
list.Skip(1).ForEach(DoWork) // if you use List<T>.ForEeach()
but I recommend you to write your one:
public static void ForEach(this IEnumerable<T> collection, Action<T> action)
{
foreach(T item in collection)
action(item);
}
So you could do just next:
list.Skip(1).ForEach(DoWork)

It's hard to say what the "best" way to handle the first element differently is without knowing why it needs to be handled differently.
If you're feeding the elements of the sequence into the framework's ForEach method, you can't elegantly provide the Action delegate the information necessary for it to determine the element parameter's position in the source sequence, so I think an extra step is necessary. If you don't need to do anything with the sequence after you loop through it, you could always use a Queue (or Stack), pass the first element to whatever handler you're using through a Dequeue() (or Pop()) method call, and then you have the leftover "homogeneous" sequence.

It might seem rudimentary with all the shiny Linq stuff available, but there's always the old fashion for loop.
var yourList = new List<int>{1,1,2,3,5,8,13,21};
for(int i = 0; i < yourList.Count; i++)
{
if(i == 0)
DoFirstElementStuff(yourList[i]);
else
DoNonFirstElementStuff(yourList[i]);
}
This would be fine if you don't want to alter yourList inside the loop. Else, you'll probably need to use the iterator explicitly. At that point, you have to wonder if that's really worth it just to get rid of an IsFirst flag.

Depends on how you're "handling it differently". If you need to do something completely different, then I'd recommend handling the first element outside the loop. If you need to do something in addition to the regular element processing, then consider having a check for the result of the additional processing. It's probably easier to understand in code, so here's some:
string randomState = null; // My alma mater!
foreach(var ele in someEnumerable) {
if(randomState == null) randomState = setState(ele);
// handle additional processing here.
}
This way, your "flag" is really an external variable you (presumably) need anyway, so you're not creating a dedicated variable. You can also wrap it in an if/else if you don't want to process the first element like the rest of the enumeration.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

First Time Calling Extension Methods is Slower Than Subsequent Calls - c#

Related

IEnumerable<T> and .Where Linq method behaviour?

Is it the same to iterate over Linq expression result than to assign it first to a variable?

Create an asynchronous LinQ query

How to yield return inside anonymous methods?

Elegantly refactoring code like this (to avoid a flag)

Categories

Resources