How to create an IEnumerable with optional Action? - c#

I have IntegerRectangle class. I want it to have an internal_perimeter() method which returns all points of its perimeter and internal_perimeter(Action<Integer> processor) which applies processor to all points of its perimeter.
One of my classes has a variable IntegerRect canvas; and HashSet<IntegerPoint> forbidden_points It calls:
canvas.internal_perimeter((IntegerPoint p)=>{forbidden_points.Add(p); print("[f]" + forbidden_points.Contains(p).ToString());});
The result differs between different implementations of internal_perimeter()
This works:
public IEnumerable<IntegerPoint> internal_perimeter()
{
for(int i=0;i<width;++i)
{
yield return new IntegerPoint(x+i,y);
}
for(int i=1;i<height;++i)
{
yield return new IntegerPoint(x+width-1,y-i);
}
for(int i=width-2;i>=0;--i)
{
yield return new IntegerPoint(x+i,y-height+1);
}
for(int i=height-2;i>=0;--i)
{
yield return new IntegerPoint(x,y-i);
}
}
public void internal_perimeter(Action<IntegerPoint> processor)
{
foreach(IntegerPoint i in internal_perimeter())
processor(i);
}
This doesn't:
public IEnumerable<IntegerPoint> internal_perimeter(Action<IntegerPoint> processor=null)
{
if(processor==null)
{
for(int i=0;i<width;++i)
{
yield return new IntegerPoint(x+i,y);
}
for(int i=1;i<height;++i)
{
yield return new IntegerPoint(x+width-1,y-i);
}
for(int i=width-2;i>=0;--i)
{
yield return new IntegerPoint(x+i,y-height+1);
}
for(int i=height-2;i>=0;--i)
{
yield return new IntegerPoint(x,y-i);
}
}
else
foreach(IntegerPoint i in internal_perimeter())
processor(i);
}
I don't understand what is wrong with the second one

To add to #Lucas' answer, which answers why your code doesn't work, you should also consider refactoring your code:
internal_perimeter is a bad name for the method. If its purpose is to mutate internal points, then it should be named void Process(Action a) or something like that.
The second example is rather problematic because it returns nothing (an empty sequence) when you don't pass null for the action parameter. It would make more sense to use a Func<T, Tresult (like LINQ Select) and yield return all processed parameters. Also, the null branch is really uncommon (it is rarely recommended to pass a null delegate like this).
Next, the method really does too little. Why do you need a new method which has an existing LINQ alternative? I.e.:
var rect = new IntegerRectangle();
// this gets a list of points
var forbiddenPoints = rect.internal_perimeter().ToList();
// this filters them and projects them
// (i.e. "get all x coordinates larger then 10")
var xLargerThan10 = rect
.internal_perimeter()
.Where(p => p.X > 10)
.Select(p => p.X)
.ToList();
Even the original internal_perimeter overload might have a better name, e.g. simply GetPoints would be pretty indicative of what its purpose is:
foreach (var point in rect.GetPoints())
DoStuff(point);

Your second example is an iterator (ie it uses yield return). This kind of function is not executed until you enumerate it.
If you do: var x = internal_perimeter(i => {});
The variable x will hold an IEnumerable<IntegerPoint> of a class constructed by the compiler from your function. Your code is not executed yet at this point.
Now, try to consume it: foreach(var point in x) {}. This will execute your function. Actually in your particular case, it will all be executed on the first iteration, so calling x.FirstOrDefault(); will be enough. Indeed, calling MoveNext on the enumerator will execute the code up to the first yield return, and there are none in the else branch of your code.
Now, I'd go with your first example because of this. It is less error prone.

Related

IEnumerable<T> and .Where Linq method behaviour?

I thought I know everything about IEnumerable<T> but I just met a case that I cannot explain. When we call .Where linq method on a IEnumerable, the execution is deferred until the object is enumerated, isn't it?
So how to explain the sample below :
public class CTest
{
public CTest(int amount)
{
Amount = amount;
}
public int Amount { get; set; }
public override string ToString()
{
return $"Amount:{Amount}";
}
public static IEnumerable<CTest> GenerateEnumerableTest()
{
var tab = new List<int> { 2, 5, 10, 12 };
return tab.Select(t => new CTest(t));
}
}
Nothing bad so far!
But the following test gives me an unexpected result although my knowledge regarding IEnumerable<T> and .Where linq method :
[TestMethod]
public void TestCSharp()
{
var tab = CTest.GenerateEnumerableTest();
foreach (var item in tab.Where(i => i.Amount > 6))
{
item.Amount = item.Amount * 2;
}
foreach (var t in tab)
{
var s = t.ToString();
Debug.Print(s);
}
}
No item from tab will be multiplied by 2. The output will be :
Amount:2
Amount:5
Amount:10
Amount:12
Does anyone can explain why after enumerating tab, I get the original value.
Of course, everything work fine after calling .ToList() just after calling GenerateEnumerableTest() method.
var tab = CTest.GenerateEnumerableTest();
This tab is a LINQ query that generates CTest instances that are initialized from int-values which come from an integer array which will never change. So whenever you ask for this query you will get the "same" instances(with the original Amount).
If you want to "materialize" this query you could use ToList and then change them.
Otherwise you are modifying CTest instances that exist only in the first foreach loop. The second loop enumerates other CTest instances with the unmodified Amount.
So the query contains the informations how to get the items, you could also call the method directly:
foreach (var item in CTest.GenerateEnumerableTest().Where(i => i.Amount > 6))
{
item.Amount = item.Amount * 2;
}
foreach (var t in CTest.GenerateEnumerableTest())
{
// now you don't expect them to be changed, do you?
}
Like many LINQ operations, Select is lazy and use deferred execution so your lambda expression is never being executed, because you're calling Select but never using the results. This is why, everything work fine after calling .ToList() just after calling GenerateEnumerableTest() method:
var tab = CTest.GenerateEnumerableTest().ToList();

List<T>.AddRange and the yield statement

I am aware that the yield keyword indicates that the method in which it appears is an iterator. I was just wondering how that works with something like List<T>.AddRange.
Let's use the below example:
static void Main()
{
foreach (int i in MyInts())
{
Console.Write(i);
}
}
public static IEnumerable<int> MyInts()
{
for (int i = 0; i < 255; i++)
{
yield return i;
}
}
So in the above example after each yield, a value is returned in the foreach loop in Main and is printed to the console.
If we change Main to this:
static void Main()
{
var myList = new List<int>();
myList.AddRange(MyInts());
}
how does that work? Does AddRange get called for each int returned by the yield statement or does it somehow wait for all 255 values before adding the entire range?
The implementation of AddRange will iterate over the IEnumerable input using the iterator's .MoveNext() method until all values have been produced by your yielding method. This can be seen here.
So myList.AddRange(MyInts()); is called once and its implementation forces MyInts to return all of it values before moving on.
AddRange exhausts all values of the iterator because of how is implemented, but the following hypothetic method would only evaluate the first value of the iterator:
public void AddFirst<T>(IEnumerable<T> collection)
{
Insert(collection.First());
}
An interesting experiment while you play around with this is to add a Console.WriteLine(i); line in your MyInts method to see when each number is generated.
Short answer: When you call AddRange, it will internally iterate every item in your IEnumerable and add to the list.
If you did something like this:
var myList = new List<int>();
myList.AddRange(MyInts());
foreach (int i in myList)
{
Console.Write(i);
}
Then your values would be iterated twice, from the start to the end:
Once when adding to your list
Then in your for loop
Playing a bit
Now, let's suppose you created your own extension method for AddRange like this:
public static IEnumerable<T> AddRangeLazily<T>(this ICollection<T> col, IEnumerable<T> values)
{
foreach (T i in values)
{
yield return i; // first we yield
col.Add(i); // then we add
}
}
Then you could use it like this:
foreach (int i in myList.AddRangeLazily(MyInts()))
{
Console.Write(i);
}
...and it would be iterated twice as well, without going from the start to the end both times. It would lazily add each value to the list/collection and at the same time allow you to do something else (like printing it to output) after every new item being added.
If you had some sort of logic to stop the adding to the list in the middle of the operation, this should be helpful somehow.
The downside if this AddRangeLazily is: values will only be added to the collection once you iterate over AddRangeLazily like my code sample. If you just do this:
var someList = new List<int>();
someList.AddRangeLazily(MyInts());
if (someList.Any())
// it wouldn't enter here...
...it won't add values at all. If you wanted that behaviour, you should use AddRange. Forcing the iterationg over AddRangeLazily method would work, though:
var someList = new List<int>();
someList.AddRangeLazily(MyInts());
if (someList.AddRangeLazily(MyInts()).Count())
// it would enter here...thus adding all values to the someList
...however, depending on how lazy is the method you calling, it wouldn't iterate everything. For example:
var someList = new List<int>();
someList.AddRangeLazily(MyInts());
if (someList.AddRangeLazily(MyInts()).Any())
// it would enter here, plus adding only the first value to someList
Since Any() is true as soon as any item exists, then Any() just needs one iterationg to return true, therefore it just needs the first item to be iterated over.
I actually don't remember having to do something like this, it was just to play around with yield.
Fiddle here!!!
Interesting question.
The behavior is different if the enumerable is for a class that implements ICollection, such as another list or an array, but let's say it doesn't since your example doesn't. AddRange() simply uses the enumerator to insert items into the list one at a time.
using(IEnumerator<T> en = collection.GetEnumerator()) {
while(en.MoveNext()) {
Insert(index++, en.Current);
If the type of the enumerator is ICollection then AddRange first expands the list and then does a block copy.
If you want to see the code yourself:
https://referencesource.microsoft.com/#mscorlib/system/collections/generic/list.cs,51decd510e5bfe6e

Is it the same to iterate over Linq expression result than to assign it first to a variable?

So, this is more difficult to explain in words, so i will put code examples.
let's suppose i already have a list of clients that i want to filter.
Basically i want to know if this:
foreach(var client in list.Where(c=>c.Age > 20))
{
//Do something
}
is the same as this:
var filteredClients = list.Where(c=>c.Age > 20);
foreach(var client in filteredClients)
{
//Do something
}
I've been told that the first approach executes the .Where() in every iteration.
I'm sorry if this is a duplicate, i couldn't find any related question.
Thanks in advance.
Yes, both those examples are functionally identical. One just stores the result from Enumerable.Where in a variable before accessing it while the other just accesses it directly.
To really see why this will not make a difference, you have to understand what a foreach loop essentially does. The code in your examples (both of them) is basically equivalent to this (I’ve assumed a known type Client here):
IEnumerable<Client> x = list.Where(c=>c.Age > 20);
// foreach loop
IEnumerator<Client> enumerator = x.GetEnumerator();
while (enumerator.MoveNext())
{
Client client = enumerator.Current;
// Do something
}
So what actually happens here is the IEnumerable result from the LINQ method is not consumed directly, but an enumerator of it is requested first. And then the foreach loop does nothing else than repeatedly asking for a new object from the enumerator and processing the current element in each loop body.
Looking at this, it doesn’t make sense whether the x in the above code is really an x (i.e. a previously stored variable), or whether it’s the list.Where() call itself. Only the enumerator object—which is created just once—is used in the loop.
Now to cover that SharePoint example which Colin posted. It looks like this:
SPList activeList = SPContext.Current.List;
for (int i=0; i < activeList.Items.Count; i++)
{
SPListItem listItem = activeList.Items[i];
// do stuff
}
This is a fundamentally different thing though. Since this is not using a foreach loop, we do not get that one enumerator object which we use to iterate through the list. Instead, we repeatedly access activeList.Items: Once in the loop body to get an item by index, and once in the continuation condition of the for loop where we get the collection’s Count property value.
Unfortunately, Microsoft does not follow its own guidelines all the time, so even if Items is a property on the SPList object, it actually is creating a new SPListItemCollection object every time. And that object is empty by default and will only lazily load the actual items when you first access an item from it. So above code will eventually create a large amount of SPListItemCollections which will each fetch the items from the database. This behavior is also mentioned in the remarks section of the property documentation.
This generally violates Microsoft’s own guidelines on choosing a property vs a method:
Do use a method, rather than a property, in the following situations.
The operation returns a different result each time it is called, even if the parameters do not change.
Note that if we used a foreach loop for that SharePoint example again, then everything would have been fine, since we would have again only requested a single SPListItemCollection and created a single enumerator for it:
foreach (SPListItem listItem in activeList.Items.Cast<SPListItem>())
{ … }
They are not quite the same:
Here is the original C# code:
static void ForWithVariable(IEnumerable<Person> clients)
{
var adults = clients.Where(x => x.Age > 20);
foreach (var client in adults)
{
Console.WriteLine(client.Age.ToString());
}
}
static void ForWithoutVariable(IEnumerable<Person> clients)
{
foreach (var client in clients.Where(x => x.Age > 20))
{
Console.WriteLine(client.Age.ToString());
}
}
Here is the decompiled Intermediate Language (IL) code this results in (according to ILSpy):
private static void ForWithVariable(IEnumerable<Person> clients)
{
Func<Person, bool> arg_21_1;
if ((arg_21_1 = Program.<>c.<>9__1_0) == null)
{
arg_21_1 = (Program.<>c.<>9__1_0 = new Func<Person, bool>(Program.<>c.<>9.<ForWithVariable>b__1_0));
}
IEnumerable<Person> enumerable = clients.Where(arg_21_1);
foreach (Person current in enumerable)
{
Console.WriteLine(current.Age.ToString());
}
}
private static void ForWithoutVariable(IEnumerable<Person> clients)
{
Func<Person, bool> arg_22_1;
if ((arg_22_1 = Program.<>c.<>9__2_0) == null)
{
arg_22_1 = (Program.<>c.<>9__2_0 = new Func<Person, bool>(Program.<>c.<>9.<ForWithoutVariable>b__2_0));
}
foreach (Person current in clients.Where(arg_22_1))
{
Console.WriteLine(current.Age.ToString());
}
}
As you can see, there is a key difference:
IEnumerable<Person> enumerable = clients.Where(arg_21_1);
A more practical question, however, is whether the differences hurt performance. I concocted a test to measure that.
class Program
{
public static void Main()
{
Measure(ForEachWithVariable);
Measure(ForEachWithoutVariable);
Console.ReadKey();
}
static void Measure(Action<List<Person>, List<Person>> action)
{
var clients = new[]
{
new Person { Age = 10 },
new Person { Age = 20 },
new Person { Age = 30 },
}.ToList();
var adultClients = new List<Person>();
var sw = new Stopwatch();
sw.Start();
for (var i = 0; i < 1E6; i++)
action(clients, adultClients);
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds.ToString());
Console.WriteLine($"{adultClients.Count} adult clients found");
}
static void ForEachWithVariable(List<Person> clients, List<Person> adultClients)
{
var adults = clients.Where(x => x.Age > 20);
foreach (var client in adults)
adultClients.Add(client);
}
static void ForEachWithoutVariable(List<Person> clients, List<Person> adultClients)
{
foreach (var client in clients.Where(x => x.Age > 20))
adultClients.Add(client);
}
}
class Person
{
public int Age { get; set; }
}
After several runs of the program, I was not able to find any significant difference between ForEachWithVariable and ForEachWithoutVariable. They were always close in time, and neither was consistently faster than the other. Interestingly, if I change 1E6 to just 1000, the ForEachWithVariable is actually consistently slower, by about 1 millisecond.
So, I conclude that for LINQ to Objects, there is no practical difference. The same type of test could be run if your particular use case involves LINQ to Entities (or SharePoint).

Elegantly refactoring code like this (to avoid a flag)

I have a function running over an enumerable, but the function should be a little bit different for the first item, for example:
void start() {
List<string> a = ...
a.ForEach(DoWork);
}
bool isFirst = true;
private void DoWork(string s) {
// do something
if(isFirst)
isFirst = false;
else
print("first stuff");
// do something
}
How would you refactor this to avoid that ugly flag?
Expounding on Jimmy Hoffa's answer if you actually want to do something with the first item you could do this.
DoFirstWork(a[0])
a.Skip(1).ForEach(DoWork)
If the point is that it is separate in logic from the rest of the list then you should use a separate function.
It might be a bit heavy handed, but I pulled this from another SO question a while back.
public static void IterateWithSpecialFirst<T>(this IEnumerable<T> source,
Action<T> firstAction,
Action<T> subsequentActions)
{
using (IEnumerator<T> iterator = source.GetEnumerator())
{
if (iterator.MoveNext())
{
firstAction(iterator.Current);
}
while (iterator.MoveNext())
{
subsequentActions(iterator.Current);
}
}
}
Check out Jon Skeet's smart enumerations.
They are part of his Miscellaneous Utility Library
EDIT: added usage example, added a ForFirst method, reordered my paragraphs.
Below is a complete solution.
Usage is either of the following:
list.ForFirst(DoWorkForFirst).ForRemainder(DoWork);
// or
list.ForNext(1, DoWorkForFirst).ForRemainder(DoWork);
The crux is the ForNext method, which performs an action for the specified next set of items from the collection and returns the remaining items. I've also implemented a ForFirst method that simply calls ForNext with count: 1.
class Program
{
static void Main(string[] args)
{
List<string> list = new List<string>();
// ...
list.ForNext(1, DoWorkForFirst).ForRemainder(DoWork);
}
static void DoWorkForFirst(string s)
{
// do work for first item
}
static void DoWork(string s)
{
// do work for remaining items
}
}
public static class EnumerableExtensions
{
public static IEnumerable<T> ForFirst<T>(this IEnumerable<T> enumerable, Action<T> action)
{
return enumerable.ForNext(1, action);
}
public static IEnumerable<T> ForNext<T>(this IEnumerable<T> enumerable, int count, Action<T> action)
{
if (enumerable == null)
throw new ArgumentNullException("enumerable");
using (var enumerator = enumerable.GetEnumerator())
{
// perform the action for the first <count> items of the collection
while (count > 0)
{
if (!enumerator.MoveNext())
throw new ArgumentOutOfRangeException(string.Format(System.Globalization.CultureInfo.InvariantCulture, "Unexpected end of collection reached. Expected {0} more items in the collection.", count));
action(enumerator.Current);
count--;
}
// return the remainder of the collection via an iterator
while (enumerator.MoveNext())
{
yield return enumerator.Current;
}
}
}
public static void ForRemainder<T>(this IEnumerable<T> enumerable, Action<T> action)
{
if (enumerable == null)
throw new ArgumentNullException("enumerable");
foreach (var item in enumerable)
{
action(item);
}
}
}
I felt a bit ridiculous making the ForRemainder method; I could swear that I was re-implementing a built-in function with that, but it wasn't coming to mind and I couldn't find an equivalent after glancing around a bit. UPDATE: After reading the other answers, I see there apparently isn't an equivalent built into Linq. I don't feel so bad now.
using System.Linq; // reference to System.Core.dll
List<string> list = ..
list.Skip(1).ForEach(DoWork) // if you use List<T>.ForEeach()
but I recommend you to write your one:
public static void ForEach(this IEnumerable<T> collection, Action<T> action)
{
foreach(T item in collection)
action(item);
}
So you could do just next:
list.Skip(1).ForEach(DoWork)
It's hard to say what the "best" way to handle the first element differently is without knowing why it needs to be handled differently.
If you're feeding the elements of the sequence into the framework's ForEach method, you can't elegantly provide the Action delegate the information necessary for it to determine the element parameter's position in the source sequence, so I think an extra step is necessary. If you don't need to do anything with the sequence after you loop through it, you could always use a Queue (or Stack), pass the first element to whatever handler you're using through a Dequeue() (or Pop()) method call, and then you have the leftover "homogeneous" sequence.
It might seem rudimentary with all the shiny Linq stuff available, but there's always the old fashion for loop.
var yourList = new List<int>{1,1,2,3,5,8,13,21};
for(int i = 0; i < yourList.Count; i++)
{
if(i == 0)
DoFirstElementStuff(yourList[i]);
else
DoNonFirstElementStuff(yourList[i]);
}
This would be fine if you don't want to alter yourList inside the loop. Else, you'll probably need to use the iterator explicitly. At that point, you have to wonder if that's really worth it just to get rid of an IsFirst flag.
Depends on how you're "handling it differently". If you need to do something completely different, then I'd recommend handling the first element outside the loop. If you need to do something in addition to the regular element processing, then consider having a check for the result of the additional processing. It's probably easier to understand in code, so here's some:
string randomState = null; // My alma mater!
foreach(var ele in someEnumerable) {
if(randomState == null) randomState = setState(ele);
// handle additional processing here.
}
This way, your "flag" is really an external variable you (presumably) need anyway, so you're not creating a dedicated variable. You can also wrap it in an if/else if you don't want to process the first element like the rest of the enumeration.

IQueryable remove from the collection, best way?

IQueryable<SomeType> collection = GetCollection();
foreach (var c in collection)
{
//do some complex checking that can't be embedded in a query
//based on results from prev line we want to discard the 'c' object
}
//here I only want the results of collection - the discarded objects
So with that simple code what is the best way to get the results. Should I created a List just before the foreach and insert the objects I want to keep, or is there some other way that would be better to do this type of thing.
I know there are other posts on similar topics but I just don't feel I'm getting what I need out of them.
Edit I tried this
var collection = GetCollection().Where(s =>
{
if (s.property == 1)
{
int num= Number(s);
double avg = Avg(s.x);
if (num > avg)
return true;
else
return false;
}
else return false;
});
I tried this but was given "A lambda expression with a statement body cannot be converted to an expression tree" on compile. Did I not do something right?
//do some complex checking that can't be embedded in a query
I don't get it. You can pass a delegate which can point to a very complex function (Turing-complete) that checks whether you should discard it or not:
var result = GetCollection().AsEnumerable().Where(c => {
// ...
// process "c"
// return true if you want it in the collection
});
If you want, you can refactor it in another function:
var result = GetCollection.Where(FunctionThatChecksToDiscardOrNot);
If you wrap it into another method, you can use yield return and then iterate over the returned collection, like so:
public IEnumerable<SomeType> FindResults(IQueryable<SomeType> collection) {
foreach (var c in collection)
{
if (doComplicatedQuery(c)) {
yield return c;
}
}
}
// elsewhere
foreach (var goodItem in FindResults(GetCollection())) {
// do stuff.
}

Categories

Resources