Weird behaviour in enumerators in LINQ - c#

According to MS documentation, the enumerator should throw InvalidOperationEx, if the underlying enumerated source was modified. This works when I just get the enumerator directly from IEnumerable.
THE PROBLEM: But if I acquire enumerator from "query data structure" , then modify the source and then call MoveNext(), nothing is thrown (see the code).
Consider following code:
public static void Main(string[] args)
{
var src = new List<int>() { 1, 2, 3, 4 };
var q = src.Where(i => i % 2 == 1);
IEnumerable<int> nl = src;
var enmLinq = q.GetEnumerator();
var enmNonLinq = nl.GetEnumerator();
src.Add(5); //both enumerators should be invalid, as underlying data source changed
try
{
//throws as expected
enmNonLinq.MoveNext();
}
catch (InvalidOperationException)
{
Console.WriteLine("non LINQ enumerator threw...");
}
try
{
//DOES NOT throw as expected
enmLinq.MoveNext();
}
catch (InvalidOperationException)
{
Console.WriteLine("enumerator from LINQ threw...");
}
//It seems that if we want enmLinq to throw exception as expected:
//we must at least once call MoveNext on it (before modification)
enmLinq.MoveNext();
src.Add(6);
enmLinq.MoveNext(); // now it throws as it should
}
It seems you must first call MoveNext() method to made it notice the change of underlying source.
Why I think this is happening:
I think this is because the "query structure" is giving you too lazy enumerator, which instead of being initialized on GetEnumerator() is initialized during first call to MoveNext().
By initialization I mean connecting all of the enumerators (from WhereEnumerable, SelectEnumerable etc structures returned by LINQ methods) on the way down to the real underlying data structure.
QUESTION:
Am I right about this or am I missing something?
Do you consider it as weird/wrong behaviour?

You are correct.
The LINQ query will not call GetEnumerator on the underlying List<T> until you call MoveNext on the IEnumerable<T> returned by Where.
You can see in the reference source that MoveNext is implemented like so:
public override bool MoveNext()
{
switch (state)
{
case 1:
enumerator = source.GetEnumerator();
state = 2;
goto case 2;
case 2:
while (enumerator.MoveNext())
{
TSource item = enumerator.Current;
if (predicate(item))
{
current = item;
return true;
}
}
Dispose();
break;
}
return false;
}
In the 'initial' state (state 1), it will first call GetEnumerator on the source before moving to state 2.

The documentation only states execution is deferred until the object is enumerated either by calling its GetEnumerator method directly or by using foreach in Visual C# or For Each in Visual Basic.
Since it lacks further detail, the queries performed by LINQ may call GetEnumerator on their source either on the first call to their own GetEnumerator or as late as possible, such as the first call to MoveNext.
I wouldn't assume any particular behavior.
In practice, the actual implementation (see Enumerable.WhereEnumerableIterator<TSource> in the reference source) defers execution to the first call to MoveNext.

enmLinq is not actualized until first MoveNext call. So, any modification done to the src prior to calling MoveNext will not affect validity of enmLinq. Once you call MoveNext on enmLinq - enumerator is actualized, hence any changes on src will lead to exception for the subsequent MoveNext call.

You can test this yourself.
public static void Main(string[] args)
{
var src = new List<int>() { 1, 2, 3, 4 };
var q = src.Where(i =>
{
Output();
return i % 2 == 1;
}
);
IEnumerable<int> nl = src;
var enmLinq = q.GetEnumerator();
var enmNonLinq = nl.GetEnumerator();
src.Add(5); //both enumerators should be invalid, as underlying data source changed
try
{
//throws as expected
enmNonLinq.MoveNext();
}
catch (InvalidOperationException)
{
Console.WriteLine("non LINQ enumerator threw...");
}
try
{
//DOES NOT throw as expected
// Output() is called now.
enmLinq.MoveNext();
}
catch (InvalidOperationException)
{
Console.WriteLine("enumerator from LINQ threw...");
}
//It seems that if we want enmLinq to throw exception as expected:
//we must at least once call MoveNext on it (before modification)
enmLinq.MoveNext();
src.Add(6);
enmLinq.MoveNext(); // now it throws as it should
}
public static void Output()
{
Console.WriteLine("Test");
}
When you run the program, you will see that "Test" is not output to the console until after you call your first MoveNext, which occurs after the source was initially modified.

Related

try catch skipping exception

try
{
return strngarray.Select(strngarrayelem =>
{
string[] data = strngarrayelem .Split(',');
return new xyzClass(data[1], data[2], data[0], (Color)System.Windows.Media.ColorConverter.ConvertFromString(data[3]), data.Length > 4 ? data[4] : "N/A");
});
}
catch (Exception ex)
{
MessageBox.Show("abc");
return Enumerable.Empty<xyzClass>();
}
I am getting format exception in
(Color)System.Windows.Media.ColorConverter.ConvertFromString(data[3])
I try catching it by try-catch but exception is still thrown by app level try catch and not caught by my local try catch.
Why my try catch not getting error ?
You are just returning a LINQ query, it's not yet executed(like for example with ToList).
So if you want to catch the exception here you should consider materializing it to a collection in this method. You could still return IEnumerable<xyzClass> since List<xyzClass> implements that interface.
try
{
return strngarray.Select(strngarrayelem =>
{
string[] data = strngarrayelem .Split(',');
return new xyzClass(data[1], data[2], data[0], (Color)System.Windows.Media.ColorConverter.ConvertFromString(data[3]), data.Length > 4 ? data[4] : "N/A");
}).ToList(); // <------- HERE !!!
}
catch (Exception ex)
{
MessageBox.Show("abc");
return Enumerable.Empty<xyzClass>();
}
If you don't know which method is just returning a query, look at the documentation in MSDN for the keyword deferred. For example Enumerable.Select:
This method is implemented by using deferred execution. The
immediate return value is an object that stores all the information
that is required to perform the action. The query represented by this
method is not executed until the object is enumerated either by
calling its GetEnumerator method directly or by using
foreach
Methods like for example Enumerable.ToList or ToArray call GetEnumerator, so they will execute the query. MSDN:
The ToList<TSource>(IEnumerable<TSource>) method forces immediate
query evaluation and returns a List<T> that contains the query
results. You can append this method to your query in order to obtain a
cached copy of the query results.
ToArray<TSource> has similar behavior but returns an array instead of
a List<T>.

LINQ unexpected behavior when returning IEnumerable and calling ToArray

I noticed some weird behavior in LINQ-code, and reduced the problem to the following minimal example with two methods:
IA Find(string n)
{
IA result;
if (!_dictionary.TryGetValue(n, out result))
{
throw Exception();
}
return result;
}
IEnumerable<IA> Find(IEnumerable<string> names)
{
return names.Select(Find).ToArray();
}
This works as expected.
Now, I remove the .ToArray() so the method looks as follows:
IEnumerable<IA> Find(IEnumerable<string> names)
{
return names.Select(Find);
}
This change will cause the exception not to be thrown, even if some of the names are not found in _dictionary, but are present in the names parameter.
What causes this (to me) unexpected behavior of LINQ?
Its because of deferred execution. The Linq is not evaluated until you execute it.
The call to ToArray() causes a full enumeration of the IEnumerable and thus, the exception to occur.
The second method does not enumerate the IEnumerable and execution is deferred until the caller needs it.
If you were to enumerate the result of Find e.g.
var result = Find(new[] { "name" }).ToList();
or
foreach (var found in Find(new[] { "name" }))
{
...
}
then the exception would occur.

Anonymous type scoping issue

What is the proper way to create a variable that will house a list of anonymous objects that are generated through a LINQ query while keeping the variable declaration outside of a try/catch and the assignment being handled inside of a try/catch?
At the moment I'm declaring the variable as IEnumberable<object>, but this causes some issues down the road when I'm trying to use it later...
i.e.
var variableDeclaration;
try{
...
assignment
...
}catch...
EDIT:
If it's relevant (don't think it is) the list of objects is being returned as a Json result from an MVC3 action. I'm trying to reduce the time that some using statements are open with the DB as I'm having some performance issues that I'm trying to clear up a bit. In doing some of my testing I came across this issue and can't seem to find info on it.
EDIT 2:
If I could request the avoidance of focusing on LINQ. While LINQ is used the question is more specific to the scoping issues associated with Anonymous objects. Not the fact that LINQ is used (in this case) to generate them.
Also, a couple of answers have mentioned the use of dynamic while this will compile it doesn't allow for the usages that I'm needing later on the method. If what I'm wanting to do isn't possible then at the moment the answer appears to be to create a new class with the definition that I'm needing and to use that.
It's possible to get around this by creating a generic Cast method as outlined by Jon Skeet here. It will work and give you the intellisense you want. But, at this point, what's wrong with creating a custom type for your linq method?
public class MyClass
{
public int MyInt { get; set; }
}
IEnumerable<MyClass> myClass =
//Some Linq query that returns a collection of MyClass
Well, if you're using LINQ, the query is not evaluated unless materialized...
So, you might be able to:
var myQuery = //blah
try
{
myQuery = myQuery.ToList(); //or other materializing call
}
catch
{
}
Could you perhaps get away with using dynamic ??
dynamic variableDeclaration;
try
{
variableDeclaration = SomeList.Where(This => This == That);
}
catch { }
Not sure what this will affect further in your code block, but just a thought :)
If you are declaring the variable ahead of using it like a try/catch you can't use [var] as it is intendend. Instead you have to type the the variable.
var x = 0;
try{
x = SomethingReturningAnInt();
}
or
int x;
try{
x = SomethingReturningAnInt();
}
However in your case you don't really "know" what the method returns
var x = ...;
try{
x = Something();
}
catch{}
won't work
Option you have when you don't know the type in advance is use of dynamic:
dynamic x;
try{
x = Something();
}
catch{}
(But that feels like going back to VB4)
Another cheat: you can define variable locally (similarly to Jon's hack in Dave Zych answer) and than use it inside try/catch. As long as you can create the same anonymous item type before try-catch you are OK (as anonymous types wit the same field names and types are considered the same):
var myAnonymouslyType = Enumerable.Repeat(
new {Field1 = (int)1, Field2 = (string)"fake"}, 0);
try
{
myAnonymouslyType = ...(item =>
new {Field1 = item.Id, Field2=item.Text})...
}
...
This is safer option than covered in Jon's casting of anonymous types between functions because compiler will immediately find errors if types don't match.
Note: I'd vote for non-anonymous type if you have to go this way...
Note 2: depending on your actual need consider simply returning data from inside try/catch and having second return of default information outside.
This has vexed me for a while. In the end I've build some Generic helper methods where I can pass in the code that generates the anonymous objects, and the catch code as lamdas as follows
public static class TryCatch
{
public static T Expression<T>(Func<T> lamda, Action<Exception> onException)
{
try
{
return lamda();
}
catch(Exception e)
{
onException(e);
return default(T);
}
}
}
//and example
Exception throwexception = null;
var results = TryCatch.Expression(
//TRY
() =>
{
//simulate exception happening sometimes.
if (new Random().Next(3) == 2)
{
throw new Exception("test this");
}
//return an anonymous object
return new { a = 1, b = 2 };
} ,
//CATCH
(e) => { throwexception = e;
//retrow if you wish
//throw e;
}
);
https://gist.github.com/klumsy/6287279

Does LINQ to objects stop processing Any() when condition is true?

Consider the following:
bool invalidChildren = this.Children.Any(c => !c.IsValid());
This class has a collection of child objects that have an IsValid() method. Suppose that the IsValid() method is a processor intensive task. After encountering the first child object where IsValid() is false, theoretically processing can stop because the result can never become true. Does LINQ to objects actually stop evaluating after the first IsValid() = false (like a logical AND) or does it continue evaluating all child objects?
Obviously I could just put this in a foreach loop and break on the first invalid result, but I was just wondering if LINQ to objects is smart enough to do this as well.
EDIT:
Thanks for the answers, for some reason I didn't think to look it up on MSDN myself.
Yes it does. As soon as it finds a match, the criteria is satified. All is similar in that it checks all items but if one doesn't match it ends immeditately as well.
Exists works in the same manner too.
Any
The enumeration of source is stopped as soon as the result can be determined.
Exists
The elements of the current List are individually passed to the Predicate delegate, and processing is stopped when a match is found.
All
The enumeration of source is stopped as soon as the result can be determined.
etc...
Yes, it stops as soon as the results can be evaluated. Here's a quick proof:
class Program
{
static void Main(string[] args)
{
bool allvalid = TestClasses().Any(t => !t.IsValid());
Console.ReadLine();
}
public static IEnumerable<TestClass> TestClasses()
{
yield return new TestClass() { IsValid = () => { Console.Write(string.Format("TRUE{0}",Environment.NewLine)); return true; } };
yield return new TestClass() { IsValid = () => { Console.Write(string.Format("FALSE{0}", Environment.NewLine)); return false; } };
yield return new TestClass() { IsValid = () => { Console.Write(string.Format("TRUE{0}", Environment.NewLine)); return true; } };
yield return new TestClass() { IsValid = () => { Console.Write(string.Format("TRUE{0}", Environment.NewLine)); return true; } };
}
}
public class TestClass
{
public Func<bool> IsValid {get;set;}
}
Yes it will stop after it encounters the first item for which the condition matches, in your case the first item for which c.IsValid() returns false.
From MSDN:
The enumeration of source is stopped
as soon as the result can be
determined.
Here's a quick and dirty empirical test to see for yourself:
class Kebab
{
public static int NumberOfCallsToIsValid = 0;
public bool IsValid()
{
NumberOfCallsToIsValid++;
return false;
}
}
...
var kebabs = new Kebab[] { new Kebab(), new Kebab() };
kebabs.Any(kebab => !kebab.IsValid());
Debug.Assert(Kebab.NumberOfCallsToIsValid == 1);
The result is that yes, the Any LINQ operator stops as soon as a collection item matches the predicate.
as per MSDN,
The enumeration of source is stopped as soon as the result can be determined.

C#: yield return within a foreach fails - body cannot be an iterator block

Consider this bit of obfuscated code. The intention is to create a new object on the fly via the anonymous constructor and yield return it. The goal is to avoid having to maintain a local collection just to simply return it.
public static List<DesktopComputer> BuildComputerAssets()
{
List<string> idTags = GetComputerIdTags();
foreach (var pcTag in idTags)
{
yield return new DesktopComputer() {AssetTag= pcTag
, Description = "PC " + pcTag
, AcquireDate = DateTime.Now
};
}
}
Unfortunately, this bit of code produces an exception:
Error 28 The body of 'Foo.BuildComputerAssets()' cannot be an iterator block because 'System.Collections.Generic.List' is not an iterator interface type
Questions
What does this error message mean?
How can I avoid this error and use yield return properly?
You can only use yield return in a function that returns an IEnumerable or an IEnumerator, not a List<T>.
You need to change your function to return an IEnumerable<DesktopComputer>.
Alternatively, you can rewrite the function to use List<T>.ConvertAll:
return GetComputerIdTags().ConvertAll(pcTag =>
new DesktopComputer() {
AssetTag = pcTag,
Description = "PC " + pcTag,
AcquireDate = DateTime.Now
});
Your method signature is wrong. It should be:
public static IEnumerable<DesktopComputer> BuildComputerAssets()
yield only works on Iterator types:
The yield statement can only appear inside an iterator block
Iterators are defined as
The return type of an iterator must be IEnumerable, IEnumerator, IEnumerable<T>, or IEnumerator<T>.
IList and IList<T> do implement IEnumerable/IEnumerable<T>, but every caller to an enumerator expects one of the four types above and none else.
You could also implement the same functionality using a LINQ query (in C# 3.0+). This is less efficient than using ConvertAll method, but it is more general. Later, you may also need to use other LINQ features such as filtering:
return (from pcTag in GetComputerIdTags()
select new DesktopComputer() {
AssetTag = pcTag,
Description = "PC " + pcTag,
AcquireDate = DateTime.Now
}).ToList();
The ToList method converts the result from IEnumerable<T> to List<T>. I personally don't like ConvertAll, because it does the same thing as LINQ. But because it was added earlier, it cannot be used with LINQ (it should have been called Select).

Categories

Resources