C# Closure works? [duplicate] - c#

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Is there a reason for C#'s reuse of the variable in a foreach?
Looping through a list of Actions
today I encounter a problem about the C# foreach function, it didn't give me the proper result as I expected. here is the code:
using System;
using System.Collections.Generic;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
int[] data = new int[] { 1, 2, 3, 4, 5 };
List<Func<int>> actions = new List<Func<int>>();
foreach (int x in data)
{
actions.Add(delegate() { return x; });
}
foreach (var foo in actions)
{
Console.WriteLine(foo());
}
Console.ReadKey();
}
}
}
when I Run it in console application and it has five 5 printed on the screen. Why? I just cann't understand. Have asked some people and they just said that there is closure in this code, But I am not very clear about this, I remember that in javascript , I often encounter the closure, but in above code, why there is closure? thx.

In C#4 all iterations of a foreach loop share the same variable, and thus the same closure.
The specification says:
foreach (V v in x) embedded-statement
is then expanded to:
{
E e = ((C)(x)).GetEnumerator();
try
{
V v;
while (e.MoveNext())
{
v = (V)(T)e.Current;
embedded-statement
}
}
finally
{
… // Dispose e
}
}
You can see that v is declared in a block outside the while-loop, which causes this sharing behavior.
This will probably be changed in C#5.
We are taking the breaking change. In C# 5, the loop variable of a foreach will be logically inside the loop, and therefore closures will close over a fresh copy of the variable each time. The "for" loop will not be changed.
http://blogs.msdn.com/b/ericlippert/archive/2009/11/12/closing-over-the-loop-variable-considered-harmful.aspx

The key is that when you are creating the delegates within your foreach loop you are creating a closure over the loop variable x, not its current value.
Only when you execute the delegates in actions will the value be determined, which is the value of x at that time. Since you have completed the foreach loop by then the value will be the last item in your data array which is 5.

Related

Task.Factory.StartNew duplication issue [duplicate]

When using lambda expressions or anonymous methods in C#, we have to be wary of the access to modified closure pitfall. For example:
foreach (var s in strings)
{
query = query.Where(i => i.Prop == s); // access to modified closure
...
}
Due to the modified closure, the above code will cause all of the Where clauses on the query to be based on the final value of s.
As explained here, this happens because the s variable declared in foreach loop above is translated like this in the compiler:
string s;
while (enumerator.MoveNext())
{
s = enumerator.Current;
...
}
instead of like this:
while (enumerator.MoveNext())
{
string s;
s = enumerator.Current;
...
}
As pointed out here, there are no performance advantages to declaring a variable outside the loop, and under normal circumstances the only reason I can think of for doing this is if you plan to use the variable outside the scope of the loop:
string s;
while (enumerator.MoveNext())
{
s = enumerator.Current;
...
}
var finalString = s;
However variables defined in a foreach loop cannot be used outside the loop:
foreach(string s in strings)
{
}
var finalString = s; // won't work: you're outside the scope.
So the compiler declares the variable in a way that makes it highly prone to an error that is often difficult to find and debug, while producing no perceivable benefits.
Is there something you can do with foreach loops this way that you couldn't if they were compiled with an inner-scoped variable, or is this just an arbitrary choice that was made before anonymous methods and lambda expressions were available or common, and which hasn't been revised since then?
The compiler declares the variable in a way that makes it highly prone to an error that is often difficult to find and debug, while producing no perceivable benefits.
Your criticism is entirely justified.
I discuss this problem in detail here:
Closing over the loop variable considered harmful
Is there something you can do with foreach loops this way that you couldn't if they were compiled with an inner-scoped variable? or is this just an arbitrary choice that was made before anonymous methods and lambda expressions were available or common, and which hasn't been revised since then?
The latter. The C# 1.0 specification actually did not say whether the loop variable was inside or outside the loop body, as it made no observable difference. When closure semantics were introduced in C# 2.0, the choice was made to put the loop variable outside the loop, consistent with the "for" loop.
I think it is fair to say that all regret that decision. This is one of the worst "gotchas" in C#, and we are going to take the breaking change to fix it. In C# 5 the foreach loop variable will be logically inside the body of the loop, and therefore closures will get a fresh copy every time.
The for loop will not be changed, and the change will not be "back ported" to previous versions of C#. You should therefore continue to be careful when using this idiom.
What you are asking is thoroughly covered by Eric Lippert in his blog post Closing over the loop variable considered harmful and its sequel.
For me, the most convincing argument is that having new variable in each iteration would be inconsistent with for(;;) style loop. Would you expect to have a new int i in each iteration of for (int i = 0; i < 10; i++)?
The most common problem with this behavior is making a closure over iteration variable and it has an easy workaround:
foreach (var s in strings)
{
var s_for_closure = s;
query = query.Where(i => i.Prop == s_for_closure); // access to modified closure
My blog post about this issue: Closure over foreach variable in C#.
Having been bitten by this, I have a habit of including locally defined variables in the innermost scope which I use to transfer to any closure. In your example:
foreach (var s in strings)
query = query.Where(i => i.Prop == s); // access to modified closure
I do:
foreach (var s in strings)
{
string search = s;
query = query.Where(i => i.Prop == search); // New definition ensures unique per iteration.
}
Once you have that habit, you can avoid it in the very rare case you actually intended to bind to the outer scopes. To be honest, I don't think I have ever done so.
In C# 5.0, this problem is fixed and you can close over loop variables and get the results you expect.
The language specification says:
8.8.4 The foreach statement
(...)
A foreach statement of the form
foreach (V v in x) embedded-statement
is then expanded to:
{
E e = ((C)(x)).GetEnumerator();
try {
while (e.MoveNext()) {
V v = (V)(T)e.Current;
embedded-statement
}
}
finally {
… // Dispose e
}
}
(...)
The placement of v inside the while loop is important for how it is
captured by any anonymous function occurring in the
embedded-statement. For example:
int[] values = { 7, 9, 13 };
Action f = null;
foreach (var value in values)
{
if (f == null) f = () => Console.WriteLine("First value: " + value);
}
f();
If v was declared outside of the while loop, it would be shared
among all iterations, and its value after the for loop would be the
final value, 13, which is what the invocation of f would print.
Instead, because each iteration has its own variable v, the one
captured by f in the first iteration will continue to hold the value
7, which is what will be printed. (Note: earlier versions of C#
declared v outside of the while loop.)

C# simple multithreading - why is it not producing the correct results? [duplicate]

This question already has answers here:
The foreach identifier and closures
(7 answers)
Closed 8 years ago.
can someone tell me why the below is not producing the correct results? It is giving me 1233 when I expected 0123.
public static readonly object locker = new object();
public static List<int> queue = new List<int>();
public static void calculate(int input)
{
Thread.Sleep(1000);
lock (locker)
{
queue.Add(input);
}
}
[TestMethod]
public void TestT()
{
int[] _intList = new int[] { 0, 1, 2, 3 };
List<Thread> _threadList = new List<Thread>();
foreach (int num in _intList)
{
Thread t = new Thread(() => calculate(num));
t.Start();
_threadList.Add(t);
}
foreach (Thread t in _threadList) { t.Join(); }
foreach (var t in queue)
{
Console.WriteLine(t);
}
}
When I change it to use a copy of the _intList variable instead, I get the correct results of 0123. Can someone tell me why this is happening? Is it being cached somewhere?
foreach (int num in _intList)
{
int testNum = num;
Thread t = new Thread(() => calculate(testNum));
t.Start();
_threadList.Add(t);
}
When you're passing a variable to a lambda expression it gets captured by the expression. So it's not a copy but that very variable you're getting. This is a common issue with foreach and delayed execution (multithreaded or not), since the foreach continues num is getting it's next value and if it does so because your thread gets to calculate, it's that value that will be calculated.
If you didn't multithread this but instead called the result of the lambda after the foreach what you would see would be 3 3 3 3 instead, in this case you're simply seeing them ofset by one because, most likely, the time it takes to start the thread is about the same as 1 iteration.
When you're making a copy of the variable that variable is declared within the scope of the foreach and it's a new variable each time, it's not getting changed to the next member and that variable is the one getting captured, giving you the correct result. This is the correct way to do this. The result you're getting isn't unexpected but not guaranteed either, you could be getting anything from 0 1 2 3 to 3 3 3 3 with the 1st method, the second method guarantees a correct output.

Lambda expression as ThreadStart strange behavior [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
C# - The foreach identifier and closures
From Eric Lippert’s blog: “don’t close over the loop variable”
I'm using a lambda expression as ThreadStart parameter, to run a method in a new thread using Thread class. This is my code:
delegate void del();
static void Do(int i)
{
Console.WriteLine(i);
}
static del CreateLoop(del Do)
{
return () =>
{
while (true)
{
Do();
Thread.Sleep(500);
}
};
}
static void Main(string[] args)
{
int n = 0;
var loop = CreateLoop(() => Do(n));
new Thread(() => loop()).Start();
Thread.Sleep(500);
n = 1;
}
And this is the output:
0
1
1
1
...
How is it possible?
Why if I change the value of my integer variable n, also changes the value of i (Do's parameter)?
You should make a different variable out of it, thus not changing the original value.
After all, all you're really doing is calling that same old 'function', the lambda expression passing the variable i over and over again, which indeed changes. It's nog like you're storing the initial value of the var i somewhere.
var loop = CreateLoop(() => Do(n));
This line is simply creating a new function and assigning it to a variable. This function, among other things, passes the value n to the Do function. But it's not calling the Do function, it's just creating a function which will, when executed, call the Do function.
You then start a new thread which calls the function, etc, but your new thread is still executing Do(n), passing the n variable to Do. That part doesn't change - you've created a function which references a particular place in memory (represented by the variable n) and continues to reference that place in memory even as you change the value which is stored there.
I believe the following would "fix" your code:
var loop = (int x) => () => CreateLoop(() => Do(x));
new Thread(loop(n)).Start();
This passes the value of n to the function represented by loop, but the loop function creates a new place in memory (represented by x) in which to store the value. This new place in memory is not affected by subsequent changes to n. That is to say, the function you've created does not directly reference the place in memory to which n is a pointer.

How do closures differ between foreach and list.ForEach()?

Consider this code.
var values = new List<int> {123, 432, 768};
var funcs = new List<Func<int>>();
values.ForEach(v=>funcs.Add(()=>v));
funcs.ForEach(f=>Console.WriteLine(f()));//prints 123,432,768
funcs.Clear();
foreach (var v1 in values)
{
funcs.Add(()=>v1);
}
foreach (var func in funcs)
{
Console.WriteLine(func()); //prints 768,768,768
}
I know that the second foreach prints 768 3 times because of the closure variable captured by the lambda. why does it not happen in the first case?How does foreach keyword different from the method Foreach? Is it beacuse the expression is evaluated when i do values.ForEach
foreach only introduces one variable. While the lambda parameter variable is "fresh" each time it is invoked.
Compare with:
foreach (var v1 in values) // v1 *same* variable each loop, value changed
{
var freshV1 = v1; // freshV1 is *new* variable each loop
funcs.Add(() => freshV1);
}
foreach (var func in funcs)
{
Console.WriteLine(func()); //prints 123,432,768
}
That is,
foreach (T v in ...) { }
can be thought of as:
T v;
foreach(v in ...) {}
Happy coding.
The difference is that in the foreach loop, you've got a single variable v1 which is captured. That variable takes on each value within values - but you're only using it at the end... which means we only see the final value each time.
In your List<T>.ForEach version, each iteration introduces a new variable (the parameter f) - so each lambda expression is capturing a separate variable, which never changes in value.
Eric Lippert has blogged about this - but note that this behaviour may change in future versions of C#.

For loop goes out of range [duplicate]

This question already has answers here:
Captured variable in a loop in C#
(10 answers)
Closed 11 days ago.
using System;
using System.Collections.Generic;
using System.Text;
using System.Threading.Tasks;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
MyClass myClass = new MyClass();
myClass.StartTasks();
}
}
class MyClass
{
int[] arr;
public void StartTasks()
{
arr = new int[2];
arr[0] = 100;
arr[1] = 101;
for (int i = 0; i < 2; i++)
{
Task.Factory.StartNew(() => WorkerMethod(arr[i])); // IndexOutOfRangeException: i==2!!!
}
}
void WorkerMethod(int i)
{
}
}
}
It seems that i++ gets executed one more time before the loop iteration is finished. Why do I get the IndexOutOfRangeException?
You are closing over loop variable. When it's time for WorkerMethod to get called, i can have the value of two, not the value of 0 or 1.
When you use closures it's important to understand that you are not using the value that the variable has at the moment, you use the variable itself. So if you create lambdas in loop like so:
for(int i = 0; i < 2; i++) {
actions[i] = () => { Console.WriteLine(i) };
}
and later execute the actions, they all will print "2", because that's what the value of i is at the moment.
Introducing a local variable inside the loop will solve your problem:
for (int i = 0; i < 2; i++)
{
int index = i;
Task.Factory.StartNew(() => WorkerMethod(arr[index]));
}
<Resharper plug> That's one more reason to try Resharper - it gives a lot of warnings that help you catch the bugs like this one early. "Closing over a loop variable" is amongst them </Resharper plug>
The reason is that you are using a loop variable inside a parallel task. Because tasks can execute concurrently the value of the loop variable may be different to the value it had when you started the task.
You started the task inside the loop. By the time the task comes to querying the loop variable the loop has ended becuase the variable i is now beyond the stop point.
That is:
i = 2 and the loop exits.
The task uses variable i (which is now 2)
You should use Parallel.For to perform a loop body in parallel. Here is an example of how to use Parallel.For
Alternativly, if you want to maintain you current strucuture, you can make a copy of i into a loop local variable and the loop local copy will maintain its value into the parallel task.
e.g.
for (int i = 0; i < 2; i++)
{
int localIndex = i;
Task.Factory.StartNew(() => WorkerMethod(arr[localIndex]));
}
Using foreach does not throw:
foreach (var i in arr)
{
Task.Factory.StartNew(() => WorkerMethod(i));
}
But is doesn't work either:
101
101
It executes WorkerMethod with the last entry in the array. Why is nicely explained in the other answers.
This does work:
Parallel.ForEach(arr,
item => Task.Factory.StartNew(() => WorkerMethod(item))
);
Note
This actually is my first hands-on experience with System.Threading.Tasks. I found this question, my naive answer and especially some of the other answers useful for my personal learning experience. I'll leave my answer up here because it might be useful for others.

Categories

Resources