Related
I have been learning about the lambda expression, I got happy when I finally can read/understand the => operator, it kind of means "where" to me
List<int> a = new List<int>(){0,1,2,1,3,4,5,6,7,8,9};
IEnumerable<int> b = a.FindAll(x => x>=5);
foreach (int x in b)
Console.WriteLine(x);
Reading the above line, personally makes sense to read it as "Find all x's from this list WHERE x is greater than or equal 5", very good.
But then I come across a different use of the lambda expression with the Select method.
List<int> a = new List<int>(){0,1,2,1,3,4,5,6,7,8,9};
IEnumerable<int> b1 = a.Select(x => x*2);
foreach (int x in b)
Console.WriteLine(x);
With this one, the previous way of reading this operator doesn't make sense, as to me this code does "For each x return x*2", which is very different "function" to what the same operator does in the previous case.
I understand that the difference is between .FindAll and .Select, different way of dealing with input and output parameters, but I am talking about the use of the operator => in the lambda expression.
There's no question in this question, so let's make one up.
Characterizing the lambda operator as "where" works when the lambda returns a bool and is used as a predicate to test a value. Is there a more general characterization of the lambda operator that makes sense in other contexts, such as projection?
Yes. Read the lambda operator as "goes to".
a.Select(x => x * 2);
"each x goes to x times two"
You can use that for predicates as well:
a.Where(x => x > 2);
"each x goes to 'is x greater than two?'"
But that's awkward. As you note, it is easier to think of this as "where" or "such that"
"each x such that x is greater than two"
Similarly
a.OrderBy(x => x.LastName)
"order each x by x goes to last name" is awkward. Just say "order each x by last name".
In short: the English language interpretation of the operator depends on the context. The formal interpretation of the operator is always the same: it simply describes a function.
The => operator has exactly the same meaning in both cases: it creates a function whose parameter is the thing on the left, and whose return value is the thing on the right.
You wrote in a comment that the x in the first case is not a parameter as you understand it. That's not correct; it is a parameter in both cases here.
Here's your first example, again:
List<int> a = new List<int>(){0,1,2,1,3,4,5,6,7,8,9};
IEnumerable<int> b = a.FindAll(x => x>=5);
foreach (int x in b)
Console.WriteLine(x);
If you wanted to write this without using lambda notation, you would define a function somewhere, like this...
static bool MyCondition(int x)
{
return x >= 5;
}
...and then use that function as the argument to FindAll:
List<int> a = new List<int>(){0,1,2,1,3,4,5,6,7,8,9};
IEnumerable<int> b = a.FindAll(MyCondition);
foreach (int x in b)
Console.WriteLine(x);
The lambda notation is a shorter notation which allows you to define the function right where you use it.
Likewise, if you wanted to write your second example without using lambda notation, you'd define a function elsewhere, like this...
static int MyOperation(int x)
{
return x * 2;
}
...and pass your function as the argument to Select, like this:
List<int> a = new List<int>(){0,1,2,1,3,4,5,6,7,8,9};
IEnumerable<int> b1 = a.Select(MyOperation);
foreach (int x in b)
Console.WriteLine(x);
Think of it this way:
Mathematics: f(x) = x + x
This is a mathematical function f that takes a number x and spits out its double.
Lambda: f = x => x + x C#'s way of defining the same function f.
Another example:
Mathematics: g(x, y) = x > y
g is a function that takes two numbers x and y and returns wether the former is greater than the latter.
Lambda: g = (x, y) => x > y C#'s way of defining the same function g.
Clearer now?
P.D: I've omitted talking about type inference and the type of the lambda's themselves; its an unnecessary distraction considering the context of this question.
I am coming from a OOP, non-functional background, so I am having trouble fully visualizing several online examples regarding continuation passing. Also, functional languages like Scheme don't have to specify types of arguments or return values, so I am unsure whether I got the idea correctly.
Since C# supports lambdas, I took the first example from the Wikipedia article and tried to port it to C# with strong typing, to see how the pattern would apply:
// (Scheme)
// direct function
(define (pyth x y)
(sqrt (+ (* x x) (* y y))))
// rewriten with CPS
(define (pyth& x y k)
(*& x x (lambda (x2)
(*& y y (lambda (y2)
(+& x2 y2 (lambda (x2py2)
(sqrt& x2py2 k))))))))
// where *&, +& and sqrt& are defined to
// calculate *, + and sqrt respectively and pass the result to k
(define (*& x y k)
(k (* x y)))
So, rewriting the CPS pyth& version in C# resulted in:
// (C#6)
// continuation function signature
delegate double Cont(double a);
// *&, +& and sqrt& functions
static double MulCont(double a, double b, Cont k) => k(a * b);
static double AddCont(double a, double b, Cont k) => k(a + b);
static double SqrtCont(double a, Cont k) => k(Math.Sqrt(a));
// sqrt(x*x + y*y), cps style
static double PythCont(double x, double y, Cont k) =>
MulCont(x, x, x2 =>
MulCont(y, y, y2 =>
AddCont(x2, y2, x2py2 =>
SqrtCont(x2py2, k))));
I could have used generics instead of double, but signatures would be longer. Anyway, what I am not sure is:
Is the Cont signature above correct (i.e. Func<double, double>)? Should the continuation fn. accept the parameter, process it, and then return the value of the same type back?
When I first started reading about continuations, I got the feeling that this continuation function will get invoked for each step in the call stack, but in the example above it's only passed to sqrt&, and all other calls get lambdas which don't really "pass" intermediate values to the original continuation. The code above in the function above is basically analogue to k(Math.Sqrt(x * x + y * y)), so does this mean my assumption about intermediate "hooks" is wrong?
Yes, unless you want to do anything non-numerical with the outermost continuation, it is correct.
You would only need more "Cont"s when your original expression involves more types, e.g.
(define (foo x) (if (= x 0) 1 0))
in which case it might look like this (sorry I write in scheme for brevity):
(define (foo& x k)
(=& x 0 (lambda (r1)
(if r1 (k 1) (k 0)))))
-- now the outermost continuation has a number (let's say an int) as input, while the one provided to "=&" has bool->int type.
You are almost right (up to duality) -- each step on the call stack is now a call to some continuation.
In general you might be confusing first-class continuations with cps -- the former is a language feature (as in scheme where you can access the current continuation with call/cc operator), the latter is a technique you can use anywhere.
You actually can convert expressions to cps without even having higher-order functions in your language (by just representing them somehow).
Another thing you asked is how cps relates to control flow. Well, notice that in applicative, functional language (like scheme) the only thing you have specified is that in case of application, you first evaluate the operands and the operator, and then apply the latter to the former. It does not matter in what order you evaluate the operands -- you might do it left-to-right, right-to-left [or perhaps in some crazy way]. But what if you're not using purely functional style, and the operands cause some side effects? They might e.g. print something to stdout, and later return some value. In that case, you would like to have control over the order. If I remember well, programs compiled with gambit-C evaluate arguments right-to-left, while interpreted with gambit's interpreter left-to-right -- so the problem really exists ;). And precisely then the cps might save you [actually there are other means as well, but we're about cps right now!].
In the scheme example you posted it is forced that the arguments of "+" are evaluated left-to-right.
You might alter that easily:
(define (pyth& x y k)
(*& y y (lambda (y2)
(*& x x (lambda (x2)
(+& x2 y2 (lambda (x2py2)
(sqrt& x2py2 k))))))))
And that's the thing.
Of some further applications, as guys already said in the comments, transformation to CPS moves every application to tail-position, so the call stack is being replaced with lambdas, and further if you defunctionalize them, what you get is a data structure representing the control flow -- a neat form to be converted to, say C, or some other imperative language. Fully automagicaly!
Or, if you'd like to implement some monad mumbo-jumbo, say Maybe monad, in CPS it's easy, just prepend to each continuation-lambda the test on whether the received value is "Just something" (in which case do the job and push the result to your continuation), or "Nothing", in which case you just push Nothing (to the continuation-lambda).
Of course rather by another program or macro, not by hand, as it might be tedious -- the most magic behing cps is that it's so easy to automate the transformation to cps.
Hope I didn't make it unnecessarily complicated.
I have created a very comprehensive introduction to the Continuation monad that you can Find Here Discovering the Continuation Monad in C#
Also you can find a.Net Fiddle here
I Repeat it in summary here
Starting from an initial Function
int Square(int x ){return (x * x);}
Use Callback and remove return type
public static void Square(int x, Action<int> callback)
{
callback(x * x);
}
Curry the Callback
public static Action<Action<int>> Square(int x)
{
return (callback) => { callback(x * x); };
}
Generalize the returned Continuation
public static Func<Func<int,T>,T> Square<T>(int x)
{
return (callback) => { callback(x * x); };
}
Extract the Continuation Structure Also Known As the Return Method of the monad
delegate T Cont<U, T>(Func<U, T> f);
public static Cont<U, T> ToContinuation<U, T>(this U x)
{
return (callback) => callback(x);
}
square.ToContinuation<Func<int, int>, int>()
Add The bind Monadic method and thus Complete the Monad
public static Cont<V, Answer> Bind<T, U, V, Answer>(
this Cont<T, Answer> m,
Func<T, Cont<U, Answer>> k,
Func<T, U, V> selector)
{
return (Func<V, Answer> c) =>
m(t => k(t)(y => c(selector(t, y))));
}
When comparing to a minimum or maximum of two numbers/functions, does C# short-circuit if the case is true for the first one and would imply truth for the second? Specific examples of these cases are
if(x < Math.Max(y, z()))
and
if(x > Math.Min(y, z()))
Since Math.Max(y, z()) will return a value at least as large as y, if x < y then there is no need to evaluate z(), which could take a while. Similar situation with Math.Min.
I realize that these could both be rewritten along the lines of
if(x < y || x < z())
in order to short-circuit, but I think it's more clear what the comparison is without rewriting. Does this short-circuit?
As others have pointed out, the compiler knows nothing about the semantics of Min or Max that would allow it to break the rule that arguments are evaluated before the method is called.
If you wanted to write your own, you could do so easily enough:
static bool LazyLessThan(int x, int y, Func<int> z)
{
return x < y || x < z();
}
and then call it
if (LazyLessThan(x, y, z))
or
if (LazyLessThan(x, y, ()=>z()))
Or for that matter:
static bool LazyRelation<T>(T x, T y, Func<T> z, Func<T, T, bool> relation)
{
return relation(x, y) || relation(x, z());
}
...
if (LazyRelation(x, y, ()=>z, (a,b)=> a < b)))
No, it doesn't short circuit and z() will always be evaluated. If you want the short circuiting behavior you should rewrite as you have done.
Math.Min() and Math.Max() are methods just like any other. They have to be evaluated in order to return the value which will be used as the second argument in the comparison. If you want short-circuiting then you will have to write the condition using the || operator as you have demonstrated.
(Nothing particularly new to add, but I figured I'd share the results of a test I ran on it.)
Math.Max() could easily be inlined by the CLR's just-in-time compiler and from there I was curious whether it might further optimize the code in such a way that it is short-circuited.
So I whipped up a microbenchmark that evaluates the two expressions 1,000,000 times each.
For z(), I used a function that calculates Fib(15) using the recursive method. Here are the results of running the two:
x < Math.Max(y, z()) : 8097 ms
x < y || x < z() : 29 ms
I'm guessing the CLR won't transform the code in any way that prevents method calls from executing, because it doesn't know (and doesn't check to see if) the routine has any side effects.
No, it doesnt short circuit, at least at the C# compiler level. Math.Min or Math.Max are two ordinary static method calls and the compiler will not optimize in that sense.
The order of evaluation of the code will be: z(), Math.Max, x > ...
If you really want to make sure, check out the IL code.
Below a Compose function. If f and g are unary functions which return values, then Compose(f,g) returns a function which when called on x performs the equivalent to f(g(x)).
static Func<X, Z> Compose<Z, Y, X>(Func<Y, Z> f,Func<X, Y> g)
{ return x => f(g(x)); }
Here's a couple of simple Func values which can be composed:
Func<int, bool> is_zero = x => { return x == 0; };
Func<int, int> mod_by_2 = x => { return x % 2; };
E.g. this works:
Console.WriteLine(Compose(is_zero, mod_by_2)(4));
However, if I instead have these equivalent static methods:
static bool IsZero(int n) { return n == 0; }
static int ModBy2(int n) { return n % 2; }
the same example doesn't work with those. I.e. this produces a compile time error:
Console.WriteLine(Compose(IsZero, ModBy2)(4));
Explicitly passing types to Compose fixes the issue:
Console.WriteLine(Compose<bool, int, int>(IsZero, ModBy2)(4));
Is there anyway to write Compose such that it works on the static methods without the explicit types?
Is this a good approach to take to implementing Compose? Can anyone make improvements to this?
The problem here is not the use of static methods but the use of method groups. When you use a function name as an expression without invoking it then it's a method group and must go through method group conversion. You would have the exact same problem with instance methods.
The problem you're running into is that C# can't do return type inference on method groups. Using Compose(IsZero, ModBy2)) requires the return type to be inferred for both IsZero and ModBy2 and hence this operation fails.
This is a known limitation in the inference capabilities of the C# compiler. Eric Lippert wrote an extensive blog article on this particular subject which covers this problem in detail
http://blogs.msdn.com/b/ericlippert/archive/2007/11/05/c-3-0-return-type-inference-does-not-work-on-member-groups.aspx
I read This article and i found it interesting.
To sum it up for those who don't want to read the entire post. The author implements a higher order function named Curry like this (refactored by me without his internal class):
public static Func<T1, Func<T2, TResult>>
Curry<T1, T2, TResult>(this Func<T1, T2, TResult> fn)
{
Func<Func<T1, T2, TResult>, Func<T1, Func<T2, TResult>>> curry =
f => x => y => f(x, y);
return curry(fn);
}
That gives us the ability to take an expression like F(x, y)
eg.
Func<int, int, int> add = (x, y) => x + y;
and call it in the F.Curry()(x)(y) manner;
This part i understood and i find it cool in a geeky way. What i fail to wrap my head around is the practical usecases for this approach. When and where this technique is necessary and what can be gained from it?
Thanks in advance.
Edited:
After the initial 3 responses i understand that the gain would be that in some cases when we create a new function from the curried some parameters are not re evalued.
I made this little test in C# (keep in mind that i'm only interested in the C# implementation and not the curry theory in general):
public static void Main(string[] args)
{
Func<Int, Int, string> concat = (a, b) => a.ToString() + b.ToString();
Func<Int, Func<Int, string>> concatCurry = concat.Curry();
Func<Int, string> curryConcatWith100 = (a) => concatCurry(100)(a);
Console.WriteLine(curryConcatWith100(509));
Console.WriteLine(curryConcatWith100(609));
}
public struct Int
{
public int Value {get; set;}
public override string ToString()
{
return Value.ToString();
}
public static implicit operator Int(int value)
{
return new Int { Value = value };
}
}
On the 2 consecutive calls to curryConcatWith100 the ToString() evaluation for the value 100 is called twice (once for each call) so i dont see any gain in evaluation here. Am i missing something ?
Currying is used to transform a function with x parameters to a function with y parameters, so it can be passed to another function that needs a function with y parameters.
For example, Enumerable.Select(this IEnumerable<T> source, Func<TSource, bool> selector) takes a function with 1 parameter. Math.Round(double, int) is a function that has 2 parameters.
You could use currying to "store" the Round function as data, and then pass that curried function to the Select like so
Func<double, int, double> roundFunc = (n, p) => Math.Round(n, p);
Func<double, double> roundToTwoPlaces = roundFunc.Curry()(2);
var roundedResults = numberList.Select(roundToTwoPlaces);
The problem here is that there's also anonymous delegates, which make currying redundant. In fact anonymous delegates are a form of currying.
Func<double, double> roundToTwoPlaces = n => Math.Round(n, 2);
var roundedResults = numberList.Select(roundToTwoPlaces);
Or even just
var roundedResults = numberList.Select(n => Math.Round(n, 2));
Currying was a way of solving a particular problem given the syntax of certain functional languages. With anonymous delegates and the lambda operator the syntax in .NET is alot simpler.
Its easier to first consider fn(x,y,z). This could by curried using fn(x,y) giving you a function that only takes one parameter, the z. Whatever needs to be done with x and y alone can be done and stored by a closure that the returned function holds on to.
Now you call the returned function several times with various values for z without having to recompute the part the required x and y.
Edit:
There are effectively two reasons to curry.
Parameter reduction
As Cameron says to convert a function that takes say 2 parameters into a function that only takes 1. The result of calling this curried function with a parameter is the same as calling the original with the 2 parameters.
With Lambdas present in C# this has limited value since these can provide this effect anyway. Although it you are use C# 2 then the Curry function in your question has much greater value.
Staging computation
The other reason to curry is as I stated earlier. To allow complex/expensive operations to be staged and re-used several times when the final parameter(s) are supplied to the curried function.
This type of currying isn't truely possible in C#, it really takes a functional language that can natively curry any of its functions to acheive.
Conclusion
Parameter reduction via the Curry you mention is useful in C# 2 but is considerably de-valued in C# 3 due to Lambdas.
In a sense, curring is a technique to
enable automatic partial application.
More formally, currying is a technique
to turn a function into a function
that accepts one and only one
argument.
In turn, when called, that function
returns another function that accepts
one and only one argument . . . and so
on until the 'original' function is
able to be executed.
from a thread in codingforums
I particularly like the explanation and length at which this is explained on this page.
One example: You have a function compare(criteria1, criteria2, option1, option2, left, right). But when you want to supply the function compare to some method with sorts a list, then compare() must only take two arguments, compare(left, right). With curry you then bind the criteria arguments as you need it for sorting this list, and then finally this highly configurable function presents to the sort algorithm as any other plain compare(left,right).
Detail: .NET delegates employ implicit currying. Each non-static member function of a class has an implicit this reference, still, when you write delegates, you do not need to manually use some currying to bind this to the function. Instead C# cares for the syntactic sugar, automatically binds this, and returns a function which only requires the arguments left.
In C++ boost::bind et al. are used for the same. And as always, in C++ everything is a little bit more explicit (for instance, if you want to pass a instance-member function as a callback, you need to explicitly bind this).
I have this silly example:
Uncurry version:
void print(string name, int age, DateTime dob)
{
Console.Out.WriteLine(name);
Console.Out.WriteLine(age);
Console.Out.WriteLine(dob.ToShortDateString());
Console.Out.WriteLine();
}
Curry Function:
public Func<string, Func<int, Action<DateTime>>> curry(Action<string, int, DateTime> f)
{
return (name) => (age) => (dob) => f(name, age, dob);
}
Usage:
var curriedPrint = curry(print);
curriedPrint("Jaider")(29)(new DateTime(1983, 05, 10)); // Console Displays the values
Have fun!
here's another example of how you might use a Curry function. Depending on some condition (e.g. day of week) you could decide what archive policy to apply before updating a file.
void ArchiveAndUpdate(string[] files)
{
Func<string, bool> archiveCurry1 = (file) =>
Archive1(file, "archiveDir", 30, 20000000, new[] { ".tmp", ".log" });
Func<string, bool> archiveCurry2 = (file) =>
Archive2("netoworkServer", "admin", "nimda", new FileInfo(file));
Func<string, bool> archvieCurry3 = (file) => true;
// backup locally before updating
UpdateFiles(files, archiveCurry1);
// OR backup to network before updating
UpdateFiles(files, archiveCurry2);
// OR do nothing before updating
UpdateFiles(files, archvieCurry3);
}
void UpdateFiles(string[] files, Func<string, bool> archiveCurry)
{
foreach (var file in files)
{
if (archiveCurry(file))
{
// update file //
}
}
}
bool Archive1(string fileName, string archiveDir,
int maxAgeInDays, long maxSize, string[] excludedTypes)
{
// backup to local disk
return true;
}
bool Archive2(string sereverName, string username,
string password, FileInfo fileToArchvie)
{
// backup to network
return true;
}