C# Linq vs. Currying - c#

I am playing a little bit with functional programming and the various concepts of it. All this stuff is very interesting. Several times I have read about Currying and what an advantage it has.
But I do not get the point with this. The following source demonstrates the using of the curry concept and the solution with linq. Actually, I do not see any advatages of using the currying concept.
So, what is the advantage of using currying?
static bool IsPrime(int value)
{
int max = (value / 2) + 1;
for (int i = 2; i < max; i++)
{
if ((value % i) == 0)
{
return false;
}
}
return true;
}
static readonly Func<IEnumerable<int>, IEnumerable<int>> GetPrimes =
HigherOrder.GetFilter<int>().Curry()(IsPrime);
static void Main(string[] args)
{
int[] numbers = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };
Console.Write("Primes:");
//Curry
foreach (int n in GetPrimes(numbers))
{
Console.Write(" {0}", n);
}
Console.WriteLine();
//Linq
foreach (int n in numbers.Where(p => IsPrime(p)))
{
Console.Write(" {0}", n);
}
Console.ReadLine();
}
Here is the HigherOrder Filter Method:
public static Func<Func<TSource, bool>, IEnumerable<TSource>, IEnumerable<TSource>> GetFilter<TSource>()
{
return Filter<TSource>;
}

what is the advantage of using currying?
First off, let's clarify some terms. People use "currying" to mean both:
reformulating a method of two parameters into a methods of one parameter that returns a method of one parameter and
partial application of a method of two parameters to produce a method of one parameter.
Clearly these two tasks are closely related, and hence the confusion. When speaking formally, one ought to restrict "currying" to refer to the first definition, but when speaking informally either usage is common.
So, if you have a method:
static int Add(int x, int y) { return x + y; }
you can call it like this:
int result = Add(2, 3); // 5
You can curry the Add method:
static Func<int, int> MakeAdder(int x) { return y => Add(x, y); }
and now:
Func<int, int> addTwo = MakeAdder(2);
int result = addTwo(3); // 5
Partial application is sometimes also called "currying" when speaking informally because it is obviously related:
Func<int, int> addTwo = y=>Add(2,y);
int result = addTwo(3);
You can make a machine that does this process for you:
static Func<B, R> PartiallyApply<A, B, R>(Func<A, B, R> f, A a)
{
return (B b)=>f(a, b);
}
...
Func<int, int> addTwo = PartiallyApply<int, int, int>(Add, 2);
int result = addTwo(3); // 5
So now we come to your question:
what is the advantage of using currying?
The advantage of either technique is that it gives you more flexibility in dealing with methods.
For example, suppose you are writing an implementation of a path finding algorithm. You might already have a helper method that gives you an approximate distance between two points:
static double ApproximateDistance(Point p1, Point p2) { ... }
But when you are actually building the algorithm, what you often want to know is what is the distance between the current location and a fixed end point. What the algorithm needs is Func<Point, double> -- what is the distance from the location to the fixed end point? What you have is Func<Point, Point, double>. How are you going to turn what you've got into what you need? With partial application; you partially apply the fixed end point as the first argument to the approximate distance method, and you get out a function that matches what your path finding algorithm needs to consume:
Func<Point, double> distanceFinder = PartiallyApply<Point, Point, double>(ApproximateDistance, givenEndPoint);
If the ApproximateDistance method had been curried in the first place:
static Func<Point, double> MakeApproximateDistanceFinder(Point p1) { ... }
Then you would not need to do the partial application yourself; you'd just call MakeApproximateDistanceFinder with the fixed end point and you'd be done.
Func<Point, double> distanceFinder = MakeApproximateDistanceFinder(givenEndPoint);

The comment by #Eric Lippert on What is the advantage of Currying in C#? (achieving partial function) points to this blog post:
Currying and Partial Function Application
Where I found this the best explantion that works for me:
From a theoretical standpoint, it is interesting because it (currying) simplifies
the lambda calculus to include only those functions which have at most
one argument. From a practical perspective, it allows a programmer to
generate families of functions from a base function by fixing the
first k arguments. It is akin to pinning up something on the wall
that requires two pins. Before being pinned, the object is free to
move anywhere on the surface; however, when when first pin is put in
then the movement is constrained. Finally, when the second pin is put
in then there is no longer any freedom of movement. Similarly, when a
programmer curries a function of two arguments and applies it to the
first argument then the functionality is limited by one dimension.
Finally, when he applies the new function to the second argument then
a particular value is computed.
Taking this further I see that functional programming essentially introduces 'data flow programming as opposed to control flow' this is akin to using say SQL instead of C#. With this definition I see why LINQ is and why it has many many applications outside of pure Linq2Objects - such as events in Rx.

The advantage of using currying is largely to be found in functional languages, which are built to benefit from currying, and have a convenient syntax for the concept. C# is not such a language, and implementations of currying in C# are usually difficult to follow, as is the expression HigherOrder.GetFilter<int>().Curry()(IsPrime).

Related

Functionally pure dice rolls in C#

I am writing a dice-based game in C#. I want all of my game-logic to be pure, so I have devised a dice-roll generator like this:
public static IEnumerable<int> CreateDiceStream(int seed)
{
var random = new Random(seed);
while (true)
{
yield return 1 + random.Next(5);
}
}
Now I can use this in my game logic:
var playerRolls = players.Zip(diceRolls, (player, roll) => Tuple.Create(player, roll));
The problem is that the next time I take from diceRolls I want to skip the rolls that I have already taken:
var secondPlayerRolls = players.Zip(
diceRolls.Skip(playerRolls.Count()),
(player, roll) => Tuple.Create(player, roll));
This is already quite ugly and error prone. It doesn't scale well as the code becomes more complex.
It also means that I have to be careful when using a dice roll sequence between functions:
var x = DoSomeGameLogic(diceRolls);
var nextRoll = diceRolls.Skip(x.NumberOfDiceRollsUsed).First();
Is there a good design pattern that I should be using here?
Note that it is important that my functions remain pure due to syncronisation and play-back requirements.
This question is not about correctly initializing System.Random. Please read what I have written, and leave a comment if it is unclear.
That's a very nice puzzle.
Since manipulating diceRolls's state is out of the question (otherwise, we'd have those sync and replaying issues you mentioned), we need an operation which returns both (a) the values to be consumed and (b) a new diceRolls enumerable which starts after the consumed items.
My suggestion would be to use the return value for (a) and an out parameter for (b):
static IEnumerable<int> Consume(this IEnumerable<int> rolls, int count, out IEnumerable<int> remainder)
{
remainder = rolls.Skip(count);
return rolls.Take(count);
}
Usage:
var firstRolls = diceRolls.Consume(players.Count(), out diceRolls);
var secondRolls = diceRolls.Consume(players.Count(), out diceRolls);
DoSomeGameLogic would use Consume internally and return the remaining rolls. Thus, it would need to be called as follows:
var x = DoSomeGameLogic(diceRolls, out diceRolls);
// or
var x = DoSomeGameLogic(ref diceRolls);
// or
x = DoSomeGameLogic(diceRolls);
diceRolls = x.RemainingDiceRolls;
The "classic" way to implement pure random generators is to use a specialized form of a state monad (more explanation here), which wraps the carrying around of the current state of the generator. So, instead of implementing (note that my C# is quite rusty, so please consider this as pseudocode):
Int Next() {
nextState, nextValue = NextRandom(globalState);
globalState = nextState;
return nextValue;
}
you define something like this:
class Random<T> {
private Func<Int, Tuple<Int, T>> transition;
private Tuple<Int, Int> NextRandom(Int state) { ... whatever, see below ... }
public static Random<A> Unit<A>(A a) {
return new Random<A>(s => Tuple(s, a));
}
public static Random<Int> GetRandom() {
return new Random<Int>(s => nextRandom(s));
}
public Random<U> SelectMany(Func<T, Random<U>> f) {
return new Random(s => {
nextS, a = this.transition(s);
return f(a).transition(nextS);
}
}
public T Run(Int seed) {
return this.transition(seed);
}
}
Which should be usable with LINQ, if I did everything right:
// player1 = bla, player2 = blub, ...
Random<Tuple<Player, Int>> playerOneRoll = from roll in GetRandom()
select Tuple(player1, roll);
Random<Tuple<Player, Int>> playerTwoRoll = from roll in GetRandom()
select Tuple(player2, roll);
Random<List<Tuple<Player, Int>>> randomRolls = from t1 in playerOneRoll
from t2 in playerTwoRoll
select List(t1, t2);
var actualRolls = randomRolls.Run(234324);
etc., possibly using some combinators. The trick here is to represent the whole "random action" parametrized by the current input state; but this is also the problem, since you'd need a good implementation of NextRandom.
It would be nice if you could just reuse the internals of the .NET Random implementation, but as it seems, you cannot access its internal state. However, I'm sure there are enough sufficiently good PRNG state functions around on the internet (this one looks good; you might have to change the state type).
Another disadvantage of monads is that once you start working in them (ie, construct things in Random), you need to "carry that though" the whole control flow, up to the top level, at which you should call Run once and for all. This is something one needs to get use to, and is more tedious in C# than functional languages optimized for such things.

Converting a method to use any Enum

My Problem:
I want to convert my randomBloodType() method to a static method that can take any enum type. I want my method to take any type of enum whether it be BloodType, DaysOfTheWeek, etc. and perform the operations shown below.
Some Background on what the method does:
The method currently chooses a random element from the BloodType enum based on the values assigned to each element. An element with a higher value has a higher probability to be picked.
Code:
public enum BloodType
{
// BloodType = Probability
ONeg = 4,
OPos = 36,
ANeg = 3,
APos = 28,
BNeg = 1,
BPos = 20,
ABNeg = 1,
ABPos = 5
};
public BloodType randomBloodType()
{
// Get the values of the BloodType enum and store it in a array
BloodType[] bloodTypeValues = (BloodType[])Enum.GetValues(typeof(BloodType));
List<BloodType> bloodTypeList = new List<BloodType>();
// Create a list where each element occurs the approximate number of
// times defined as its value(probability)
foreach (BloodType val in bloodTypeValues)
{
for(int i = 0; i < (int)val; i++)
{
bloodTypeList.Add(val);
}
}
// Sum the values
int sum = 0;
foreach (BloodType val in bloodTypeValues)
{
sum += (int)val;
}
//Get Random value
Random rand = new Random();
int randomValue = rand.Next(sum);
return bloodTypeList[randomValue];
}
What I have tried so far:
I have tried to use generics. They worked out for the most part, but I was unable to cast my enum elements to int values. I included a example of a section of code that was giving me problems below.
foreach (T val in bloodTypeValues)
{
sum += (int)val; // This line is the problem.
}
I have also tried using Enum e as a method parameter. I was unable to declare the type of my array of enum elements using this method.
(Note: My apologies in advance for the lengthy answer. My actual proposed solution is not all that long, but there are a number of problems with the proposed solutions so far and I want to try to address those thoroughly, to provide context for my own proposed solution).
In my opinion, while you have in fact accepted one answer and might be tempted to use either one, neither of the answers provided so far are correct or useful.
Commenter Ben Voigt has already pointed out two major flaws with your specifications as stated, both related to the fact that you are encoding the enum value's weight in the value itself:
You are tying the enum's underlying type to the code that then must interpret that type.
Two enum values that have the same weight are indistinguishable from each other.
Both of these issues can be addressed. Indeed, while the answer you accepted (why?) fails to address the first issue, the one provided by Dweeberly does address this through the use of Convert.ToInt32() (which can convert from long to int just fine, as long as the values are small enough).
But the second issue is much harder to address. The answer from Asad attempts to address this by starting with the enum names and parsing them to their values. And this does indeed result in the final array being indexed containing the corresponding entries for each name separately. But the code actually using the enum has no way to distinguish the two; it's really as if those two names are a single enum value, and that single enum value's probability weight is the sum of the value used for the two different names.
I.e. in your example, while the enum entries for e.g. BNeg and ABNeg will be selected separately, the code that receives these randomly selected value has no way to know whether it was BNeg or ABNeg that was selected. As far as it knows, those are just two different names for the same value.
Now, even this problem can be addressed (but not in the way that Asad attempts to…his answer is still broken). If you were, for example, to encode the probabilities in the value while still ensuring unique values for each name, you could decode those probabilities while doing the random selection and that would work. For example:
enum BloodType
{
// BloodType = Probability
ONeg = 4 * 100 + 0,
OPos = 36 * 100 + 1,
ANeg = 3 * 100 + 2,
APos = 28 * 100 + 3,
BNeg = 1 * 100 + 4,
BPos = 20 * 100 + 5,
ABNeg = 1 * 100 + 6,
ABPos = 5 * 100 + 7,
};
Having declared your enum values that way, then you can in your selection code divide the enum value by 100 to obtain its probability weight, which then can be used as seen in the various examples. At the same time, each enum name has a unique value.
But even solving that problem, you are still left with problems related to the choice of encoding and representation of the probabilities. For example, in the above you cannot have an enum that has more than 100 values, nor one with weights larger than (2^31 - 1) / 100; if you want an enum that has more than 100 values, you need a larger multiplier but that would limit your weight values even more.
In many scenarios (maybe all the ones you care about) this won't be an issue. The numbers are small enough that they all fit. But that seems like a serious limitation in what seems like a situation where you want a solution that is as general as possible.
And that's not all. Even if the encoding stays within reasonable limits, you have another significant limit to deal with: the random selection process requires an array large enough to contain for each enum value as many instances of that value as its weight. Again, if the values are small maybe this is not a big problem. But it does severely limit the ability of your implementation to generalize.
So, what to do?
I understand the temptation to try to keep each enum type self-contained; there are some obvious advantages to doing so. But there are also some serious disadvantages that result from that, and if you truly ever try to use this in a generalized way, the changes to the solutions proposed so far will tie your code together in ways that IMHO negate most if not all of the advantage of keeping the enum types self-contained (primarily: if you find you need to modify the implementation to accommodate some new enum type, you will have to go back and edit all of the other enum types you're using…i.e. while each type looks self-contained, in reality they are all tightly coupled with each other).
In my opinion, a much better approach would be to abandon the idea that the enum type itself will encode the probability weights. Just accept that this will be declared separately somehow.
Also, IMHO is would be better to avoid the memory-intensive approach proposed in your original question and mirrored in the other two answers. Yes, this is fine for the small values you're dealing with here. But it's an unnecessary limitation, making only one small part of the logic simpler while complicating and restricting it in other ways.
I propose the following solution, in which the enum values can be whatever you want, the enum's underlying type can be whatever you want, and the algorithm uses memory proportionally only to the number of unique enum values, rather than in proportion to the sum of all of the probability weights.
In this solution, I also address possible performance concerns, by caching the invariant data structures used to select the random values. This may or may not be useful in your case, depending on how frequently you will be generating these random values. But IMHO it is a good idea regardless; the up-front cost of generating these data structures is so high that if the values are selected with any regularity at all, it will begin to dominate the run-time cost of your code. Even if it works fine today, why take the risk? (Again, especially given that you seem to want a generalized solution).
Here is the basic solution:
static T NextRandomEnumValue<T>()
{
KeyValuePair<T, int>[] aggregatedWeights = GetWeightsForEnum<T>();
int weightedValue =
_random.Next(aggregatedWeights[aggregatedWeights.Length - 1].Value),
index = Array.BinarySearch(aggregatedWeights,
new KeyValuePair<T, int>(default(T), weightedValue),
KvpValueComparer<T, int>.Instance);
return aggregatedWeights[index < 0 ? ~index : index + 1].Key;
}
static KeyValuePair<T, int>[] GetWeightsForEnum<T>()
{
object temp;
if (_typeToAggregatedWeights.TryGetValue(typeof(T), out temp))
{
return (KeyValuePair<T, int>[])temp;
}
if (!_typeToWeightMap.TryGetValue(typeof(T), out temp))
{
throw new ArgumentException("Unsupported enum type");
}
KeyValuePair<T, int>[] weightMap = (KeyValuePair<T, int>[])temp;
KeyValuePair<T, int>[] aggregatedWeights =
new KeyValuePair<T, int>[weightMap.Length];
int sum = 0;
for (int i = 0; i < weightMap.Length; i++)
{
sum += weightMap[i].Value;
aggregatedWeights[i] = new KeyValuePair<T,int>(weightMap[i].Key, sum);
}
_typeToAggregatedWeights[typeof(T)] = aggregatedWeights;
return aggregatedWeights;
}
readonly static Random _random = new Random();
// Helper method to reduce verbosity in the enum-to-weight array declarations
static KeyValuePair<T1, T2> CreateKvp<T1, T2>(T1 t1, T2 t2)
{
return new KeyValuePair<T1, T2>(t1, t2);
}
readonly static KeyValuePair<BloodType, int>[] _bloodTypeToWeight =
{
CreateKvp(BloodType.ONeg, 4),
CreateKvp(BloodType.OPos, 36),
CreateKvp(BloodType.ANeg, 3),
CreateKvp(BloodType.APos, 28),
CreateKvp(BloodType.BNeg, 1),
CreateKvp(BloodType.BPos, 20),
CreateKvp(BloodType.ABNeg, 1),
CreateKvp(BloodType.ABPos, 5),
};
readonly static Dictionary<Type, object> _typeToWeightMap =
new Dictionary<Type, object>()
{
{ typeof(BloodType), _bloodTypeToWeight },
};
readonly static Dictionary<Type, object> _typeToAggregatedWeights =
new Dictionary<Type, object>();
Note that the work of actually selecting a random value is simply a matter of choosing a non-negative random integer less than the sum of the weights, and then using a binary search to find the appropriate enum value.
Once per enum type, the code will build the table of values and weight-sums that will be used for the binary search. This result is stored in a cache dictionary, _typeToAggregatedWeights.
There are also the objects that have to be declared and which will be used at run-time to build this table. Note that the _typeToWeightMap is just in support of making this method 100% generic. If you wanted to write a different named method for each specific type you wanted to support, that could still used a single generic method to implement the initialization and selection, but the named method would know the correct object (e.g. _bloodTypeToWeight) to use for initialization.
Alternatively, another way to avoid the _typeToWeightMap while still keeping the method 100% generic would be to have the _typeToAggregatedWeights be of type Dictionary<Type, Lazy<object>>, and have the values of the dictionary (the Lazy<object> objects) explicitly reference the appropriate weight array for the type.
In other words, there are lots of variations on this theme that would work fine. But they will all have essentially the same structure as above; semantics would be the same and performance differences would be negligible.
One thing you'll notice is that the binary search requires a custom IComparer<T> implementation. That is here:
class KvpValueComparer<TKey, TValue> :
IComparer<KeyValuePair<TKey, TValue>> where TValue : IComparable<TValue>
{
public readonly static KvpValueComparer<TKey, TValue> Instance =
new KvpValueComparer<TKey, TValue>();
private KvpValueComparer() { }
public int Compare(KeyValuePair<TKey, TValue> x, KeyValuePair<TKey, TValue> y)
{
return x.Value.CompareTo(y.Value);
}
}
This allows the Array.BinarySearch() method to correct compare the array elements, allowing a single array to contain both the enum values and their aggregated weights, but limiting the binary search comparison to just the weights.
Assuming your enum values are all of type int (you can adjust this accordingly if they're long, short, or whatever):
static TEnum RandomEnumValue<TEnum>(Random rng)
{
var vals = Enum
.GetNames(typeof(TEnum))
.Aggregate(Enumerable.Empty<TEnum>(), (agg, curr) =>
{
var value = Enum.Parse(typeof (TEnum), curr);
return agg.Concat(Enumerable.Repeat((TEnum)value,(int)value)); // For int enums
})
.ToArray();
return vals[rng.Next(vals.Length)];
}
Here's how you would use it:
var rng = new Random();
var randomBloodType = RandomEnumValue<BloodType>(rng);
People seem to have their knickers in a knot about multiple indistinguishable enum values in the input enum (for which I still think the above code provides expected behavior). Note that there is no answer here, not even Peter Duniho's, that will allow you to distinguish enum entries when they have the same value, so I'm not sure why this is being considered as a metric for any potential solutions.
Nevertheless, an alternative approach that doesn't use the enum values as probabilities is to use an attribute to specify the probability:
public enum BloodType
{
[P=4]
ONeg,
[P=36]
OPos,
[P=3]
ANeg,
[P=28]
APos,
[P=1]
BNeg,
[P=20]
BPos,
[P=1]
ABNeg,
[P=5]
ABPos
}
Here is what the attribute used above looks like:
[AttributeUsage(AttributeTargets.Field, AllowMultiple = false)]
public class PAttribute : Attribute
{
public int Weight { get; private set; }
public PAttribute(int weight)
{
Weight = weight;
}
}
and finally, this is what the method to get a random enum value would like:
static TEnum RandomEnumValue<TEnum>(Random rng)
{
var vals = Enum
.GetNames(typeof(TEnum))
.Aggregate(Enumerable.Empty<TEnum>(), (agg, curr) =>
{
var value = Enum.Parse(typeof(TEnum), curr);
FieldInfo fi = typeof (TEnum).GetField(curr);
var weight = ((PAttribute)fi.GetCustomAttribute(typeof(PAttribute), false)).Weight;
return agg.Concat(Enumerable.Repeat((TEnum)value, weight)); // For int enums
})
.ToArray();
return vals[rng.Next(vals.Length)];
}
(Note: if this code is performance critical, you might need to tweak this and add caching for the reflection data).
Some of this you can do and some of it isn't so easy. I believe the following extension method will do what you describe.
static public class Util {
static Random rnd = new Random();
static public int PriorityPickEnum(this Enum e) {
// The approved types for an enum are byte, sbyte, short, ushort, int, uint, long, or ulong
// However, Random only supports a int (or double) as a max value. Either way
// it doesn't have the range for uint, long and ulong.
//
// sum enum
int sum = 0;
foreach (var x in Enum.GetValues(e.GetType())) {
sum += Convert.ToInt32(x);
}
var i = rnd.Next(sum); // get a random value, it will form a ratio i / sum
// enums may not have a uniform (incremented) value range (think about flags)
// therefore we have to step through to get to the range we want,
// this is due to the requirement that return value have a probability
// proportional to it's value. Note enum values must be sorted for this to work.
foreach (var x in Enum.GetValues(e.GetType()).OfType<Enum>().OrderBy(a => a)) {
i -= Convert.ToInt32(x);
if (i <= 0) return Convert.ToInt32(x);
}
throw new Exception("This doesn't seem right");
}
}
Here is an example of using this extension:
BloodType bt = BloodType.ABNeg;
for (int i = 0; i < 100; i++) {
var v = (BloodType) bt.PriorityPickEnum();
Console.WriteLine("{0}: {1}({2})", i, v, (int) v);
}
This should work pretty well for enum's of type byte, sbyte, ushort, short and int. Once you get beyond int (uint, long, ulong) the problem is the Random class. You can adjust the code to use doubles generated by Random, which would cover uint, but the Random class just doesn't have the range to cover long and ulong. Of course you could use/find/write a different Random class if this is important.

Writing a C# version of Haskell infinite Fibonacci series function

Note: The point of this question is more from a curiosity perspective. I want to know out of curiosity whether it is even possible to transliterate the Haskell implementation into a functional C# equivalent.
So I've been learning myself Haskell for great good, and while solving Project Euler problems I ran into this beautiful Haskell Fibonacci implementation:
fibs :: [Integer]
fibs = 1:1:zipWith (+) fibs (tail fibs)
Of course I was tempted to write a C# version like this, so:
If I do this:
IEnumerable<int> fibs =
Enumerable.Zip(Enumerable.Concat(new int[] { 1, 1 }, fibs),
//^^error
fibs.Skip(1), (f, s) => f + s);
The error says use of unassigned local variable fibs.
So I went slightly imperative, while this compiles...
public static IEnumerable<int> Get()
{
return Enumerable.Zip(Enumerable.Concat(new int[] { 1, 1 }, Get()),
Get().Skip(1), (f, s) => f + s);
}
It breaks with a stack overflow exception! So I came here..
Questions:
Can anyone think of a functional C# equivalent that works?
I'd like some insight into why my solutions don't work.
The answer to your first question is: this is how to do it in C#:
using System;
using System.Collections.Generic;
using System.Linq;
class P
{
static IEnumerable<int> F()
{
yield return 1;
yield return 1;
foreach(int i in F().Zip(F().Skip(1), (a,b)=>a+b))
yield return i;
}
static void Main()
{
foreach(int i in F().Take(10))
Console.WriteLine(i);
}
}
The answer to your second question is: C# is eager by default, so your method has an unbounded recursion. Iterators that use yield however return an enumerator immediately, but do not construct each element until required; they are lazy. In Haskell everything is lazy automatically.
UPDATE: Commenter Yitz points out correctly that this is inefficient because, unlike Haskell, C# does not automatically memoize the results. It's not immediately clear to me how to fix it while keeping this bizarre recursive algorithm intact.
Of course you would never actually write fib like this in C# when it is so much easier to simply:
static IEnumerable<BigInteger> Fib()
{
BigInteger prev = 0;
BigInteger curr = 1;
while (true)
{
yield return curr;
var next = curr + prev;
prev = curr;
curr = next;
}
}
Unlike the C# version provided in Eric Lippert's answer, this F# version avoids repeated computation of elements and therefore has comparable efficiency with Haskell:
let rec fibs =
seq {
yield 1
yield 1
for (a, b) in Seq.zip fibs (Seq.skip 1 fibs) do
yield a + b
}
|> Seq.cache // this is critical for O(N)!
I have to warn you that I'm trying to fix your attempts, not to make a productive code.
Also, this solution is good to make our brains to explode, and maybe the computer also.
In your first snippet you tried to call recursive your field or local variable, that is not possible.Instead we can try with a lambda which could be more similar to that. We know from Church, that is also not possible, at least in the traditional way. Lambda expressions are unnamed; you can't call them by their name ( inside of the implementation ). But you can use the fixed point to do recursion. If you have a sane mind there is big chance of not knowing what is that, anyway you should give a try to this link before continuing with this.
fix :: (a -> a) -> a
fix f = f (fix f)
This will be the c# implementation (which is wrong)
A fix<A>(Func<A,A> f) {
return f(fix(f));
}
Why is wrong? Because fix(f) represents a beautiful stackoverflow. So we need to make it lazy:
A fix<A>(Func<Func<A>,A> f) {
return f(()=>fix(f));
}
Now is lazy! Actually you will see a lot of this in the following code.
In your second snippet and also in the first, you have the problem that the second argument to Enumerable.Concat is not lazy, and you will have stackoverflow exception, or infinite loop in the idealistic way. So let's make it lazy.
static IEnumerable<T> Concat<T>(IEnumerable<T> xs,Func<IEnumerable<T>> f) {
foreach (var x in xs)
yield return x;
foreach (var y in f())
yield return y;
}
Now, we have the whole "framework" to implement what you have tried in the functional way.
void play() {
Func<Func<Func<IEnumerable<int>>>, Func<IEnumerable<int>>> F = fibs => () =>
Concat(new int[] { 1, 1 },
()=> Enumerable.Zip (fibs()(), fibs()().Skip(1), (a,b)=> a + b));
//let's see some action
var n5 = fix(F)().Take(5).ToArray(); // instant
var n20 = fix(F)().Take(20).ToArray(); // relative fast
var n30 = fix(F)().Take(30).ToArray(); //this will take a lot of time to compute
//var n40 = fix(F)().Take(40).ToArray(); //!!! OutOfMemoryException
}
I know that the F signature is ugly like hell, but this is why languages like haskell exists, and even F#. C# is not made for this work to be done like this.
Now, the question is, why haskell can achieve something like this?Why? because whenever you say something like
a:: Int
a = 4
The most similar translation in C# is :
Func<Int> a = () => 4
Actually is much more involved in the haskell implementation, but this is the idea why similar method of solving problems looks so different if you want to write it in both languages
Here it is for Java, dependent on Functional Java:
final Stream<Integer> fibs = new F2<Integer, Integer, Stream<Integer>>() {
public Stream<Integer> f(final Integer a, final Integer b) {
return cons(a, curry().f(b).lazy().f(a + b));
}
}.f(1, 2);
For C#, you could depend on XSharpX
A take on Eric's answer that has Haskell equivalent performance, but still has other issues(thread safety, no way to free memory).
private static List<int> fibs = new List<int>(){1,1};
static IEnumerable<int> F()
{
foreach (var fib in fibs)
{
yield return fib;
}
int a, b;
while (true)
{
a = fibs.Last();
b = fibs[fibs.Count() - 2];
fibs.Add(a+b);
yield return a + b;
}
}
Translating from a Haskell environment to a .NET environment is much easier if you use F#, Microsoft's functional declarative language similar to Haskell.

C++ Rvalue references and move semantics

C++03 had the problem of unnecessary copies that could happen implicitly. For this purpose, C++11 introduced rvalue references and move semantics. Now my question is, do this unnecessary copying problem also exist in languages such as C# and java or was it only a C++ problem? In other words, does rvalue references make C++11 even more efficient as compared to C# or Java?
As far as C# concerned (operator overloading allowed in it), lets say we have a mathematical vector class, and we use it like this.
vector_a = vector_b + vector_c;
The compiler will surely transform vector_b + vector_c to some temporary object (lets call it vector_tmp).
Now I don't think C# can differentiate between a temporary rvalue such as vector_tmp or a an lvalue such as vector_b, so we'll have to copy data to vector_a anyway, which can easily be avoided by using rvalue references and move semantics in C++11.
Class references in C# and Java have some properties of shared_ptrs in C++. However, rvalue references and move semantics relate more to temporary value types, but the value types in C# are quite non-flexible compared to C++ value types, and from my own C# experience, you'll end up with classes, not structs, most of the time.
So my assumption is that neither Java nor C# would profit much from those new C++ features, which lets code make safe assumptions whether something is a temporary, and instead of copying lets it just steal the content.
yes unnecessary copy operation are there in C# and java.
does rvalue references make C++11 even more efficient as compared to C# or Java?
answer is yes. :)
Because classes in Java and C# use reference semantics, there are never any implicit copies of objects in those languages. The problem move semantics solve does not and has never existed in Java and C#.
I think it could occur in Java. See the add and add_to operation below. add creates a result object to hold the result of the matrix add operation, while add_to merely adds the rhs to this.
class Matrix {
public static final int w = 2;
public static final int h = 2;
public float [] data;
Matrix(float v)
{
data = new float[w*h];
for(int i=0; i<w*h; ++i)
{ data[i] = v; }
}
// Creates a new Matrix by adding this and rhs
public Matrix add(Matrix rhs)
{
Main result = new Main(0.0f);
for(int i=0; i<w*h; ++i)
{ result.data[i] = this.data[i] + rhs.data[i]; }
return result;
}
// Just adds the values in rhs to this
public Main add_to(Main rhs)
{
for(int i=0; i<w*h; ++i)
{ this.data[i] += rhs.data[i]; }
return this;
}
public static void main(String [] args)
{
Matrix m = new Matrix(0.0f);
Matrix n = new Matrix(1.0f);
Matrix o = new Matrix(1.0f);
// Chaining these ops would modify m
// Matrix result = m.add_to(n).subtract_from(o);
m.add_to(n); // Adds n to m
m.subtract_from(o); // Subtract o from n
// Can chain ops without modifying m,
// but temps created to hold results from each step
Matrix result = m.add(n).subtract(o);
}
}
Thus, I think it depends on what sort of functionality you're providing to the user with your classes.
The problem comes up a lot. Someone I want to hold onto a unique copy of an object that no one else can modify. How do I do that?
Make a deep copy of whatever object someone gives me? That would work, but it's not efficient.
Ask people to give me a new object and not to keep a copy? That's faster if you're brave. Bugs can come from a completely unrelated piece of code modifying the object hours later.
C++ style: Move all the items from the input to my own new object. If the caller accidentally tries to use the object again, he will immediately see the problem.
Sometimes a C# read only collection can help. But in my experiences that's usually a pain at best.
Here's what I'm talking about:
class LongLivedObject
{
private Dictionary <string, string> _settings;
public LongLivedObject(Dictionary <string, string> settings)
{ // In C# this always duplicates the data structure and takes O(n) time.
// C++ will automatically try to decide if it could do a swap instead.
// C++ always lets you explicitly say you want to do the swap.
_settings = new Dictionary <string, string>(settings);
}
}
This question is at the heart of Clojure and other functional languages!
In summary, yes, I often wish I had C++11 style data structures and operations in C#.
You can try to emulate move semantics. For instance in Trade-Ideas Philip's example you can pass custom MovableDictionary instead of Dictionary:
public class MovableDictionary<K, V> // : IDictionary<K, V>, IReadOnlyDictionary<K, V>...
{
private Dictionary<K, V> _map;
// Implement all Dictionary<T>'s methods by calling Map's ones.
public Dictionary<K, V> Move()
{
var result = Map;
_map = null;
return result;
}
private Dictionary<K, V> Map
{
get
{
if (_map == null)
_map = new Dictionary<K, V>();
return _map;
}
}
}

Linq late binding confusion

Can someone please explain me what I am missing here. Based on my basic understanding linq result will be calculated when the result will be used and I can see that in following code.
static void Main(string[] args)
{
Action<IEnumerable<int>> print = (x) =>
{
foreach (int i in x)
{
Console.WriteLine(i);
}
};
int[] arr = { 1, 2, 3, 4, 5 };
int cutoff = 1;
IEnumerable<int> result = arr.Where(x => x < cutoff);
Console.WriteLine("First Print");
cutoff = 3;
print(result);
Console.WriteLine("Second Print");
cutoff = 4;
print(result);
Console.Read();
}
Output:
First Print
1
2
Second Print
1
2
3
Now I changed the
arr.Where(x => x < cutoff);
to
IEnumerable<int> result = arr.Take(cutoff);
and the output is as follow.
First Print
1
Second Print
1
Why with Take, it does not use the current value of the variable?
The behavior your seeing comes from the different way in which the arguments to the LINQ functions are evaluated. The Where method recieves a lambda which captures the value cutoff by reference. It is evaluated on demand and hence sees the value of cutoff at that time.
The Take method (and similar methods like Skip) take an int parameter and hence cutoff is passed by value. The value used is the value of cutoff at the moment the Take method is called, not when the query is evaluated
Note: The term late binding here is a bit incorrect. Late binding generally refers to the process where the members an expression binds to are determined at runtime vs. compile time. In C# you'd accomplish this with dynamic or reflection. The behavior of LINQ to evaluate it's parts on demand is known as delayed execution.
There's a few different things getting confused here.
Late-binding: This is where the meaning of code is determined after it was compiled. For example, x.DoStuff() is early-bound if the compiler checks that objects of x's type have a DoStuff() method (considering extension methods and default arguments too) and then produces the call to it in the code it outputs, or fails with a compiler error otherwise. It is late-bound if the search for the DoStuff() method is done at run-time and throws a run-time exception if there was no DoStuff() method. There are pros and cons to each, and C# is normally early-bound but has support for late-binding (most simply through dynamic but the more convoluted approaches involving reflection also count).
Delayed execution: Strictly speaking, all Linq methods immediately produce a result. However, that result is an object which stores a reference to an enumerable object (often the result of the previous Linq method) which it will process in an appropriate manner when it is itself enumerated. For example, we can write our own Take method as:
private static IEnumerable<T> TakeHelper<T>(IEnumerable<T> source, int number)
{
foreach(T item in source)
{
yield return item;
if(--number == 0)
yield break;
}
}
public static IEnumerable<T> Take<T>(this IEnumerable<T> source, int number)
{
if(source == null)
throw new ArgumentNullException();
if(number < 0)
throw new ArgumentOutOfRangeException();
if(number == 0)
return Enumerable.Empty<T>();
return TakeHelper(source, number);
}
Now, when we use it:
var taken4 = someEnumerable.Take(4);//taken4 has a value, so we've already done
//something. If it was going to throw
//an argument exception it would have done so
//by now.
var firstTaken = taken4.First();//only now does the object in taken4
//do the further processing that iterates
//through someEnumerable.
Captured variables: Normally when we make use of a variable, we make use of how its current state:
int i = 2;
string s = "abc";
Console.WriteLine(i);
Console.WriteLine(s);
i = 3;
s = "xyz";
It's pretty intuitive that this prints 2 and abc and not 3 and xyz. In anonymous functions and lambda expressions though, when we make use of a variable we are "capturing" it as a variable, and so we will end up using the value it has when the delegate is invoked:
int i = 2;
string s = "abc";
Action λ = () =>
{
Console.WriteLine(i);
Console.WriteLine(s);
};
i = 3;
s = "xyz";
λ();
Creating the λ doesn't use the values of i and s, but creates a set of instructions as to what to do with i and s when λ is invoked. Only when that happens are the values of i and s used.
Putting it all together: In none of your cases do you have any late-binding. That is irrelevant to your question.
In both you have delayed execution. Both the call to Take and the call to Where return enumerable objects which will act upon arr when they are enumerated.
In only one do you have a captured variable. The call to Take passes an integer directly to Take and Take makes use of that value. The call to Where passes a Func<int, bool> created from a lambda expression, and that lambda expression captures an int variable. Where knows nothing of this capture, but the Func does.
That's the reason the two behave so differently in how they treat cutoff.
Take doesn't take a lambda, but an integer, as such it can't change when you change the original variable.

Categories

Resources