c# lambda expression complexity - c#

A simple question regarding lambda expression
I wanted to get an average of all trades in following code. The formula I am using is ((price 1*qty 1+(price 2*qty 2)....+(price n*qty n)/(qty 1+qty 2+...+qty n)
In following code, I am using sum function to calculate total of (price*qty) and the complexity will be O(n) and once more time to add up all qty the complexity will be O(n). So, is there any way I can find a summation of both using complexity O(n) means a single lambda expression which can calculate both results.
Using for loop, I can calculate both results in O(n) complexity.
class Program
{
static void Main(string[] args)
{
List<Trade> trades = new List<Trade>()
{
new Trade() {price=2,qty=2},
new Trade() {price=3,qty=3}
};
///using lambda
int price = trades.Sum(x => x.price * x.qty);
int qty = trades.Sum(x => x.qty);
///using for loop
int totalPriceQty=0, totalQty=0;
for (int i = 0; i < trades.Count; ++i)
{
totalPriceQty += trades[i].price * trades[i].qty;
totalQty += trades[i].qty;
}
Console.WriteLine("Average {0}", qty != 0 ? price / qty : 0);
Console.Read();
}
}
class Trade
{
public int price;
public int qty;
}
Edit: I know that the coefficient does not count. Let me rephrase the question by saying that with lambda we are going to go through each element in the list twice while with for loop we are going to go through each element only once. Is there any solution with lambda so it does not have to go through list elements twice?

As mentioned the Big-O is unchanged, whether you iterate once or twice. If you wanted to use Linq to iterate just once you could use a custom aggregator (since you are reducing to the same properties we can just use an instance of Trade for aggregation):
var ag = trades.Aggregate(new Trade(),
(agg, trade) =>
{
agg.price += trade.price * trade.qty;
agg.qty += trade.qty;
return agg;
});
int price = ag.price;
int qty = ag.qty;
At this point personally I would just use a foreach loop or the simple lambdas you already have - unless performance is crucial here (measure it!)

Big-O complexity does not take consideration of constant coefficients. O(n) + O(n) still gives O(n).
If you’re determined you want to it in lambda, here’s an example using the Aggregate operator. It looks contrived, and I wouldn’t recommend it over a traditional for loop.
var result = trades.Aggregate(
Tuple.Create(0, 0),
(acc, trade) =>
Tuple.Create(acc.Item1 + trade.price * trade.qty, acc.Item2 + trade.qty));
int totalPrice = result.Item1;
int totalQuantity = result.Item2;

To expand on BrokenGlass's answer, you could also use an anonymous type as your aggregator like so:
var result = trades.Aggregate(
new { TotalValue = 0L, TotalQuantity = 0L },
(acc, trade) => new
{
TotalValue = acc.TotalValue + trade.price,
TotalQuantity = acc.TotalQuantity + trade.qty
}
);
There are two small upsides to this approach:
If you're doing this kind of calculation on a large number of trades (or on large trades), it's possible that you would overflow the int that keep track of total value of trades and total shares. This approach lets you specify long as your data type (so it would take longer to overflow).
The object you get back from this aggregation will have more meaningful properties than you would if you simply returned a Trade object.
The big downside is that it's kinda weird to look at.

Related

Is there a way to find the closest number in an Array to another inputed number?

So I have a Visualstudio Forms where I have a NumericUpDown function that will allow users to input a 5 digit number such as 09456. And I need to be able to compare that number to an already existing array of similar 5 digit numbers, so essentially I need to get the inputted number and find the closest number to that.
var numbers = new List<float> {89456f, 23467f, 86453f, };
// the list is way longer but you get the idea
var target = numericUpDown.3 ;
var closest = numbers.Select(n => new { n, (n - target) })
.OrderBy(p => p.distance)
.First().n;
But the first problem I encounter is that I cannot use a "-" operation on a float. Is there any way I can avoid that error and be able to still find the closest input?
Anonymous type members need names, and you need to use the absolute value of the difference. eg
var numbers = new List<float> { 89456f, 23467f, 86453f, };
var target = 3;
var closest = numbers.Select(n => new { n, distance = Math.Abs(n - target) })
.OrderBy(p => p.distance)
.First().n;
Well, apart from some issues in your sample(like no distance property on float) it should work:
int target = 55555;
float closest = numbers.OrderBy(f => Math.Abs(f - target)).First();
Demo: https://dotnetfiddle.net/gqS50L
The answers that use OrderBy are correct, but have less than optimal performance. OrderBy is an O(N log N) operation, but why sort the whole collection when you only need the top element? By contrast, MinBy will give you the result in O(N) time:
var closest = numbers.MinBy(n => Math.Abs(n - target));
Apart from the compilation errors, using LINQ for this is very slow and time consuming. The entire list has to be scanned once to find the distance, then it needs to be sorted, which scans it all over again and caches the results before returning them in order.
Before .NET 6
A faster way would be to iterate only once, calculating the distance of the current item from the target, and keep track of which number is closest. That's how eg Min and Max work.
public static float? Closest(this IEnumerable<float> list, float target)
{
float? closest=null;
float bestDist=float.MaxValue;
foreach(var n in list)
{
var dist=Math.Abs(n-target);
if (dist<bestDist)
{
bestDist=dist;
closest=n;
}
}
return closest;
}
This will return the closest number in a single pass.
var numbers = new List<float> { 89456f, 23467f, 86453f, };
var closest=numbers.Closest(20000);
Console.WriteLine($"Closest is {closest}");
------------------
Closest is 23467
Using MoreLINQ and MinBy
The same can be done in a single line using the MinBy extension method from the MoreLINQ library:
var closest=numbers.MinBy(n=>Math.Abs(n-target));
Using MinBy
In .NET 6 and later, Enumerable.MinBy was added to the BCL:
var closest=numbers.MinBy(n=>Math.Abs(n-target));
The code is similar to the explicit loop once you look past the generic key selectors and comparers :
while (e.MoveNext())
{
TSource nextValue = e.Current;
TKey nextKey = keySelector(nextValue);
if (nextKey != null && comparer.Compare(nextKey, key) < 0)
{
key = nextKey;
value = nextValue;
}
}

C# - Best Way to Match 2 items from a List Without Nested Loops

Say i have a list that hold minitues of film durations called
filmDurations in type of int.
And i have a int parameter called flightDuration for a duration
of any given flight in minitues.
My objective is :
For any given flightDuration, i want to match 2 film from my filmDurations that their sums exactly finishes 30 minutes from flight.
For example :
filmDurations = {130,105,125,140,120}
flightDuration = 280
My output : (130 120)
I can do it with nested loops. But it is not effective and it is time consuming.
I want to do it more effectively.
I thinked using Linq but still it is O(n^2).
What is the best effective way?
Edit: I want to clear one thing.
I want to find filmDurations[i] + filmDurations[j] in;
filmDurations[i] + filmDurations[j] == fligtDuration - 30
And say i have very big amont of film durations.
You could sort all durations (remove duplicates) (O(n log n)) and than iterate through them (until the length flight-duration -30). Search for the corresponding length of the second film (O(log n)).
This way you get all duration-pairs in O(n log n).
You can also use a HashMap (duration -> Films) to find matching pairs.
This way you can avoid sorting and binary search. Iterate through all durations and look up in the map if there are entries with duration = (flight-duration -30).
Filling the map needs O(n) lookup O(1) and you need to iterate all durations.
-> Over all complexity O(n) but you loose the possibility to find 'nearly matching pairs which would be easy to implement using the sorted list approach described above)
As Leisen Chang said you can put all items into dictionary. After doing that rewrite your equation
filmDurations[i] + filmDurations[j] == fligtDuration - 30
as
filmDurations[i] == (fligtDuration - 30 - filmDurations[j])
Now for each item in filmDurations search for (fligtDuration - 30 - filmDurations[j]) in dictionary. And if such item found you have a solution.
Next code implement this concept
public class IndicesSearch
{
private readonly List<int> filmDurations;
private readonly Dictionary<int, int> valuesAndIndices;
public IndicesSearch(List<int> filmDurations)
{
this.filmDurations = filmDurations;
// preprocessing O(n)
valuesAndIndices = filmDurations
.Select((v, i) => new {value = v, index = i})
.ToDictionary(k => k.value, v => v.index);
}
public (int, int) FindIndices(
int flightDuration,
int diff = 30)
{
// search, also O(n)
for (var i = 0; i < filmDurations.Count; ++i)
{
var filmDuration = filmDurations[i];
var toFind = flightDuration - filmDuration - diff;
if (valuesAndIndices.TryGetValue(toFind, out var j))
return (i, j);
}
// no solution found
return (-1, -1); // or throw exception
}
}

ParallelEnumerable.Aggregate for several methods

Start to learn multithreading. Have 3 methods to calculate a sum, average, and product of square roots of an array.
At first, I make three separate blocking calls using PLINQ. Then I thought that it would be nice to be able to make it in a single call and return an object with sum, product, and average at the same time. I read that ParallelEnumerable.Aggregate can help me with this, but I totally don't know how to use it.
I would be really grateful for some explanation how to use this function in my case, good/bad aspects of this approach.
public static double Average(double[] array, string tool)
{
if (array == null) throw new ArgumentNullException(nameof(array));
double sum = Enumerable.Sum(array);
double result = sum / array.Length;
Print(tool, result);
return result;
}
public static double Sum(double[] array, string tool)
{
if (array == null) throw new ArgumentNullException(nameof(array));
double sum = Enumerable.Sum(array);
Print(tool, sum);
return sum;
}
public static void ProductOfSquareRoots(double[] array, string tool)
{
if (array == null) throw new ArgumentNullException(nameof(array));
double result = 1;
foreach (var number in array)
{
result = result * Math.Sqrt(number);
}
Print(tool, result);
}
The three aggregated values (average, sum and product of square roots) that you want to compute can each be computed by performing a single pass over the numbers. Instead of doing this three times (one for each aggregated value) you can do this once and aggregate the three values inside the loop (this should save time).
The average is the sum divided by the count and as you already are computing the sum you only need the count in addition to get the average. If you know the size of the input you don't even have to count the items but here I assume that the size of the input is unknown in advance.
If you want to use LINQ you can use Aggregate:
var aggregate = numbers.Aggregate(
// Starting value for the accumulator.
(Count: 0, Sum: 0D, ProductOfSquareRoots: 1D),
// Update the accumulator with a specific number.
(accumulator, number) =>
{
accumulator.Count += 1;
accumulator.Sum += number;
accumulator.ProductOfSquareRoots *= Math.Sqrt(number);
return accumulator;
});
The variable aggregate is a ValueTuple<int, double, double> with the items Count, Sum and ProductOfSquareRoots. Before C# 7 you would use an anonymous type. However, that would require an allocation for each value in the input sequence slowing down the aggregation. By using a mutable value tuple the aggregation should become faster.
Aggregate works with PLINQ so if numbers is of type ParallelQuery<T> and not IEnumerable<T> then the aggregation will be performed in parallel. Notice that this requires the aggregation to be both associative (e.g. (a + b) + c = a + (b + c) and commutative (e.g. a + b = b + a) which in your case is true.
PLINQ has an overhead so it might not perform better compared to single threaded LINQ depending on the number of elements in your sequence and how complex the calculations are. You will have to measure this yourself to determine if PLINQ speeds things up. However, you can use the same Aggregate expression in both LINQ and PLINQ making your code easy to switch from single threaded to parallel by inserting AsParallel() the right place.
Note: you must initialize the result variable with the value 1, because otherwise you will always get 0.
Note 2: instead of Enumerable.Sum(array), just write array.Sum().
No, the Aggregate method won't help you to calculate the three functions at the same time. See Martin Liversage answer.
KISS ;)
if (array == null) throw new ArgumentNullException(nameof(array));
var sum = array.Sum();
var average = array.Average();
var product = array.Aggregate(1.0, (acc, val) => acc * Math.Sqrt(val));
Can be simplified:
var average = sum / array.Length;
This eliminates an extra pass through the array.
Want to parallelize?
var sum = array.AsParallel().Sum();
//var average = array.AsParallel().Average(); // Extra pass!
var average = sum / array.Length; // More fast! Really!
var product = array.AsParallel().Aggregate(1.0, (acc, val) => acc * Math.Sqrt(val));
However, it will probably be slower than the previous method. Such paralleling is justified only for very large collections, in billions of elements.
Each pass through the collection takes time. The less passes, the better the performance. From one we have already disposed of, when calculating the average. Let's do just one.
double sum = 0;
double product = 1;
foreach (var number in array)
{
sum += number;
product = product * Math.Sqrt(number);
}
double average = sum / array.Length;
Three results in one pass! We are the best!
Let's get back to the subject.
The Parallel.Invoke method allows you to execute several functions in parallel, but it does not get the results from them. It is suitable for calculations of the type "fire and forget".
We can parallelize the computation by running multiple tasks. With help of Task.WhenAll waiting for them all completed and get the result.
var results = await Task.WhenAll(
Task.Run(() => array.Sum()),
Task.Run(() => array.Average()),
Task.Run(() => array.Aggregate(1.0, (acc, val) => acc * Math.Sqrt(val)))
);
var sum = results[0];
var average = results[1];
var product = results[2];
It is also not effective for a small size collection. But it may be more efficient than the AsParallel in some cases.
Another way of writing this approach with tasks. Perhaps it will seem clearer.
var sumTask = Task.Run(() => array.Sum());
var avgTask = Task.Run(() => array.Average());
var prodTask = Task.Run(() => array.Aggregate(1.0, (acc, val) => acc * Math.Sqrt(val)));
Task.WaitAll(sumTask, avgTask, prodTask);
sum = sumTask.Result;
average = avgTask.Result;
product = prodTask.Result;

Could C# Linq do combinatorics?

I have this data structure:
class Product
{
public string Name { get; set; }
public int Count { get; set; }
}
var list = new List<Product>(){ { Name = "Book", Count = 40}, { Name = "Car", Count = 70}, { Name = "Pen", Count = 60}........... } // 500 product object
var productsUpTo100SumCountPropert = list.Where(.....) ????
// productsUpTo100SumCountPropert output:
// { { Name = "Book", Count = 40}, { Name = "Pen", Count = 60} }
I want to sum the Count properties of the collection and return only products objects where that property Count sum is less than or equal to 100.
If is not possible with linq, what is a better approach that I can use?
Judging from the comments you've left on other peoples' answers and your gist (link), it looks like what you're trying to solve is in fact the Knapsack Problem - in particular, the 0/1 Knapsack Problem (link).
The Wikipedia page on this topic (that I linked to) has a short dynamic programming solution for you. It has pseudo-polynomial running time ("pseudo" because the complexity depends on the capacity you choose for your knapsack (W).
A good preprocessing step to take before running the algorithm is to find the greatest common denominator (GCD) of all of your item weights (w_i) and then divide it out of each value.
d <- GCD({w_1, w_2, ..., w_N})
w_i' <- w_i / d //for each i = 1, 2, ..., N
W' <- W / d //integer division here
Then solve the problem using the modified weights and capacity instead (w_i' and W').
The greedy algorithm you use in your gist won't work very well. This better algorithm is simple enough that it's worth implementing.
You need the Count extension method
list.Count(p => p.Count <= 100);
EDIT:
If you want the sum of the items, Where and Sum extension methods could be utilized:
list.Where(p => p.Count <= 100).Sum(p => p.Count);
list.Where(p=> p.Count <= 100).ToList();

iterating with Linq

I am trying to find a way to access previous values from a Linq method in the same line.
I want to be able to use this general form in Linq:
var values = Enumerable.Range( 1, 100 ).Select( i => i + [last result] );
But I can't find a way to do something like this without multi-line lambda's and storing the results somewhere else.
So the best fibonacci sum I've been able to do in Linq is:
List<int> calculated = new List<int>( new int[] { 1, 2 });
var fibonacci = Enumerable.Range(2, 10).Select(i =>
{
int result = calculated[i - 2] + calculated[i - 1];
calculated.Add(result);
return result; // and how could I just put the result in fibonacci?
}
);
Which seems ugly. I could do this in less code with a regular for loop.
for (int i = 2; i < 10; i++)
{
calculated.Add(calculated[i - 2] + calculated[i - 1]);
}
It seems like if I could find a way to do this, I could use Linq to do a lot of Linear programming and sum a lot of iterative formulas.
If you are looking for a way to create a Fibonacci sequence generator, you would be better off writing your own generator function instead of using Linq extension methods. Something like this:
public static IEnumerable<int> Fibonacci()
{
int a = 1;
int b = 0;
int last;
for (;;) {
yield return a;
last = a;
a += b;
b = last;
}
}
Then you can apply Linq methods to this enumerable to achieve the result you want (try iterating over Fibonacci().Take(20) for example).
Linq extension methods are not the solution for every programming problem, and I can only imagine how horrid a pure LINQ Fibonacci sequence generator would look.
Closest you can come to something like that with LINQ is the IEnumerable.Aggregate method (a.k.a. fold from functional programming). You can use it to, for example, sum up the squares of a collection, like:
int sumSquares = list.Aggregate(0, (sum, item) => sum + item * item);
Since in LINQ the values are retrieved from a collection using an enumerator, i.e. they are taken one by one, by definition, there is no concept of "previous item". The items could even be generated and discarded on the fly, using some yield return magic. That said, you could always use some hack like:
long a= 1;
long b= 1;
var fibonacci = Enumerable.Range(1,20).Select(i => {
long last= a + b;
b = a;
a = last;
return last;
});
but the moment you have to use and modify an outside variable to make the lambdas work, you are in code-smell territory.

Categories

Resources