Optimizing code for range selection - c#

Given an array and two more arrays i need to find the range of elements in the first array
For e.g. MainArray={2,4,6,5,8,9}, range1={4,5,6}, range2={6,9,8}
for First-Iteration i have to select elements in MainArray in range [4,6] ->[4,6,5] --[3] is the output
for second-Iteration i have to select elements in MainArray in range [5,9] ->[5,8,9]--[3] is the output
for third-Iteration i have to select elements in MainArray in range [6,8] ->[6,8]--[2] is the output
array returned [3,3,2]
static void Main(string[] args)
{
var rng = new Random();
var result = processFunc(Enumerable.Range(0, 5000000).OrderBy(x => rng.Next()).ToArray(),
Enumerable.Range(0, 20000).OrderBy(x => rng.Next()).Take(200).ToArray(),
Enumerable.Range(0, 20000).OrderBy(x => rng.Next()).Take(200).ToArray());
}
public static int[] processFunc(int[] scores,int[] l,int[] r)
{
IList<int> output = new List<int>();
for (int i = 0; i < l.Length; i++)
{
var bestMatch = scores.Where(x => x >= l[i] && x <= r[i]);
output.Add(bestMatch.Count());
}
return output.ToArray();
}
The code runs fine when numbers are small but once they >50,000 the program becomes slow. How can I optimize this solution ?

Assuming l and r have the same length, consider this approach:
public static int[] processFunc(int[] scores, int[] l, int[] r)
{
var min = Math.Min(l.Min(z => z), r.Min(z => z));
var max = Math.Max(l.Max(z => z), r.Max(z => z));
var grouped = scores.Where(z => z >= min && z <= max).GroupBy(z => z).Select(val => Tuple.Create(val.Key, val.Count())).OrderBy(z => z.Item1).ToList();
return l.Zip(r, (left, right) =>
{
var matching = grouped.Where(z => z.Item1 >= left).TakeWhile(z => z.Item1 <= right);
return matching.Sum(z => z.Item2);
}).ToArray();
}
min and max are used to ignore irrelevant (too large or too small) numbers. grouped is used to pre-calculate the counts and put them in order. Zip is used to line up the l and r values and sum the counts together.
This solution is roughly 2-3 times faster on my machine than the original code (and most of the remaining time being spent is actually setting up the parameters, rather than in the function itself).

Related

Sorting array by frequency of elements in C#

Here is what i have so far
int[] numbers = { 3,5,4,3,8,8,5,3,2,1,9,5 };
int[] n = new int[12];
int[] k;
foreach (int number in numbers)
{
n[number]++;
}
Array.Sort(n);
Array.Reverse(n);
foreach (int value in n)
{
Console.WriteLine(value);
}
I know i am missing the part where i sort the frequency of the elements after i counted them and i just cant get my head around it. I'd appreciate some help, Thanks!
What's the problem with your solution ?
Whereas you correctly keep the frequencies of the numbers in the table called n in your code, which hereby I would call it frequencies, then you Sort this array. This action breaks your solution, since each frequency is associated with the corresponding index of its location in the array.
E.g. If an instance of this array is this [8,2,1,7,6]. When you call the Sort method on this array, this would have as a result the array to be sorted and the order of the elements of the array would be this [1,2,7,6,8]. Before calling sort, the first element of the array was indicating that the number 0 (the index of the first element is 0) has been found 8 times in our numbers. After sort, the first element is 1, which means now that the frequency of the number 0 is 1, which is apparently wrong.
If you want to keep it your way, then you could try something like this:
int[] numbers = { 1,2,2,9,1,2,5,5,5,5,2 };
int[] frequencies = new int[12];
int k = 3;
foreach (int number in numbers)
{
frequencies[number]++;
}
var mostFrequentNumbers = frequencies.Select((frequency, index) => new
{
Number = index,
Frequency = frequency
})
.OrderByDescending(item => item.Frequency)
.Select(item => item.Number)
.Take(k);
foreach (int mostFrequentNumber in mostFrequentNumbers)
{
Console.WriteLine(mostFrequentNumber);
}
Are there any other approaches ?
An easy way to do this is to use a data structure like a Dictionary, in which you would keep as keys the numbers and as the corresponding values the corresponding frequencies.
Then you can order by descending values the above data structure an keep the k most frequent numbers.
int[] numbers = { 1,2,2,9,1,2,5,5,5,5,2 };
int k = 3;
Dictionary<int, int> numberFrequencies = new Dictionary<int, int>();
foreach (int number in numbers)
{
if(numberFrequencies.ContainsKey(number))
{
numberFrequencies[number] += 1;
}
else
{
numberFrequencies.Add(number, 1);
}
}
var mostFrequentNumbers = numberFrequencies.OrderByDescending(numberFrequency => numberFrequency.Value)
.Take(k)
.Select(numberFrequency => numberFrequency.Key);
foreach (int mostFrequentNumber in mostFrequentNumbers)
{
Console.WriteLine(mostFrequentNumber);
}
You can also achieve the same thing by only using LINQ:
int[] numbers = { 1,2,2,9,1,2,5,5,5,5,2 };
int k = 3;
var mostFrequentNumbers = numbers.GroupBy(number => number)
.ToDictionary(gr => gr.Key, gr => gr.Count())
.OrderByDescending(keyValue => keyValue.Value)
.Take(k)
.Select(numberFrequency => numberFrequency.Key);
foreach (int mostFrequentNumber in mostFrequentNumbers)
{
Console.WriteLine(mostFrequentNumber);
}
You can just use Linq extensions:
using System.Linq;
using System.Collections.Generic;
...
private static IEnumerable<int> Solve(int[] numbers, int k) {
return numbers
.GroupBy(x => x)
.OrderByDescending(g => g.Count())
.Select(g => g.Key)
.Take(k);
}
Then you can call:
var numbers = new []{1,2,2,9,1,2,5,5,5,5,2};
var k = 3;
var result = Solve(numbers, k);
foreach (int n in result)
Console.WriteLine(n);
To be very terse:
var frequents = numbers.GroupBy(t => t)
.Where(grp => grp.Count() > 1)
.Select(t => t.Key)
.OrderByDescending(t => t)
.Take(k)
.ToList();

Group list into groups by the the index range of the list

I have a source integer list with numbers from 0 to 50.
Then I want to have a grouped target list that means:
group1: 1,2,3,4,5,6,7,8,9,10
group2: 11,12,13,14,15,16,17,18,19,20
group3: etc... ,30
group4: etc... ,40
group5: etc... ,50
The groupFactor here is 5.
How can I group my integer list basing on that group factor which could be any number?
UPDATE
If the group factor is 6
there would be an additional:
group6: etc... ,60
Let k be your group factor. Group your list by multiplying the list member by k then dividing by 50, and grouping the sequence on the resulting quotient.
Your question is a little vague but for the sample you provided, I found this fancy group by :)
var list = new List<int>();
for (int i=0; i <= 50; i++)
{
list.Add(i);
}
var result = list.GroupBy( n => (n-1)/10 );
Try this
static void Main(string[] args)
{
List<int> input = new List<int>();
for (int i = 0; i <= 50; i++)
{
input.Add(i);
}
List<List<int>> output = input.Select((x, i) => new { x = x, i = (int)(x / 10) }).GroupBy(y => y.i).Select(z => z.Select(a => a.x).ToList()).ToList();
}

Split a List into several Lists based on criteria using LINQ

I have a list of integers, which I would like to split into 2 or more lists based upon meeting a certain criteria. For example:
List<int> myList = new List<int>();
myList.Add(100);
myList.Add(200);
myList.Add(300);
myList.Add(400);
myList.Add(200);
myList.Add(500);
I would like to split the list into several lists, each of which contains all items which total <= 600. In the above, it would then result in 3 separate List objects.
List 1 would contain 100, 200 300
List 2 would contain 400, 200
List 3 would contain 500
Ideally, I'd like it to be a single LINQ statement.
Although doable, this is an excellent example of what LINQ is not for. Check yourself.
Having
var myList = new List<int> { 100, 200, 300, 400, 200, 500, };
int maxSum = 600;
"Pure" LINQ (the power of Aggregate)
var result = myList.Aggregate(
new { Sum = 0, List = new List<List<int>>() },
(data, value) =>
{
int sum = data.Sum + value;
if (data.List.Count > 0 && sum <= maxSum)
data.List[data.List.Count - 1].Add(value);
else
data.List.Add(new List<int> { (sum = value) });
return new { Sum = sum, List = data.List };
},
data => data.List)
.ToList();
A normal (non LINQ) implementation of the above
var result = new List<List<int>>();
int sum = 0;
foreach (var value in myList)
{
if (result.Count > 0 && (sum += value) <= maxSum)
result[result.Count - 1].Add(value);
else
result.Add(new List<int> { (sum = value) });
}
For completeness (and some fun), a "Hackish" LINQ (the power of closures and C# operators)
int sum = 0, key = -1;
var result = myList.GroupBy(x => key >= 0 && (sum += x) <= maxSum ? key : ++key + (sum = x) * 0, (k, e) => e.ToList()).ToList();
Here is the solution to your problem. I am not sure if that is the best case solver but it will surely do the job:
List<int> First = myList.Where(x => x <= 300).ToList();
List<int> Second = myList.Where(x => x == 400 || x == 200).ToList();
List<int> Third = myList.Where(x => x == 500).ToList();
It does query through the list and checks for values that meets the requirements then it will convert IEnumerable into the List.
This will do what you want for a list any size but maybe not as short as you are looking for. You would have to write a LINQ extension method to shorten it but then it become a bit more complicated.
List<int> myList = new List<int>();
myList.Add(100);
myList.Add(200);
myList.Add(300);
myList.Add(400);
myList.Add(200);
myList.Add(500);
var result = new List<List<int>>();
var skip = 0;
while (skip < myList.Count)
{
var sum = 0;
result.Add(myList.Skip(skip).TakeWhile(x =>
{
sum += x;
return sum <= 600;
}).ToList());
skip += result.Last().Count();
}

LINQ to Get Closest Value?

I have a List, MyStuff has a property of Type Float.
There are objects with property values of 10,20,22,30.
I need to write a query that finds the objects closest to 21, in this case it would find the 20 and 22 object. Then I need to write one that finds the object closes to 21 without going over, and it would return the object with a value of 20.
I have no idea where/how to begin with this one. Help?
Thanks.
Update - wow there are so many awesome responses here. Thanks! I don't know which one to follow so I will try them all. One thing that might make this more (or less) interesting is that the same query will have to apply to LINQ-to-SQL entities, so possibly the answer harvested from the MS Linq forums will work the best? Don't know.
Try sorting them by the absolute value of the difference between the number and 21 and then take the first item:
float closest = MyStuff
.Select (n => new { n, distance = Math.Abs (n - 21) })
.OrderBy (p => p.distance)
.First().n;
Or shorten it according to #Yuriy Faktorovich's comment:
float closest = MyStuff
.OrderBy(n => Math.Abs(n - 21))
.First();
Here's a solution that satisfies the second query in linear time:
var pivot = 21f;
var closestBelow = pivot - numbers.Where(n => n <= pivot)
.Min(n => pivot - n);
(Edited from 'above' to 'below' after clarification)
As for the first query, it would be easiest to use MoreLinq's MinBy extension:
var closest = numbers.MinBy(n => Math.Abs(pivot - n));
It's also possible to do it in standard LINQ in linear time, but with 2 passes of the source:
var minDistance = numbers.Min(n => Math.Abs(pivot - n));
var closest = numbers.First(n => Math.Abs(pivot - n) == minDistance);
If efficiency is not an issue, you could sort the sequence and pick the first value in O(n * log n) as others have posted.
Based on this post at the Microsoft Linq forums:
var numbers = new List<float> { 10f, 20f, 22f, 30f };
var target = 21f;
//gets single number which is closest
var closest = numbers.Select( n => new { n, distance = Math.Abs( n - target ) } )
.OrderBy( p => p.distance )
.First().n;
//get two closest
var take = 2;
var closests = numbers.Select( n => new { n, distance = Math.Abs( n - target ) } )
.OrderBy( p => p.distance )
.Select( p => p.n )
.Take( take );
//gets any that are within x of target
var within = 1;
var withins = numbers.Select( n => new { n, distance = Math.Abs( n - target ) } )
.Where( p => p.distance <= within )
.Select( p => p.n );
List<float> numbers = new List<float>() { 10f, 20f, 22f, 30f };
float pivot = 21f;
var result = numbers.Where(x => x >= pivot).OrderBy(x => x).FirstOrDefault();
OR
var result = (from n in numbers
where n>=pivot
orderby n
select n).FirstOrDefault();
and here comes an extension method:
public static T Closest<T,TKey>(this IEnumerable<T> source, Func<T, TKey> keySelector, TKey pivot) where TKey : IComparable<TKey>
{
return source.Where(x => pivot.CompareTo(keySelector(x)) <= 0).OrderBy(keySelector).FirstOrDefault();
}
Usage:
var result = numbers.Closest(n => n, pivot);

Finding if a target number is the sum of two numbers in an array via LINQ

A basic solution would look like this:
bool sortTest(int[] numbers, int target)
{
Array.Sort(numbers);
for(int i = 0; i < numbers.Length; i++)
{
for(int j = numbers.Length-1; j > i; j--)
{
if(numbers[i] + numbers[j] == target)
return true;
}
}
return false;
}
Now I'm very new to LINQ but this is what I have written so far:
var result = from num in numbers
where numbers.Contains(target -num)
select num;
if (result.Count() > 0)
return true;
return false;
Now i'm running into an issue given the following example:
Array: 1, 2, 4, 5, 8
Target: 16
It should return back false, but it's catching 16-8=8. So how do I go about not letting it notice itself in the contains check? Or can I make a second array each time within the query that doesn't contain the number I'm working with(thus solving the problem)?
Thanks in advance.
Is this what you're looking for?
var result = from n1 in numbers
from n2 in numbers
where n1 != n2 && n1 + n2 == target
select new { n1, n2 };
[Edit]
This returns matches twice and ignores the situation where a number is duplicated in the array. You can't handle these situations using Expression Syntax because you can't access the index of a matched item, but you can do it like this:
var result = numbers.Select((n1, idx) =>
new {n1, n2 = numbers.Take(idx).FirstOrDefault(
n2 => n1 + n2 == target)}).Where(pair => pair.n2 != 0);
As long as you don't have any zeros in your array.
[Further thought Edit]
The perfect mix solution:
var result = from item in numbers.Select((n1, idx) =>
new {n1, shortList = numbers.Take(idx)})
from n2 in item.shortList
where item.n1 + n2 == target
select new {n1 = item.n1, n2};
What I'd do to solve this problem in general is first write a "chooser".
public static IEnumerable<IEnumerable<T>> Chooser<T>(this IList<T> sequence, int num)
{ ... left as an exercise ... }
The output of the chooser is a sequence of sequences. Each sub-sequence is of length num, and consists of elements chosen from the original sequence. So if you passed { 10, 30, 20, 50 } as the sequence and 3 for num, you'd get the sequence of sequences:
{10, 30, 20}, {10, 30, 50}, {10, 20, 50}, {30, 20, 50}
as a result.
Once you've written Chooser, the problem becomes easy:
var results =
from subsequence in numbers.Chooser(2)
where subsequence.Sum() == target
select subsequence;
And now you can solve the problem for subsequences of other sizes, not just pairs.
Writing Chooser is a bit tricky but it's not too hard.
To improve on pdr's reply and address the concerns mentioned in the comments you could use the overloaded Select method to compare the indices of the items and ensure uniqueness.
public bool sortTest(int[] numbers, int target)
{
var indexedInput = numbers.Select((n, i) => new { Number = n, Index = i });
var result = from x in indexedInput
from y in indexedInput
where x.Index != y.Index
select x.Number + y.Number == target;
return result.Any(item => item);
}
Or in dot notation:
var result = numbers.Select((n, i) => new { Number = n, Index = i })
.SelectMany(
x => indexedInput,
(x, y) => new { x = x, y = y })
.Where(item => item.x.Index != item.y.Index)
.Select(item => item.x.Number + item.y.Number == target);

Categories

Resources