LINQ group by sequence and count with sorting - c#

I am searching a best performance method to group and count sequences with sorting using LINQ. I will be processing files even bigger than 500 MBs so performance is most important key in that task.
List<int[]> num2 = new List<int[]>();
num2.Add(new int[] { 35, 44 });
num2.Add(new int[] { 200, 22 });
num2.Add(new int[] { 35, 33 });
num2.Add(new int[] { 35, 44 });
num2.Add(new int[] { 3967, 11 });
num2.Add(new int[] { 200, 22 });
num2.Add(new int[] { 200, 2 });
The result have to be like this:
[35, 44] => 2
[200, 22] => 2
[35, 33] => 1
[35, 44] => 1
[3967, 11] => 1
[200, 2 ] => 1
I have done something like this:
Dictionary<int[], int> result2 = (from i in num2
group i by i into g
orderby g.Count() descending
select new { Key = g.Key, Freq = g.Count() })
.ToDictionary(x => x.Key, x => x.Freq);
SetRichTextBox("\n\n Second grouping\n");
foreach (var i in result2)
{
SetRichTextBox("\nKey: ");
foreach (var r in i.Key)
{
SetRichTextBox(r.ToString() + " ");
}
SetRichTextBox("\n Value: " + i.Value.ToString());
}
But it is not working properly. Any help?

For arrays of length 2, this will work.
num2.GroupBy(a => a[0])
.Select(g => new { A0 = g.Key, A1 = g.GroupBy(a => a[1]) })
.SelectMany(a => a.A1.Select(a1 => new { Pair = new int[] { a.A0, a1.Key }, Count = a1.Count() }));
I think that should give you optimal performance; you could also try an .AsParallel() clause after your first Select statement.
This strategy (grouping successively by the n-th element of the arrays) generalises to arrays of arbitrary length:
var dim = 2;
var tuples = num2.GroupBy(a => a[0])
.Select(g => new Tuple<int[], List<int[]>>(new [] { g.Count(), g.Key }, g.Select(a => a.Skip(1).ToArray()).ToList()));
for (int n = 1; n < dim; n++)
{
tuples = tuples.SelectMany(t => t.Item2.GroupBy(list => list[0])
.Select(g => new Tuple<int[], List<int[]>>(new[] { g.Count() }.Concat(t.Item1.Skip(1)).Concat(new [] { g.Key }).ToArray(), g.Select(a => a.Skip(1).ToArray()).ToList())));
}
var output = tuples.Select(t => new { Arr = string.Join(",", t.Item1.Skip(1)), Count = t.Item1[0] })
.OrderByDescending(o => o.Count)
.ToList();
which generates an output of
Arr = "35, 44", Count = 2
Arr = "200, 22", Count = 2
Arr = "35, 33", Count = 1
Arr = "200, 2", Count = 1
Arr = "3967, 11", Count = 1
in your example. I'll let you test it for higher dimensions. :)
You should be able to parallelise these queries without too much difficulties, as the successive groupings are independent.

You can do something like this:
var results = from x in nums
group x by new { a = x[0], b = x[1] } into g
orderby g.Count() descending
select new
{
Key = g.Key,
Count = g.Count()
};
foreach (var result in results)
Console.WriteLine(String.Format("[{0},{1}]=>{2}", result.Key.a, result.Key.b, result.Count));
The trick is to come up with a way to compare the values in the array, instead of the arrays themselves.
The alternative (and possibly better option) would be to transform your data from int[] to some custom type, override the equality operator on that custom type, then just group x by x into g, but if you're really stuck with int[] then this works.

Related

LINQ - Join both lists by last digit

A sequence of positive integers integerList1 and integerList2 are given. All values in each sequence are different.
Get a set (list of NumberPair values) of all value pairs that satisfy the following conditions:
the first element of the pair belongs to the sequence integerList1,
the second belongs to
integerList2
both elements end with the same digit.
The NumberPair type includes
Value 1, Value 2 fields.
The resulting NumberPair list must be sorted in ascending order
by the first field, and if they are equal, by the second.
Here is an example:
integerList1: new[] { 1, 12, 4, 5, 78 }
integerList2: new[] { 1, 42, 75, 65, 8, 97 }
Expected result:
expected: new[]
{
new NumberPair{Item1 = 1, Item2 = 1},
new NumberPair{Item1 = 5, Item2 = 65},
new NumberPair{Item1 = 5, Item2 = 75},
new NumberPair{Item1 = 12, Item2 = 42},
new NumberPair{Item1 = 78, Item2 = 8}
}
I tried to solve like this
var lastDigitsGroups1 = integerList1.GroupBy(num => num % 10).ToDictionary(kvp => kvp.Key, kvp => kvp.ToList());
var lastDigitsGroups2 = integerList2.GroupBy(num => num % 10).ToDictionary(kvp => kvp.Key, kvp => kvp.ToList());
var intersection = lastDigitsGroups1.Keys.Intersect(lastDigitsGroups2.Keys);
foreach (var item in intersection)
{
var np = new NumberPair { Item1 = lastDigitsGroups1[item].FirstOrDefault(), Item2 = lastDigitsGroups2[item].FirstOrDefault() };
yield return np;
}
However, it should be done only using LINQ and even with one LINQ query.
Join both lists by keys as below:
var result = (from a in integerList1
join b in integerList2 on (a % 10) equals (b % 10)
select new NumberPair { Item1 = a, Item2 = b }
)
.OrderBy(x => x.Item1)
.ThenBy(x => x.Item2)
.ToList();
Or
var result = integerList1.Join(integerList2,
x => x % 10,
y => y % 10,
(x, y) => new { x, y })
.Select(x => new NumberPair { Item1 = x.x, Item2 = x.y })
.OrderBy(x => x.Item1)
.ThenBy(x => x.Item2)
.ToList();
Demo # .NET Fiddle
I honestly don't understand your approach, it does not seem to do what you have mentioned in your conditions. If i just take them as requirement, use Enumerable.Zip:
var result = integerList1.Zip(integerList2, (i1, i2) => new NumberPair{Item1 = i1, Item2 = i2} )
.OrderBy(np => np.Item1)
.ThenBy(np => np.Item2);

Group the indexes of the same elements in a array in C#

There is a int[] array that stores different numbers.
What I want is to group the indexes of those same numbers in the array to the same groups.
For exmaple, the array is int[5]{1,2,5,1,5}
I would like to see the output is List<List<int>> { {0,3}, {1}, {2,4} } // don't mind syntax
It's better if Linq (or a more efficient way) can be used, thanks for help.
You can simply use GroupBy and the position obtained from the Select overload:
int[] array;
var result = array.Select((v, idx) => new { Value = v, Index = idx })
.GroupBy(g => g.Value)
.Select(g => g.ToArray()) // inner array
.ToArray(); // outer array
One of ways:
var result = myArray.Select((elem, idx) => new { Value = elem, Idx = idx})
.GroupBy(proxy => proxy.Value);
foreach (var grouped in result)
{
Console.WriteLine("Element {0} has indexes: {1}",
grouped.Key,
string.Join(", ", grouped.Select(proxy => proxy.Idx).ToArray()));
}
var myFinalList = result.Select(proxy => proxy.ToArray()).ToList();
You can use Enumerable.Range combined with GroupBy:
int[] arr = { 1, 2, 5, 1, 5 };
var result = Enumerable.Range(0, arr.Length)
.GroupBy(i => arr[i])
.Select(x => x.ToList()).ToList();
DEMO HERE

Crafting a LINQ based solution to determine if a set of predicates are satisfied for a pair of collections constrained by a set of invariants

This isn't a question I feel I have the vocabulary to properly express, but I have two collections of the same anonymous type (lets call it 'a.)
'a is defined as new {string Name, int Count}
One of these collections of 'a we shall call requirements.
One of these collections of 'a we shall call candidates.
Given these collections, I want to determine if the following assertions hold.
If there exists some element in requirements r such that r.Count == 0, each element in candidates c such that r.Name == c.Name must satisfy c.Count == 0. There must exist one such element in candidates for each such element in requirements.
For each element of requirements r where r.Count > 0, there must be some subset of elements in candidates c such that c₁.Name, c₂.Name, ..., cₓ.Name == r.Name and that c₁ + ... + cₓ >= r.Count. Each element of candidates used to satisfy this rule for some element in requirements may not be used for another element in requirements.
An example of this would be that given
requirements = {{"A",0}, {"B", 0}, {"C", 9}}
candidates = {{"B", 0}, {"C", 1}, {"A",0}, {"D", 2}, {"C", 4}, {"C", 4}}
That this query would be satisfied.
r={"A", 0} and r={"B", 0} would be satisfied according to rule #1 against c={"A", 0} and c={"B", 0}
-and-
r={"C", 9) is satisfied according to rule #2 by the group gc on collections c.Name derived from {{"C", 1}, {"C", 4}, {"C", 4}} as gc = {"C", 9}
However it is worth noting that if requirements contained {"C", 6} and {"C", 3} instead of {"C", 9}, this particular set of collections would fail to satisfy the predicates.
Now to the question finally.
What is the best way to form this into a linq expression prioritizing speed (least iterations)?
The unsolved subset has been re-asked here
Here's my sketch for a linqy solution, but it doesn't address #3 at all. It works by grouping and joining on names. The hard part would then be to determine if there is some matching of requirements to candidates that satisfies the group.
void Main() {
var requirements = new [] {
new NameCount{ Name = "A", Count = 0 },
new NameCount{ Name = "B", Count = 0 },
new NameCount{ Name = "C", Count = 9 },
new NameCount{ Name = "D", Count = 3 },
new NameCount{ Name = "D", Count = 5 },
};
var candidates = new[] {
new NameCount {Name = "B", Count = 0},
new NameCount {Name = "C", Count = 1},
new NameCount {Name = "A", Count = 0},
new NameCount {Name = "D", Count = 2},
new NameCount {Name = "C", Count = 4},
new NameCount {Name = "C", Count = 4}
};
var matched = requirements
.GroupBy(r => r.Name)
.GroupJoin(candidates, rg => rg.Key, c => c.Name,
(rg, cg) => new { requirements = rg, candidates = cg });
bool satisfied = matched.All( /* ??? */ );
}
struct NameCount {
public string Name;
public int Count;
}
For the given input, matched would be this:
.GroupJoin has much better performance characteristics than candidates.Where in the projection.
After reconsidering the revised requirements, I've come up with a invariant assertions that must hold for a solution to exist..
For each paired cg and rg...
|cg.Name| >= |rg.Name|
cg.SummedCount >= rg.SummedCount
Assuming we have satisfied those conditions, a solution MAY exist.
My intuition suggests something similar to the following algorithm:
For each Name...
Let us call each r in rg a basket, and each c in cg an apple.
Sort apples in descending order.
We will keep track of which elements we've assigned to each basket in rg (e.g. r₁ is paired with cg₁.) Maintain sortedness in our buckets by ascending order of rₓ.Count - cgₓ.Count. (This value may be negative.)
Now, iterate through our list of apples, starting with the largest, and assign it to the least filled bucket by iterating through rg. If we overfill the first bucket, we continue descending through the list until we encounter a bucket that would remain unfilled if we put that apple in it. We then choose the previous bucket.
That is, we want to minimize the number of apples necessary to fill each bucket, so we prefer a perfect fit to overfilling, and overfilling to underfilling.
This algorithm does not work on the following case:
rg = (6, 5), cg = (3, 2, 2, 2, 2)
The above algorithm produces
r6 = (3, 2, 2), r5 = (2, 2)
whereas the solution ought to be
r6 = (2, 2, 2), r5 = (3, 2)
Going to post the obvious answer here, but I'm looking for something more elegant.
Given candidates as IEnumerable<'a>, project IEnumerable<'a> groupedCandidates from candidates by calling candidates.Where(c=>c.Count != 0).GroupBy(...) by performing a Sum on all elements with the same name.
Then project simpleCandidates from candidates.Except(groupedCandidates, (c,gc)=>c.Name == gc.Name)
Past here it gets fuzzy because candidates may only satisfy a requirement once.
EDIT: This solution does not meet the revised requirements.
I'm not familiar with LINQ, but it looks like you can do this problem in O(n) unless I misunderstand something. There are three steps to completing this problem.
First, construct a list or hashtable counter and populate it by iterating through c. If we use a hashtable, the size of the hashtable will be the length of c so we don't have to resize our hashtable.
for candidate in c:
counter[candidate.name] += candidate.count
We do this in one pass. O(m) where m is the length of c.
With counter constructed, we construct a hashtable by iterating through r.
for requirement in r:
if not h[requirement.name] or not requirement.count >= h[requirement.name]:
h[requirement.name] = requirement.count
Then, we simply iterate through counter and compare counts.
for sum in counter:
assert h[sum.name] and h[sum.name] >= sum.count
We do this in one pass: O(p) where p is the length of counter.
If this algorithm terminates successfully, our constraints are satisfied, and we've completed it in O(m) + O(o) + O(p)
I finally came up with a workable solution
IEnumerable<Glyph> requirements = t.Objectives.Cast<Glyph>().OrderBy(k => k.Name);
IEnumerable<Glyph> candidates = Resources.Cast<Glyph>().OrderBy(k => k.Name);
IEnumerable<Glyph> zeroCountCandidates = candidates.Where(c => c.Count == 0);
IEnumerable<Glyph> zeroCountRequirements = requirements.Where(r => r.Count == 0);
List<Glyph> remainingCandidates = zeroCountCandidates.ToList();
if (zeroCountCandidates.Count() < zeroCountRequirements.Count())
{
return false;
}
foreach (var r in zeroCountRequirements)
{
if (!remainingCandidates.Contains(r))
{
return false;
}
else
{
remainingCandidates.Remove(r);
}
}
IEnumerable<Glyph> nonZeroCountCandidates = candidates.Where(c => c.Count > 0);
IEnumerable<Glyph> nonZeroCountRequirements = requirements.Where(r => r.Count > 0);
var perms = nonZeroCountCandidates.Permutations();
foreach (var perm in perms)
{
bool isViableSolution = true;
remainingCandidates = perm.ToList();
foreach (var requirement in nonZeroCountRequirements)
{
int countThreshold = requirement.Count;
while (countThreshold > 0)
{
if (remainingCandidates.Count() == 0)
{
isViableSolution = false;
break;
}
var c = remainingCandidates[0];
countThreshold -= c.Count;
remainingCandidates.Remove(c);
}
}
if (isViableSolution)
{
return true;
}
}
return false;
Disgusting isn't it?
algorithm:
if any requirement Name doesn't exist in the candidates, return false
for any requirement having Count = 0
if there aren't at least as many candidates
with the same Name and Count, return false
eliminate all exact matches between candidates and requirements
eliminate requirements (and candidates) where the requirement
and all higher requirements have a higher candidate available
for remaining non-zero requirements
find the subset of candidates
that matches the most requirements
and eliminate the requirements (and candidates)
if there are any remaining non-zero requirements
return false
return true because no unmatched requirements remain
sample implementation:
public static bool IsValid(IEnumerable<string> requirementNames,
IList<int> requirementCounts,
IEnumerable<string> candidateNames,
IList<int> candidateCounts)
{
var requirements = requirementNames
.Select((x, i) => new
{
Name = x,
Count = requirementCounts[i]
})
.ToList();
var candidates = candidateNames
.Select((x, i) => new
{
Name = x,
Count = candidateCounts[i]
})
.ToList();
var zeroRequirements = requirements
.Where(x => x.Count == 0)
.Select(x => x.Name)
.GroupBy(x => x)
.ToDictionary(x => x.Key, x => x.Count());
var zeroCandidates = candidates
.Where(x => x.Count == 0)
.Select(x => x.Name)
.GroupBy(x => x)
.ToDictionary(x => x.Key, x => x.Count());
if (zeroRequirements.Keys.Any(x => !zeroCandidates.ContainsKey(x) ||
zeroCandidates[x] < zeroRequirements[x]))
{
return false;
}
var nonZeroRequirements = requirements
.Where(x => x.Count != 0)
.GroupBy(x => x.Name)
.ToDictionary(x => x.Key,
x => x.Select(y => y.Count)
.GroupBy(y => y)
.ToDictionary(y => y.Key, y => y.Count()));
var nonZeroCandidates = candidates
.Where(x => x.Count != 0)
.GroupBy(x => x.Name)
.ToDictionary(x => x.Key,
x => x.Select(y => y.Count)
.GroupBy(y => y)
.ToDictionary(y => y.Key, y => y.Count()));
foreach (var name in nonZeroRequirements.Keys.ToList())
{
var requirementsForName = nonZeroRequirements[name];
Dictionary<int, int> candidatesForName;
if (!nonZeroCandidates.TryGetValue(name, out candidatesForName))
{
return false;
}
if (candidatesForName.Sum(x => x.Value) <
requirementsForName.Sum(x => x.Value))
{
return false;
}
if (candidatesForName.Sum(x => x.Value*x.Key) <
requirementsForName.Sum(x => x.Value*x.Key))
{
return false;
}
EliminateExactMatches(candidatesForName, requirementsForName);
EliminateHighRequirementsWithAvailableHigherCandidate(candidatesForName, requirementsForName);
EliminateRequirementsThatHaveAMatchingCandidateSum(candidatesForName, requirementsForName);
if (requirementsForName
.Any(x => x.Value > 0))
{
return false;
}
}
return true;
}
private static void EliminateRequirementsThatHaveAMatchingCandidateSum(
IDictionary<int, int> candidatesForName,
IDictionary<int, int> requirementsForName)
{
var requirements = requirementsForName
.Where(x => x.Value > 0)
.OrderByDescending(x => x.Key)
.SelectMany(x => Enumerable.Repeat(x.Key, x.Value))
.ToList();
if (!requirements.Any())
{
return;
}
// requirements -> candidates used
var items = GenerateCandidateSetsThatSumToOrOverflow(
requirements.First(),
candidatesForName,
new List<int>())
.Concat(new[] {new KeyValuePair<int, IList<int>>(0, new List<int>())})
.Select(x => new KeyValuePair<IList<int>, IList<int>>(
new[] {x.Key}, x.Value));
foreach (var count in requirements.Skip(1))
{
var count1 = count;
items = (from i in items
from o in GenerateCandidateSetsThatSumToOrOverflow(
count1,
candidatesForName,
i.Value)
select
new KeyValuePair<IList<int>, IList<int>>(
i.Key.Concat(new[] {o.Key}).OrderBy(x => x).ToList(),
i.Value.Concat(o.Value).OrderBy(x => x).ToList()))
.GroupBy(
x => String.Join(",", x.Key.Select(y => y.ToString()).ToArray()) + ">"
+ String.Join(",", x.Value.Select(y => y.ToString()).ToArray()))
.Select(x => x.First());
}
var bestSet = items
.OrderByDescending(x => x.Key.Count(y => y > 0)) // match the most requirements
.ThenByDescending(x => x.Value.Count) // use the most candidates
.ToList();
var best = bestSet.First();
foreach (var requirementCount in best.Key.Where(x => x > 0))
{
requirementsForName[requirementCount] -= 1;
}
foreach (var candidateCount in best.Value.Where(x => x > 0))
{
candidatesForName[candidateCount] -= 1;
}
}
private static void EliminateHighRequirementsWithAvailableHigherCandidate(
IDictionary<int, int> candidatesForName,
IDictionary<int, int> requirementsForName)
{
foreach (var count in requirementsForName
.Where(x => x.Value > 0)
.OrderByDescending(x => x.Key)
.Select(x => x.Key)
.ToList())
{
while (requirementsForName[count] > 0)
{
var count1 = count;
var largerCandidates = candidatesForName
.Where(x => x.Key > count1)
.OrderByDescending(x => x.Key)
.ToList();
if (!largerCandidates.Any())
{
return;
}
var largerCount = largerCandidates.First().Key;
requirementsForName[count] -= 1;
candidatesForName[largerCount] -= 1;
}
}
}
private static void EliminateExactMatches(
IDictionary<int, int> candidatesForName,
IDictionary<int, int> requirementsForName)
{
foreach (var count in requirementsForName.Keys.ToList())
{
int numberOfCount;
if (candidatesForName.TryGetValue(count, out numberOfCount) &&
numberOfCount > 0)
{
var toRemove = Math.Min(numberOfCount, requirementsForName[count]);
requirementsForName[count] -= toRemove;
candidatesForName[count] -= toRemove;
}
}
}
private static IEnumerable<KeyValuePair<int, IList<int>>> GenerateCandidateSetsThatSumToOrOverflow(
int sumNeeded,
IEnumerable<KeyValuePair<int, int>> candidates,
IEnumerable<int> usedCandidates)
{
var usedCandidateLookup = usedCandidates
.GroupBy(x => x)
.ToDictionary(x => x.Key, x => x.Count());
var countToIndex = candidates
.Select(x => Enumerable.Range(
0,
usedCandidateLookup.ContainsKey(x.Key)
? x.Value - usedCandidateLookup[x.Key]
: x.Value)
.Select(i => new KeyValuePair<int, int>(x.Key, i)))
.SelectMany(x => x)
.ToList();
// sum to List of <count,index>
var sumToElements = countToIndex
.Select(a => new KeyValuePair<int, IList<KeyValuePair<int, int>>>(
a.Key, new[] {a}))
.ToList();
countToIndex = countToIndex.Where(x => x.Key < sumNeeded).ToList();
while (sumToElements.Any())
{
foreach (var set in sumToElements
.Where(x => x.Key >= sumNeeded))
{
yield return new KeyValuePair<int, IList<int>>(
sumNeeded,
set.Value.Select(x => x.Key).ToList());
}
sumToElements = (from a in sumToElements.Where(x => x.Key < sumNeeded)
from b in countToIndex
where !a.Value.Any(x => x.Key == b.Key && x.Value == b.Value)
select new KeyValuePair<int, IList<KeyValuePair<int, int>>>(
a.Key + b.Key,
a.Value.Concat(new[] {b})
.OrderBy(x => x.Key)
.ThenBy(x => x.Value)
.ToList()))
.GroupBy(x => String.Join(",", x.Value.Select(y => y.Key.ToString()).ToArray()))
.Select(x => x.First())
.ToList();
}
}
private static IEnumerable<int> GetAddendsFor(int sum, Random random)
{
var values = new List<int>();
while (sum > 0)
{
var addend = random.Next(1, sum);
sum -= addend;
values.Add(addend);
}
return values;
}
Tests:
[Test]
public void ABCC_0063__with_candidates__BCADCC_010244__should_return_false()
{
var requirementNames = "ABCC".Select(x => x.ToString()).ToArray();
var requirementCounts = new[] {0, 0, 6, 3};
var candidateNames = "BCADCC".Select(x => x.ToString()).ToArray();
var candidateCounts = new[] {0, 1, 0, 2, 4, 4};
var actual = IsValid(requirementNames, requirementCounts, candidateNames, candidateCounts);
actual.ShouldBeFalse();
}
[Test]
public void ABC_003__with_candidates__BCADCC_010244__should_return_true()
{
var requirementNames = "ABC".Select(x => x.ToString()).ToArray();
var requirementCounts = new[] {0, 0, 3};
var candidateNames = "BCADCC".Select(x => x.ToString()).ToArray();
var candidateCounts = new[] {0, 1, 0, 2, 4, 4};
var actual = IsValid(requirementNames, requirementCounts, candidateNames, candidateCounts);
actual.ShouldBeTrue();
}
[Test]
public void ABC_003__with_candidates__BCAD_0102__should_return_false()
{
var requirementNames = "ABC".Select(x => x.ToString()).ToArray();
var requirementCounts = new[] {0, 0, 3};
var candidateNames = "BCAD".Select(x => x.ToString()).ToArray();
var candidateCounts = new[] {0, 1, 0, 2};
var actual = IsValid(requirementNames, requirementCounts, candidateNames, candidateCounts);
actual.ShouldBeFalse();
}
[Test]
public void ABC_009__with_candidates__BCADCC_010244__should_return_true()
{
var requirementNames = "ABC".Select(x => x.ToString()).ToArray();
var requirementCounts = new[] {0, 0, 9};
var candidateNames = "BCADCC".Select(x => x.ToString()).ToArray();
var candidateCounts = new[] {0, 1, 0, 2, 4, 4};
var actual = IsValid(requirementNames, requirementCounts, candidateNames, candidateCounts);
actual.ShouldBeTrue();
}
[Test]
public void FuzzTestIt()
{
var random = new Random();
const string names = "ABCDE";
for (var tries = 0; tries < 10000000; tries++)
{
var numberOfRequirements = random.Next(5);
var shouldPass = true;
var requirementNames = new List<string>();
var requirementCounts = new List<int>();
var candidateNames = new List<string>();
var candidateCounts = new List<int>();
for (var i = 0; i < numberOfRequirements; i++)
{
var name = names.Substring(random.Next(names.Length), 1);
switch (random.Next(6))
{
case 0: // zero-requirement with corresponding candidate
requirementNames.Add(name);
requirementCounts.Add(0);
candidateNames.Add(name);
candidateCounts.Add(0);
break;
case 1: // zero-requirement without corresponding candidate
requirementNames.Add(name);
requirementCounts.Add(0);
shouldPass = false;
break;
case 2: // non-zero-requirement with corresponding candidate
{
var count = random.Next(1, 10);
requirementNames.Add(name);
requirementCounts.Add(count);
candidateNames.Add(name);
candidateCounts.Add(count);
}
break;
case 3: // non-zero-requirement with matching sum of candidates
{
var count = random.Next(1, 10);
requirementNames.Add(name);
requirementCounts.Add(count);
foreach (var value in GetAddendsFor(count, random))
{
candidateNames.Add(name);
candidateCounts.Add(value);
}
}
break;
case 4: // non-zero-requirement with matching overflow candidate
{
var count = random.Next(1, 10);
requirementNames.Add(name);
requirementCounts.Add(count);
candidateNames.Add(name);
candidateCounts.Add(count + 2);
}
break;
case 5: // non-zero-requirement without matching candidate or sum or candidates
{
var count = random.Next(10, 20);
requirementNames.Add(name);
requirementCounts.Add(count);
shouldPass = false;
}
break;
}
}
try
{
var actual = IsValid(requirementNames, requirementCounts, candidateNames, candidateCounts);
actual.ShouldBeEqualTo(shouldPass);
}
catch (Exception e)
{
Console.WriteLine("Requirements: " + String.Join(", ", requirementNames.ToArray()));
Console.WriteLine(" " +
String.Join(", ", requirementCounts.Select(x => x.ToString()).ToArray()));
Console.WriteLine("Candidates: " + String.Join(", ", candidateNames.ToArray()));
Console.WriteLine(" " +
String.Join(", ", candidateCounts.Select(x => x.ToString()).ToArray()));
Console.WriteLine(e);
Assert.Fail();
}
}
}

How to build a histogram for a list of int in C# [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Find the most frequent numbers in an array using LINQ
I have a list of int, List<int> demoList, which is something like {1, 2, 1, 1, 1, 3, 2, 1} and I want to write a LINQ statement for obtaining the number with the highest number of appearences from that list, which in my case is 1.
int highestAppearanceNum = demoList.GroupBy(i => i)
.OrderByDescending(grp => grp.Count())
.Select(grp => grp.First())
.First();
Edit: If you also want to know which number appears how often:
var appearances = demoList.GroupBy(i => i)
.OrderByDescending(grp => grp.Count())
.Select(grp => new { Num = grp.Key, Count = grp.Count() });
if (appearances.Any())
{
int highestAppearanceNum = appearances.First().Num; // 1
int highestAppearanceCount = appearances.First().Count; // 5
}
var list = new[] { 1, 2, 1, 1, 1, 3, 2, 1 };
var result = list
.GroupBy(x => x)
.Select(x => new { Number = x.Key, Count = x.Count() })
.OrderByDescending(x => x.Count)
.FirstOrDefault();
Console.WriteLine("highest number = {0}, count = {1}", result.Number, result.Count);
Use group by clause.
var groups =
from i in demoList
group i by i into g
select new { Value = g.Key, Count = g.Count() }
From here you can say
var max = groups.Max(g => g.Count);
groups.Where(g => g.Count == max).Select (g => g.Value); // { 1 }
var query =
from i in demoList
group i by i into g
orderby g.Count() descending
select new { Value = g.Key, Count = g.Count() };
var result = query.First();
Console.WriteLine(
"The number with the most occurrences is {0}, which appears {1} times",
result.Value,
result.Count);
I appologise in advance:
List<int> demoList = new List<int>() { 1, 2, 1, 1, 1, 3, 2, 1 };
Dictionary<int,int> keyOrdered = demoList.GroupBy(i => i)
.Select(i => new { i.Key, Count = i.Count() })
.OrderBy(i=>i.Key)
.ToDictionary(i=>i.Key, i=>i.Count);
var max = keyOrdered.OrderByDescending(i=>i.Value).FirstOrDefault();
List<string> histogram = new List<string>();
for (int i = max.Value; i >-1 ; i--)
{
histogram.Add(string.Concat(keyOrdered.Select(t => t.Value>i?"| ":" ")));
}
histogram.Add(string.Concat(keyOrdered.Keys.OrderBy(i => i).Select(i => i.ToString() + " ")));
histogram.ForEach(i => Console.WriteLine(i));
Console.WriteLine(Environment.NewLine);
Console.WriteLine("Max: {0}, Count:{1}", max.Key, max.Value);
when i read the title i thought of this and it made me smile.. (prolly full of bugs too!)

how would i use linq to find the most occured data in a data set?

List<int> a = 11,2,3,11,3,22,9,2
//output
11
This may not be the most efficient way, but it will get the job done.
public static int MostFrequent(IEnumerable<int> enumerable)
{
var query = from it in enumerable
group it by it into g
select new {Key = g.Key, Count = g.Count()} ;
return query.OrderByDescending(x => x.Count).First().Key;
}
And the fun single line version ...
public static int MostFrequent(IEnumerable<int> enumerable)
{
return (from it in enumerable
group it by it into g
select new {Key = g.Key, Count = g.Count()}).OrderByDescending(x => x.Count).First().Key;
}
a.GroupBy(item => item).
Select(group => new { Key = group.Key, Count = group.Count() }).
OrderByDescending(pair => pair.Count).
First().
Key;
Another example :
IEnumerable<int> numbers = new[] { 11, 2, 3, 11, 3, 22, 9, 2 };
int most = numbers
.Select(x => new { Number = x, Count = numbers.Count(y => y == x) })
.OrderByDescending(z => z.Count)
.First().Number;

Categories

Resources