Finding mode in List of integers [duplicate] - c#

This question already has answers here:
How to find the Mode in Array C#? [duplicate]
(4 answers)
Closed 7 years ago.
How can I find the mode of a list of numbers? I know the logic of it (I think) but I don't know how to implement that logic or convert what my brain thinks into workable code.
This is what I know:
I need to have a loop that goes through the list one time to see how many times a number is repeated and an array to save the times a number is repeated. I also need to tell my program to discard the lesser amount once a larger one is found.

A linq approach, more concise but almost certainly less efficient than Yeldar Kurmangaliyev's:
int FindMode(IEnumerable<int> data)
{
return data
.GroupBy(n => n)
.Select(x => new { x.Key, Count = x.Count() })
.OrderByDescending(a => a.Count)
.First()
.Key;
}
This does not handle the case where data is empty, nor where there are two or more data points with the same frequency in the data set.

Yes, you are right:
Let we have a list of numbers:
List<int> myValues = new List<int>(new int[] { 1, 3, 3, 3, 7, 7 } );
You need to have a loop that goes through the list one time:
foreach (var val in myValues)
{
}
to see how many times a number is repeated in array to save the times a number is repeated:
Dictionary<int, int> repetitions = new Dictionary<int, int>();
foreach (var val in myValues)
{
if (repetitions.ContainsKey(val))
repetitions[val]++; // Met it one more time
else
repetitions.Add(val, 1); // Met it once, because it is not in dict.
}
Now, your dictionary repetitions stores how many (exactly value) times key value repeated.
Then, you need to find the record of mode (i.e. record with the highest time of repetitions (i.e. highest value)) and take this one. LINQ will help us - let's sort the array by value and take the last one...or sort it descending and take the first one. Actually, that's the same in terms of result and productivity.
var modeRecord = repetitions.OrderByDescending(x => x.Value).First();
// or
var modeRecord = repetitions.OrderBy(x => x.Value).Last();
Here it is! Here we have a mode:
List<int> myValues = new List<int>(new int[] { 1, 3, 3, 3, 7, 7 } );
Dictionary<int, int> repetitions = new Dictionary<int, int>();
foreach (var val in myValues)
{
if (repetitions.ContainsKey(val))
repetitions[val]++; // Met it one more time
else
repetitions.Add(val, 1); // Met it once, because it is not in dict.
}
var modeRecord = repetitions.OrderByDescending(x => x.Value).First();
Console.WriteLine("Mode is {0}. It meets {1} times in an list", modeRecord.Key, modeRecord.Value);
Your mode calculation logic is good. All you need is following your own instructions in a code :)

Here's an alternative LINQ approach:
var values = new int[] { 1, 3, 3, 3, 7, 7 };
var mode =
values
.Aggregate(
new { best = 0, best_length = 0, current = 0, current_length = 0 },
(a, n) =>
{
var current_length = 1 + (a.current == n ? a.current_length : 0);
var is_longer = current_length > a.best_length;
return new
{
best = is_longer ? n : a.best,
best_length = is_longer ? current_length : a.best_length,
current = n,
current_length,
};
}).best;

Related

How can I choose the second highest value from a list in c#?

I have a list List<int> myList = new List<int>() { 10, 20, 8, 20, 9, 5, 20, 10 };, I want to choose the second highest value, which is in this case 10. I wrote this code and it works, but I wonder if there is something shorter and better.
List<int> myList = new List<int>() { 10, 20, 8, 20, 9, 5, 20, 10 };
myList = myList.Distinct().ToList();
var descendingOrder = myList.OrderByDescending(i => i);
var sec = descendingOrder.Skip(1).First();
You could just stop using intermediate variables and ToList()
var secondHighest =
myList
.Distinct()
.OrderByDescending(i => i);
.Skip(1)
.First();
This will work the same as your version, but only requires one statement instead of three.
I find it a lot easier to read code list this.
Each LINQ method call on it's own line, and no intermediate variables, especially ones that change (myList is reassigned, which makes it harder to comprehend).
Dave's suggestion to perform all the operations in one pipeline is very good indeed as it avoids:
unnecessary intermediate variables
eagerly creating new collection objects at intermediate steps
reduces clutter.
more readable i.e. it's easier to see what's going on
On the other hand, in terms of efficiency, it might be better to perform two passes over the source list instead of "sorting" the entire list only to take the second item.
var maximum = myList.Max();
var secondMaximum = myList.Where(x => x < maximum).Max();
I think I'd avoid LINQ for this one and just go for a standard "loop over every element, if current is higher than max, push current max to second place, current value to current max"
int sec = int.MinValue;
for(int i =0, m= int.MinValue; i <list.Length; i++)
if(list[i] > m){
sec = m;
m = list[i];
}
Your given logic distincts the values so it looks like 20 is not the second highest in your list even though there are three values that are 20. This is achieve here by the >. If I'd used >= then each 20 would roll the variables and it would behave as if non distincted
If you're interested in performance, test it over a list with a few million entries and pick the one that meets your appetite for readability vs speed
It's not LINQ-y, but it's O(N) and easy to read:
public static int TheSecondMax()
{
List<int> myList = new List<int>() { 10, 20, 8, 20, 9, 5, 20, 10 };
int max = int.MinValue;
int secondMax = int.MinValue;
foreach (var item in myList)
{
if (item > max)
{
max = item;
}
if (item > secondMax && item < max)
{
secondMax = item;
}
}
return secondMax;
}

Quick sorting a data array while tracking index C#

I'm a bit stuck using quick sort algorithm on an integer array, while saving the original indexes of the elements as they're moved around during the sorting process. Using C#/Visual studio
For example
ToSort Array {52,05,08,66,02,10}
Indexes : 0 1 2 3 4 5
AfterSort Array {02,05,08,10,52,66}
Indexes : 4 1 2 5 0 3
I need to save the indexes of the sorted values in another array.
I feel like this is very complex as quick sorting is recursive and any help or pointers would be much appreciated! Thanks!
As #Will said you can do something like this :
var myArray = new int[] { 52, 05, 08, 66, 02, 10 };
///In tupple item1 you have the number, in the item2 you have the index
var myIndexedArray = myArray.Select( ( n, index ) => Tuple.Create( n, index ) );
///Or if you are using c# 7, you can use the tuple literals ! :
var myIndexedArray = myArray.Select( ( n, index ) => ( n, index ) );
///Call your quick sort method, sort by the item1 (the number) inside the method
// or use Enumerable.OrderBy:
myIndexedArray = myIndexedArray.OrderBy(x => x.Item1);
///Then get your things back
int[] numbers = myIndexedArray.Select(x => x.Item1).ToArray();
int[] indexes = myIndexedArray.Select(x => x.Item2).ToArray();
LINQ OrderBy uses QuickSort internally. So instead of implementing QuickSort yourself, use OrderBy, if needed with a custom IComparer<T>.
Put the data to be sorted into an anonymous type which remembers the original index, then sort by value. You can retrieve the original index from the index property of the sorted elements.
using System.Linq;
var data = new int[] { 52,05,08,66,02,10 };
var sortingDictionary = data
.Select((value, index) => new { value, index });
var sorted = sortingDictionary
.OrderBy(kvp => kvp.value)
.ToList(); // enumerate before looping over result!
for (var newIndex = 0; newIndex < sorted.Count(); newIndex ++) {
var item = sorted.ElementAt(newIndex);
Console.WriteLine(
$"New index: {newIndex}, old index: {item.index}, value: {item.value}"
);
}
Fiddle
Edit: incorporated improvements suggested by mjwills

How to filter a List of int list with condition?

I have this List<List<int>>:
{{1,2},{1,3},{1,4},{2,3},{2,4},{3,4}}
In this list there are 6 list, which contain numbers from 1 to 4, and the occurrence of each number is 3;
I want to filter it in order to get:
{{1,2}{1,3}{2,4}{3,4}}
here the occurrence of each number is 2;
the lists are generated dynamically and I want to be able to filter also dynamically, base on the occurrence;
Edit-More Details
I need to count how many times a number is contain in the List<List<int>>, for the above example is 3. Then I want to exclude lists from the List<List<int>> in order to reduce the number of times from 3 to 2,
The main issue for me was to find a way to not block my computer :), and also to get each number appear for 2 times (mandatory);
Well if it's always a combination of 2 numbers, and they have to appear N times on the list, it means that depending on the N You gonna have:
4 (different digits) x 2 (times hey have to appear) = 8 digits = 4 pairs
4 x 3 (times) = 12 = 6 (pairs)
4 x 4 = 16 = 8 pairs
That means - that from 6 pairs we know we must select 4 pairs that best match the criteria
so based on the basic combinatorics (https://www.khanacademy.org/math/probability/probability-and-combinatorics-topic/permutations/v/permutation-formula)
we have a 6!/2! = (6*5*4*3*2*1)/(2*1)= 360 possible permutations
basically You can have 360 different ways how You put the the second list together.
because it doesn't matter how You arrange the items in the list (the order of items in the list) then the number of possible combinations is 6!/(2!*4!) = 15
https://www.khanacademy.org/math/probability/probability-and-combinatorics-topic/combinations-combinatorics/v/combination-formula
so the thing is - you have 15 possible answers to Your question.
Which means - you only need to loop over it for 15 times.
There are only 15 ways to chose 4 items out of the list of 6
seems like this is a solution to Your - "killing the machine" question.
so next question - how do we find all the possible 'combination'
Let's define all the possible items that we can pick from the input array
for example 1-st, 2-nd, 3-rd and 4-th..
1,2,3,4....... 1,2,3,5...... 1,2,3,6 ...
All the combinations would be (from here https://stackoverflow.com/a/10629938/444149)
static IEnumerable<IEnumerable<T>> GetKCombs<T>(IEnumerable<T> list, int length) where T : IComparable
{
if (length == 1) return list.Select(t => new T[] { t });
return GetKCombs(list, length - 1)
.SelectMany(t => list.Where(o => o.CompareTo(t.Last()) > 0),
(t1, t2) => t1.Concat(new T[] { t2 }));
}
and invoke with (because there are 6 items to pick from, who's indexed are 0,1,2,3,4 and 5)
var possiblePicks = GetKCombs(new List<int> { 0, 1, 2, 3, 4, 5 }, 4);
we get 15 possible combinations
so now - we try taking 4 elements out of the first list, and check if they match the criteria.. if not.. then take another combination
var data = new List<List<int>>
{
new List<int> { 1,2 },
new List<int> { 1,3 },
new List<int> { 1,4 },
new List<int> { 2,3 },
new List<int> { 2,4 },
new List<int> { 3,4 }
};
foreach (var picks in possiblePicks)
{
var listToTest = new List<List<int>>(4);
foreach (var i in picks)
listToTest.Add(data[i]);
var ok = Check(listToTest, 2);
if (ok)
break;
}
private bool Check(List<List<int>> listToTest, int limit)
{
Dictionary<int, int> ret = new Dictionary<int, int>();
foreach (var inputElem in listToTest)
{
foreach (var z in inputElem)
{
var returnCount = ret.ContainsKey(z) ? ret[z] : 0;
if (!ret.ContainsKey(z))
ret.Add(z, returnCount + 1);
else
ret[z]++;
}
}
return ret.All(p => p.Value == limit);
}
I'm sure this can be further optimized to minimize the amount of iterations other the 'listToTest'
Also, this is a lazy implementation (Ienumerable) - so if it so happens that the very first (or second) combination is successful, it stop iterating.
I accepted the Marty's answer because fixed my issue, any way trying to use his method for larger lists, I found my self blocking again my computer so I start looking for another method and I end it up with this one:
var main = new List<HashSet<int>> {
new HashSet<int> {1,2},
new HashSet<int> {1,3},
new HashSet<int> {1,4},
new HashSet<int> {2,3},
new HashSet<int> {2,4},
new HashSet<int> {3,4} };
var items = new HashSet<int>(from l in main from p in l select p); //=>{1,2,3,4}
for (int i =main.Count-1;i-->0; )
{
var occurence=items.Select(a=> main.Where(x => x.Contains(a)).Count()).ToList();
var occurenceSum = 0;
foreach(var j in main[i])
{
occurenceSum += occurence[j - 1];
if (occurenceSum==6) //if both items have occurence=3, then the sum=6, then I can remove that list!
{
main.RemoveAt(i);
}
}
}

Getting last x consecutive items with LINQ

My question is similar to this one: Finding Consecutive Items in List using Linq. Except, I'd like to get the last consecutive items that have no gaps. For example:
2, 4, 7, 8
output
7,8
Another example:
4,5,8,10,11,12
output
10,11,12
How can that be done?
I'm assuming in that you want the last consecutive sequence with more than one member... so from the sequence
{4, 5, 8, 10, 11, 12, 15}
you're expecting the sequence:
{10, 11, 12}
I've indicated the line to remove if the last sequence is permitted to have only a single member, giving a sequence of
{15}
Here's the linq:
new[] {4, 5, 8, 10, 11, 12, 15}
.Select((n,i) => new {n, i})
.GroupBy(x => x.n - x.i) //this line will group consecutive nums in the seq
.Where(g => g.Count() > 1) //remove this line if the seq {15} is expected
.Select(x => x.Select(xx => xx.n))
.LastOrDefault()
There's a hidden assumption here that the numbers of the sequence are in ascending order. If this isn't the case, it will be necessary to enroll the powers of microsoft's extension method for finding contiguous items in a sequence. Let me know if this is the case.
This works and is probably easier and more efficient than LINQ in this case:
var list = new[] { 2, 4, 7, 8 };
List<int> lastConsecutive = new List<int>();
for (int i = list.Length - 1; i > 0; i--)
{
lastConsecutive.Add(list[i]);
if (list[i] - 1 != list[i - 1])
break;
if(i==1 && list[i] - 1 == list[i - 1]) // needed since we're iterating just until 1
lastConsecutive.Add(list[0]);
}
lastConsecutive.Reverse();
I realise this is both late and wordy, but this is probably the fastest method here that still uses LINQ.
Test lists:
var list1 = new List<int> {2,4,7,8};
var list2 = new List<int> {4,5,8,10,11,12,15};
The method:
public List<int> LastConsecutive(List<int> list)
{
var rev = list.AsEnumerable().Reverse();
var res = rev.Zip(rev.Skip(1), (l, r) => new { left = l, right = r, diff = (l - r) })
.SkipWhile(x => x.diff != 1)
.TakeWhile(x => x.diff == 1);
return res.Take(1).Select(x => x.left)
.Concat(res.Select(x => x.right))
.Reverse().ToList();
}
This one goes from back to front and checks elements pairwise, only taking elements from when they start being consecutive (the SkipWhile) until they end being consecutive (the TakeWhile).
Then it does some work to pull the relevant pairwise numbers out (left number from the 'original' list and then all the right numbers), and reverses it back. Similar efficiency to the imperative version, but in my opinion simpler to read because of LINQ.

Check two List<int>'s for the same numbers

I have two List's which I want to check for corresponding numbers.
for example
List<int> a = new List<int>(){1, 2, 3, 4, 5};
List<int> b = new List<int>() {0, 4, 8, 12};
Should give the result 4.
Is there an easy way to do this without too much looping through the lists?
I'm on 3.0 for the project where I need this so no Linq.
You can use the .net 3.5 .Intersect() extension method:-
List<int> a = new List<int>() { 1, 2, 3, 4, 5 };
List<int> b = new List<int>() { 0, 4, 8, 12 };
List<int> common = a.Intersect(b).ToList();
Jeff Richter's excellent PowerCollections has Set with Intersections. Works all the way back to .NET 2.0.
http://www.codeplex.com/PowerCollections
Set<int> set1 = new Set<int>(new[]{1,2,3,4,5});
Set<int> set2 = new Set<int>(new[]{0,4,8,12});
Set<int> set3 = set1.Intersection(set2);
You could do it the way that LINQ does it, effectively - with a set. Now before 3.5 we haven't got a proper set type, so you'd need to use a Dictionary<int,int> or something like that:
Create a Dictionary<int, int> and populate it from list a using the element as both the key and the value for the entry. (The value in the entry really doesn't matter at all.)
Create a new list for the intersections (or write this as an iterator block, whatever).
Iterate through list b, and check with dictionary.ContainsKey: if it does, add an entry to the list or yield it.
That should be O(N+M) (i.e. linear in both list sizes)
Note that that will give you repeated entries if list b contains duplicates. If you wanted to avoid that, you could always change the value of the dictionary entry when you first see it in list b.
You can sort the second list and loop through the first one and for each value do a binary search on the second one.
If both lists are sorted, you can easily do this in O(n) time by doing a modified merge from merge-sort, simply "remove"(step a counter past) the lower of the two leading numbers, if they are ever equal, save that number to the result list and "remove" both of them. it takes less than n(1) + n(2) steps. This is of course assuming they are sorted. But sorting of integer arrays isn't exactly expensive O(n log(n))... I think. If you'd like I can throw together some code on how to do this, but the idea is pretty simple.
Tested on 3.0
List<int> a = new List<int>() { 1, 2, 3, 4, 5, 12, 13 };
List<int> b = new List<int>() { 0, 4, 8, 12 };
List<int> intersection = new List<int>();
Dictionary<int, int> dictionary = new Dictionary<int, int>();
a.ForEach(x => { if(!dictionary.ContainsKey(x))dictionary.Add(x, 0); });
b.ForEach(x => { if(dictionary.ContainsKey(x)) dictionary[x]++; });
foreach(var item in dictionary)
{
if(item.Value > 0)
intersection.Add(item.Key);
}
In comment to question author said that there will be
Max 15 in the first list and 20 in the
second list
In this case I wouldn't bother with optimizations and use List.Contains.
For larger lists hash can be used to take advantage of O(1) lookup that leads to O(N+M) algorithm as Jon noted.
Hash requires additional space. To reduce memory usage we should hash shortest list.
List<int> a = new List<int>() { 1, 2, 3, 4, 5 };
List<int> b = new List<int>() { 0, 4, 8, 12 };
List<int> shortestList;
List<int> longestList;
if (a.Count > b.Count)
{
shortestList = b;
longestList = a;
}
else
{
shortestList = a;
longestList = b;
}
Dictionary<int, bool> dict = new Dictionary<int, bool>();
shortestList.ForEach(x => dict.Add(x, true));
foreach (int i in longestList)
{
if (dict.ContainsKey(i))
{
Console.WriteLine(i);
}
}
var c = a.Intersect(b);
This only works in 3.5 saw your requirement my apologies.
The method recommended by ocdecio is a good one if you're going to implement it from scratch. Looking at the time complexity compared to the nieve method we see:
Sort/binary search method:
T ~= O(n log n) + O(n) * O(log n) ~= O(n log n)
Looping through both lists (nieve method):
T ~= O(n) * O(n) ~= O(n ^ 2)
There may be a quicker method, but I am not aware of it. Hopefully that should justify choosing his method.
(Previous answer - changed IndexOf to Contains, as IndexOf casts to an array first)
Seeing as it's two small lists the code below should be fine. Not sure if there's a library with an intersection method like Java has (although List isn't a set so it wouldn't work), I know as someone pointed out the PowerCollection library has one.
List<int> a = new List<int>() {1, 2, 3, 4, 5};
List<int> b = new List<int>() {0, 4, 8, 12};
List<int> result = new List<int>();
for (int i=0;i < a.Count;i++)
{
if (b.Contains(a[i]))
result.Add(a[i]);
}
foreach (int i in result)
Console.WriteLine(i);
Update 2: HashSet was a dumb answer as it's 3.5 not 3.0
Update: HashSet seems like the obvious answer:
// Method 2 - HashSet from System.Core
HashSet<int> aSet = new HashSet<int>(a);
HashSet<int> bSet = new HashSet<int>(b);
aSet.IntersectWith(bSet);
foreach (int i in aSet)
Console.WriteLine(i);
Here is a method that removed duplicate strings. Change this to accomidate int and it will work fine.
public List<string> removeDuplicates(List<string> inputList)
{
Dictionary<string, int> uniqueStore = new Dictionary<string, int>();
List<string> finalList = new List<string>();
foreach (string currValue in inputList)
{
if (!uniqueStore.ContainsKey(currValue))
{
uniqueStore.Add(currValue, 0);
finalList.Add(currValue);
}
}
return finalList;
}
Update: Sorry, I am actually combining the lists and then removing duplicates. I am passing the combined list to this method. Not exactly what you are looking for.
Wow. The answers thus far look very complicated. Why not just use :
List<int> a = new List<int>() { 1, 2, 3, 4, 5, 12, 13 };
List<int> b = new List<int>() { 0, 4, 8, 12 };
...
public List<int> Dups(List<int> a, List<int> b)
{
List<int> ret = new List<int>();
foreach (int x in b)
{
if (a.Contains(x))
{
ret.add(x);
}
}
return ret;
}
This seems much more straight-forward to me... unless I've missed part of the question. Which is entirely possible.

Categories

Resources