IComparable<T> gives stackoverflow when used for negative numbers? - c#

This is a weired problem, I have implemented simple quick sort as follows..
static void Main(string[] args)
{
List<int> unsorted = new List<int> { 1, 3, 5, 7, 9, 8, 6, 4, 2 };
List<int> sorted = quicksort(unsorted);
Console.WriteLine(string.Join(",", sorted));
Console.ReadKey();
}
private static List<T> quicksort<T>(List<T> arr) where T : IComparable<T>
{
List<T> loe = new List<T>(), gt = new List<T>();
if (arr.Count < 2)
return arr;
int pivot = arr.Count / 2;
T pivot_val = arr[pivot];
arr.RemoveAt(pivot);
foreach (T i in arr)
{
if (i.CompareTo(pivot_val) <= 0)
loe.Add(i);
else
gt.Add(i);
}
List<T> resultSet = new List<T>();
resultSet.AddRange(quicksort(loe));
gt.Add(pivot_val);
resultSet.AddRange(quicksort(gt));
return resultSet;
}
Output is : 1,2,3,4,5,6,7,8,9
But When I use any negative number in the unsorted list there is a stackoverflow error,
for example
if List unsorted = new List { 1, 3, 5, 7, 9, 8, 6, 4, 2, -1 };
Now there is a stackoverflow..
What's going on? Why this is not working ?

Your algorithm has a bug. Consider the simplest input list { 1, -1 }. Let's step through your logic.
You first choose a pivot index, Count / 2, which is 1.
You then remove the pivot element at index 1 (-1) from the arr list.
Next you compare each remaining element in the arr list (there's just the 1 at index 0) with the pivot.
The 1 is greater than the pivot (-1) so you add it to the gt list.
Next you quicksort the loe list, which is empty. That sort returns an empty list, which you add to the result set.
You then add the pivot value to the end of the gt list, so the gt list now looks like this: { 1, -1 }. Notice that this is the exact same list as you started with.
You then attempt to quicksort the gt list. Since you are calling the quicksort routine with the same input, the same sequence of steps happens again, until the stack overflows.
It seems the error in your logic is that you blindly add the pivot to the gt list without comparing it to anything. I'll leave it to you to figure out how to make it work.
Edited to add: I'm assuming this is a homework assignment, but if it's not, I would highly recommend using .NET's built in Sort() method on List<T>. It has been highly optimized and heavily tested, and will most likely perform better than anything home-brewed. Why reinvent the wheel?

if you don't have a debugger try this...
foreach (T i in arr)
{
if (i.CompareTo(pivot_val) <= 0)
{
loe.Add(i);
Console.WriteLine("loe.add " + i.ToString());
}
else
{
gt.Add(i);
Console.WriteLine("gt.add " + i.ToString());
}
}

Related

Is there a Linq equivalent to the unix command uniq

Every search I make assumes "Distinct()", but this is NOT my requirement. I just wish to remove all the repeats. Are there any options using linq (i.e. the Enumerable extensions) ?
For example (in C#)
int[] input = new [] {1,2,3,3,4,5,5,5,6,6,5,4,4,3,2,1,6};
int[] expected = new [] {1,2,3,4,5,6,5,4,3,2,1,6};
You are asking for non-repeating elements, not unique elements. LINQ-to-Objects operations are essentially iterators. You could write your own iterator method that only yields the first time an item is encountered, eg:
public static IEnumerable<int> DistinctUntilChanged(this IEnumerable<int> source)
{
int? previous=null;
foreach(var item in source)
{
if (item!=previous)
{
previous=item;
yield return item;
}
}
}
var input = new [] {1,2,3,3,4,5,5,5,6,6,5,4,4,3,2,1,6};
var result=input.DistinctUntilChanged().ToArray();
The result will be :
{1,2,3,4,5,6,5,4,3,2,1,6};
UPDATE
Another option is to use Observable.DistinctUntilChanged from the System.Reactive Library, eg:
var input = new[] { 1, 2, 3, 3, 4, 5, 5, 5, 6, 6, 5, 4, 4, 3, 2, 1, 6 };
var result = input.ToObservable()
.DistinctUntilChanged()
.ToEnumerable()
.ToArray();
System.Reactive, and Reactive Extensions are meant to handle sequences of events using the basic LINQ operators and more. It's easy to convert between Observable and Enumerable though, with ToObservable() and ToEnumerable(), so they can be used to handle any collection. After all, an event sequence is similar to an "infinite" sequence
UPDATE 2
In case there's any confusion about the use of int? to store the previous number, it's to allow easy comparison even with the first element of the source without actually calling First() on it. If it was ,eg int previous=0; and the first element was 0, the comparison would filter out the first element.
By using an int? in C# or an int option in F# or a Maybe<int> if we have a Maybe monad we can differentiate between no initial value and an initial value of 0.
Observable.DistinctUntilChanged uses a flag to check whether we are checking the first element. The equivalent code would be:
public static IEnumerable<int> NonRepeating(this IEnumerable<int> source)
{
int previous =0;
bool isAssigned=false;
foreach (var item in source)
{
if (!isAssigned || item != previous)
{
isAssigned = true;
previous = item;
yield return item;
}
}
}
MoreLINQ
Finally, one can use the GroupAdjacent method from the MoreLinq library to group repeating items together. Each group contains the repeating source elements. In this particular case though we only need the key values:
var result = input.GroupAdjacent(i => i).Select(i => i.Key).ToArray();
The nice thing about GroupAdjacent is that the elements can be transformed while grouping, eg :
input.GroupAdjacent(i => i,i=>$"Number {i}")
would return groupings of strings.
It is possible with linq, although for performance and readability a simple for loop would probably be the better option.
int[] input = new[] { 1, 2, 3, 3, 4, 5, 5, 5, 6, 6, 5, 4, 4, 3, 2, 1, 6 };
var result = input.Where((x, i) => i == 0 || x != input[i - 1]).ToArray();

Will the result of a LINQ query always be guaranteed to be in the correct order?

Question: Will the result of a LINQ query always be guaranteed to be in the correct order?
Example:
int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };
var lowNums =
from n in numbers
where n < 5
select n;
Now, when we walk through the entries of the query result, will it be in the same order as the input data numbers is ordered?
foreach (var x in lowNums)
{
Console.WriteLine(x);
}
If someone can provide a note on the ordering in the documentation, this would be perfect.
For LINQ to Objects: Yep.
For Parallel LINQ: Nope.
For LINQ to Expression Trees (EF, L2S, etc): Nope.
I think the order of elements retrieved by a LINQ is preserved, at least for LINQ to Object, for LINQ to SQL or Entity, it may depend on the order of the records in the table. For LINQ to Object, I'll try explaining why it preserves the order.
In fact when the LINQ query is executed, the IEnumerable source will call to GetEnumerator() to start looping with a while loop and get the next element using MoveNext(). This is how a foreach works on the IEnumerable source. We all know that a foreach will preserve the order of the elements in a list/collection. Digging more deeply into the MoveNext(), I think it just has some Position to save the current Index and MoveNext() just increase the Position and yield the corresponding element (at the new position). That's why it should preserve the order, all the code changing the original order is redundant or by explicitly calling to OrderBy or OrderByDescending.
If you think this
int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };
foreach(var i in numbers)
if(i < 5) Console.Write(i + " ");
prints out 4 1 3 2 0 you should think this
int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };
IEnumerator ie = numbers.GetEnumerator();
while(ie.MoveNext()){
if((int)ie.Current < 5) Console.Write(ie.Current + " ");
}
also prints out 4 1 3 2 0. Hence this LINQ query
var lowNums = from n in numbers
where n < 5
select n;
foreach (var i in lowNums) {
Console.Write(i + " ");
}
should also print out 4 1 3 2 0.
Conclusion: The order of elements in LINQ depends on how MoveNext() of an IEnumerator obtained from an IEnumerable is implemented. However, it's for sure that the order of elements in LINQ result will be the same order a foreach loop works on the elements.

Is there a C# equivalent to C++ std::partial_sort?

I'm trying to implement a paging algorithm for a dataset sortable via many criteria. Unfortunately, while some of those criteria can be implemented at the database level, some must be done at the app level (we have to integrate with another data source). We have a paging (actually infinite scroll) requirement and are looking for a way to minimize the pain of sorting the entire dataset at the app level with every paging call.
What is the best way to do a partial sort, only sorting the part of the list that absolutely needs to be sorted? Is there an equivalent to C++'s std::partial_sort function available in the .NET libraries? How should I go about solving this problem?
EDIT: Here's an example of what I'm going for:
Let's say I need to get elements 21-40 of a 1000 element set, according to some sorting criteria. In order to speed up the sort, and since I have to go through the whole dataset every time anyway (this is a web service over HTTP, which is stateless), I don't need the whole dataset ordered. I only need elements 21-40 to be correctly ordered. It is sufficient to create 3 partitions: Elements 1-20, unsorted (but all less than element 21); elements 21-40, sorted; and elements 41-1000, unsorted (but all greater than element 40).
OK. Here's what I would try based on what you said in reply to my comment.
I want to be able to say "4th through 6th" and get something like: 3,
2, 1 (unsorted, but all less than proper 4th element); 4, 5, 6 (sorted
and in the same place they would be for a sorted list); 8, 7, 9
(unsorted, but all greater than proper 6th element).
Lets add 10 to our list to make it easier: 10, 9, 8, 7, 6, 5, 4, 3, 2, 1.
So, what you could do is use the quick select algorithm to find the the ith and kth elements. In your case above i is 4 and k is 6. That will of course return the values 4 and 6. That's going to take two passes through your list. So, so far the runtime is O(2n) = O(n). The next part is easy, of course. We have lower and upper bounds on the data we care about. All we need to do is make another pass through our list looking for any element that is between our upper and lower bounds. If we find such an element we throw it into a new List. Finally, we then sort our List which contains only the ith through kth elements that we care about.
So, I believe the total runtime ends up being O(N) + O((k-i)lg(k-i))
static void Main(string[] args) {
//create an array of 10 million items that are randomly ordered
var list = Enumerable.Range(1, 10000000).OrderBy(x => Guid.NewGuid()).ToList();
var sw = Stopwatch.StartNew();
var slowOrder = list.OrderBy(x => x).Skip(10).Take(10).ToList();
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
//Took ~8 seconds on my machine
sw.Restart();
var smallVal = Quickselect(list, 11);
var largeVal = Quickselect(list, 20);
var elements = list.Where(el => el >= smallVal && el <= largeVal).OrderBy(el => el);
Console.WriteLine(sw.ElapsedMilliseconds);
//Took ~1 second on my machine
}
public static T Quickselect<T>(IList<T> list , int k) where T : IComparable {
Random rand = new Random();
int r = rand.Next(0, list.Count);
T pivot = list[r];
List<T> smaller = new List<T>();
List<T> larger = new List<T>();
foreach (T element in list) {
var comparison = element.CompareTo(pivot);
if (comparison == -1) {
smaller.Add(element);
}
else if (comparison == 1) {
larger.Add(element);
}
}
if (k <= smaller.Count) {
return Quickselect(smaller, k);
}
else if (k > list.Count - larger.Count) {
return Quickselect(larger, k - (list.Count - larger.Count));
}
else {
return pivot;
}
}
You can use List<T>.Sort(int, int, IComparer<T>):
inputList.Sort(startIndex, count, Comparer<T>.Default);
Array.Sort() has an overload that accepts index and length arguments that lets you sort a subset of an array. The same exists for List.
You cannot sort an IEnumerable directly, of course.

Top 5 values from three given arrays

Recently i faced a question in
C#,question is:-
There are three int arrays
Array1={88,65,09,888,87}
Array2={1,49,921,13,33}
Array2={22,44,66,88,110}
Now i have to get array of highest 5 from all these three arrays.What is the most optimized way of doing this in c#?
The way i can think of is take an array of size 15 and add array elements of all three arrays and sort it n get last 5.
An easy way with LINQ:
int[] top5 = array1.Concat(array2).Concat(array3).OrderByDescending(i => i).Take(5).ToArray();
An optimal way:
List<int> highests = new List<int>(); // Keep the current top 5 sorted
// Traverse each array. No need to put them together in an int[][]..it's just for simplicity
foreach (int[] array in new int[][] { array1, array2, array3 }) {
foreach (int i in array) {
int index = highests.BinarySearch(i); // where should i be?
if (highests.Count < 5) { // if not 5 yet, add anyway
if (index < 0) {
highests.Insert(~index, i);
} else { //add (duplicate)
highests.Insert(index, i);
}
}
else if (index < 0) { // not in top-5 yet, add
highests.Insert(~index, i);
highests.RemoveAt(0);
} else if (index > 0) { // already in top-5, add (duplicate)
highests.Insert(index, i);
highests.RemoveAt(0);
}
}
}
Keep a sorted list of the top-5 and traverse each array just once.
You may even check the lowest of the top-5 each time, avoiding the BinarySearch:
List<int> highests = new List<int>();
foreach (int[] array in new int[][] { array1, array2, array3 }) {
foreach (int i in array) {
int index = highests.BinarySearch(i);
if (highests.Count < 5) { // if not 5 yet, add anyway
if (index < 0) {
highests.Insert(~index, i);
} else { //add (duplicate)
highests.Insert(index, i);
}
} else if (highests.First() < i) { // if larger than lowest top-5
if (index < 0) { // not in top-5 yet, add
highests.Insert(~index, i);
highests.RemoveAt(0);
} else { // already in top-5, add (duplicate)
highests.Insert(index, i);
highests.RemoveAt(0);
}
}
}
}
The most optimized way for a fixed K=5 is gong through all arrays five times, picking the highest element not taken so far on each pass. You need to mark the element that you take in order to skip it on subsequent passes. This has the complexity of O(N1+N2+N3) (you go through all N1+N2+N3 elements five times), which is as fast as it can get.
You can combine the arrays using LINQ, sort them, then reverse.
int[] a1 = new int[] { 1, 10, 2, 9 };
int[] a2 = new int[] { 3, 8, 4, 7 };
int[] a3 = new int[] { 2, 9, 8, 4 };
int[] a4 = a1.Concat(a2).Concat(a3).ToArray();
Array.Sort(a4);
Array.Reverse(a4);
for (int i = 0; i < 5; i++)
{
Console.WriteLine(a4[i].ToString());
}
Console.ReadLine();
Prints: 10, 9, 9, 8, 8 from the sample I provided as input for the arrays.
Maybe you could have an array of 5 elements which would be the "max values" array.
Initially fill it with the first 5 values, which in your case would just be the first array. Then loop through the rest of the values. For each value, check it against the 5 max values from least to greatest. If you find the current value from the main list is greater than the value in the max values array, insert it above that element in the array, which would push the last element out. At the end you should have an array of the 5 max values.
For three arrays of length N1,N2,N3, the fastest way should be combining the 3 arrays, and then finding the (N1+N2+N3-4)th order statistic using modified quick sort.
In the resultant array, the elements with indices (N1+N2+N3-5) to the maximum (N1+N2+N3-1) should be your 5 largest. You can also sort them later.
The time complexity of this approach is O(N1+N2+N3) on average.
Here are the two ways for doing this task. The first one is using only basic types. This is the most efficient way, with no extra loop, no extra comparison, and no extra memory consumption. You just pass the index of elements that need to be matched with another one and calculate which is the next index to be matched for each given array.
First Way -
http://www.dotnetbull.com/2013/09/find-max-top-5-number-from-3-sorted-array.html
Second Way -
int[] Array1 = { 09, 65, 87, 89, 888 };
int[] Array2 = { 1, 13, 33, 49, 921 };
int[] Array3 = { 22, 44, 66, 88, 110 };
int [] MergeArr = Array1.Concat(Array2).Concat(Array3).ToArray();
Array.Sort(MergeArr);
int [] Top5Number = MergeArr.Reverse().Take(5).ToArray()
Taken From -
Find max top 5 number from three given sorted array
Short answer: Use a SortedList from Sorted Collection Types in .NET as a min-heap.
Explanation:
From the first array, add 5 elements to this SortedList/min-heap;
Now iterate through all the rest of the elements of arrays:
If an array element is bigger than the smallest element in min-heap then remove the min element and push this array element in the heap;
Else, continue to next array element;
In the end, your min-heap has the 5 biggest elements of all arrays.
Complexity: Takes Log k time to find the minimum when you have a SortedList of k elements. Multiply that by total elements in all arrays because you are going to perform this 'find minimum operation' that many times.
Brings us to overall complexity of O(n * Log k) where n is the total number of elements in all your arrays and k is the number of highest numbers you want.

Check two List<int>'s for the same numbers

I have two List's which I want to check for corresponding numbers.
for example
List<int> a = new List<int>(){1, 2, 3, 4, 5};
List<int> b = new List<int>() {0, 4, 8, 12};
Should give the result 4.
Is there an easy way to do this without too much looping through the lists?
I'm on 3.0 for the project where I need this so no Linq.
You can use the .net 3.5 .Intersect() extension method:-
List<int> a = new List<int>() { 1, 2, 3, 4, 5 };
List<int> b = new List<int>() { 0, 4, 8, 12 };
List<int> common = a.Intersect(b).ToList();
Jeff Richter's excellent PowerCollections has Set with Intersections. Works all the way back to .NET 2.0.
http://www.codeplex.com/PowerCollections
Set<int> set1 = new Set<int>(new[]{1,2,3,4,5});
Set<int> set2 = new Set<int>(new[]{0,4,8,12});
Set<int> set3 = set1.Intersection(set2);
You could do it the way that LINQ does it, effectively - with a set. Now before 3.5 we haven't got a proper set type, so you'd need to use a Dictionary<int,int> or something like that:
Create a Dictionary<int, int> and populate it from list a using the element as both the key and the value for the entry. (The value in the entry really doesn't matter at all.)
Create a new list for the intersections (or write this as an iterator block, whatever).
Iterate through list b, and check with dictionary.ContainsKey: if it does, add an entry to the list or yield it.
That should be O(N+M) (i.e. linear in both list sizes)
Note that that will give you repeated entries if list b contains duplicates. If you wanted to avoid that, you could always change the value of the dictionary entry when you first see it in list b.
You can sort the second list and loop through the first one and for each value do a binary search on the second one.
If both lists are sorted, you can easily do this in O(n) time by doing a modified merge from merge-sort, simply "remove"(step a counter past) the lower of the two leading numbers, if they are ever equal, save that number to the result list and "remove" both of them. it takes less than n(1) + n(2) steps. This is of course assuming they are sorted. But sorting of integer arrays isn't exactly expensive O(n log(n))... I think. If you'd like I can throw together some code on how to do this, but the idea is pretty simple.
Tested on 3.0
List<int> a = new List<int>() { 1, 2, 3, 4, 5, 12, 13 };
List<int> b = new List<int>() { 0, 4, 8, 12 };
List<int> intersection = new List<int>();
Dictionary<int, int> dictionary = new Dictionary<int, int>();
a.ForEach(x => { if(!dictionary.ContainsKey(x))dictionary.Add(x, 0); });
b.ForEach(x => { if(dictionary.ContainsKey(x)) dictionary[x]++; });
foreach(var item in dictionary)
{
if(item.Value > 0)
intersection.Add(item.Key);
}
In comment to question author said that there will be
Max 15 in the first list and 20 in the
second list
In this case I wouldn't bother with optimizations and use List.Contains.
For larger lists hash can be used to take advantage of O(1) lookup that leads to O(N+M) algorithm as Jon noted.
Hash requires additional space. To reduce memory usage we should hash shortest list.
List<int> a = new List<int>() { 1, 2, 3, 4, 5 };
List<int> b = new List<int>() { 0, 4, 8, 12 };
List<int> shortestList;
List<int> longestList;
if (a.Count > b.Count)
{
shortestList = b;
longestList = a;
}
else
{
shortestList = a;
longestList = b;
}
Dictionary<int, bool> dict = new Dictionary<int, bool>();
shortestList.ForEach(x => dict.Add(x, true));
foreach (int i in longestList)
{
if (dict.ContainsKey(i))
{
Console.WriteLine(i);
}
}
var c = a.Intersect(b);
This only works in 3.5 saw your requirement my apologies.
The method recommended by ocdecio is a good one if you're going to implement it from scratch. Looking at the time complexity compared to the nieve method we see:
Sort/binary search method:
T ~= O(n log n) + O(n) * O(log n) ~= O(n log n)
Looping through both lists (nieve method):
T ~= O(n) * O(n) ~= O(n ^ 2)
There may be a quicker method, but I am not aware of it. Hopefully that should justify choosing his method.
(Previous answer - changed IndexOf to Contains, as IndexOf casts to an array first)
Seeing as it's two small lists the code below should be fine. Not sure if there's a library with an intersection method like Java has (although List isn't a set so it wouldn't work), I know as someone pointed out the PowerCollection library has one.
List<int> a = new List<int>() {1, 2, 3, 4, 5};
List<int> b = new List<int>() {0, 4, 8, 12};
List<int> result = new List<int>();
for (int i=0;i < a.Count;i++)
{
if (b.Contains(a[i]))
result.Add(a[i]);
}
foreach (int i in result)
Console.WriteLine(i);
Update 2: HashSet was a dumb answer as it's 3.5 not 3.0
Update: HashSet seems like the obvious answer:
// Method 2 - HashSet from System.Core
HashSet<int> aSet = new HashSet<int>(a);
HashSet<int> bSet = new HashSet<int>(b);
aSet.IntersectWith(bSet);
foreach (int i in aSet)
Console.WriteLine(i);
Here is a method that removed duplicate strings. Change this to accomidate int and it will work fine.
public List<string> removeDuplicates(List<string> inputList)
{
Dictionary<string, int> uniqueStore = new Dictionary<string, int>();
List<string> finalList = new List<string>();
foreach (string currValue in inputList)
{
if (!uniqueStore.ContainsKey(currValue))
{
uniqueStore.Add(currValue, 0);
finalList.Add(currValue);
}
}
return finalList;
}
Update: Sorry, I am actually combining the lists and then removing duplicates. I am passing the combined list to this method. Not exactly what you are looking for.
Wow. The answers thus far look very complicated. Why not just use :
List<int> a = new List<int>() { 1, 2, 3, 4, 5, 12, 13 };
List<int> b = new List<int>() { 0, 4, 8, 12 };
...
public List<int> Dups(List<int> a, List<int> b)
{
List<int> ret = new List<int>();
foreach (int x in b)
{
if (a.Contains(x))
{
ret.add(x);
}
}
return ret;
}
This seems much more straight-forward to me... unless I've missed part of the question. Which is entirely possible.

Categories

Resources