C# Sort List Based on Another List - c#

I have a class that has multiple List<> contained within it. Its basically a table stored with each column as a List<>. Each column does not contain the same type. Each list is also the same length (has the same number of elements).
For example:
I have 3 List<> objects; one List, two List, and three List.
//Not syntactically correct
List<DateTime> one = new List...{4/12/2010, 4/9/2006, 4/13/2008};
List<double> two = new List...{24.5, 56.2, 47.4};
List<string> three = new List...{"B", "K", "Z"};
I want to be able to sort list one from oldest to newest:
one = {4/9/2006, 4/13/2008, 4/12/2010};
So to do this I moved element 0 to the end.
I then want to sort list two and three the same way; moving the first to the last.
So when I sort one list, I want the data in the corresponding index in the other lists to also change in accordance with how the one list is sorted.
I'm guessing I have to overload IComparer somehow, but I feel like there's a shortcut I haven't realized.

I've handled this design in the past by keeping or creating a separate index list. You first sort the index list, and then use it to sort (or just access) the other lists. You can do this by creating a custom IComparer for the index list. What you do inside that IComparer is to compare based on indexes into the key list. In other words, you are sorting the index list indirectly. Something like:
// This is the compare function for the separate *index* list.
int Compare (object x, object y)
{
KeyList[(int) x].CompareTo(KeyList[(int) y])
}
So you are sorting the index list based on the values in the key list. Then you can use that sorted key list to re-order the other lists. If this is unclear, I'll try to add a more complete example when I get in a situation to post one.

Here's a way to do it using LINQ and projections. The first query generates an array with the original indexes reordered by the datetime values; in your example, the newOrdering array would have members:
{ 4/9/2006, 1 }, { 4/13/2008, 2 }, { 4/12/2010, 0 }
The second set of statements generate new lists by picking items using the reordered indexes (in other words, items 1, 2, and 0, in that order).
var newOrdering = one
.Select((dateTime, index) => new { dateTime, index })
.OrderBy(item => item.dateTime)
.ToArray();
// now, order each list
one = newOrdering.Select(item => one[item.index]).ToList();
two = newOrdering.Select(item => two[item.index]).ToList();
three = newOrdering.Select(item => three[item.index]).ToList();

I am sorry to say, but this feels like a bad design. Especially because List<T> does not guarantee element order before you have called one of the sorting operations (so you have a problem when inserting):
From MSDN:
The List is not guaranteed to be
sorted. You must sort the List
before performing operations (such as
BinarySearch) that require the List
to be sorted.
In many cases you won't run into trouble based on this, but you might, and if you do, it could be a very hard bug to track down. For example, I think the current framework implementation of List<T> maintains insert order until sort is called, but it could change in the future.
I would seriously consider refactoring to use another data structure. If you still want to implement sorting based on this data structure, I would create a temporary object (maybe using an anonymous type), sort this, and re-create the lists (see this excellent answer for an explanation of how).

First you should create a Data object to hold everything.
private class Data
{
public DateTime DateTime { get; set; }
public int Int32 { get; set; }
public string String { get; set; }
}
Then you can sort like this.
var l = new List<Data>();
l.Sort(
(a, b) =>
{
var r = a.DateTime.CompareTo(b);
if (r == 0)
{
r = a.Int32.CompareTo(b);
if (r == 0)
{
r = a.String.CompareTo(b);
}
}
return r;
}
);

I wrote a sort algorithm that does this for Nito.LINQ (not yet released). It uses a simple-minded QuickSort to sort the lists, and keeps any number of related lists in sync. Source code starts here, in the IList<T>.Sort extension method.
Alternatively, if copying the data isn't a huge concern, you could project it into a LINQ query using the Zip operator (requires .NET 4.0 or Rx), order it, and then pull each result out:
List<DateTime> one = ...;
List<double> two = ...;
List<string> three = ...;
var combined = one.Zip(two, (first, second) => new { first, second })
.Zip(three, (pair, third) => new { pair.first, pair.second, third });
var ordered = combined.OrderBy(x => x.first);
var orderedOne = ordered.Select(x => x.first);
var orderedTwo = ordered.Select(x => x.second);
var orderedThree = ordered.Select(x => x.third);
Naturally, the best solution is to not separate related data in the first place.

Using generic arrays, this can get a bit cumbersome.
One alternative is using the Array.Sort() method that takes an array of keys and an array of values to sort. It first sorts the key array into ascending order and makes sure the array of values is reorganized to match this sort order.
If you're willing to incur the cost of converting your List<T>s to arrays (and then back), you could take advantage of this method.
Alternatively, you could use LINQ to combine the values from multiple arrays into a single anonymous type using Zip(), sort the list of anonymous types using the key field, and then split that apart into separate arrays.
If you want to do this in-place, you would have to write a custom comparer and create a separate index array to maintain the new ordering of items.

I hope this could help :
one = one.Sort(delegate(DateTime d1, DateTime d2)
{
return Convert.ToDateTime(d2).CompareTo(Convert.ToDateTime(d1));
});

Related

nested hashset of lists?

I'm working on one of the project Euler problems, and I wanted to take the approach of creating a list of values, and adding the list to a Hashset, this way I could evaluate in constant time if the list already exists in the hashset, with the end goal to count the number of lists in the hashset for my end result.
The problem I'm having is when I create a list in this manner.
HashSet<List<int>> finalList = new HashSet<List<int>>();
List<int> candidate = new List<int>();
candidate.Add(5);
finalList.Add(candidate);
if (finalList.Contains(candidate) == false) finalList.Add(candidate);
candidate.Clear();
//try next value
Obviously the finalList[0] item is cleared when I clear the candidate and is not giving me the desired result. Is it possible to have a hashset of lists(of integers) like this? How would I ensure a new list is instantiated each time and added as a new item to the hashset, perhaps say in a for loop testing many values and possible list combinations?
Why don't you use a value which is unique for each list as a key or identifier? You could create a HashSet for your keys which will unlock your lists.
You can use a Dictionary instead. The only thing is you have to test to see if the Dictionary already has the list. This is easy to do, by creating a simple class that supports this need.
class TheSimpleListManager
{
private Dictionary<String, List<Int32>> Lists = new Dictionary<String, List<Int32>>();
public void AddList(String key, List<Int32> list)
{
if(!Lists.ContainsKey(key))
{
Lists.Add(key, list);
}
else
{
// list already exists....
}
}
}
This is just a quick sample of an approach.
To fix your clear issue: Since its an object reference, you would have to create a new List and add it to the HashSet.
You can create the new List by passing the old one into its constructor.
HashSet<List<int>> finalList = new HashSet<List<int>>();
List<int> candidate = new List<int>();
candidate.Add(5);
var newList = new List<int>(candidate);
finalList.Add(newList);
if (finalList.Contains(newList) == false) //Not required for HashSet
finalList.Add(newList);
candidate.Clear();
NOTE: HashSet internally does a contains before adding items. In otherwords, here even if you execute finalList.Add(newList); n times, it would add newList only once. Therefore it is not necessary to do a contains check.

Sorting by name after each entry

I am using this LINQ statement to sort a list by product name (ascending order) which contains product names (string) and Sizes available for each product (List<byte>);
LinkedList<FullItemDetails> itemDetails = new LinkedList<FullItemDetails>();
public class FullItemDetails
{
public string ProductName { get; set; }
public List<byte> Sizes { get; set; }
}
Now every time I input a new entry ex; Jacket,6,12,18,10, I think my program is sorting my list all over again;
itemDetails.AddLast(fullItemDetails);
//SortedProducts
itemDetails = Products.OrderBy(x => x.ProductName).ToList();
If the list is already sorted I only need to put the last entry in its correct place.
What is the best way to do it. Also to reduce the complexity of the algorithm. thanks
This seems like an ideal problem for a SortedList, as you have a key (name) and value (List<int> for the size).
Documentation is available here: http://msdn.microsoft.com/en-us/library/system.collections.sortedlist.aspx
The list declaration would look like this: SortedList<string, List<int> >. All inserts would be sorted on string, and the values can be enumerated based on each key.
Instead of List<T>, use SortedList<TKey, TValue> or SortedSet<T>. You can pass in an IComparer<T> to use a specific sorting algorithm through the respective constructor. Should you want to use a Lambda expression, you can use a small wrapper class to wrap a Comparison<T>.
Which will result in something like:
ICollection<FullItemDetails> _itemList = new SortedSet<FullItemDetails>(new ComparisonComparer<FullItemDetails>((x,y) -> x.ProductName.CompareTo(y.ProductName))
Your collection will now always be ordered.
When you're using .NET 4.5, you can use Comparer<T>.Create to create an IComparer implementation from a lambda expression.
You can use a SortedList<string,FullItemDetails>.
And you add your times like that list.Add(fullItemDetails.Name,fullItemDetails)
[Edit] : The order will be conserved after adding or removing an element.
[Edit2] Using LINQ
you use a list to store your items (adding/removing) : List<FullItemDetails> originalList and other property to read your sorted data :
IEnumerable<FullItemDetails> sortedList = originalList.OrderBy(e => e.Name).ThenBy(e => /* logic to order by another property*/);
and now you can iterate through your sortedList and as this sorted list is an IEnumerable<T> each time you iterate through it you will have exactly the same elements as in your originalList (after adding or removing items).
In other words : sortedList only contains the logic to read your originalList.
Hope this helps.
Regards.

Using C#, what's an efficient way to compare/merge two generic lists of the same type?

If I have two generic lists, List, and I want to merge all the unique Place objects into one List, based on the Place.Id property, what's a good method of doing this efficiently?
One list will always contain 50, the other list could contain significantly more.
result = list1.Union(list2, new ElementComparer());
You need to create ElementComparer to implement IEqualityComparer. E.g. see this
If you want to avoid having to define your own ElementComparer and just use lambda expressions, you can try the following:
List<Place> listOne = /* whatever */;
List<Place> listTwo = /* whatever */;
List<Place> listMerge = listOne.Concat(
listTwo.Where(p1 =>
!listOne.Any(p2 => p1.Id == p2.Id)
)
).ToList();
Essentially this will just concatenate the Enumerable listOne with the set of all elements in listTwo such that the elements are not in the intersection between listOne and listTwo.
Enumerable.Distinct Method
Note: .NET 3.5 & above.
If you want to emphasize efficiency, I suggest you write a small method to do the merge yourself:
List<Place> constantList;//always contains 50 elements. no duplicate elements
List<Place> targetList;
List<Place> result;
Dictionary<int, Place> dict;
for(var p in constantList)
dict.Put(p.Id,p);
result.AddRange(constantList);
for(var p in targetList)
{
if(!dict.Contains(p.Id))
result.Add(p)
}
If speed is what you need, you need to compare using a Hashing mechanism. What I would do is maintain a Hashset of the ids that you have already read and then add the elements to the result if the id hasn't been read yet. You can do this for as many lists as you want and can return an IEnumerable instead of a list if you want to start consuming before the merge is over.
public IEnumerable<Place> Merge(params List<Place>[] lists)
{
HashSet<int> _ids = new HashSet<int>();
foreach(List<Place> list in lists)
{
foreach(Place place in list)
{
if (!_ids.Contains(place.Id))
{
_ids.Add(place.Id);
yield return place;
}
}
}
}
The fact that one list has 50 elements and the other one many more has no implication. Unless you know that the lists are ordered...

How to compare two sorted large lists efficiently in C#?

I have got two generic lists with 20,000 and 30,000 objects in each list.
class Employee
{
string name;
double salary;
}
List<Employee> newEmployeeList = List<Employee>() {....} // contains 20,000 objects
List<Employee> oldEmployeeList = List<Employee>() {....} // contains 30,000 objects
Lists can also be sorted by name if it improves the speed.
I want to compare these two lists to find out
employees whose name and salary matching
employees whose name is matching but not salary
What is the fastest way to compare such large data lists with above conditions?
I would sort both newEmployeeList and oldEmployeeList lists by name - O(n*log(n)). And then you can use linear algorithm to search for matches. So the total would be O(n+n*log(n)) if both lists are about the same size. This should be faster than O(n^2) "brute force" algorithm.
I'd probably recommend the two lists be stored in a Dictionary<string, Employee> based on the name to begin with, then you can iterate over the keys in one and lookup to see if they exist and the salaries match in the other. This would also save the cost of sorting them later or putting them in a more efficient structure.
This is pretty much O(n) - linear to build both dictionaries, linear to go through the keys and lookup in the other. Since O(n + m + n) reduces to O(n)
But, if you must use List<T> to hold the lists for other reasons, you could also use the Join() LINQ method, and build a new list with a Match field that tells you whether they were a match or mismatch...
var results = newEmpList.Join(
oldEmpList,
n => n.Name,
o => o.Name,
(n, o) => new
{
Name = n.Name,
Salary = n.Salary,
Match = o.Salary == n.Salary
});
You can then filter this with a Where() clause for Match or !Match.
Update: I assume (by the title of your question) that the 2 lists are already sorted. Perhaps they're stored in a database with a clustered index or something. This answer, therefore, relies on that assumption.
Here is an implementation that has O(n) complexity, and is also very fast, AND is pretty simple too.
I believe this is a variant of the Merge Algorithm.
Here's the idea:
Start enumerating both lists
Compare the 2 current items.
If they match, add to your results.
If the 1st item is "smaller", advance the 1st list.
If the 2nd item is "smaller", advance the 2nd list.
Since both lists are known to be sorted, this will work very well. This implementation assumes that name is unique in each list.
var comparer = StringComparer.OrdinalIgnoreCase;
var namesAndSalaries = new List<Tuple<Employee, Employee>>();
var namesOnly = new List<Tuple<Employee, Employee>>();
// Create 2 iterators; one for old, one for new:
using (IEnumerator<Employee> A = oldEmployeeList.GetEnumerator()) {
using (IEnumerator<Employee> B = newEmployeeList.GetEnumerator()) {
// Start enumerating both:
if (A.MoveNext() && B.MoveNext()) {
while (true) {
int compared = comparer.Compare(A.Current.name, B.Current.name);
if (compared == 0) {
// Names match
if (A.Current.salary == B.Current.salary) {
namesAndSalaries.Add(Tuple.Create(A.Current, B.Current));
} else {
namesOnly.Add(Tuple.Create(A.Current, B.Current));
}
if (!A.MoveNext() || !B.MoveNext()) break;
} else if (compared == -1) {
// Keep searching A
if (!A.MoveNext()) break;
} else {
// Keep searching B
if (!B.MoveNext()) break;
}
}
}
}
}
One of fastest possible solutions on sorted lists is use of BinarySearch in order to find an item in another list.
But as mantioned others, you should measure it against your project requirements, as performance often tends to be a subjective thing.
You could create a Dictionary using
var lookupDictionary = list1.ToDictionary(x=>x.name);
That would give you close to O(1) lookup and a close to O(n) behavior if you're looking up values from a loop over the other list.
(I'm assuming here that ToDictionary is O(n) which would make sense with a straight forward implementation, but I have not tested this to be the case)
This would make for a very straight forward algorithm, and I'm thinking going below O(n) with two unsorted lists is pretty hard.

C# Extract list of fields from list of class

I've got a list of elements of a certain class. This class contains a field.
class Foo {public int i;}
List<Foo> list;
I'd like to extract the field from all items in the list into a new list.
List<int> result = list.ExtractField (e => e.i); // imaginary
There are surely multiple ways to do that, but I did not find a nice-looking solution yet. I figured linq might help, but I was not sure how exactly.
Just:
List<int> result = list.Select(e => e.i).ToList();
or
List<int> result = list.ConvertAll(e => e.i);
The latter is more efficient (because it knows the final size to start with), but will only work for lists and arrays rather than any arbitrary sequence.

Categories

Resources