I have an int array of ID's that are ordered properly. Then I have an an array of unordered objects that have ID properties.
I would like to order the objects by ID that match the order of the int array.
Something along the lines of
newObjectArray = oldObjectArray.MatchOrderBy(IdArray)
Would be most desirable
I feel like I should be able to accomplish this using LINQ but I have yet to find a way.
My current method doesn't seem very efficient since it has to query on every iteration of the collection. I suspect that performance can suffer for sufficiently large collections. Which eventually will happen.
Here is my current implementation:
//this is just dummy data to show you whats going on
int[] orderedIDs = new int[5] {5534, 5632, 2334, 6622, 2344};
MemberObject[] searchResults = MyMethodToGetSearchResults();
MemberObject[] orderedSearchResults = new MemberObject[orderedIDs.Count()];
for(int i = 0; i < orderedIDs.Count(); i++)
{
orderedSearchResults[i] = searchResults
.Select(memberObject => memberObject)
.Where(memberObject => memberObject.id == orderedIDs[i])
.FirstOrDefault();
}
A brute force implementation:
MemberObject[] sortedResults =
IdArray.Select(id => searchResults
.FirstOrDefault( item => item.id == id ))
However, this requires reiterating searchResults for every item in IdArray and doesn't deal too neatly with items that have duplicate ids.
Things improve if you make an ILookup of your search results, so that grabbing the correct search result for each item in IdArray is now O(1) time.
ILookup<int, MemberObject> resultLookup = searchResults.ToLookup(x => x.id);
Now:
MemberObject[] sortedResults =
IdArray.SelectMany(id => resultLookup[id])
Related
I have a list of 50 sorted items(say) in which few items are priority ones (assume they have flag set to 1).
By default, i have to show the latest items (based on date) first, but the priority items should appear after some 'x' number of records. Like below
index 0: Item
index 1: Item
index 2: Priority Item (insert priority items from this position)
index 3: Priority Item
index 4: Priority Item
index 5: Item
index 6: Item
The index 'x' at which priority items should be inserted is pre-defined.
To achieve this, i am using following code
These are my 50 sorted items
var list= getMyTop50SortedItems();
fetching all priority items and storing it in another list
var priorityItems = list.Where(x => x.flag == 1).ToList();
filtering out the priority items from main list
list.RemoveAll(x => z.flag == 1);
inserting priority items in the main list at given position
list.InsertRange(1, priorityRecords);
This process is doing the job correctly and giving me the expected result. But am not sure whether it is the correct way to do it or is there any better way (considering the performance)?
Please provide your suggestions.
Also, how is the performance effected as i am doing many operations (filter, remove, insert) considering the increase in number of records from 50 to 100000(any number).
Update: How can i use IQueryable to decrease the number of operations on list.
As per documentation on InsertRange:
This method is an O(n * m) operation, where n is the number of
elements to be added and m is Count.
n*m isn't too very good, so I would use LINQ's Concat method to create a whole new list from three smaller lists, instead of modifying an existing one.
var allItems = getMyTop50();
var topPriorityItems = list.Where(x => x.flag == 1).ToList();
var topNonPriorityItems = list.Where(x => x.flag != 1).ToList();
var result = topNonPriorityItems
.Take(constant)
.Concat(topPriorityItems)
.Concat(topNonPriorityItems.Skip(constant));
I am not sure how fast the Concat, Skip and Take methods for List<T> are, though, but I'd bet they are not slower than O(n).
It seems like the problem you're actually trying to solve is just sorting the list of items. If this is the case, you don't need to be concerned with removing the priority items and reinserting them at the correct index, you just need to figure out your sort ordering function. Something like this ought to work:
// Set "x" to be whatever you want based on your requirements --
// this is the number of items that will precede the "priority" items in the
// sorted list
var x = 3;
var sortedList = list
.Select((item, index) => Tuple.Create(item, index))
.OrderBy(item => {
// If the original position of the item is below whatever you've
// defined "x" to be, then keep the original position
if (item.Item2 < x) {
return item.Item2;
}
// Otherwise, ensure that "priority" items appear first
return item.Item1.flag == 1 ? x + item.Item2 : list.Count + x + item.Item2;
}).Select(item => item.Item1);
You may need to tweak this slightly based on what you're trying to do, but it seems much simpler than removing/inserting from multiple lists.
Edit: Forgot that .OrderBy doesn't provide an overload that provides the original index of the item; updated answer to wrap the items in a Tuple that contains the original index. Not as clean as the original answer, but it should still work.
This can be done using a single enumeration of the original collection using linq-to-objects. IMO this also reads pretty clearly based on the original requirements you defined.
First, define the "buckets" that we'll be sorting into: I like using an enum here for clarity, but you could also just use an int.
enum SortBucket
{
RecentItems = 0,
PriorityItems = 1,
Rest = 2,
}
Then we'll define the logic for which "bucket" a particular item will be sorted into:
private static SortBucket GetBucket(Item item, int position, int recentItemCount)
{
if (position <= recentItemCount)
{
return SortBucket.RecentItems;
}
return item.IsPriority ? SortBucket.PriorityItems : SortBucket.Rest;
}
And then a fairly straightforward linq-to-objects statement to sort first into the buckets we defined, and then by the original position. Written as an extension method:
static IEnumerable<Item> PrioritySort(this IEnumerable<Item> items, int recentItemCount)
{
return items
.Select((item, originalPosition) => new { item, originalPosition })
.OrderBy(o => GetBucket(o.item, o.originalPosition, recentItemCount))
.ThenBy(o => o.originalPosition)
.Select(o => o.item);
}
I have 2 lists. First is a list of objects that has an int property ID. The other is a list of ints.
I need to compare these 2 lists and copy the objects to a new list with only the objects that matches between the two lists based on ID. Right now I am using 2 foreach loops as follows:
var matched = new list<Cars>();
foreach(var car in cars)
foreach(var i in intList)
{
if (car.id == i)
matched.Add(car);
}
This seems like it is going to be very slow as it is iterating over each list many times. Is there way to do this without using 2 foreach loops like this?
One slow but clear way would be
var matched = cars.Where(car => intList.Contains(car.id)).ToList();
You can make this quicker by turning the intList into a dictionary and using ContainsKey instead.
var intLookup = intList.ToDictionary(k => k);
var matched = cars.Where(car => intLookup.ContainsKey(car.id)).ToList();
Even better still, a HashSet:
var intHash = new HashSet(intList);
var matched = cars.Where(car => intHash.Contains(car.id)).ToList();
You could try some simple linq something like this should work:
var matched = cars.Where(w => intList.Contains(w.id)).ToList();
this will take your list of cars and then find only those items where the id is contained in your intList.
I need to optimize the below foreach loop. The foreach loop is taken more time to get the unique items.
Instead can the FilterItems be converted into a list collection. If so how to do it. Then i will take unique items easily from it.
The problem arises when i have 5,00,000 items in FilterItems.
Please suggest some ways to optimize the below code:
int i = 0;
List<object> order = new List<object>();
List<object> unique = new List<object>();
// FilterItems IS A COLLECTION OF RECORDS. CAN THIS BE CONVERTED TO A LIST COLLECTION DIRECTLY, SO THAT I CAN TAKE THE UNIQUE ITEMS FROM IT.
foreach (Record rec in FilterItems)
{
string text = rec.GetValue(“Column Name”);
int position = order.BinarySearch(text);
if (position < 0)
{
order.Insert(-position - 1, text);
unique.Add(text);
}
i++;
}
It's unclear what you mean by "converting FilterItems into a list" when we don't know anything about it, but you could definitely consider sorting after you've got all the items, rather than as you go:
var strings = FilterItems.Select(record => record.GetValue("Column Name"))
.Distinct()
.OrderBy(x => x)
.ToList();
The use of Distinct() here will avoid sorting lots of equal items - it looks like you only want distinct items anyway.
If you want unique to be in the original order but order to be the same items, just sorted, you could use:
var unique = FilterItems.Select(record => record.GetValue("Column Name"))
.Distinct()
.ToList();
var order = unique.OrderBy(x => x).ToList();
Now Distinct() isn't guaranteed to preserve order - but it does so in the current implementation, and that's the most natural implementation, too.
Is possible to sort an in-memory list by another list (the second list would be a reference data-source or something like this) ?
public class DataItem
{
public string Name { get; set; }
public string Path { get; set; }
}
// a list of Data Items, randomly sorted
List<DataItem> dataItems = GetDataItems();
// the sort order data source with the paths in the correct order
IEnumerable<string> sortOrder = new List<string> {
"A",
"A.A1",
"A.A2",
"A.B1"
};
// is there a way to tell linq to sort the in-memory list of objects
// by the sortOrder "data source"
dataItems = dataItems.OrderBy(p => p.Path == sortOrder).ToList();
First, lets assign an index to each item in sortOrder:
var sortOrderWithIndices = sortOrder.Select((x, i) => new { path = x, index = i });
Next, we join the two lists and sort:
var dataItemsOrdered =
from d in dataItems
join x in sortOrderWithIndices on d.Path equals x.path //pull index by path
orderby x.index //order by index
select d;
This is how you'd do it in SQL as well.
Here is an alternative (and I argue more efficient) approach to the one accepted as answer.
List<DataItem> dataItems = GetDataItems();
IDictionary<string, int> sortOrder = new Dictionary<string, int>()
{
{"A", int.MaxValue},
{"A.A1", int.MaxValue-1},
{"A.A2", int.MaxValue -2},
{"A.B1", int.MaxValue-3},
};
dataItems.Sort((di1, di2) => sortOrder[di1.Path].CompareTo(sortOrder[di2.Path]));
Let's say Sort() and OrderBy() both take O(n*logn), where n is number of items in dataItems. The solution given here takes O(n*logn) to perform the sort. We assume the step required to create the dictionary sortOrder has a cost not significantly different from creating the IEnumerable in the original post.
Doing a join and then sorting the collection, however adds an additional cost O(nm) where m is number of elements in sortOrder. Thus the total time complexity for that solution comes to O(nm + nlogn).
In theory, the approach using join may boil down to O(n * (m + logn)) ~= O(n*logn) any way. But in practice, join is costing extra cycles. This is in addition to possible extra space complexity incurred in the linq approach where auxiliary collections might have been created in order to process the linq query.
If your list of paths is large, you would be better off performing your lookups against a dictionary:
var sortValues = sortOrder.Select((p, i) => new { Path = p, Value = i })
.ToDictionary(x => x.Path, x => x.Value);
dataItems = dataItems.OrderBy(di => sortValues[di.Path]).ToList();
custom ordering is done by using a custom comparer (an implementation of the IComparer interface) that is passed as the second argument to the OrderBy method.
I wanted to ask for suggestions how I can simplify the foreach block below. I tried to make it all in one linq statement, but I couldn't figure out how to manipulate "count" values inside the query.
More details about what I'm trying to achieve:
- I have a huge list with potential duplicates, where Id's are repeated, but property "Count" is different numbers
- I want to get rid of duplicates, but still not to loose those "Count" values
- so for the items with the same Id I summ up the "Count" properties
Still, the current code doesn't look pretty:
var grouped = bigList.GroupBy(c => c.Id).ToList();
foreach (var items in grouped)
{
var count = 0;
items.Each(c=> count += c.Count);
items.First().Count = count;
}
var filtered = grouped.Select(y => y.First());
I don't expect the whole solution, pieces of ideas will be also highly appreciated :)
Given that you're mutating the collection, I would personally just make a new "item" with the count:
var results = bigList.GroupBy(c => c.Id)
.Select(g => new Item(g.Key, g.Sum(i => i.Count)))
.ToList();
This performs a simple mapping from the original to a new collection of Item instances, with the proper Id and Count values.
var filtered = bigList.GroupBy(c=>c.Id)
.Select(g=> {
var f = g.First();
f.Count = g.Sum(c=>c.Count);
return f;
});