Make C# ParallelEnumerable.OrderBy stable sort - c#

I'm sorting a list of objects by their integer ids in parallel using OrderBy. I have a few objects with the same id and need the sort to be stable.
According to Microsoft's documentation, the parallelized OrderBy is not stable, but there is an implementation approach to make it stable. However, I cannot find an example of this.
var list = new List<pair>() { new pair("a", 1), new pair("b", 1), new pair("c", 2), new pair("d", 3), new pair("e", 4) };
var newList = list.AsParallel().WithDegreeOfParallelism(4).OrderBy<pair, int>(p => p.order);
private class pair {
private String name;
public int order;
public pair (String name, int order) {
this.name = name;
this.order = order;
}
}

The remarks for the other OrderBy method suggest this approach:
var newList = list
.Select((pair, index) => new { pair, index })
.AsParallel().WithDegreeOfParallelism(4)
.OrderBy(p => p.pair.order)
.ThenBy(p => p.index)
.Select(p => p.pair);

Related

C# Linq Group elements from list and count sum of values

i have a List of Lists of objects each containing a string and a float value.
I need to group those elements by the string value (name) and order the groups by the sum of float value.
public class Element
{
public string Name;
public float Value;
public Element(string name,float value) {
Name = name;
Value = value;
}
}
List<List<Element>> elementslist = new List<List<Element>>();
elementslist.Add(new List<Element>() { new Element("Apple", 1.2f), new Element("Banana", 0) });
elementslist.Add(new List<Element>() { new Element("Apple", 2.1f), new Element("Banana", 1.4f) });
elementslist.Add(new List<Element>() { new Element("Apple", 0), new Element("Banana", 0) });
p.s.: is there any smarter aggregation algorythm to obtain thise result ? Maybe could be also considered of "close" these values are in List order...
Thank you very much
First flat (SelectMany) the list, then GroupBy by name and OrderBy by Sum of values:
var groups = elements.SelectMany(l => l)
.GroupBy(e => e.Name)
.OrderBy(g => g.Sum(x => x.Value))
This will give you a IEnumerable<Element> with one element by name and the sum of Values in the Value member. Hope it helps.
elementslist.SelectMany(n => n).GroupBy(n => n.Name).Select(n => new Element(n.First().Name, n.Sum(p => p.Value))).OrderBy(n => n.Value)
Or OrderByDescending.

Merge Complex Object List using Union / Intersect

Consider two lists of complex objects say :
var first = new List<Record>
{
new Record(1, new List<int> { 2, 3 }),
new Record(4, new List<int> { 5, 6 })
};
var second = new List<Record>
{
new Record(1, new List<int> { 4 })
};
where a Record is defined as below. Nothing fancy, just a class with Id and list of
SecondaryIdentifiers.
public class Record
{
private readonly IList<int> _secondaryIdentifiers;
private readonly int _id;
public Record(int id, IList<int> secondaryIdentifiers)
{
_id = id;
_secondaryIdentifiers = secondaryIdentifiers;
}
public IList<int> SecondaryIdentifiers
{
get { return _secondaryIdentifiers; }
}
public int Id
{
get { return _id; }
}
}
How can I union / interest such that the Union and Intersect operations merge the SecondaryIdentifiers.
var union = first.Union(second);
var intersect = first.Intersect(second);
Union will be
{
new Record(1, new List<int> { 2, 3 , 4 }),
new Record(4, new List<int> { 5, 6 })
};
Intersect will be
{
new Record(1, new List<int> { 2, 3 , 4 }),
};
What I have tried
I tried using a first.Union(second, new EqualityComparer()) where the EqualityComparer extends IEqualityComparer<Record> and merges the two SecondaryIdentifiers if the two items compared are equal, but it seemed a little hacky to me.
Is there a more elegant way of doing this ?
Is there a more elegant way of doing this
It is opinion based but I would do it as:
var union = first.Concat(second)
.GroupBy(x => x.Id)
.Select(g => g.SelectMany(y => y.SecondaryIdentifiers).ToList())
.ToList();
var intersect = first.Concat(second)
.GroupBy(x => x.Id)
.Where(x => x.Count() > 1)
.Select(g => g.SelectMany(y => y.SecondaryIdentifiers).ToList())
.ToList();
PS: Feel free to remove .ToList()s for lazy evaluation.
this should work for the union part:
from a in first
join b in second on a.Id equals b.Id into rGroup
let ids = a.SecondaryIdentifiers.Union(rGroup.SelectMany(r => r.SecondaryIdentifiers))
select new Record(a.Id, ids.ToList())
and the intersect:
from a in first
join b in second on a.Id equals b.Id
select new Record(a.Id, a.SecondaryIdentifiers.Union(b.SecondaryIdentifiers).ToList())

Fastest way to select distinct values from list based on two properties

I have a this list:
List<myobject> list= new List<myobject>();
list.Add(new myobject{name="n1",recordNumber=1});
list.Add(new myobject{name="n2",recordNumber=2});
list.Add(new myobject{name="n3",recordNumber=3});
list.Add(new myobject{name="n4",recordNumber=3});
I'm looking for the fastest way to select distinct objects based on recordNumber, but if there is more than one object with same recordNumber(here recordNumber=3), I want to select object base on its name.(the name provided by paramater)
thanks
It looks like you are really after something like:
Dictionary<int, List<myobject>> myDataStructure;
That allows you to quickly retrieve by record number. If the List<myobject> with that dictionary key contains more than one entry, you can then use the name to select the correct one.
Note that if your list is not terribly long, an O(n) check that just scans the list checking for the recordNumber and name may be fast enough, in the sense that other things happening in your program could obscure the list lookup cost. Consider that possibility before over-optimizing lookup times.
Here's the LINQ way of doing this:
Func<IEnumerable<myobject>, string, IEnumerable<myobject>> getDistinct =
(ms, n) =>
ms
.ToLookup(x => x.recordNumber)
.Select(xs => xs.Skip(1).Any()
? xs.Where(x => x.name == n).Take(1)
: xs)
.SelectMany(x => x)
.ToArray();
I just tested this with a 1,000,000 randomly created myobject list and it produced the result in 106ms. That should be fast enough for most situations.
Are you looking for
class Program
{
static void Main(string[] args)
{
List<myobject> list = new List<myobject>();
list.Add(new myobject { name = "n1", recordNumber = 1 });
list.Add(new myobject { name = "n2", recordNumber = 2 });
list.Add(new myobject { name = "n3", recordNumber = 3 });
list.Add(new myobject { name = "n4", recordNumber = 3 });
//Generates Row Number on the fly
var withRowNumbers = list
.Select((x, index) => new
{
Name = x.name,
RecordNumber = x.recordNumber,
RowNumber = index + 1
}).ToList();
//Generates Row Number with Partition by clause
var withRowNumbersPartitionBy = withRowNumbers
.OrderBy(x => x.RowNumber)
.GroupBy(x => x.RecordNumber)
.Select(g => new { g, count = g.Count() })
.SelectMany(t => t.g.Select(b => b)
.Zip(Enumerable.Range(1, t.count), (j, i) => new { Rn = i, j.RecordNumber, j.Name}))
.Where(i=>i.Rn == 1)
.ToList();
//print the result
withRowNumbersPartitionBy.ToList().ForEach(i => Console.WriteLine("Name = {0} RecordNumber = {1}", i.Name, i.RecordNumber));
Console.ReadKey();
}
}
class myobject
{
public int recordNumber { get; set; }
public string name { get; set; }
}
Result:
Name = n1 RecordNumber = 1
Name = n2 RecordNumber = 2
Name = n3 RecordNumber = 3
Are you looking for a method to do this?
List<myobject> list= new List<myobject>();
list.Add(new myobject{name="n1",recordNumber=1});
list.Add(new myobject{name="n2",recordNumber=2});
list.Add(new myobject{name="n3",recordNumber=3});
list.Add(new myobject{name="n4",recordNumber=3});
public myobject Find(int recordNumber, string name)
{
var matches = list.Where(l => l.recordNumber == recordNumber);
if (matches.Count() == 1)
return matches.Single();
else return matches.Single(m => m.name == name);
}
This will - of course - break if there are multiple matches, or zero matches. You need to write your own edge cases and error handling!
If the name and recordNumber combination is guaranteed to be unique then you can always use Hashset.
You can then use RecordNumber and Name to generate the HashCode by using a method described here.
class myobject
{
//override GetHashCode
public override int GetHashCode()
{
unchecked // Overflow is fine, just wrap
{
int hash = 17;
// Suitable nullity checks etc, of course :)
hash = hash * 23 + recordNumber.GetHashCode();
hash = hash * 23 + name.GetHashCode();
return hash;
}
}
//override Equals
}

IEnumerable<Object> Data Specific Ordering

I've an object that is include property ID with values between 101 and 199. How to order it like 199,101,102 ... 198?
In result I want to put last item to first.
The desired ordering makes no sense (some reasoning would be helpful), but this should do the trick:
int maxID = items.Max(x => x.ID); // If you want the Last item instead of the one
// with the greatest ID, you can use
// items.Last().ID instead.
var strangelyOrderedItems = items
.OrderBy(x => x.ID == maxID ? 0 : 1)
.ThenBy(x => x.ID);
Depending whether you are interested in the largest item in the list, or the last item in the list:
internal sealed class Object : IComparable<Object>
{
private readonly int mID;
public int ID { get { return mID; } }
public Object(int pID) { mID = pID; }
public static implicit operator int(Object pObject) { return pObject.mID; }
public static implicit operator Object(int pInt) { return new Object(pInt); }
public int CompareTo(Object pOther) { return mID - pOther.mID; }
public override string ToString() { return string.Format("{0}", mID); }
}
List<Object> myList = new List<Object> { 1, 2, 6, 5, 4, 3 };
// the last item first
List<Object> last = new List<Object> { myList.Last() };
List<Object> lastFirst =
last.Concat(myList.Except(last).OrderBy(x => x)).ToList();
lastFirst.ForEach(Console.Write);
Console.WriteLine();
// outputs: 312456
// or
// the largest item first
List<Object> max = new List<Object> { myList.Max() };
List<Object> maxFirst =
max.Concat(myList.Except(max).OrderBy(x => x)).ToList();
maxFirst.ForEach(Console.Write);
Console.WriteLine();
// outputs: 612345
Edit: missed the part about you wanting the last item first. You could do it like this :
var objectList = new List<DataObject>();
var lastob = objectList.Last();
objectList.Remove(lastob);
var newList = new List<DataObject>();
newList.Add(lastob);
newList.AddRange(objectList.OrderBy(o => o.Id).ToList());
If you are talking about a normal sorting you could use linq's order by method like this :
objectList = objectList.OrderBy(ob => ob.ID).ToList();
In result I want to put last item to first
first sort the list
List<int> values = new List<int>{100, 56, 89..};
var result = values.OrderBy(x=>x);
add an extension method for swaping an elements in the List<T>
static void Swap<T>(this List<T> list, int index1, int index2)
{
T temp = list[index1];
list[index1] = list[index2];
list[index2] = temp;
}
after use it
result .Swap(0, result.Count -1);
You can acheive this using a single Linq statment.
var ordering = testData
.OrderByDescending(t => t.Id)
.Take(1)
.Union(testData.OrderBy(t => t.Id).Take(testData.Count() - 1));
Order it in reverse direction and take the top 1, then order it the "right way round" and take all but the last and union these together. There are quite a few variants of this approach, but the above should work.
This approach should work for arbitrary lists too, without the need to know the max number.
How about
var orderedItems = items.OrderBy(x => x.Id)
var orderedItemsLastFirst =
orderedItems.Reverse().Take(1).Concat(orderedItems.Skip(1));
This will iterate the list several times so perhaps could be more efficient but doesn't use much code.
If more speed is important you could write a specialised IEnumerable extension that would allow you to sort and return without converting to an intermediate IEnumerable.
var myList = new List<MyObject>();
//initialize the list
var ordered = myList.OrderBy(c => c.Id); //or use OrderByDescending if you want reverse order

How to sort collection quite specifically by linq

var ids = new int[] { 3, 2, 20, 1 };
var entities = categories.Where(entity => ids.Contains(entity.Id));
I have to sort entities by exactly same like in ids array. How can i do that ?
This should do the trick (written off the top of my head, so may have mistakes)
var ids = new int[] { 3, 2, 20, 1 };
var ordering = ids.Select((id,index) => new {id,index});
var entities =
categories
.Where(entity => ids.Contains(entity.Id))
.AsEnumerable() //line not necessary if 'categories' is a local sequence
.Join(ordering, ent => ent.Id, ord => ord.id, (ent,ord) => new {ent,ord})
.OrderBy(x => x.ord.index)
.Select(x => x.ent)
You could use OrderBy with the index of the Ids in ids.
To get the index of an Id from ids, you could create a map of Id to index. That way you can look up the index in almost constant time, instead of having to call IndexOf and traverse the whole list each time.
Something like this:
var idToIndexMap = ids
.Select((i, v) => new { Index = i, Value = v })
.ToDictionary(
pair => pair.i,
pair => pair.v
);
var sortedEntities = categories
.Where(e => ids.Contains(e.Id))
.ToList() // Isn't necessary if this is Linq-to-Objects instead of entities...
.OrderBy(e => idToIndexMap[e.Id])
;
You may have a go with this:
public class Foo
{
public void Bar()
{
int[] idOrder = new int[] { 3, 2, 20, 1 };
var lookup = idOrder.ToDictionary(i => i,
i => Array.IndexOf(idOrder, i));
foreach(var a in idOrder.OrderBy(i => new ByArrayComparable<int>(lookup, i)))
Console.WriteLine(a);
}
}
public class ByArrayComparable<T> : IComparable<ByArrayComparable<T>> where T : IComparable<T>
{
public readonly IDictionary<T, int> order;
public readonly T element;
public ByArrayComparable(IDictionary<T, int> order, T element)
{
this.order = order;
this.element = element;
}
public int CompareTo(ByArrayComparable<T> other)
{
return this.order[this.element].CompareTo(this.order[other.element]);
}
}
This works for unique elements only, but the lookup efford is constant.

Categories

Resources