C# alternative for the C++ STL set<T> - c#

I'm looking for a sorted data structure, that will be similar to the STL set(T).
I found SortedList, but it requires (key, val), I'm looking for something like List(string) - only sorted.
I found on the web Spring.Collections, but my framework does not recognize it.
Is there a simple SortedSet I could use in the regular basic framework?
Thanks,
Gal

You can do this with A System.Collections.Generic.Dictionary. Here's a good article: Dictionarys and sorting
Edit: The SortedDictionary seems even better.

A SortedSet < T > introduced in .NET 4.0 is what you are looking for, see MSDN here

Also List<T> can be sorted. It is not sorted by default, but you can sort it, even with custom sorting algorithms if you so wish.

There's nothing built into the framework other than SortedDictionary<K,V> and SortedList<K,V>.
The C5 Collections library has several sorted collections. One of the following should do the trick, depending on your exact requirements: SortedArray<T>, TreeBag<T> or TreeSet<T>.
There's also Power Collections, which provides OrderedBag<T> and OrderedSet<T> collections.

There is a System.Collections.SortedList or System.Collections.Generic.SortedList which is always sorted.
Or you can use the Array.Sort method to sort ar defined moments in time.

How about using a List<> and calling the Sort method?
Not an extension but try this
public class SortedList<T>: List<T>
{
public SortedList(): base()
{
}
public SortedList(IEnumerable<T> collection): base(collection)
{
}
public SortedList(int capacity)
: base(capacity)
{
}
public void AddSort(T item)
{
base.Add(item);
this.Sort();
}
}
It's only a starting point, but adds a new method AddSort.
An extension method would be used to alter the List<>.Add method and call the sort on the end of it.
Using extension method
Place the following in a namespace accessible by your code:
public static class ListExtension
{
public static void AddSort<T>(this List<T> list, T item)
{
list.Add(item);
list.Sort();
}
}
You can the use code such as:
List<int> newList = List<int>();
newList.AddSort(6);
newList.AddSort(4);
newList.AddSort(3);
And the values will be:
newList[0] == 3
newList[1] == 4
newList[3] == 6
You can also just use newList.Add and the list is then sorted when you call newList.AddSort

Related

Simplest way to make SortableBindingList use a stable sort

There is an example of how to modify SortableBindingList to use a stable sort. However, there is an updated version of SortableBindingList. What is the best way to modify this new version to use a stable sort? I think I would want a flag on the SortableBindingList to let the user of the SortableBindingList decide if they want to use (slower) stable sort or (faster) default sort.
Thanks
You can solve this problem by writing a stable sort extension method for List<T>:
public static class ListExtensions
{
public static void StableSort<T>(this List<T> list, IComparer<T> comparer)
{
var pairs = list.Select((value, index) => Tuple.Create(value, index)).ToList();
pairs.Sort((x, y) =>
{
int result = comparer.Compare(x.Item1, y.Item1);
return result != 0 ? result : x.Item2 - y.Item2;
});
list.Clear();
list.AddRange(pairs.Select(key => key.Item1));
}
}
and then in the new version of SortableBindingList change this line:
itemsList.Sort(comparer);
to:
itemsList.StableSort(comparer);
This works by using the unstable sort supplemented with a secondary key on the item index within the list. Since this version doesn't use the pathologically slow insertion sort to achieve a stable sort, it should be fast enough for general use.

How to get an empty list of a collection?

I have a collection of anonymous class and I want to return an empty list of it.
What is the best readable expression to use?
I though of the following but I don't think they are readably enough:
var result = MyCollection.Take(0).ToList();
var result = MyCollection.Where(p => false).ToList();
Note: I don't want to empty the collection itself.
Any suggestion!
Whats about:
Enumerable.Empty<T>();
This returns an empty enumerable which is of type T. If you really want a List so you are free to do this:
Enumerable.Empty<T>().ToList<T>();
Actually, if you use a generic extension you don't even have to use any Linq to achieve this, you already have the anonymous type exposed through T
public static IList<T> GetEmptyList<T>(this IEnumerable<T> source)
{
return new List<T>();
}
var emp = MyCollection.GetEmptyList();
Given that your first suggestion works and should perform well - if readability is the only issue, why not create an extension method:
public static IList<T> CreateEmptyCopy(this IEnumerable<T> source)
{
return source.Take(0).ToList();
}
Now you can refactor your example to
var result = MyCollection.CreateEmptyCopy();
For performance reasons, you should stick with the first option you came up with.
The other one would iterate over the entire collection before returning an empty list.
Because the anonymous type there is no way, in source code, to create a list. There is, however, a way to create such list through reflection.

Does Distinct() method keep original ordering of sequence intact?

I want to remove duplicates from list, without changing order of unique elements in the list.
Jon Skeet & others have suggested to use the following:
list = list.Distinct().ToList();
Reference:
How to remove duplicates from a List<T>?
Remove duplicates from a List<T> in C#
Is it guaranteed that the order of unique elements would be same as before? If yes, please give a reference that confirms this as I couldn't find anything on it in documentation.
It's not guaranteed, but it's the most obvious implementation. It would be hard to implement in a streaming manner (i.e. such that it returned results as soon as it could, having read as little as it could) without returning them in order.
You might want to read my blog post on the Edulinq implementation of Distinct().
Note that even if this were guaranteed for LINQ to Objects (which personally I think it should be) that wouldn't mean anything for other LINQ providers such as LINQ to SQL.
The level of guarantees provided within LINQ to Objects is a little inconsistent sometimes, IMO. Some optimizations are documented, others not. Heck, some of the documentation is flat out wrong.
In the .NET Framework 3.5, disassembling the CIL of the Linq-to-Objects implementation of Distinct() shows that the order of elements is preserved - however this is not documented behavior.
I did a little investigation with Reflector. After disassembling System.Core.dll, Version=3.5.0.0 you can see that Distinct() is an extension method, which looks like this:
public static class Emunmerable
{
public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source)
{
if (source == null)
throw new ArgumentNullException("source");
return DistinctIterator<TSource>(source, null);
}
}
So, interesting here is DistinctIterator, which implements IEnumerable and IEnumerator. Here is simplified (goto and lables removed) implementation of this IEnumerator:
private sealed class DistinctIterator<TSource> : IEnumerable<TSource>, IEnumerable, IEnumerator<TSource>, IEnumerator, IDisposable
{
private bool _enumeratingStarted;
private IEnumerator<TSource> _sourceListEnumerator;
public IEnumerable<TSource> _source;
private HashSet<TSource> _hashSet;
private TSource _current;
private bool MoveNext()
{
if (!_enumeratingStarted)
{
_sourceListEnumerator = _source.GetEnumerator();
_hashSet = new HashSet<TSource>();
_enumeratingStarted = true;
}
while(_sourceListEnumerator.MoveNext())
{
TSource element = _sourceListEnumerator.Current;
if (!_hashSet.Add(element))
continue;
_current = element;
return true;
}
return false;
}
void IEnumerator.Reset()
{
throw new NotSupportedException();
}
TSource IEnumerator<TSource>.Current
{
get { return _current; }
}
object IEnumerator.Current
{
get { return _current; }
}
}
As you can see - enumerating goes in order provided by source enumerable (list, on which we are calling Distinct). Hashset is used only for determining whether we already returned such element or not. If not, we are returning it, else - continue enumerating on source.
So, it is guaranteed, that Distinct() will return elements exactly in same order, which are provided by collection to which Distinct was applied.
According to the documentation the sequence is unordered.
Yes, Enumerable.Distinct preserves order. Assuming the method to be lazy "yields distinct values are soon as they are seen", it follows automatically. Think about it.
The .NET Reference source confirms. It returns a subsequence, the first element in each equivalence class.
foreach (TSource element in source)
if (set.Add(element)) yield return element;
The .NET Core implementation is similar.
Frustratingly, the documentation for Enumerable.Distinct is confused on this point:
The result sequence is unordered.
I can only imagine they mean "the result sequence is not sorted." You could implement Distinct by presorting then comparing each element to the previous, but this would not be lazy as defined above.
A bit late to the party, but no one really posted the best complete code to accomplish this IMO, so let me offer this (which is essentially identical to what .NET Framework does with Distinct())*:
public static IEnumerable<T> DistinctOrdered<T>(this IEnumerable<T> items)
{
HashSet<T> returnedItems = new HashSet<T>();
foreach (var item in items)
{
if (returnedItems.Add(item))
yield return item;
}
}
This guarantees the original order without reliance on undocumented or assumed behavior. I also believe this is more efficient than using multiple LINQ methods though I'm open to being corrected here.
(*) The .NET Framework source uses an internal Set class, which appears to be substantively identical to HashSet.
By default when use Distinct linq operator uses Equals method but you can use your own IEqualityComparer<T> object to specify when two objects are equals with a custom logic implementing GetHashCode and Equals method.
Remember that:
GetHashCode should not used heavy cpu comparision ( eg. use only some obvious basic checks ) and its used as first to state if two object are surely different ( if different hash code are returned ) or potentially the same ( same hash code ). In this latest case when two object have the same hashcode the framework will step to check using the Equals method as a final decision about equality of given objects.
After you have MyType and a MyTypeEqualityComparer classes follow code not ensure the sequence maintain its order:
var cmp = new MyTypeEqualityComparer();
var lst = new List<MyType>();
// add some to lst
var q = lst.Distinct(cmp);
In follow sci library I implemented an extension method to ensure Vector3D set maintain the order when use a specific extension method DistinctKeepOrder:
relevant code follows:
/// <summary>
/// support class for DistinctKeepOrder extension
/// </summary>
public class Vector3DWithOrder
{
public int Order { get; private set; }
public Vector3D Vector { get; private set; }
public Vector3DWithOrder(Vector3D v, int order)
{
Vector = v;
Order = order;
}
}
public class Vector3DWithOrderEqualityComparer : IEqualityComparer<Vector3DWithOrder>
{
Vector3DEqualityComparer cmp;
public Vector3DWithOrderEqualityComparer(Vector3DEqualityComparer _cmp)
{
cmp = _cmp;
}
public bool Equals(Vector3DWithOrder x, Vector3DWithOrder y)
{
return cmp.Equals(x.Vector, y.Vector);
}
public int GetHashCode(Vector3DWithOrder obj)
{
return cmp.GetHashCode(obj.Vector);
}
}
In short Vector3DWithOrder encapsulate the type and an order integer, while Vector3DWithOrderEqualityComparer encapsulates original type comparer.
and this is the method helper to ensure order maintained
/// <summary>
/// retrieve distinct of given vector set ensuring to maintain given order
/// </summary>
public static IEnumerable<Vector3D> DistinctKeepOrder(this IEnumerable<Vector3D> vectors, Vector3DEqualityComparer cmp)
{
var ocmp = new Vector3DWithOrderEqualityComparer(cmp);
return vectors
.Select((w, i) => new Vector3DWithOrder(w, i))
.Distinct(ocmp)
.OrderBy(w => w.Order)
.Select(w => w.Vector);
}
Note : further research could allow to find a more general ( uses of interfaces ) and optimized way ( without encapsulate the object ).
This highly depends on your linq-provider. On Linq2Objects you can stay on the internal source-code for Distinct, which makes one assume the original order is preserved.
However for other providers that resolve to some kind of SQL for example, that isn´t neccessarily the case, as an ORDER BY-statement usually comes after any aggregation (such as Distinct). So if your code is this:
myArray.OrderBy(x => anothercol).GroupBy(x => y.mycol);
this is translated to something similar to the following in SQL:
SELECT * FROM mytable GROUP BY mycol ORDER BY anothercol;
This obviously first groups your data and sorts it afterwards. Now you´re stuck on the DBMS own logic of how to execute that. On some DBMS this isn´t even allowed. Imagine the following data:
mycol anothercol
1 2
1 1
1 3
2 1
2 3
when executing myArr.OrderBy(x => x.anothercol).GroupBy(x => x.mycol) we assume the following result:
mycol anothercol
1 1
2 1
But the DBMS may aggregate the anothercol-column so, that allways the value of the first row is used, resulting in the following data:
mycol anothercol
1 2
2 1
which after ordering will result in this:
mycol anothercol
2 1
1 2
This is similar to the following:
SELECT mycol, First(anothercol) from mytable group by mycol order by anothercol;
which is the completely reverse order than what you expected.
You see the execution-plan may vary depending on what the underlying provider is. This is why there´s no guarantee about that in the docs.

Retrieving items from an F# List passed to C#

I have a function in C# that is being called in F#, passing its parameters in a Microsoft.FSharp.Collections.List<object>.
How am I able to get the items from the F# List in the C# function?
EDIT
I have found a 'functional' style way to loop through them, and can pass them to a function as below to return C# System.Collection.List:
private static List<object> GetParams(Microsoft.FSharp.Collections.List<object> inparams)
{
List<object> parameters = new List<object>();
while (inparams != null)
{
parameters.Add(inparams.Head);
inparams = inparams.Tail;
}
return inparams;
}
EDIT AGAIN
The F# List, as was pointed out below, is Enumerable, so the above function can be replaced with the line;
new List<LiteralType>(parameters);
Is there any way, however, to reference an item in the F# list by index?
In general, avoid exposing F#-specific types (like the F# 'list' type) to other languages, because the experience is not all that great (as you can see).
An F# list is an IEnumerable, so you can create e.g. a System.Collections.Generic.List from it that way pretty easily.
There is no efficient indexing, as it's a singly-linked-list and so accessing an arbitrary element is O(n). If you do want that indexing, changing to another data structure is best.
In my C#-project I made extension methods to convert lists between C# and F# easily:
using System;
using System.Collections.Generic;
using Microsoft.FSharp.Collections;
public static class FSharpInteropExtensions {
public static FSharpList<TItemType> ToFSharplist<TItemType>(this IEnumerable<TItemType> myList)
{
return Microsoft.FSharp.Collections.ListModule.of_seq<TItemType>(myList);
}
public static IEnumerable<TItemType> ToEnumerable<TItemType>(this FSharpList<TItemType> fList)
{
return Microsoft.FSharp.Collections.SeqModule.of_list<TItemType>(fList);
}
}
Then use just like:
var lst = new List<int> { 1, 2, 3 }.ToFSharplist();
Answer to edited question:
Is there any way, however, to reference an item in the F# list by index?
I prefer f# over c# so here is the answer:
let public GetListElementAt i = mylist.[i]
returns an element (also for your C# code).

Apply function to all elements of collection through LINQ [duplicate]

This question already has answers here:
LINQ equivalent of foreach for IEnumerable<T>
(22 answers)
Closed 6 years ago.
I have recently started off with LINQ and its amazing. I was wondering if LINQ would allow me to apply a function - any function - to all the elements of a collection, without using foreach. Something like python lambda functions.
For example if I have a int list, Can I add a constant to every element using LINQ
If i have a DB table, can i set a field for all records using LINQ.
I am using C#
A common way to approach this is to add your own ForEach generic method on IEnumerable<T>. Here's the one we've got in MoreLINQ:
public static void ForEach<T>(this IEnumerable<T> source, Action<T> action)
{
source.ThrowIfNull("source");
action.ThrowIfNull("action");
foreach (T element in source)
{
action(element);
}
}
(Where ThrowIfNull is an extension method on any reference type, which does the obvious thing.)
It'll be interesting to see if this is part of .NET 4.0. It goes against the functional style of LINQ, but there's no doubt that a lot of people find it useful.
Once you've got that, you can write things like:
people.Where(person => person.Age < 21)
.ForEach(person => person.EjectFromBar());
The idiomatic way to do this with LINQ is to process the collection and return a new collection mapped in the fashion you want. For example, to add a constant to every element, you'd want something like
var newNumbers = oldNumbers.Select(i => i + 8);
Doing this in a functional way instead of mutating the state of your existing collection frequently helps you separate distinct operations in a way that's both easier to read and easier for the compiler to reason about.
If you're in a situation where you actually want to apply an action to every element of a collection (an action with side effects that are unrelated to the actual contents of the collection) that's not really what LINQ is best suited for, although you could fake it with Select (or write your own IEnumerable extension method, as many people have.) It's probably best to stick with a foreach loop in that case.
You could also consider going parallel, especially if you don't care about the sequence and more about getting something done for each item:
SomeIEnumerable<T>.AsParallel().ForAll( Action<T> / Delegate / Lambda )
For example:
var numbers = new[] { 1, 2, 3, 4, 5 };
numbers.AsParallel().ForAll( Console.WriteLine );
HTH.
haha, man, I just asked this question a few hours ago (kind of)...try this:
example:
someIntList.ForEach(i=>i+5);
ForEach() is one of the built in .NET methods
This will modify the list, as opposed to returning a new one.
Or you can hack it up.
Items.All(p => { p.IsAwesome = true; return true; });
For collections that do not support ForEach you can use static ForEach method in Parallel class:
var options = new ParallelOptions() { MaxDegreeOfParallelism = 1 };
Parallel.ForEach(_your_collection_, options, x => x._Your_Method_());
You can try something like
var foo = (from fooItems in context.footable select fooItems.fooID + 1);
Returns a list of id's +1, you can do the same with using a function to whatever you have in the select clause.
Update: As suggested from Jon Skeet this is a better version of the snippet of code I just posted:
var foo = context.footable.Select(foo => foo.fooID + 1);
I found some way to perform in on dictionary contain my custom class methods
foreach (var item in this.Values.Where(p => p.IsActive == false))
item.Refresh();
Where 'this' derived from : Dictionary<string, MyCustomClass>
class MyCustomClass
{
public void Refresh(){}
}

Categories

Resources