Rx – Distinct with timeout? - c#

I’m wondering is there any way to implement Distinct in Reactive Extensions for .NET in such way that it will be working for given time and after this time it should reset and allow values that are come back again. I need this for hot source in application that will be working for whole year with now stops so I’m worried about performance and I need those values to be allowed after some time. There is also DistinctUntilChanged but in my case values could be mixed – for example: A A X A, DistinctUntilChanged will give me A X A and I need result A X and after given time distinct should be reset.

The accepted answer is flawed; flaw demonstrated below. Here's a demonstration of solution, with a test batch:
TestScheduler ts = new TestScheduler();
var source = ts.CreateHotObservable<char>(
new Recorded<Notification<char>>(200.MsTicks(), Notification.CreateOnNext('A')),
new Recorded<Notification<char>>(300.MsTicks(), Notification.CreateOnNext('B')),
new Recorded<Notification<char>>(400.MsTicks(), Notification.CreateOnNext('A')),
new Recorded<Notification<char>>(500.MsTicks(), Notification.CreateOnNext('A')),
new Recorded<Notification<char>>(510.MsTicks(), Notification.CreateOnNext('C')),
new Recorded<Notification<char>>(550.MsTicks(), Notification.CreateOnNext('B')),
new Recorded<Notification<char>>(610.MsTicks(), Notification.CreateOnNext('B'))
);
var target = source.TimedDistinct(TimeSpan.FromMilliseconds(300), ts);
var expectedResults = ts.CreateHotObservable<char>(
new Recorded<Notification<char>>(200.MsTicks(), Notification.CreateOnNext('A')),
new Recorded<Notification<char>>(300.MsTicks(), Notification.CreateOnNext('B')),
new Recorded<Notification<char>>(500.MsTicks(), Notification.CreateOnNext('A')),
new Recorded<Notification<char>>(510.MsTicks(), Notification.CreateOnNext('C')),
new Recorded<Notification<char>>(610.MsTicks(), Notification.CreateOnNext('B'))
);
var observer = ts.CreateObserver<char>();
target.Subscribe(observer);
ts.Start();
ReactiveAssert.AreElementsEqual(expectedResults.Messages, observer.Messages);
Solution includes a number of overloads for TimedDistinct, allowing for IScheduler injection, as well as IEqualityComparer<T> injection, similar to Distinct. Ignoring all those overloads, the solution rests on a helper method StateWhere, which is kind of like a combination of Scan and Where: It filters like a Where, but allows you to embed state in it like Scan.
public static class RxState
{
public static IObservable<TSource> TimedDistinct<TSource>(this IObservable<TSource> source, TimeSpan expirationTime)
{
return TimedDistinct(source, expirationTime, Scheduler.Default);
}
public static IObservable<TSource> TimedDistinct<TSource>(this IObservable<TSource> source, TimeSpan expirationTime, IScheduler scheduler)
{
return TimedDistinct<TSource>(source, expirationTime, EqualityComparer<TSource>.Default, scheduler);
}
public static IObservable<TSource> TimedDistinct<TSource>(this IObservable<TSource> source, TimeSpan expirationTime, IEqualityComparer<TSource> comparer)
{
return TimedDistinct(source, expirationTime, comparer, Scheduler.Default);
}
public static IObservable<TSource> TimedDistinct<TSource>(this IObservable<TSource> source, TimeSpan expirationTime, IEqualityComparer<TSource> comparer, IScheduler scheduler)
{
var toReturn = source
.Timestamp(scheduler)
.StateWhere(
new Dictionary<TSource, DateTimeOffset>(comparer),
(state, item) => item.Value,
(state, item) => state
.Where(kvp => item.Timestamp - kvp.Value < expirationTime)
.Concat(
!state.ContainsKey(item.Value) || item.Timestamp - state[item.Value] >= expirationTime
? Enumerable.Repeat(new KeyValuePair<TSource, DateTimeOffset>(item.Value, item.Timestamp), 1)
: Enumerable.Empty<KeyValuePair<TSource, DateTimeOffset>>()
)
.ToDictionary(kvp => kvp.Key, kvp => kvp.Value, comparer),
(state, item) => !state.ContainsKey(item.Value) || item.Timestamp - state[item.Value] >= expirationTime
);
return toReturn;
}
public static IObservable<TResult> StateSelectMany<TSource, TState, TResult>(
this IObservable<TSource> source,
TState initialState,
Func<TState, TSource, IObservable<TResult>> resultSelector,
Func<TState, TSource, TState> stateSelector
)
{
return source
.Scan(Tuple.Create(initialState, Observable.Empty<TResult>()), (state, item) => Tuple.Create(stateSelector(state.Item1, item), resultSelector(state.Item1, item)))
.SelectMany(t => t.Item2);
}
public static IObservable<TResult> StateWhere<TSource, TState, TResult>(
this IObservable<TSource> source,
TState initialState,
Func<TState, TSource, TResult> resultSelector,
Func<TState, TSource, TState> stateSelector,
Func<TState, TSource, bool> filter
)
{
return source
.StateSelectMany(initialState, (state, item) =>
filter(state, item) ? Observable.Return(resultSelector(state, item)) : Observable.Empty<TResult>(),
stateSelector);
}
}
The accepted answer has two flaws:
It doesn't accept IScheduler injection, meaning that it is hard to test within the Rx testing framework. This is easy to fix.
It relies on mutable state, which doesn't work well in a multi-threaded framework like Rx.
Issue #2 is noticeable with multiple subscribers:
var observable = Observable.Range(0, 5)
.DistinctFor(TimeSpan.MaxValue)
;
observable.Subscribe(i => Console.WriteLine(i));
observable.Subscribe(i => Console.WriteLine(i));
The output, following regular Rx behavior, should be outputting 0-4 twice. Instead, 0-4 is outputted just once.
Here's another sample flaw:
var subject = new Subject<int>();
var observable = subject
.DistinctFor(TimeSpan.MaxValue);
observable.Subscribe(i => Console.WriteLine(i));
observable.Subscribe(i => Console.WriteLine(i));
subject.OnNext(1);
subject.OnNext(2);
subject.OnNext(3);
This outputs 1 2 3 once, not twice.
Here's the code for MsTicks:
public static class RxTestingHelpers
{
public static long MsTicks(this int ms)
{
return TimeSpan.FromMilliseconds(ms).Ticks;
}
}

With a wrapper class that timestamps items, but does not consider the timestamp (created field) for hashing or equality:
public class DistinctForItem<T> : IEquatable<DistinctForItem<T>>
{
private readonly T item;
private DateTime created;
public DistinctForItem(T item)
{
this.item = item;
this.created = DateTime.UtcNow;
}
public T Item
{
get { return item; }
}
public DateTime Created
{
get { return created; }
}
public bool Equals(DistinctForItem<T> other)
{
if (ReferenceEquals(null, other)) return false;
if (ReferenceEquals(this, other)) return true;
return EqualityComparer<T>.Default.Equals(Item, other.Item);
}
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj)) return false;
if (ReferenceEquals(this, obj)) return true;
if (obj.GetType() != this.GetType()) return false;
return Equals((DistinctForItem<T>)obj);
}
public override int GetHashCode()
{
return EqualityComparer<T>.Default.GetHashCode(Item);
}
public static bool operator ==(DistinctForItem<T> left, DistinctForItem<T> right)
{
return Equals(left, right);
}
public static bool operator !=(DistinctForItem<T> left, DistinctForItem<T> right)
{
return !Equals(left, right);
}
}
It is now possible to write a DistinctFor extension method:
public static IObservable<T> DistinctFor<T>(this IObservable<T> src,
TimeSpan validityPeriod)
{
//if HashSet<DistinctForItem<T>> actually allowed us the get at the
//items it contains it would be a better choice.
//However it doesn't, so we resort to
//Dictionary<DistinctForItem<T>, DistinctForItem<T>> instead.
var hs = new Dictionary<DistinctForItem<T>, DistinctForItem<T>>();
return src.Select(item => new DistinctForItem<T>(item)).Where(df =>
{
DistinctForItem<T> hsVal;
if (hs.TryGetValue(df, out hsVal))
{
var age = DateTime.UtcNow - hsVal.Created;
if (age < validityPeriod)
{
return false;
}
}
hs[df] = df;
return true;
}).Select(df => df.Item);
}
Which can be used:
Enumerable.Range(0, 1000)
.Select(i => i % 3)
.ToObservable()
.Pace(TimeSpan.FromMilliseconds(500)) //drip feeds the observable
.DistinctFor(TimeSpan.FromSeconds(5))
.Subscribe(x => Console.WriteLine(x));
For reference, here is my implementation of Pace<T>:
public static IObservable<T> Pace<T>(this IObservable<T> src, TimeSpan delay)
{
var timer = Observable
.Timer(
TimeSpan.FromSeconds(0),
delay
);
return src.Zip(timer, (s, t) => s);
}

Related

How to implement my own operator in rx.net

I need the functionality of a hysteresis filter in RX. It should emit a value from the source stream only when the previously emitted value and the current input value differ by a certain amount. As a generic extension method, it could have the following signature:
public static IObservable<T> HysteresisFilter<T>(this IObservable<t> source, Func<T/*previously emitted*/, T/*current*/, bool> filter)
I was not able to figure out how to implement this with existing operators. I was looking for something like lift from RxJava, any other method to create my own operator. I have seen this checklist, but I haven't found any example on the web.
The following approaches (both are actually the same) which seem workaround to me work, but is there a more Rx way to do this, like without wrapping a subject or actually implementing an operator?
async Task Main()
{
var cts = new CancellationTokenSource(TimeSpan.FromSeconds(5));
var rnd = new Random();
var s = Observable.Interval(TimeSpan.FromMilliseconds(10))
.Scan(0d, (a,_) => a + rnd.NextDouble() - 0.5)
.Publish()
.AutoConnect()
;
s.Subscribe(Console.WriteLine, cts.Token);
s.HysteresisFilter((p, c) => Math.Abs(p - c) > 1d).Subscribe(x => Console.WriteLine($"1> {x}"), cts.Token);
s.HysteresisFilter2((p, c) => Math.Abs(p - c) > 1d).Subscribe(x => Console.WriteLine($"2> {x}"), cts.Token);
await Task.Delay(Timeout.InfiniteTimeSpan, cts.Token).ContinueWith(_=>_, TaskContinuationOptions.OnlyOnCanceled);
}
public static class ReactiveOperators
{
public static IObservable<T> HysteresisFilter<T>(this IObservable<T> source, Func<T, T, bool> filter)
{
return new InternalHysteresisFilter<T>(source, filter).AsObservable;
}
public static IObservable<T> HysteresisFilter2<T>(this IObservable<T> source, Func<T, T, bool> filter)
{
var subject = new Subject<T>();
T lastEmitted = default;
bool emitted = false;
source.Subscribe(
value =>
{
if (!emitted || filter(lastEmitted, value))
{
subject.OnNext(value);
lastEmitted = value;
emitted = true;
}
}
, ex => subject.OnError(ex)
, () => subject.OnCompleted()
);
return subject;
}
private class InternalHysteresisFilter<T>: IObserver<T>
{
Func<T, T, bool> filter;
T lastEmitted;
bool emitted;
private readonly Subject<T> subject = new Subject<T>();
public IObservable<T> AsObservable => subject;
public InternalHysteresisFilter(IObservable<T> source, Func<T, T, bool> filter)
{
this.filter = filter;
source.Subscribe(this);
}
public IDisposable Subscribe(IObserver<T> observer)
{
return subject.Subscribe(observer);
}
public void OnNext(T value)
{
if (!emitted || filter(lastEmitted, value))
{
subject.OnNext(value);
lastEmitted = value;
emitted = true;
}
}
public void OnError(Exception error)
{
subject.OnError(error);
}
public void OnCompleted()
{
subject.OnCompleted();
}
}
}
Sidenote: There will be several thousand of such filters applied to as many streams. I need throughput over latency, thus I am looking for the solution with the minimum of overhead both in CPU and in memory even if others look fancier.
Most examples I've seen in the book Introduction to Rx are using the method Observable.Create for creating new operators.
The Create factory method is the preferred way to implement custom observable sequences. The usage of subjects should largely remain in the realms of samples and testing. (citation)
public static IObservable<T> HysteresisFilter<T>(this IObservable<T> source,
Func<T, T, bool> predicate)
{
return Observable.Create<T>(observer =>
{
T lastEmitted = default;
bool emitted = false;
return source.Subscribe(value =>
{
if (!emitted || predicate(lastEmitted, value))
{
observer.OnNext(value);
lastEmitted = value;
emitted = true;
}
}, observer.OnError, observer.OnCompleted);
});
}
This answer is the same is equivalent to #Theodor's, but it avoids using Observable.Create, which I generally would avoid.
public static IObservable<T> HysteresisFilter2<T>(this IObservable<T> source,
Func<T, T, bool> predicate)
{
return source
.Scan((emitted: default(T), isFirstItem: true, emit: false), (state, newItem) => state.isFirstItem || predicate(state.emitted, newItem)
? (newItem, false, true)
: (state.emitted, false, false)
)
.Where(t => t.emit)
.Select(t => t.emitted);
}
.Scan is what you want to use when you're tracking state across items within an observable.

Creating a collection with a function to obtain the next member

I need to accumulate values into a collection, based on an arbitrary function. Each value is derived from calling a function on the previous value.
My current attempt:
public static T[] Aggregate<T>(this T source, Func<T, T> func)
{
var arr = new List<T> { };
var current = source;
while(current != null)
{
arr.Add(current);
current = func(current);
};
return arr.ToArray();
}
Is there a built-in .Net Framework function to do this?
This operation is usually called Unfold. There's no built-in version but it is implemented in FSharp.Core, so you could wrap that:
public static IEnumerable<T> Unfold<T, TState>(TState init, Func<TState, T> gen)
{
var liftF = new Converter<TState, Microsoft.FSharp.Core.FSharpOption<Tuple<T, TState>>>(x =>
{
var r = gen(x);
if (r == null)
{
return Microsoft.FSharp.Core.FSharpOption<Tuple<T, TState>>.None;
}
else
{
return Microsoft.FSharp.Core.FSharpOption<Tuple<T, TState>>.Some(Tuple.Create(r, x));
}
});
var ff = Microsoft.FSharp.Core.FSharpFunc<TState, Microsoft.FSharp.Core.FSharpOption<Tuple<T, TState>>>.FromConverter(liftF);
return Microsoft.FSharp.Collections.SeqModule.Unfold<TState, T>(ff, init);
}
public static IEnumerable<T> Unfold<T>(T source, Func<T, T> func)
{
return Unfold<T>(source, func);
}
however writing your own version would be simpler:
public static IEnumerable<T> Unfold<T>(T source, Func<T, T> func)
{
T current = source;
while(current != null)
{
yield return current;
current = func(current);
}
}
You are referring to an anamorphism as mentioned here linq-unfold-operator, which is the dual of a catamorphism.
Unfold is the dual of Aggregate. Aggregate exists in the .Net Framework; Unfold does not (for some unknown reason). Hence your confusion.
/// seeds: the initial data to unfold
/// stop: if stop(seed) is True, don't go any further
/// map: transform the seed into the final data
/// next: generate the next seed value from the current seed
public static IEnumerable<R> UnFold<T,R>(this IEnumerable<T> seeds, Predicate<T> stop,
Func<T,R> map, Func<T,IEnumerable<T>> next) {
foreach (var seed in seeds) {
if (!stop(seed)) {
yield return map(seed);
foreach (var val in next(seed).UnFold(stop, map, next))
yield return val;
}
}
}
Usage Example:
var parents = new[]{someType}.UnFold(t => t == null, t => t,
t => t.GetInterfaces().Concat(new[]{t.BaseType}))
.Distinct();

How to select increasing subsequence of values from IObservable<T>

How to write this method?
public static IObservable<T> IncreasingSubsequence<T>(this IObservable<T> observable, IComparer<T> comparer)
{
// ???
}
Resulting observable should push only those values that exceed maximum of all previous values.
Another approach would be to use Scan and DistinctUnitChanged. Here's a example using ints for simplicity
IObservable<int> xs;
xs.Scan((last,cur) => cur > last ? cur : last).DistinctUntilChanged()
and the more general form
public static IObservable<T> IncreasingSubsequence<T>(this IObservable<T> xs, IComparer<T> comp)
{
return xs.Scan((last,cur) => comp.Compare(cur, last) == 1 ? cur : last)
.DistinctUntilChanged();
}
I think the easiest way is to use Where() and the fact that closures are mutable:
public static IObservable<T> IncreasingSubsequence<T>(
this IObservable<T> observable, IComparer<T> comparer = null)
{
if (observable == null)
throw new ArgumentNullException("observable");
if (comparer == null)
comparer = Comparer<T>.Default;
T max = default(T);
bool first = true;
return observable.Where(x =>
{
if (first)
{
first = false;
max = x;
return true;
}
if (comparer.Compare(x, max) > 0)
{
max = x;
return true;
}
return false;
});
}

Using IEqualityComparer for Union

I simply want to remove duplicates from two lists and combine them into one list. I also need to be able to define what a duplicate is. I define a duplicate by the ColumnIndex property, if they are the same, they are duplicates. Here is the approach I took:
I found a nifty example of how to write inline comparers for the random occassions where you need em only once in a code segment.
public class InlineComparer<T> : IEqualityComparer<T>
{
private readonly Func<T, T, bool> getEquals;
private readonly Func<T, int> getHashCode;
public InlineComparer(Func<T, T, bool> equals, Func<T, int> hashCode)
{
getEquals = equals;
getHashCode = hashCode;
}
public bool Equals(T x, T y)
{
return getEquals(x, y);
}
public int GetHashCode(T obj)
{
return getHashCode(obj);
}
}
Then I just have my two lists, and attempt a union on them with the comparer.
var formatIssues = issues.Where(i => i.IsFormatError == true);
var groupIssues = issues.Where(i => i.IsGroupError == true);
var dupComparer = new InlineComparer<Issue>((i1, i2) => i1.ColumnInfo.ColumnIndex == i2.ColumnInfo.ColumnIndex,
i => i.ColumnInfo.ColumnIndex);
var filteredIssues = groupIssues.Union(formatIssues, dupComparer);
The result set however is null.
Where am I going astray?
I have already confirmed that the two lists have columns with equal ColumnIndex properties.
I've just run your code on a test set.... and it works!
public class InlineComparer<T> : IEqualityComparer<T>
{
private readonly Func<T, T, bool> getEquals;
private readonly Func<T, int> getHashCode;
public InlineComparer(Func<T, T, bool> equals, Func<T, int> hashCode)
{
getEquals = equals;
getHashCode = hashCode;
}
public bool Equals(T x, T y)
{
return getEquals(x, y);
}
public int GetHashCode(T obj)
{
return getHashCode(obj);
}
}
class TestClass
{
public string S { get; set; }
}
[TestMethod]
public void testThis()
{
var l1 = new List<TestClass>()
{
new TestClass() {S = "one"},
new TestClass() {S = "two"},
};
var l2 = new List<TestClass>()
{
new TestClass() {S = "three"},
new TestClass() {S = "two"},
};
var dupComparer = new InlineComparer<TestClass>((i1, i2) => i1.S == i2.S, i => i.S.GetHashCode());
var unionList = l1.Union(l2, dupComparer);
Assert.AreEqual(3, unionList);
}
So... maybe go back and check your test data - or run it with some other test data?
After all - for a Union to be empty - that suggests that both your input lists are also empty?
A slightly simpler way:
it does preserve the original order
it ignores dupes as it finds them
Uses a link extension method:
formatIssues.Union(groupIssues).DistinctBy(x => x.ColumnIndex)
This is the DistinctBy lambda method from MoreLinq
public static IEnumerable<TSource> DistinctBy<TSource, TKey>
(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
HashSet<TKey> knownKeys = new HashSet<TKey>();
foreach (TSource element in source)
{
if (knownKeys.Add(keySelector(element)))
{
yield return element;
}
}
}
Would the Linq Except method not do it for you?
var formatIssues = issues.Where(i => i.IsFormatError == true);
var groupIssues = issues.Where(i => i.IsGroupError == true);
var dupeIssues = issues.Where(i => issues.Except(new List<Issue> {i})
.Any(x => x.ColumnIndex == i.ColumnIndex));
var filteredIssues = formatIssues.Union(groupIssues).Except(dupeIssues);

Can you create a simple 'EqualityComparer<T>' using a lambda expression

Short question:
Is there a simple way in LINQ to objects to get a distinct list of objects from a list based on a key property on the objects.
Long question:
I am trying to do a Distinct() operation on a list of objects that have a key as one of their properties.
class GalleryImage {
public int Key { get;set; }
public string Caption { get;set; }
public string Filename { get; set; }
public string[] Tags {g et; set; }
}
I have a list of Gallery objects that contain GalleryImage[].
Because of the way the webservice works [sic] I have duplicates of the
GalleryImage object. i thought it would be a simple matter to use Distinct() to get a distinct list.
This is the LINQ query I want to use :
var allImages = Galleries.SelectMany(x => x.Images);
var distinctImages = allImages.Distinct<GalleryImage>(new
EqualityComparer<GalleryImage>((a, b) => a.id == b.id));
The problem is that EqualityComparer is an abstract class.
I dont want to :
implement IEquatable on GalleryImage because it is generated
have to write a separate class to implement IEqualityComparer as shown here
Is there a concrete implementation of EqualityComparer somewhere that I'm missing?
I would have thought there would be an easy way to get 'distinct' objects from a set based on a key.
(There are two solutions here - see the end for the second one):
My MiscUtil library has a ProjectionEqualityComparer class (and two supporting classes to make use of type inference).
Here's an example of using it:
EqualityComparer<GalleryImage> comparer =
ProjectionEqualityComparer<GalleryImage>.Create(x => x.id);
Here's the code (comments removed)
// Helper class for construction
public static class ProjectionEqualityComparer
{
public static ProjectionEqualityComparer<TSource, TKey>
Create<TSource, TKey>(Func<TSource, TKey> projection)
{
return new ProjectionEqualityComparer<TSource, TKey>(projection);
}
public static ProjectionEqualityComparer<TSource, TKey>
Create<TSource, TKey> (TSource ignored,
Func<TSource, TKey> projection)
{
return new ProjectionEqualityComparer<TSource, TKey>(projection);
}
}
public static class ProjectionEqualityComparer<TSource>
{
public static ProjectionEqualityComparer<TSource, TKey>
Create<TKey>(Func<TSource, TKey> projection)
{
return new ProjectionEqualityComparer<TSource, TKey>(projection);
}
}
public class ProjectionEqualityComparer<TSource, TKey>
: IEqualityComparer<TSource>
{
readonly Func<TSource, TKey> projection;
readonly IEqualityComparer<TKey> comparer;
public ProjectionEqualityComparer(Func<TSource, TKey> projection)
: this(projection, null)
{
}
public ProjectionEqualityComparer(
Func<TSource, TKey> projection,
IEqualityComparer<TKey> comparer)
{
projection.ThrowIfNull("projection");
this.comparer = comparer ?? EqualityComparer<TKey>.Default;
this.projection = projection;
}
public bool Equals(TSource x, TSource y)
{
if (x == null && y == null)
{
return true;
}
if (x == null || y == null)
{
return false;
}
return comparer.Equals(projection(x), projection(y));
}
public int GetHashCode(TSource obj)
{
if (obj == null)
{
throw new ArgumentNullException("obj");
}
return comparer.GetHashCode(projection(obj));
}
}
Second solution
To do this just for Distinct, you can use the DistinctBy extension in MoreLINQ:
public static IEnumerable<TSource> DistinctBy<TSource, TKey>
(this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector)
{
return source.DistinctBy(keySelector, null);
}
public static IEnumerable<TSource> DistinctBy<TSource, TKey>
(this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector,
IEqualityComparer<TKey> comparer)
{
source.ThrowIfNull("source");
keySelector.ThrowIfNull("keySelector");
return DistinctByImpl(source, keySelector, comparer);
}
private static IEnumerable<TSource> DistinctByImpl<TSource, TKey>
(IEnumerable<TSource> source,
Func<TSource, TKey> keySelector,
IEqualityComparer<TKey> comparer)
{
HashSet<TKey> knownKeys = new HashSet<TKey>(comparer);
foreach (TSource element in source)
{
if (knownKeys.Add(keySelector(element)))
{
yield return element;
}
}
}
In both cases, ThrowIfNull looks like this:
public static void ThrowIfNull<T>(this T data, string name) where T : class
{
if (data == null)
{
throw new ArgumentNullException(name);
}
}
Building on Charlie Flowers' answer, you can create your own extension method to do what you want which internally uses grouping:
public static IEnumerable<T> Distinct<T, U>(
this IEnumerable<T> seq, Func<T, U> getKey)
{
return
from item in seq
group item by getKey(item) into gp
select gp.First();
}
You could also create a generic class deriving from EqualityComparer, but it sounds like you'd like to avoid this:
public class KeyEqualityComparer<T,U> : IEqualityComparer<T>
{
private Func<T,U> GetKey { get; set; }
public KeyEqualityComparer(Func<T,U> getKey) {
GetKey = getKey;
}
public bool Equals(T x, T y)
{
return GetKey(x).Equals(GetKey(y));
}
public int GetHashCode(T obj)
{
return GetKey(obj).GetHashCode();
}
}
This is the best i can come up with for the problem in hand.
Still curious whether theres a nice way to create a EqualityComparer on the fly though.
Galleries.SelectMany(x => x.Images).ToLookup(x => x.id).Select(x => x.First());
Create lookup table and take 'top' from each one
Note: this is the same as #charlie suggested but using ILookup - which i think is what a group must be anyway.
What about a throw away IEqualityComparer generic class?
public class ThrowAwayEqualityComparer<T> : IEqualityComparer<T>
{
Func<T, T, bool> comparer;
public ThrowAwayEqualityComparer(Func<T, T, bool> comparer)
{
this.comparer = comparer;
}
public bool Equals(T a, T b)
{
return comparer(a, b);
}
public int GetHashCode(T a)
{
return a.GetHashCode();
}
}
So now you can use Distinct with a custom comparer.
var distinctImages = allImages.Distinct(
new ThrowAwayEqualityComparer<GalleryImage>((a, b) => a.Key == b.Key));
You might be able to get away with the <GalleryImage>, but I'm not sure if the compiler could infer the type (don't have access to it right now.)
And in an additional extension method:
public static class IEnumerableExtensions
{
public static IEnumerable<TValue> Distinct<TValue>(this IEnumerable<TValue> #this, Func<TValue, TValue, bool> comparer)
{
return #this.Distinct(new ThrowAwayEqualityComparer<TValue>(comparer);
}
private class ThrowAwayEqualityComparer...
}
You could group by the key value and then select the top item from each group. Would that work for you?
This idea is being debated here, and while I'm hoping the .NET Core team adopt a method to generate IEqualityComparer<T>s from lambda, I'd suggest you to please vote and comment on that idea, and use the following:
Usage:
IEqualityComparer<Contact> comp1 = EqualityComparerImpl<Contact>.Create(c => c.Name);
var comp2 = EqualityComparerImpl<Contact>.Create(c => c.Name, c => c.Age);
class Contact { public Name { get; set; } public Age { get; set; } }
Code:
public class EqualityComparerImpl<T> : IEqualityComparer<T>
{
public static EqualityComparerImpl<T> Create(
params Expression<Func<T, object>>[] properties) =>
new EqualityComparerImpl<T>(properties);
PropertyInfo[] _properties;
EqualityComparerImpl(Expression<Func<T, object>>[] properties)
{
if (properties == null)
throw new ArgumentNullException(nameof(properties));
if (properties.Length == 0)
throw new ArgumentOutOfRangeException(nameof(properties));
var length = properties.Length;
var extractions = new PropertyInfo[length];
for (int i = 0; i < length; i++)
{
var property = properties[i];
extractions[i] = ExtractProperty(property);
}
_properties = extractions;
}
public bool Equals(T x, T y)
{
if (ReferenceEquals(x, y))
//covers both are null
return true;
if (x == null || y == null)
return false;
var len = _properties.Length;
for (int i = 0; i < _properties.Length; i++)
{
var property = _properties[i];
if (!Equals(property.GetValue(x), property.GetValue(y)))
return false;
}
return true;
}
public int GetHashCode(T obj)
{
if (obj == null)
return 0;
var hashes = _properties
.Select(pi => pi.GetValue(obj)?.GetHashCode() ?? 0).ToArray();
return Combine(hashes);
}
static int Combine(int[] hashes)
{
int result = 0;
foreach (var hash in hashes)
{
uint rol5 = ((uint)result << 5) | ((uint)result >> 27);
result = ((int)rol5 + result) ^ hash;
}
return result;
}
static PropertyInfo ExtractProperty(Expression<Func<T, object>> property)
{
if (property.NodeType != ExpressionType.Lambda)
throwEx();
var body = property.Body;
if (body.NodeType == ExpressionType.Convert)
if (body is UnaryExpression unary)
body = unary.Operand;
else
throwEx();
if (!(body is MemberExpression member))
throwEx();
if (!(member.Member is PropertyInfo pi))
throwEx();
return pi;
void throwEx() =>
throw new NotSupportedException($"The expression '{property}' isn't supported.");
}
}
Here's an interesting article that extends LINQ for this purpose...
http://www.singingeels.com/Articles/Extending_LINQ__Specifying_a_Property_in_the_Distinct_Function.aspx
The default Distinct compares objects based on their hashcode - to easily make your objects work with Distinct, you could override the GetHashcode method.. but you mentioned that you are retrieving your objects from a web service, so you may not be able to do that in this case.
implement IEquatable on GalleryImage because it is generated
A different approach would be to generate GalleryImage as a partial class, and then have another file with the inheritance and IEquatable, Equals, GetHash implementation.

Categories

Resources