Does CombineLatest conserve the order of the observables? - c#

I'm interested in the following overloads:
public static IObservable<IList<TSource>> CombineLatest<TSource>(this params IObservable<TSource>[] sources);
public static IObservable<IList<TSource>> CombineLatest<TSource>(this IEnumerable<IObservable<TSource>> sources);
Is the order of the elements in the resulting list guaranteed to be the same as the order in the input?
For example, in the following code will list[0] always contain an element from a and list[1] an element from b?
IObservable<int> a = ...;
IObservable<int> b = ...;
var list = await Observable.CombineLatest(a, b).FirstAsync();
The documentations states:
a list with the latest source elements
and:
observable sequence containing lists of the latest elements of the sources
but does not really mention anything about order.

The order is conserved.
When you look in the source code of RX, it all boils down to the System.Reactive.Linq.CombineLatest<TSource, TResult> class.
You can find there that an indexed observer is created for each input observable (where the index is the order in the input):
for (int i = 0; i < N; i++)
{
var j = i;
var d = new SingleAssignmentDisposable();
_subscriptions[j] = d;
var o = new O(this, j);
d.Disposable = srcs[j].SubscribeSafe(o);
}
And the resulting element is produced as follows:
private void OnNext(int index, TSource value)
{
lock (_gate)
{
_values[index] = value;
_hasValue[index] = true;
if (_hasValueAll || (_hasValueAll = _hasValue.All(Stubs<bool>.I)))
{
/* snip */
res = _parent._resultSelector(new ReadOnlyCollection<TSource>(_values));
/* snip */
_observer.OnNext(res);
}
/* snip */
}
}
The _resultSelector for the overloads I'm interested in is just a Enumerable.ToList(). So the order in the output list will be the same as the order in the input.

CombineLatest first fires when there was an element pushed to all the streams.
After that it fires whenever a new element is pushed to any of the streams.
So if you "combine" two streams, the resulting list always has two elements in it, and as far as I know it's guaranteed that the order is the same as the order you gave the streams in the CombineLatest parameters.
You can visualize Rx operators using marble diagrams. HERE is the one for CombineLatest.

Related

How to remove a scriptable object from a list of scriptable objects? [duplicate]

I am looking for a better pattern for working with a list of elements which each need processed and then depending on the outcome are removed from the list.
You can't use .Remove(element) inside a foreach (var element in X) (because it results in Collection was modified; enumeration operation may not execute. exception)... you also can't use for (int i = 0; i < elements.Count(); i++) and .RemoveAt(i) because it disrupts your current position in the collection relative to i.
Is there an elegant way to do this?
Iterate your list in reverse with a for loop:
for (int i = safePendingList.Count - 1; i >= 0; i--)
{
// some code
// safePendingList.RemoveAt(i);
}
Example:
var list = new List<int>(Enumerable.Range(1, 10));
for (int i = list.Count - 1; i >= 0; i--)
{
if (list[i] > 5)
list.RemoveAt(i);
}
list.ForEach(i => Console.WriteLine(i));
Alternately, you can use the RemoveAll method with a predicate to test against:
safePendingList.RemoveAll(item => item.Value == someValue);
Here's a simplified example to demonstrate:
var list = new List<int>(Enumerable.Range(1, 10));
Console.WriteLine("Before:");
list.ForEach(i => Console.WriteLine(i));
list.RemoveAll(i => i > 5);
Console.WriteLine("After:");
list.ForEach(i => Console.WriteLine(i));
foreach (var item in list.ToList()) {
list.Remove(item);
}
If you add ".ToList()" to your list (or the results of a LINQ query), you can remove "item" directly from "list" without the dreaded "Collection was modified; enumeration operation may not execute." error. The compiler makes a copy of "list", so that you can safely do the remove on the array.
While this pattern is not super efficient, it has a natural feel and is flexible enough for almost any situation. Such as when you want to save each "item" to a DB and remove it from the list only when the DB save succeeds.
A simple and straightforward solution:
Use a standard for-loop running backwards on your collection and RemoveAt(i) to remove elements.
Reverse iteration should be the first thing to come to mind when you want to remove elements from a Collection while iterating over it.
Luckily, there is a more elegant solution than writing a for loop which involves needless typing and can be error prone.
ICollection<int> test = new List<int>(new int[] {1, 2, 3, 4, 5, 6, 7, 8, 9, 10});
foreach (int myInt in test.Reverse<int>())
{
if (myInt % 2 == 0)
{
test.Remove(myInt);
}
}
Using the ToArray() on a generic list allows you to do a Remove(item) on your generic List:
List<String> strings = new List<string>() { "a", "b", "c", "d" };
foreach (string s in strings.ToArray())
{
if (s == "b")
strings.Remove(s);
}
Select the elements you do want rather than trying to remove the elements you don't want. This is so much easier (and generally more efficient too) than removing elements.
var newSequence = (from el in list
where el.Something || el.AnotherThing < 0
select el);
I wanted to post this as a comment in response to the comment left by Michael Dillon below, but it's too long and probably useful to have in my answer anyway:
Personally, I'd never remove items one-by-one, if you do need removal, then call RemoveAll which takes a predicate and only rearranges the internal array once, whereas Remove does an Array.Copy operation for every element you remove. RemoveAll is vastly more efficient.
And when you're backwards iterating over a list, you already have the index of the element you want to remove, so it would be far more efficient to call RemoveAt, because Remove first does a traversal of the list to find the index of the element you're trying to remove, but you already know that index.
So all in all, I don't see any reason to ever call Remove in a for-loop. And ideally, if it is at all possible, use the above code to stream elements from the list as needed so no second data structure has to be created at all.
Using .ToList() will make a copy of your list, as explained in this question:
ToList()-- Does it Create a New List?
By using ToList(), you can remove from your original list, because you're actually iterating over a copy.
foreach (var item in listTracked.ToList()) {
if (DetermineIfRequiresRemoval(item)) {
listTracked.Remove(item)
}
}
If the function that determines which items to delete has no side effects and doesn't mutate the item (it's a pure function), a simple and efficient (linear time) solution is:
list.RemoveAll(condition);
If there are side effects, I'd use something like:
var toRemove = new HashSet<T>();
foreach(var item in items)
{
...
if(condition)
toRemove.Add(item);
}
items.RemoveAll(toRemove.Contains);
This is still linear time, assuming the hash is good. But it has an increased memory use due to the hashset.
Finally if your list is only an IList<T> instead of a List<T> I suggest my answer to How can I do this special foreach iterator?. This will have linear runtime given typical implementations of IList<T>, compared with quadratic runtime of many other answers.
As any remove is taken on a condition you can use
list.RemoveAll(item => item.Value == someValue);
List<T> TheList = new List<T>();
TheList.FindAll(element => element.Satisfies(Condition)).ForEach(element => TheList.Remove(element));
You can't use foreach, but you could iterate forwards and manage your loop index variable when you remove an item, like so:
for (int i = 0; i < elements.Count; i++)
{
if (<condition>)
{
// Decrement the loop counter to iterate this index again, since later elements will get moved down during the remove operation.
elements.RemoveAt(i--);
}
}
Note that in general all of these techniques rely on the behaviour of the collection being iterated. The technique shown here will work with the standard List(T). (It is quite possible to write your own collection class and iterator that does allow item removal during a foreach loop.)
For loops are a bad construct for this.
Using while
var numbers = new List<int>(Enumerable.Range(1, 3));
while (numbers.Count > 0)
{
numbers.RemoveAt(0);
}
But, if you absolutely must use for
var numbers = new List<int>(Enumerable.Range(1, 3));
for (; numbers.Count > 0;)
{
numbers.RemoveAt(0);
}
Or, this:
public static class Extensions
{
public static IList<T> Remove<T>(
this IList<T> numbers,
Func<T, bool> predicate)
{
numbers.ForEachBackwards(predicate, (n, index) => numbers.RemoveAt(index));
return numbers;
}
public static void ForEachBackwards<T>(
this IList<T> numbers,
Func<T, bool> predicate,
Action<T, int> action)
{
for (var i = numbers.Count - 1; i >= 0; i--)
{
if (predicate(numbers[i]))
{
action(numbers[i], i);
}
}
}
}
Usage:
var numbers = new List<int>(Enumerable.Range(1, 10)).Remove((n) => n > 5);
However, LINQ already has RemoveAll() to do this
var numbers = new List<int>(Enumerable.Range(1, 10));
numbers.RemoveAll((n) => n > 5);
Lastly, you are probably better off using LINQ's Where() to filter and create a new list instead of mutating the existing list. Immutability is usually good.
var numbers = new List<int>(Enumerable.Range(1, 10))
.Where((n) => n <= 5)
.ToList();
Using Remove or RemoveAt on a list while iterating over that list has intentionally been made difficult, because it is almost always the wrong thing to do. You might be able to get it working with some clever trick, but it would be extremely slow. Every time you call Remove it has to scan through the entire list to find the element you want to remove. Every time you call RemoveAt it has to move subsequent elements 1 position to the left. As such, any solution using Remove or RemoveAt, would require quadratic time, O(n²).
Use RemoveAll if you can. Otherwise, the following pattern will filter the list in-place in linear time, O(n).
// Create a list to be filtered
IList<int> elements = new List<int>(new int[] {1, 2, 3, 4, 5, 6, 7, 8, 9, 10});
// Filter the list
int kept = 0;
for (int i = 0; i < elements.Count; i++) {
// Test whether this is an element that we want to keep.
if (elements[i] % 3 > 0) {
// Add it to the list of kept elements.
elements[kept] = elements[i];
kept++;
}
}
// Unfortunately IList has no Resize method. So instead we
// remove the last element of the list until: elements.Count == kept.
while (kept < elements.Count) elements.RemoveAt(elements.Count-1);
I would reassign the list from a LINQ query that filtered out the elements you didn't want to keep.
list = list.Where(item => ...).ToList();
Unless the list is very large there should be no significant performance problems in doing this.
The best way to remove items from a list while iterating over it is to use RemoveAll(). But the main concern written by people is that they have to do some complex things inside the loop and/or have complex compare cases.
The solution is to still use RemoveAll() but use this notation:
var list = new List<int>(Enumerable.Range(1, 10));
list.RemoveAll(item =>
{
// Do some complex operations here
// Or even some operations on the items
SomeFunction(item);
// In the end return true if the item is to be removed. False otherwise
return item > 5;
});
By assuming that predicate is a Boolean property of an element, that if it is true, then the element should be removed:
int i = 0;
while (i < list.Count())
{
if (list[i].predicate == true)
{
list.RemoveAt(i);
continue;
}
i++;
}
In C# one easy way is to mark the ones you wish to delete then create a new list to iterate over...
foreach(var item in list.ToList()){if(item.Delete) list.Remove(item);}
or even simpler use linq....
list.RemoveAll(p=>p.Delete);
but it is worth considering if other tasks or threads will have access to the same list at the same time you are busy removing, and maybe use a ConcurrentList instead.
I wish the "pattern" was something like this:
foreach( thing in thingpile )
{
if( /* condition#1 */ )
{
foreach.markfordeleting( thing );
}
elseif( /* condition#2 */ )
{
foreach.markforkeeping( thing );
}
}
foreachcompleted
{
// then the programmer's choices would be:
// delete everything that was marked for deleting
foreach.deletenow(thingpile);
// ...or... keep only things that were marked for keeping
foreach.keepnow(thingpile);
// ...or even... make a new list of the unmarked items
others = foreach.unmarked(thingpile);
}
This would align the code with the process that goes on in the programmer's brain.
foreach(var item in list.ToList())
{
if(item.Delete) list.Remove(item);
}
Simply create an entirely new list from the first one. I say "Easy" rather than "Right" as creating an entirely new list probably comes at a performance premium over the previous method (I haven't bothered with any benchmarking.) I generally prefer this pattern, it can also be useful in overcoming Linq-To-Entities limitations.
for(i = list.Count()-1;i>=0;i--)
{
item=list[i];
if (item.Delete) list.Remove(item);
}
This way cycles through the list backwards with a plain old For loop. Doing this forwards could be problematic if the size of the collection changes, but backwards should always be safe.
Just wanted to add my 2 cents to this in case this helps anyone, I had a similar problem but needed to remove multiple elements from an array list while it was being iterated over. the highest upvoted answer did it for me for the most part until I ran into errors and realized that the index was greater than the size of the array list in some instances because multiple elements were being removed but the index of the loop didn't keep track of that. I fixed this with a simple check:
ArrayList place_holder = new ArrayList();
place_holder.Add("1");
place_holder.Add("2");
place_holder.Add("3");
place_holder.Add("4");
for(int i = place_holder.Count-1; i>= 0; i--){
if(i>= place_holder.Count){
i = place_holder.Count-1;
}
// some method that removes multiple elements here
}
There is an option that hasn't been mentioned here.
If you don't mind adding a bit of code somewhere in your project, you can add and extension to List to return an instance of a class that does iterate through the list in reverse.
You would use it like this :
foreach (var elem in list.AsReverse())
{
//Do stuff with elem
//list.Remove(elem); //Delete it if you want
}
And here is what the extension looks like:
public static class ReverseListExtension
{
public static ReverseList<T> AsReverse<T>(this List<T> list) => new ReverseList<T>(list);
public class ReverseList<T> : IEnumerable
{
List<T> list;
public ReverseList(List<T> list){ this.list = list; }
public IEnumerator GetEnumerator()
{
for (int i = list.Count - 1; i >= 0; i--)
yield return list[i];
yield break;
}
}
}
This is basically list.Reverse() without the allocation.
Like some have mentioned you still get the drawback of deleting elements one by one, and if your list is massively long some of the options here are better. But I think there is a world where someone would want the simplicity of list.Reverse(), without the memory overhead.
Copy the list you are iterating. Then remove from the copy and interate the original. Going backwards is confusing and doesn't work well when looping in parallel.
var ids = new List<int> { 1, 2, 3, 4 };
var iterableIds = ids.ToList();
Parallel.ForEach(iterableIds, id =>
{
ids.Remove(id);
});
I would do like this
using System.IO;
using System;
using System.Collections.Generic;
class Author
{
public string Firstname;
public string Lastname;
public int no;
}
class Program
{
private static bool isEven(int i)
{
return ((i % 2) == 0);
}
static void Main()
{
var authorsList = new List<Author>()
{
new Author{ Firstname = "Bob", Lastname = "Smith", no = 2 },
new Author{ Firstname = "Fred", Lastname = "Jones", no = 3 },
new Author{ Firstname = "Brian", Lastname = "Brains", no = 4 },
new Author{ Firstname = "Billy", Lastname = "TheKid", no = 1 }
};
authorsList.RemoveAll(item => isEven(item.no));
foreach(var auth in authorsList)
{
Console.WriteLine(auth.Firstname + " " + auth.Lastname);
}
}
}
OUTPUT
Fred Jones
Billy TheKid
I found myself in a similar situation where I had to remove every nth element in a given List<T>.
for (int i = 0, j = 0, n = 3; i < list.Count; i++)
{
if ((j + 1) % n == 0) //Check current iteration is at the nth interval
{
list.RemoveAt(i);
j++; //This extra addition is necessary. Without it j will wrap
//down to zero, which will throw off our index.
}
j++; //This will always advance the j counter
}
The cost of removing an item from the list is proportional to the number of items following the one to be removed. In the case where the first half of the items qualify for removal, any approach which is based upon removing items individually will end up having to perform about N*N/4 item-copy operations, which can get very expensive if the list is large.
A faster approach is to scan through the list to find the first item to be removed (if any), and then from that point forward copy each item which should be retained to the spot where it belongs. Once this is done, if R items should be retained, the first R items in the list will be those R items, and all of the items requiring deletion will be at the end. If those items are deleted in reverse order, the system won't end up having to copy any of them, so if the list had N items of which R items, including all of the first F, were retained,
it will be necessary to copy R-F items, and shrink the list by one item N-R times. All linear time.
My approach is that I first create a list of indices, which should get deleted. Afterwards I loop over the indices and remove the items from the initial list. This looks like this:
var messageList = ...;
// Restrict your list to certain criteria
var customMessageList = messageList.FindAll(m => m.UserId == someId);
if (customMessageList != null && customMessageList.Count > 0)
{
// Create list with positions in origin list
List<int> positionList = new List<int>();
foreach (var message in customMessageList)
{
var position = messageList.FindIndex(m => m.MessageId == message.MessageId);
if (position != -1)
positionList.Add(position);
}
// To be able to remove the items in the origin list, we do it backwards
// so that the order of indices stays the same
positionList = positionList.OrderByDescending(p => p).ToList();
foreach (var position in positionList)
{
messageList.RemoveAt(position);
}
}
Trace the elements to be removed with a property, and remove them all after process.
using System.Linq;
List<MyProperty> _Group = new List<MyProperty>();
// ... add elements
bool cond = false;
foreach (MyProperty currObj in _Group)
{
// here it is supposed that you decide the "remove conditions"...
cond = true; // set true or false...
if (cond)
{
// SET - element can be deleted
currObj.REMOVE_ME = true;
}
}
// RESET
_Group.RemoveAll(r => r.REMOVE_ME);
myList.RemoveAt(i--);
simples;

Regarding Sorting MultiDimensional Arrays in C#

I am trying to figure out a way to correctly sort a bunch of different arraylists.
I am publishing content articles and every value [0] in an arraylist will relate to every other value [0]. and so on. Each element makes up the collective parts of a complete content item.
Now, the last element, popularity, is the amount of clicks an item has received. How do I
do a sort of the content items based on popularity without mixing up the html for each article?
*EDIT I am limited by the .NET 2.0 Framework at Work*
Below is the code... thanks.
public class MultiDimDictList : Dictionary<string, ArrayList> { }
myDicList.Add("fly", a_fly);
myDicList.Add("img", a_img);
myDicList.Add("bar", a_bar);
myDicList.Add("meter", a_meter);
myDicList.Add("block", a_block);
myDicList.Add("popularity", a_pop);
If you use the following code you can convert your existing dictionary of arraylists into a collection of Dictionaries and thus allowing a simple sort using Linq OrderBy
// Get the shortest arraylist length (they should be equal this is just a paranoia check!)
var count=myDicList.Values.Min(x=>x.Count);
// Get the collection of Keys
var keys=myDicList.Keys;
// Perform the conversion
var result=Enumerable.Range(0,count).Select(i=>keys.Select(k=>new {Key=k,Value=myDicList[k][i]}).ToDictionary(x=>x.Key,x=>x.Value));
var sorted=result.OrderByDescending(x=>x["popularity"]).ToList()
-- EDIT VERSION FOR .NET 2.0
First you need a comparer class
class PopularityComparison : IComparer<Dictionary<string,object>> {
private bool _sortAscending;
public PopularityComparison(bool sortAscending) {
_sortAscending = sortAscending;
}
public int Compare(Dictionary<string, object> x, Dictionary<string, object> y) {
object xValue = x["popularity"];
object yValue = y["popularity"];
// Sort Ascending
if (_sortAscending) {
return Comparer.Default.Compare(xValue, yValue);
} else {
return Comparer.Default.Compare(yValue, xValue);
}
}
}
Then you can use the following code
// Get the shortest arraylist length (they should be equal this is just a paranoia check!)
// Replacement for min
int count = int.MaxValue;
foreach (ArrayList a in myDicList.Values) if (a.Count < count) count = a.Count;
// Get the collection of Keys
Dictionary<string, ArrayList>.KeyCollection keys = myDicList.Keys;
// Perform the conversion
List<Dictionary<string, object>> result = new List<Dictionary<string, object>>(count);
for (int i = 0; i < count; i++) {
Dictionary<string, object> row = new Dictionary<string, object>(keys.Count);
foreach (string key in keys) row.Add(key, myDicList[key][i]);
result.Add(row);
}
And then finally to sort in ascending popularity order
result.Sort(new PopularityComparison(true));
or Descending order
result.Sort(new PopularityComparison(true));
I'd think it would be better to have an object containing your keys as properties, then a single collection with each item you'd have in your array lists.
This way you'd have a single collection sort, which becomes trivial if using Linq.OrderBy().
something like...
public class Article
{
public string Fly{get;set;}
public string Img{get;set;}
// etc.
public float Popularity{get;set;}
}
Then...
List<Article> articles = ... get from somewhere, or convert from your array lists.
List<Article> sorted = articles.OrderBy(a=>a.Popularity).ToList();
Please excuse the napkin code here... I'll update it if you need more detail.
An example using non-linq.
Create an implementation of IComparer.
public class ArticleComparer : IComparer<Article>
{
public bool Accending { get; set; }
public int Compare(Article x, Article y)
{
float result = x.Popularity - y.Popularity;
if (!Accending) { result *= -1; }
if (result == 0) { return 0; }
if (result > 0) return 1;
return -1;
}
}
Then when you go to sort the List, you can do something like the following.
ArticleComparer comparer = new ArticleComparer();
comparer.Accending = false;
articles.Sort(comparer);
This would be much easier if you had a list of article objects, each of which contained properties for fly, img, bar, popularity, etc. But if you really have to store things using this inside-out approach, then the only way you can sort the content items based on popularity is to create another array (or list) to hold the order.
Create a new list and populate it with sequential indexes:
List<int> OrderedByPopularity = new List<int>();
ArrayList popList = myDicList["popularity"];
for (int i = 0; i < popList.Count; ++i)
{
OrderedByPopularity.Add(i);
}
Now you have a list that contains the indexes of the items in the popularity list. Now you can sort:
OrderedByPopularity.Sort((i1, i2) => return popList[i1].CompareTo(popList[i2]););
But that gives you the least popular article first. If you want to reverse the sort so that OrderedByPopularity[0] is the most popular item:
OrderedByPopularity.Sort((i1, i2) => { return popList[i2].CompareTo(popList[i1]);});
Really, though, you should look into restructuring your application. It's much easier to work with objects that have properties rather than trying to maintain parallel arrays of properties.
If you have to do this in .NET 2.0, declare the poplist array at class scope (rather than method scope), and create a comparison method.
ArrayList poplist;
void MyMethod()
{
List<int> OrderedByPopularity = new List<int>();
popList = myDicList["popularity"];
for (int i = 0; i < popList.Count; ++i)
{
OrderedByPopularity.Add(i);
}
OrderedByPopularity.Sort(PopularityComparison);
// ...
}
int PopularityComparison(int i1, int i2)
{
return ((int)popList[i2]).CompareTo((int)popList[i1]);
}

How to pass a subset of a collection to a C# method?

In C++, anytime I want to process some range of items I do something like this:
template<class Iterator>
void method(Iterator range_begin, Iterator range_end);
Is there such a convenient way to do this in C#, not only for System.Array but for other collection classes as well?
// ...
using System.Linq;
IEnumerable<T> GetSubset<T>( IEnumerable<T> collection, int start, int len )
{
// error checking if desired
return collection.Skip( start ).Take( len );
}
I can't think of any at the moment, but using LINQ you can easily create an IEnumerable<T> representing a part of your collection.
As I understand it, the C# way is to see a part of a collection as a collection itself, and make your method to work on this sequence. This idea allows the method to be ignorant to whether it's working on the whole collection or its part.
Examples:
void method(IEnumerable<T> collection)
{ ... }
// from 2-nd to 5-th item:
method(original.Skip(1).Take(4));
// all even items:
method(original.Where(x => x % 2 == 0));
// everything till the first 0
method(original.TakeWhile(x => x != 0));
// everything
method(original);
etc.
Compare this to C++:
// from 2-nd to 5-th item:
method(original.begin() + 1, original.begin() + 5);
// everything
method(original.begin(), original.end());
// other two cannot be coded in this style in C++
In C++, your code calculates iterators in order to pass them around, and mark the beginning and the end of your sequence. In C#, you pass lightweight IEnumerables around. Due to the lazy evaluation this should have no overhead.
If you use LINQ, take a look at Skip and Take.
Could you just use IEnumerable<> and pass in your values as a LINQ query?
void MyMethod<T>(IEnumerable<T> workSet) {
foreach (var workItem in workSet) {
doWorkWithItem(workItem);
}
}
var dataset = yourArray.SkipWhile(i=>i!=startItem).TakeWhile(i=>i!=endItem);
MyMethod(dataset);
var pagedSet = yourArray.Skip(pageSize * pageNumber).Take(pageSize);
MyMethod(pagedSet);
wouldn't that be the same as using LAMBDA expression to subset your list?
public myClass {
public int Year { get; set; }
...
}
then...
List<myClass> allClasses = db.GetClasses();
IEnumerable<myClass> subsetClasses = allClasses.where(x => x.Year >= 1990 && x.Year <= 2000);
processSubset(subsetClasses);
or you could use skip() and take() just like in a database if you want to process n items in the collection without a given argument.
It depends on what kind of collection you're using. Some have more capabilities than others. Using a list, you can do something like this:
List<string> lst = new List<string>();
lst = lst.Where(str => str == "Harry" || str == "John" || str == "Joey").ToList();

C# merge distinct items of 2 collections

I'm looking for a performant way to add distinct items of a second ICollection to an existing one. I'm using .NET 4.
This should do it:
list1.Union(list2).Distinct(aCustomComparer).ToList()
As long as they're IEnumerable, you can use the go-to Linq answer:
var union = firstCollection.Union(secondCollection);
This will use the default equality comparison, which for most objects is referential equality. To change this, you can define an IEqualityComparer generic to the item type in your collection that will perform a more semantic comparison, and specify it as the second argument of the Union.
Another way to add to your exisiting list would be:
list1.AddRange(list2.Distinct().Except(list1));
The most direct answer to your question - since you didn't give much detail on the actual types of ICollection you have as input or need as output is the one given by KeithS
var union = firstCollection.Union(secondCollection);
This will return a distinct IEnumerable - if that is what you need then it is VERY fast. I made a small test app (below) that ran the union method (MethodA) against a simple hashset method of deduplicating and returns a Hashset<>(MethodB). The union method DESTROYS the hashset:
MethodA: 1ms
MethodB: 2827ms
However -- Having to convert that IEnumerable to some other type of collection such as List<> (like the version ADas posted) changes everything:
Simply adding .ToList() to MethodA
var union = firstCollection.Union(secondCollection).ToList();
Changes the results:
MethodA: 3656ms
MethodB: 2803ms
So - it seems more would need to be known about the specific case you are working with - and any solution you come up with should be tested - since a small (code) change can have HUGE impacts.
Below is the test I used to compare these methods - I'm sure it is a stupid way to test - but it seems to work :)
private static void Main(string[] args)
{
ICollection<string> collectionA = new List<string>();
ICollection<string> collectionB = new List<string>();
for (int i = 0; i < 1000; i++)
{
string randomString = Path.GetRandomFileName();
collectionA.Add(randomString);
collectionA.Add(randomString);
collectionB.Add(randomString);
collectionB.Add(randomString);
}
Stopwatch testA = new Stopwatch();
testA.Start();
MethodA(collectionA, collectionB);
testA.Stop();
Stopwatch testB = new Stopwatch();
testB.Start();
MethodB(collectionA, collectionB);
testB.Stop();
Console.WriteLine("MethodA: {0}ms", testA.ElapsedMilliseconds);
Console.WriteLine("MethodB: {0}ms", testB.ElapsedMilliseconds);
Console.ReadLine();
}
private static void MethodA(ICollection<string> collectionA, ICollection<string> collectionB)
{
for (int i = 0; i < 10000; i++)
{
var result = collectionA.Union(collectionB);
}
}
private static void MethodB(ICollection<string> collectionA, ICollection<string> collectionB)
{
for (int i = 0; i < 10000; i++)
{
var result = new HashSet<string>(collectionA);
foreach (string s in collectionB)
{
result.Add(s);
}
}
}

Is there a way to know I am getting the last element in the foreach loop

I need to do special treatment for the last element in the collection. I am wondering if I can know I hit the last element when using foreach loop.
Only way I know of is to increment a counter and compare with length on exit, or when breaking out of loop set a boolean flag, loopExitedEarly.
There isn't a direct way. You'll have to keep buffering the next element.
IEnumerable<Foo> foos = ...
Foo prevFoo = default(Foo);
bool elementSeen = false;
foreach (Foo foo in foos)
{
if (elementSeen) // If prevFoo is not the last item...
ProcessNormalItem(prevFoo);
elementSeen = true;
prevFoo = foo;
}
if (elementSeen) // Required because foos might be empty.
ProcessLastItem(prevFoo);
Alternatively, you could use the underlying enumerator to do the same thing:
using (var erator = foos.GetEnumerator())
{
if (!erator.MoveNext())
return;
Foo current = erator.Current;
while (erator.MoveNext())
{
ProcessNormalItem(current);
current = erator.Current;
}
ProcessLastItem(current);
}
It's a lot easier when working with collections that reveal how many elements they have (typically the Count property from ICollection or ICollection<T>) - you can maintain a counter (alternatively, if the collection exposes an indexer, you could use a for-loop instead):
int numItemsSeen = 0;
foreach(Foo foo in foos)
{
if(++numItemsSeen == foos.Count)
ProcessLastItem(foo)
else ProcessNormalItem(foo);
}
If you can use MoreLinq, it's easy:
foreach (var entry in foos.AsSmartEnumerable())
{
if(entry.IsLast)
ProcessLastItem(entry.Value)
else ProcessNormalItem(entry.Value);
}
If efficiency isn't a concern, you could do:
Foo[] fooArray = foos.ToArray();
foreach(Foo foo in fooArray.Take(fooArray.Length - 1))
ProcessNormalItem(foo);
ProcessLastItem(fooArray.Last());
Unfortunately not, I would write it with a for loop like:
string[] names = { "John", "Mary", "Stephanie", "David" };
int iLast = names.Length - 1;
for (int i = 0; i <= iLast; i++) {
Debug.Write(names[i]);
Debug.Write(i < iLast ? ", " : Environment.NewLine);
}
And yes, I know about String.Join :).
I see others already posted similar ideas while I was typing mine, but I'll post it anyway:
void Enumerate<T>(IEnumerable<T> items, Action<T, bool> action) {
IEnumerator<T> enumerator = items.GetEnumerator();
if (!enumerator.MoveNext()) return;
bool foundNext;
do {
T item = enumerator.Current;
foundNext = enumerator.MoveNext();
action(item, !foundNext);
}
while (foundNext);
}
...
string[] names = { "John", "Mary", "Stephanie", "David" };
Enumerate(names, (name, isLast) => {
Debug.Write(name);
Debug.Write(!isLast ? ", " : Environment.NewLine);
})
Not without jumping through flaming hoops (see above). But you can just use the enumerator directly (slightly awkward because of C#'s enumerator design):
IEnumerator<string> it = foo.GetEnumerator();
for (bool hasNext = it.MoveNext(); hasNext; ) {
string element = it.Current;
hasNext = it.MoveNext();
if (hasNext) { // normal processing
Console.Out.WriteLine(element);
} else { // special case processing for last element
Console.Out.WriteLine("Last but not least, " + element);
}
}
Notes on the other approaches I see here: Mitch's approach requires having access to a container which exposes it's size. J.D.'s approach requires writing a method in advance, then doing your processing via a closure. Ani's approach spreads loop management all over the place. John K's approach involves creating numerous additional objects, or (second method) only allows additional post processing of the last element, rather than special case processing.
I don't understand why people don't use the Enumerator directly in a normal loop, as I've shown here. K.I.S.S.
This is cleaner with Java iterators, because their interface uses hasNext rather than MoveNext. You could easily write an extension method for IEnumerable that gave you Java-style iterators, but that's overkill unless you write this kind of loop a lot.
Is it Special treatment can be done only while processing on the foreach loop, Is it you can't do that while adding to the collection. If this is your case, have your own custom collection,
public class ListCollection : List<string>
{
string _lastitem;
public void Add(string item)
{
//TODO: Do special treatment on the new Item, new item should be last one.
//Not applicable for filter/sort
base.Add(item);
}
}
List<int> numbers = new ....;
int last = numbers.Last();
Stack<int> stack = new ...;
stack.Peek();
update
var numbers = new int[] { 1, 2,3,4,5 };
var enumerator = numbers.GetEnumerator();
object last = null;
bool hasElement = true;
do
{
hasElement = enumerator.MoveNext();
if (hasElement)
{
last = enumerator.Current;
Console.WriteLine(enumerator.Current);
}
else
Console.WriteLine("Last = {0}", last);
} while (hasElement);
Console.ReadKey();
Deferred Execution trick
Build a class that encapsulates the values to be processed and the processing function for deferred execution purpose. We will end up using one instance of it for each element processed in the loop.
// functor class
class Runner {
string ArgString {get;set;}
object ArgContext {get;set;}
// CTOR: encapsulate args and a context to run them in
public Runner(string str, object context) {
ArgString = str;
ArgContext = context;
}
// This is the item processor logic.
public void Process() {
// process ArgString normally in ArgContext
}
}
Use your functor in the foreach loop to effect deferred execution by one element:
// intended to track previous item in the loop
var recent = default(Runner); // see Runner class above
// normal foreach iteration
foreach(var str in listStrings) {
// is deferred because this executes recent item instead of current item
if (recent != null)
recent.Process(); // run recent processing (from previous iteration)
// store the current item for next iteration
recent = new Runner(str, context);
}
// now the final item remains unprocessed - you have a choice
if (want_to_process_normally)
recent.Process(); // just like the others
else
do_something_else_with(recent.ArgString, recent.ArgContext);
This functor approach uses memory more but prevents you from having to count the elements in advance. In some scenarios you might achieve a kind of efficiency.
OR
Shorter Workaround
If you want to apply special processing to the last element after processing them all in a regular way ....
// example using strings
var recentStr = default(string);
foreach(var str in listStrings) {
recentStr = str;
// process str normally
}
// now apply additional special processing to recentStr (last)
It's a potential workaround.

Categories

Resources