Remove sublist from a list

Remove sublist from a list - c#

I have 2 lists: list1 and list2 (both of type int)
Now I want to remove content of list2 from list1. How I can do this in C#?
PS: Don't use loop.

IMPORTANT CHANGE
As was pointed out in the comments, .Except() uses a set internally, so any duplicate members of list1 will be absent in the final result.
Produces the set difference of two sequences
http://msdn.microsoft.com/en-us/library/system.linq.enumerable.except(v=vs.110).aspx
However, there is a solution that is both O(N) and preserves duplicates in the original list: Modify the RemoveAll(i => list2.Contains(i)) approach to use a HashSet<int> to hold the exclusion set.
List<int> list1 = Enumerable.Range(1, 10000000).ToList();
HashSet<int> exclusionSet = Enumerable.Range(500000, 10).ToHashSet();
list1.Remove(i => exclusionSet.Contains(i));
The extension method ToHashSet() is available in MoreLinq.
ORIGINAL ANSWER
You can use Linq
list1 = list1.Except(list2).ToList();
UPDATE
Out of curiosity I did a simple benchmark of my solution vs. #HighCore's.
For list2 having just one element, his code is faster. As list2 gets larger and larger, his code gets extremely slow. It looks like his is O(N-squared) (or more specifically O(list1.length*list2.length) since each item in list1 is compared to each item in list2). Don't have enough data points to check the Big-O of my solution, but it is much faster when list2 has more than a handful of elements.
Code used to test:
List<int> list1 = Enumerable.Range(1, 10000000).ToList();
List<int> list2 = Enumerable.Range(500000, 10).ToList(); // Gets MUCH slower as 10 increases to 100 or 1000
Stopwatch sw = Stopwatch.StartNew();
//list1 = list1.Except(list2).ToList();
list1.RemoveAll(i => list2.Contains(i));
sw.Stop();
var ms1 = sw.ElapsedMilliseconds;
UPDATE 2
This solution assigns a new list to the variable list1. As #Толя points out, other references (if any) to the original list1 will not be updated. This solution drastically outperforms RemoveAll for all but the smallest sizes of list2. If no other references must see the update, it is preferable for that reason.

list1.RemoveAll(x => list2.Contains(x));

You can use this:
List<T> result = list1.Except(list2).ToList();

This will remove every item in the secondList from the firstList:
firstList.RemoveAll( item => { secondList.Contains(item); } );

Related

Remove Range of items with HashSet

I am using an HashSet.
I am looking for a way to remove range of items from the beginning of the HashSet.
With List it can be done with RemoveRange
Example
Removes 10 items from the beginning:
dinosaurs.RemoveRange(0, 10);
Can this be done with HashSet?
[Edit] The the order of the HashSet does not matter since it contains only random strings.

HashSet stores unique pieces of data (under the hood, the storage is identical to the keys stored in a Dictionary) so it doesn't make much sense to remove "the first" 10 items, since they'll be effectively randomly ordered, only not actually random for most needs.
If you need to order them more appropriately for a random, OrderBy can do that:
var r = new Random();
foreach(var dinoToRemove in dinosaurs
.OrderBy(x => r.Next())
.Take(10))
{
dinosaurs.Remove(dinoToRemove);
}
If you really are determined to remove the first ten, you could also use an iterator with RemoveWhere
var count = 0;
dinosaurs.RemoveWhere(x => count++ < 10);

You can use LINQ and pass the results to a HashSet constructor:
var filteredList = new HashSet<Dinosaur>
(
originalList.Skip(10)
);
As Servy notes, a HashSet may not be ordered, so you might want to get an ordered enumerable before you skip 10.
var filteredList = new HashSet<Dinosaur>
(
originalList
.OrderBy( x => x.Foo )
.Skip(10)
);
If you don't want to instantiate a new HashSet (i.e. you want to remove items from the existing instance) you will need to do that one by one.

ConcurrentDictionary.Where very slow for filtering based int array (Key field)

I have the following
var links = new ConcurrentDictionary<int, Link>();
which is populated with around 20k records, I have another array of strings (List) that I turn into int array using following.
var intPossible = NonExistingListingIDs.Select(int.Parse); //this is very fast but need to be done
which is pretty fast. but I would like to create a new list or filter out "links" only to what is actually in the intPossible array which matches the Key element of the ConcurrentDictionary.
I have the following using a where clause but it takes about 50 seconds to do the actual filtering which is very slow for what I want to do.
var filtered = links.Where(x => intPossible.Any(y => y == x.Key)).ToList();
I know intersect is pretty fast but I have a array of ints and intersect is not working with this against a ConcurrentDictionary
How can i filter the links to be a little faster instead of 50 seconds.

You need to replace your O(n) inner lookup with something more speedy like a hashset which offers O(1) complexity for lookups.
So
var intPossible = new HashSet<int>(NonExistingListingIDs.Select(int.Parse));
and
var filtered = links.Where(x => intPossible.Contains(x.Key)).ToList();
This will avoid iterating most of intPossible for every item in links.
Alternatively, Linq is your friend:
var intPossible = NonExistingListingIDs.Select(int.Parse);
var filtered =
links.Join(intPossible, link => link.Key, intP => intP, (link, intP) => link);
The implementation of Join does much the same thing as I do above.

An alternative method would be to enumerate your list and use the indexer of the dictionary...might be a little cleaner...
var intPossible = NonExistingListingIDs.Select(int.Parse);
var filtered = from id in intPossible
where links.ContainsKey(id)
select links[id];
You might want to chuck in a .ToList() in there for good measure too...
This should actually be slightly faster than #spender's solution, since .Join has to create a new HashTable, whilst this method uses the HashTable in the ConcurrentDictionary.

If at least one element of list1 is in list2?

Having the following:
public List<int> List1 { get; set; }
...
var x = GiveMeObject(); // x.List2 --> each element on list2 has an Id (int).
...
bool containsAtLeastOne = ???
What is the easiest/fastest/shortest way (in linq) to verify if at least 1 element of list1 is in the list2 ?
Thanks

bool containsAtLeastOne = x.List2.Any(li => List1.Contains(li.Id));

alternative: Intersect
bool containsAtLeastOne = List1.Intersect(x.List2.Select(e => e.Id)).Any()
If your collections are getting large, you should use Intersect instead of Contains, since Intersect is at least as fast as Contains. Depending on your collecions, Contains can get slow quickly.
If your collections are quite small (< 1000 elements), this difference would probably not matter.
If you don't mind a non-LINQ way and some more lines of code, you could use
var tmp = new HashSet<int>(x.List2.Select(e => e.ID));
tmp.IntersectWith(list1);
bool containsAtLeastOne = tmp.Any();
which will probably be faster than the LINQ approach.

Some misunderstanding about sort extensions

I have the following code example:
List<int> list = new List<int>();
list.Add(1);
list.Add(2);
list.Add(3);
list.Add(4);
list.Add(5);
list.Add(6);
list.Add(7);
list.OrderByDescending(n=>n).Reverse();
But when I use this:
list.OrderByDescending(n=>n).Reverse();
I don't get wanted result.
If instead of the above statement, I use this one:
list.Reverse();
I get the wanted result.
Any idea why I don't get wanted result using the first statement ?
I believe I am missing something in understanding the extensions.
Thank you in advance.

The list.Reverse() method reverses the list in-place, so your original list is changed.
The .OrderByDescending() extension method produces a new list (or rather an IEnumerable<T>) and leaves your original list intact.
EDIT
To get two lists, for both sort orders:
List<int> upList = list.OrderBy(n => n).ToList();
List<int> downList = list.OrderByDescending(n => n).ToList();

Edit: So the problem seems to be that you think that the Enumerable extensions would change the original collection. No they do not. Actually they return something new you need to asign to a variable:
IEnumerable<int> ordered = list.OrderByDescending(n => n);
foreach(int i in ordered)
Console.WriteLine(i);
OrderByDescending orders descending(highest first) which is what you obviously want. So i don't understand why you reverse it afterwards.
So this should give you the expected result:
var ordered = list.OrderByDescending(n=> n);
This returns an "arbitrary" order:
list.Reverse()
since it just reverses the order you have added the ints. If you have added them in an ordered way you don't need to order at all.
In general: use OrderBy or OrderByDescending if you want to order a sequence and Reverse if you want to invert the sequence what is not necessarily an order (it is at least confusing).

Remove elements from one List<T> that are found in another

I have two lists
List<T> list1 = new List<T>();
List<T> list2 = new List<T>();
I want remove all elements from list1, which also exist in list2. Of course I can loop through the first loop looking for each element in list2, but I am looking for elegant solution.
Thanks!

To change the actual list1 in place, you could use
list1.RemoveAll(item => list2.Contains(item));
You might instead prefer to simply have a query over the lists without modifying either
var result = list1.Except(list2);
LukeH makes a good recommendation in the comments. In the first version, and if list2 is particularly large, it might be worth it to load the list into a HashSet<T> prior to the RemoveAll invocation. If the list is small, don't worry about it. If you are unsure, test both ways and then you will know.
var theSet = new HashSet<YourType>(list2);
list1.RemoveAll(item => theSet.Contains(item));

With LINQ:
var result = list1.Except(list2);

list1.RemoveAll( item => list2.Contains(item));

Description
I think you mean the generic type List<Type>. You can use Linq to do this
Sample
List<string> l = new List<string>();
List<string> l2 = new List<string>();
l.Add("one");
l.Add("two");
l.Add("three");
l2.Add("one");
l2.Add("two");
l2.Add("three");
l2.Add("four");
l2.RemoveAll(x => l.Contains(x));
More Information
MSDN - List.RemoveAll Method

var result = list1.Except(list2);

Using LINQ you can do this:
List1.RemoveAll(i => !List2.Contains(i));

If you want to remove a list of objects (list2) from another list (list1) use:
list1 = list1.Except(list2).ToList()
Remember to use ToList() to convert IEnumerable<T> to List<T>.

var NewList = FirstList.Where(a => SecondList.Exists(b => b.ID != a.ID));
Using LINQ

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Remove sublist from a list - c#

I have 2 lists: list1 and list2 (both of type int) Now I want to remove content of list2 from list1. How I can do this in C#? PS: Don't use loop.

list1.RemoveAll(x => list2.Contains(x));

You can use this: List<T> result = list1.Except(list2).ToList();

This will remove every item in the secondList from the firstList: firstList.RemoveAll( item => { secondList.Contains(item); } );

Related

Remove Range of items with HashSet

ConcurrentDictionary.Where very slow for filtering based int array (Key field)

If at least one element of list1 is in list2?

Some misunderstanding about sort extensions

Remove elements from one List<T> that are found in another

Categories

Resources