Manually removing items from a collection vs using Enumerable.Except C#

Manually removing items from a collection vs using Enumerable.Except C# - c#

To remove objects from a List of custom objects using Except method requires you to implement a IEqualityComparer on the object. But is it bad just to remove the objects in a normal foreach loop?
I understand the concept of the IEqualityComparer and using Except but I couldn't get it to work for some reason so I just removed the items manually. Is this considered bad programming?
EDIT: using the manual way id have to override Equals and GetHashcode polluting my view model - I guess that's bad?

In general, one should avoid making changes to a collection while enumerating.
I'm not entirely sure what your original problem is, or why you need to remove elements in such a way, but you're over-complicating the problem. If I understand it right, and you are in fact using a List<T> where T is a custom type. If this is the case, then simply use a LINQ query to get the values you want from the list.
var newList = oldList.Where(x => x.PropertyName != unwantedValue);
You could use Enumerable.Except, but it should be noted that Enumerable.Except returns a set, which is to say, no duplicate values are allowed.
Also, it should be noted that overriding .Equals and .GetHashCode does not pollute the viewmodel, as far as I know.
Sources:
http://msdn.microsoft.com/en-us/library/vstudio/bb336390(v=vs.100).aspx
Enumerable.Except Problem

Related

Add, Update, Remove between collections using Linq

Here is my scenario. I am using WPF and making use of two way binding to show a collection objects received from a service call every 60 seconds. On the first call I create a collection of objects that will be displayed from the collection of service objects. On subsequent calls I need to compare the service collection to the existing collection and then do one of three things:
If the Item exists in both collections then update ALL of the values for the object in the Display collection with the values from the object in the service collection.
If the item Exists in the Service Collection and not the Display Collection then add it to the Display Collection.
If the Item exists in the Display collection and not the Service Collection then remove it from the Display collection.
I am looking for the best way to do this.
Adding & Removing
Is it smarter to do a Left Join here and return everything essentially unique to one side of the other and then add or remove that as appropriate?
Should I attempt to do a Union since Linq is supposed to merge the two and ignore the duplicates?
If so how does it decide uniqueness? Is it evaluating all the properties? Can I specify which collection to keep from and which to discard in merging?
Should I use Except to create a list of differences and somehow use that?
Should I create a new list to add and remove using Where / Not In logic?
Updating
Since the collections aren't dictionaries what is the best way to do the comparison:
list1.ForEach(x => list2[x.Id].SomeProperty = x.SomeProperty);
Is there some way of copying ALL the property values other than specifying each one of them similar to above? Can I perform some kind of shallow copy within Linq Without replacing the actual object that is there?
I don't want to just clear my list and re-add everything each time because the object is bound in the display and I have logic in the properties that is tracking deviations as values change.

You can use the except and intersect methods to accomplish most of what you are looking to do.
However, depending on the size of your objects this can be very resource intensive.
I would recommend the following.
var listIDsA = collectionA.Select(s => s.Id).Distinct().ToList();
var listIDsB = collecitonB.Select(s => s.Id).Distinct().ToList();
var idsToRemove = listIDsB.Select (s => !listIDsA.Contains(s.Id)).ToList();
var idsToUpdate = listIDsB.Select(s => listIDsA.Contains(s.Id)).ToList();
var idsToAdd = listIDsA.SelecT(s => !listIDsB.Contains(s.Id)).ToList();
Then using the three new collections you can add/remove/update the apporpriate records.
You can also use a hashedset instead of IEnumerables for better performance. This will require you to create an extension class to add that functionality. Here is a good explanation of how to do that (it's not complicated).
How to convert linq results to HashSet or HashedSet
If you do this, you will need to replace the .ToList() in the first two lines to .ToHasedSet()

For your comparison you need to overwrite equals and get hashcode
Object.GetHashCode Method
Then you can use List.Contains
List.Contains Method
If you can use HashSet then you will get better performance
Code not tested
ListDisplay.Remove(x => !ListSerice.Contains(x));
Foreash(ListItem li in ListDisplay)
{
ListItem lis = ListSerice.FirstOrDefault(x => x.Equals(li));
if (lis == null) continue;
// perform update
}
Foreach(ListItem li in ListSerice.Where(x => !ListDisplay.Contains(x))) ListDisplay.Add(li);

Difficulty in Removing Items From List?

I have two lists. The first is of all students and the second is of selected students. I want if I one time select some student they will remove from the all-student list. Here is my code but it doesn't. Students won't get removed.
foreach (var li in ListSelectedStudents.ToList())
{
if (ListAllStudents.Contains(li))
{
ListAllStudents.Remove(li);
}
}

Contains will use equality to determine what is "equal", I am assuming here that your custom class hasn't provided a custom equality implementation, which means the default equatable will be provided for that type and it's just using reference equality. So even though you think two things are "equal", the Contains method doesn't and so doesn't step into the Remove call.
To get that particular code to behave what you need to do is provide an implementation of IEquatable<Student> on the Student class, as described in the remarks here.
In this instance, Contains isn't actually required as Remove will do the same checks. If there is nothing to remove, the Remove call will be transparent, effectively doing nothing.
As has been caught in the comments before I had chance to provide the information, Remove will also rely on IEquatable<Student> (docs) so you still need to provide an implementation, but it will make your code look a little cleaner:
foreach (var li in ListSelectedStudents.ToList())
{
ListAllStudents.Remove(li);
}
There may be various ways to do this without the need to implement the interface, but you won't be able to use your current code for it. I'll leave other answers to field those alternatives as it's Friday and my brain is not yet functioning properly.

have you tried using linq:
ListAllStudents.RemoveAll(m => ListSelectedStudents.Contains(m));
if it does not work, it could be something wrong with the default comparison implemented in the object, and you could either fix the comparer, or do something like:
ListAllStudents.RemoveAll(m => ListSelectedStudents.Any(n=>n.Id == m.Id)); // Assume the Id is the primary key of the object...

Try this:
ListSelectedStudents = ListSelectedStudents.Where(a => !ListSelectedStudents.Contains(a)).Select(a => a).ToList();

Real meaning of these linq querying styles

These linq queries can be written in both the ways. But choosing which way seems really to be confusing task. Please explain the difference in performance (if any) of these commands.
from table1Details in objDataContext.Table1s where table1Details.SomeId == 15
select new {....};
from table1Details in objDataContect.GetTable<Table1>() where table1Details.SomeId==15
select new {...};

That's not a difference related to LINQ at all. The data context is providing a property Table1s, which, internally, is just going to call GetTable<Table1>(). It's a convenience method with virtually no performance cost and absolutely no functional difference.

DataContext.GetTable Method
This method is the main entry point for querying. When a strongly
typed DataContext is created, new generated properties encapsulate
calls to this method. For example, a Customers property is
generated that returns GetTable<Customer>.
So for your case, there is no difference. Your DataContext has a property Table1, when you directly access it using objDataContext.Table1s it calls objDataContext.GetTable<Table1>()

But choosing which way seems really to be confusing task.
Why? What is confusing about it? Don't both ways work? Did you read the manual?
This method is the main entry point for querying. When a strongly typed DataContext is created, new generated properties encapsulate calls to this method. For example, a Customers property is generated that returns GetTable<Customer>.
GetTable<T>() just provides a generic way to access datasets, with runtime lookup (GetTable<UnknownEntityType>() will throw), as opposed to the generated, compile-time checked properties.

How can I access a property dynamically by name without using reflection?

At first I was using:
sortedList = unsorted.AsParallel().OrderBy(myItem => TypeDescriptor.GetProperties(myItem)[firstSort.Item2].GetValue(myItem));
Where firstSort.Item2 was the string name of the property. However, the performance degraded significantly as the number of items in the unsorted list increased. (As I expected)
Is there a way to do this without using reflection?
The brute force approach would be to do something like:
if(firstSort.Item2 == "Size")
sortedList = unsorted.AsParallel().OrderBy(myItem => myItem.Size);
else if(firstSort.Item2 == "Price")
sortedList = unsorted.AsParallel().OrderBy(myItem => myItem.Price);
...
I'm looking for something that would accomplish the above behavior, but without having to hardcode in all the different properties in the interface.

Everything you use that doesn't involve a hard-coded list of actual properties, will be using Reflection "behind the scenes".

You can use Expression<T> to pre-compile the expressions that you're passing to OrderBy. Then you can look them up at runtime.

You can create the PropertyInfo once and use it to call GetValue over multiple target objects. This will be much less expensive than calling TypeDescriptor.GetProperties for every item in the list.
Also, try removing AsParallel - the overhead may actually be reducing performance rather than helping it in this case.
Try this:
var prop = unsorted.GetType().GetGenericArguments()[0].GetProperty(firstSort.Item2);
sortedList = unsorted.OrderBy(myItem => prop.GetValue(myItem, null));

If you implement ICustomTypeDescriptor in your class, then you can avoid the reflection when using TypeDescriptor.
Of course, I'm assuming you own the type of myItem.

Your best bet is to use the Dynamic LINQ library provided by Microsoft.
Here is a link: http://weblogs.asp.net/scottgu/archive/2008/01/07/dynamic-linq-part-1-using-the-linq-dynamic-query-library.aspx

I like Roger's answer the best for a true solution, but if you want something simple, you can build a small code generator to take the class and break out its properties into a dictionary of string to lambda, representing each property. At runtime, you could call from this dictionary to retrieve the appropriate lambda.

You can use the DLR. The open source framework Impromptu-Interface does all the dlr plumbing behind the scenes and and gets the value of a property 2.5x faster than reflection.
sortedList = unsorted.AsParallel().OrderBy(myItem => Impromptu.InvokeGet(myItem,firstSort.Item2));

Common problem for me in C#, is my solution good, stupid, reasonable? (Advanced Beginner)

Ok, understand that I come from Cold Fusion so I tend to think of things in a CF sort of way, and C# and CF are as different as can be in general approach.
So the problem is: I want to pull a "table" (thats how I think of it) of data from a SQL database via LINQ and then I want to do some computations on it in memory. This "table" contains 6 or 7 values of a couple different types.
Right now, my solution is that I do the LINQ query using a Generic List of a custom Type. So my example is the RelevanceTable. I pull some data out that I want to do some evaluation of the data, which first start with .Contains. It appears that .Contains wants to act on the whole list or nothing. So I can use it if I have List<string>, but if I have List<ReferenceTableEntry> where ReferenceTableEntry is my custom type, I would need to override the IEquatable and tell the compiler what exactly "Equals" means.
While this doesn't seem unreasonable, it does seem like a long way to go for a simple problem so I have this sneaking suspicion that my approach is flawed from the get go.
If I want to use LINQ and .Contains, is overriding the Interface the only way? It seems like if there way just a way to say which field to operate on. Is there another collection type besides LIST that maybe has this ability. I have started using List a lot for this and while I have looked and looked, a see some other but not necessarily superior approaches.
I'm not looking for some fine point of performance or compactness or readability, just wondering if I am using a Phillips head screwdriver in a Hex screw. If my approach is a "decent" one, but not the best of course I'd like to know a better, but just knowing that its in the ballpark would give me little "Yeah! I'm not stupid!" and I would finish at least what I am doing completely before switch to another method.
Hope I explained that well enough. Thanks for you help.

What exactly is it you want to do with the table? It isn't clear. However, the standard LINQ (-to-Objects) methods will be available on any typed collection (including List<T>), allowing any range of Where, First, Any, All, etc.
So: what is you are trying to do? If you had the table, what value(s) do you want?
As a guess (based on the Contains stuff) - do you just want:
bool x= table.Any(x=>x.Foo == foo); // or someObj.Foo
?

There are overloads for some of the methods in the List class that takes a delegate (optionally in the form of a lambda expression), that you can use to specify what field to look for.
For example, to look for the item where the Id property is 42:
ReferenceTableEntry found = theList.Find(r => r.Id == 42);
The found variable will have a reference to the first item that matches, or null if no item matched.
There are also some LINQ extensions that takes a delegate or an expression. This will do the same as the Find method:
ReferenceTableEntry found = theList.FirstOrDefault(r => r.Id == 42);

Ok, so if I'm reading this correctly you want to use the contains method. When using this with collections of objects (such as ReferenceTableEntry) you need to be careful because what you're saying is you're checking to see if the collection contains an object that IS the same as the object you're comparing against.
If you use the .Find() or .FindAll() method you can specify the criteria that you want to match on using an anonymous method.
So for example if you want to find all ReferenceTableEntry records in your list that have an Id greater than 1 you could do something like this
List<ReferenceTableEntry> listToSearch = //populate list here
var matches = listToSearch.FindAll(x => x.Id > 1);
matches will be a list of ReferenceTableEntry records that have an ID greater than 1.
having said all that, it's not completely clear that this is what you're trying to do.

Here is the LINQ query involved that creates the object I am talking about, and the problem line is:
.Where (searchWord => queryTerms.Contains(searchWord.Word))
List<queryTerm> queryTerms = MakeQueryTermList();
public static List<RelevanceTableEntry> CreateRelevanceTable(List<queryTerm> queryTerms)
{
SearchDataContext myContext = new SearchDataContext();
var productRelevance = (from pwords in myContext.SearchWordOccuranceProducts
where (myContext.SearchUniqueWords
.Where (searchWord => queryTerms.Contains(searchWord.Word))
.Select (searchWord => searchWord.Id)).Contains(pwords.WordId)
orderby pwords.WordId
select new {pwords.WordId, pwords.Weight, pwords.Position, pwords.ProductId});
}
This query returns a list of WordId's that match the submitted search string (when it was List and it was just the word, that works fine, because as an answerer mentioned before, they were the same type of objects). My custom type here is queryTerms, a List that contains WordId, ProductId, Position, and Weight. From there I go about calculating the relevance by doing various operations on the created object. Sum "Weight" by product, use position matches to bump up Weights, etc. My point for keeping this separate was that the rules for doing those operations will change, but the basic factors involved will not. I would have even rather it be MORE separate (I'm still learning, I don't want to get fancy) but the rules for local and interpreted LINQ queries seems to trip me up when I do.
Since CF has supported queries of queries forever, that's how I tend to lean. Pull the data you need from the db, then do your operations (which includes queries with Aggregate functions) on the in-memory table.
I hope that makes it more clear.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.