CollectionChanged and IList of Items - why the difficulties - c#

I am looking into the topic why a ObservableCollection/ListCollectionView/CollectionView raises a NotSuportedException when calling the CollectionChanged with the parameter of IList.
//Throws an exception
private void collectionChanged_Removed(IList items)
{
if (CollectionChanged != null)
CollectionChanged(this, new NotifyCollectionChangedEventArgs(NotifyCollectionChangedAction.Remove, items));
}
I have found several Webpages, talking about this topic and they suggest either using the Reset ability to force a complete redraw of the UI, or just simply call for each item the CollectionChanged or some more creative way: http://geekswithblogs.net/NewThingsILearned/archive/2008/01/16/listcollectionviewcollectionview-doesnt-support-notifycollectionchanged-with-multiple-items.aspx
I just cant find the WHY?
For me it makes no sense why this would be the case.
Is there any chance that this lacking feature, which we all face at some point in our Development Cycle, since the Add method just has to much of an overhead when you want to Add multiple items fast, will be implemented any time (.Net 5, C# 6...).
Edit:
In my specific case, I have written my own class :
public class ObservableList<T> : IList<T>, IList, IEnumerable<T>,
INotifyCollectionChanged
{
public event NotifyCollectionChangedEventHandler CollectionChanged;
//other stuff...
}
And still throws the said NotSupportedException.

Inspired by VirtualBlackFox's answer I took a look under the hood of the CollectionView classes in ILSpy. It appears that the primary reason why the lack of support for Range operations is because internally the CollectionView uses a change log to centrally manage pending changes of all kinds and dispatch messages on a per/item basis.
By its very purpose, the CollectionView could store 1000s of records used simultaneously with multiple UI controls representing its underlying data. So, adding or deleting records must be done on an atomic basis to maintain the integrity of the UI controls that access the view information. You can't synchronize incremental changes with multiple UI subscribers using bulk change events without passing the grouping, sorting, and filtering functionality of the CollectionView onto the UI controls that use it.
The CollectionView also derives from System.Windows.Threading.Dispatcher so the issue may be co-related to how it manages work items on it's thread. The call path includes a protected ProcessCollectionChanged method that specifically processes individual changes on the UI thread. So, updating ranges may interfere with the whole threading model it uses to interact with UI elements that use it.
I totally agree that having consumers of the CollectionView pass in an IList to NotifyCollectionChangedEventArgs is silly. It specifically rejects anything with a length != 1 and hard-codes for args.NewItems[0] internally.

As #nmclean said in the comments the problem isn't in the collection emitting CollectionChanged but on the receiving end.
If you look at the code of ListCollectionView (Using DotPeek for example or on new versions of visual studio you can access the reference source code) you will notice that each time the attached collection change it call a method ValidateCollectionChangedEventArgs that throw when there is more than one element changed
private void ValidateCollectionChangedEventArgs(NotifyCollectionChangedEventArgs e)
{
switch (e.Action)
{
case NotifyCollectionChangedAction.Add:
if (e.NewItems.Count == 1)
break;
else
throw new NotSupportedException(System.Windows.SR.Get("RangeActionsNotSupported"));
...
The rest of the class and it's CollectionView base class are already big beasts (2710 and 2027 lines in the source published on the reference source code) so Microsoft might have wanted to avoid supporting a complex case that the collection that they recommend don't create anyway.
(The method handling collection change is already 141 lines in the reference source code and adding multi element support will make it grow even more or need a careful split and potentially break other things...)
I didn't find any suggestions linked to adding support for range events in Microsoft Connect but you should submit your own if it is important for you.

I guess this is mainly for performance reasons. I was also shocked when I saw that CollectionView also does not accept the -1 value for NewStartingIndex or OldStartingIndex. So basically, CollectionView always wants a mapping from items to their indices. However, it does not require this mapping to be exact (which is strange from my point of view), it is allowed that NewStartingIndex is smaller than the correct index (unless it is -1).
I think the root of the problem is that large parts within Microsoft still think that a list is the one and only way to implement a collection, which of course just is not true. Here, the creators of NotifyCollectionChangedEventArgs thought about alternatives (such as linked lists or hashing collections), but the UI guys did not. Or at least they did not want to support these collections as they appear rather rarely.

The temporary solution is useless. It only hides the problems.
The solution could lie in making events where you provide an entire new list to the observers. That way Microsoft won't have to implement efficient range handlers for each type of observer.

Related

Best practice for interface to allow adding, deleting etc. child objects w/ broadcasting events (similar to ObservableCollection)

I'm trying to specify an interface for a Folder. That interface should allow to
- Add or delete files of type IFile
- Get an List of IFile
- Broadcast events whenever a file was added/deleted/changed (e.g. for the GUI to subscribe to)
and I'm trying to find the best way to do it. So far, I came up with three ideas:
1
public interface IFolder_v1
{
ObservableCollection<IFile> files;
}
2
public interface IFolder_v2
{
void add(IFile);
void remove(IFile);
IEnumerable<IFile> files { get; }
EventHandler OnFileAdded { get; }
EventHandler OnFileRemoved { get; }
EventHandler OnFileDeleted { get; }
}
3
public interface IFolder_v3
{
void add(IFile);
void remove(IFile);
IEnumerable<IFile> files { get; }
EventHandler<CRUD_EventArgs> OnFilesChanged { get; }
}
public class CRUD_EventArgs : EventArgs
{
public enum Operations
{
added,
removed,
updated
}
private Operations _op;
public CRUD_EventArgs(Operations operation)
{
this._op = operation;
}
public Operations operation
{
get
{
return this._op;
}
}
}
Idea #1 seems really nice to implement as doesn't require much code, but has some problems: What, for example, if an implementation of IFolder only allows to add files of specific types (Say, text files), and throws an exception whenever another file is being added? I don't think that would be feasible with a simple ObservableCollection.
Idea #2 seems ok, but requires more code. Also, defining three separate events seems a bit tedious - what if an object needs to subscribe to all events? We'd need to subscribe to 3 different eventhandlers for that. Seems annoying.
Also a little less easy to use than solution #1 as now, one needs to call .Add to add files, but a list of files is stored in .files etc. - so the naming conventions are a bit less clear than having everything bundled up in one simple sub-object (.files from idea #1).
Idea #3 circumvents all of those problems, but has the longest code. Also, I have to use a custom EventArgs class, which I can't imagine is particularly clean in an interface definition? (Also seems overkill to define a class like that for simple CRUD event notifications, shouldn't there be an existing class of some sort?)
Would appreciate some feedback on what you think is the best solution (possibly even something I haven't thought of at all). Is there any best practice?
Take a look at the Framework's FileSystemWatcher class. It does pretty much what you need, but if anyway you still need to implement your own class, you can take ideas by looking at how it is implemented (which is by the way similar to your #2 approach).
Having said that, I personally think that #3 is also a very valid approach. Don't be afraid of writing long code (within reasonable limits of course) if the result is more readable and maintainable than it would be with shorter code.
Personally I would go with #2.
In #1 you just expose a entire collection of objects, allowing everyone to do anything with them.
#3 seems less self explanatory to me. Though - I like to keep thing simple when coding so I may be biased.
If watchers are going to be shorter-lived than the thing being watched, I would avoid events. The pattern exemplified by ObservableCollection, where the collection gives a subscribed observer an IDisposable object which can be used to unsubscribe is a much better approach. If you use such a pattern, you can have your class hold a weak reference (probably use a "long" weak reference) to the the subscription object, which would in turn hold a strong reference (probably a delegate) to the subscriber and to the weak reference which identifies it. Abandoned subscriptions will thus get cleaned up by the garbage collector; it will be the duty of a subscriber to ensure that a strongly-rooted reference exists to the subscription object.
Beyond the fact that abandoned subscriptions can get cleaned up, another advantage of using the
"disposable subscription-object" approach is that unsubscription can easily be made lock-free and thread-safe, and run in constant time. To dispose a subscription, simply null out the delegate contained therein. If each attempt to add a subscription causes the subscription manager to inspect a couple of subscriptions to ensure that they are still valid, the total number of subscriptions in existence will never grow to more than twice the number that were valid as of the last garbage collection.

Collection properties should be read only - Loophole?

In the process of adhering to code analysis errors, I'm changing my properties to have private setters. Then I started trying to understand why a bit more. From some research, MS says this:
A writable collection property allows a user to replace the collection with a completely different collection.
And the answer, here, states:
Adding a public setter on a List<T> object is dangerous.
But the reason why it's dangerous is not listed. And that's the part where I'm curious.
If we have this collection:
public List<Foo> Foos { get; set; }
Why make the setter private? Apparently we don't want client code to replace the collection, but if a client can remove every element, and then add whatever they want, what's the point? Is that not the same as replacing the collection entirely? How is value provided by following this code analysis rule?
Not exposing the setter prevents a situation where the collection is assigned a value of null. There's a difference between null and a collection without any values. Consider:
for (var value in this.myCollection){ // do something
When there are no values (i.e. someone has called Remove on every value), nothing bad happens. When this.myCollection is null, however, a NullReferenceException will be thrown.
Code Analysis is making the assumption that your code doesn't check that myCollection is null before operating on it.
It's probably also an additional safeguard for the thread-safe collection types defined in System.Collections.Concurrent. Imagine some thread trying to replace the entire collection by overwritting it. By getting rid of the public setter, the only option the thread has is to call the thread-safe Add and Remove methods.
If you're exposing an IList (which would be better practice) the consumer could replace the collection with an entirely different class that implements IList, which could have unpredictable effects. You could have subscribed to events on that collection, or on items in that collection that you're now incorrectly responding to.
In addition to SimpleCoder's null checking (which is, of course, important), there's other things you need to consider.
Someone could replace the List, causing big problems in thread safety
Events to a replaced List won't be sent to subscribers of the old one
You're exposing much, much more behavior then you need to. For example, I wouldn't even make the getter public.
To clarify point 3, don't do cust.Orders.clear(), but make a function called clearOrders() instead.
What if a customer isn't allowed to go over a credit limit? You have no control over that if you expose the list. You'd have to check that (and every other piece of business logic) every place where you might add an order. Yikes! That's a lot of potential for bugs. Instead, you can place it all in an addOrder(Order o) function and be right as rain.
For almost every (I'd say every, but sometimes cheating feels good...) business class, every property should be private for get and set, and if feasible make them readonly too. In this way, users of your class get only behaviors. Protect as much of your data as you can!
ReadOnlyCollection and ReadOnlyObservableCollection exists only for read only collection scenearios.
ReadOnlyObservableCollection is very useful for one way binding in WPF/Silverlight/Metro apps.
If you have a Customer class with a List Property then this property should always have a private setter else it can be changed from outside the customer object via:
customer.Orders = new List<Order>
//this could overwrite data.
Always use the add and remove methods of the collection.
The Orders List should be created inside the Customer constructor via:
Orders = new List<Order>();
Do you really want to check everywhere in your code wether the customer.Orders != null then operate on the Orders?
Or you create the Orders property in your customer object as suggested and never check for customer.Orders == null instead just enumerate the Orders, if its count is zero nothing happens...

Transaction support in an observable collection

I'm interested the most efficient way to change an observable collection in such a way that only one property changed is fired. Lets say that I want to populate the list with 3 items, there is no addCollection method or something like that, so I have to do clear + 3 times add. Do I need to create a different observable collection and assign? Or what techniqies do others use?
NET Framework's ObservableCollection class sends individual notifications on as each item added to the collection and provides no mechanism for AddRange-type functionality. However you can very easily create your own collection that implements INotifyCollectionChanged and send whatever notifications you like.
On issue you may encounter is that the INotifyCollectionChanged interface includes the ability to specify that multiple items were added to the collection in a single message, but no standard NET Framework classes actually create these notifications. Because of this, some third-party and open source controls that assume only one item has been added when they receive an Add notification. Even the built-in NET Framework classes may have undiscovered bugs related to this.
For these reasons I would recommend your custom collection have a mode in which it can be set to always send a Reset notification at the end of an AddRange instead of a single multi-item Add notification. You could optimize this further by sending multiple single-item Add notifictions or a Reset notification depending on the actual number of items added.
Of course there are situations in which it is just as easy to replace the ObservableCollection with a new one. At times this will be much less efficient than looping Add() because event handlers and CollectionViews are rebuilt. Other times it will be more efficient if the collection is large and your loop only adds a few items at a time.
And sometimes it won't work at all.

Return collection as read-only

I have an object in a multi-threaded environment that maintains a collection of information, e.g.:
public IList<string> Data
{
get
{
return data;
}
}
I currently have return data; wrapped by a ReaderWriterLockSlim to protect the collection from sharing violations. However, to be doubly sure, I'd like to return the collection as read-only, so that the calling code is unable to make changes to the collection, only view what's already there. Is this at all possible?
If your underlying data is stored as list you can use List(T).AsReadOnly method.
If your data can be enumerated, you can use Enumerable.ToList method to cast your collection to List and call AsReadOnly on it.
I voted for your accepted answer and agree with it--however might I give you something to consider?
Don't return a collection directly. Make an accurately named business logic class that reflects the purpose of the collection.
The main advantage of this comes in the fact that you can't add code to collections so whenever you have a native "collection" in your object model, you ALWAYS have non-OO support code spread throughout your project to access it.
For instance, if your collection was invoices, you'd probably have 3 or 4 places in your code where you iterated over unpaid invoices. You could have a getUnpaidInvoices method. However, the real power comes in when you start to think of methods like "payUnpaidInvoices(payer, account);".
When you pass around collections instead of writing an object model, entire classes of refactorings will never occur to you.
Note also that this makes your problem particularly nice. If you don't want people changing the collections, your container need contain no mutators. If you decide later that in just one case you actually HAVE to modify it, you can create a safe mechanism to do so.
How do you solve that problem when you are passing around a native collection?
Also, native collections can't be enhanced with extra data. You'll recognize this next time you find that you pass in (Collection, Extra) to more than one or two methods. It indicates that "Extra" belongs with the object containing your collection.
If your only intent is to get calling code to not make a mistake, and modify the collection when it should only be reading all that is necessary is to return an interface which doesn't support Add, Remove, etc.. Why not return IEnumerable<string>? Calling code would have to cast, which they are unlikely to do without knowing the internals of the property they are accessing.
If however your intent is to prevent the calling code from observing updates from other threads you'll have to fall back to solutions already mentioned, to perform a deep or shallow copy depending on your need.
I think you're confusing concepts here.
The ReadOnlyCollection provides a read-only wrapper for an existing collection, allowing you (Class A) to pass out a reference to the collection safe in the knowledge that the caller (Class B) cannot modify the collection (i.e. cannot add or remove any elements from the collection.)
There are absolutely no thread-safety guarantees.
If you (Class A) continue to modify the underlying collection after you hand it out as a ReadOnlyCollection then class B will see these changes, have any iterators invalidated, etc. and generally be open to any of the usual concurrency issues with collections.
Additionally, if the elements within the collection are mutable, both you (Class A) and the caller (Class B) will be able to change any mutable state of the objects within the collection.
Your implementation depends on your needs:
- If you don't care about the caller (Class B) from seeing any further changes to the collection then you can just clone the collection, hand it out, and stop caring.
- If you definitely need the caller (Class B) to see changes that are made to the collection, and you want this to be thread-safe, then you have more of a problem on your hands. One possibility is to implement your own thread-safe variant of the ReadOnlyCollection to allow locked access, though this will be non-trivial and non-performant if you want to support IEnumerable, and it still won't protect you against mutable elements in the collection.
One should note that aku's answer will only protect the list as being read only. Elements in the list are still very writable. I don't know if there is any way of protecting non-atomic elements without cloning them before placing them in the read only list.
You can use a copy of the collection instead.
public IList<string> Data {
get {
return new List<T>(data);
}}
That way it doesn't matter if it gets updated.
You want to use the yield keyword. You loop through the IEnumerable list and return the results with yeild. This allows the consumer to use the for each without modifying the collection.
It would look something like this:
List<string> _Data;
public IEnumerable<string> Data
{
get
{
foreach(string item in _Data)
{
return yield item;
}
}
}

Threadsafe foreach enumeration of lists

I need to enumerate though generic IList<> of objects. The contents of the list may change, as in being added or removed by other threads, and this will kill my enumeration with a "Collection was modified; enumeration operation may not execute."
What is a good way of doing threadsafe foreach on a IList<>? prefferably without cloning the entire list. It is not possible to clone the actual objects referenced by the list.
Cloning the list is the easiest and best way, because it ensures your list won't change out from under you. If the list is simply too large to clone, consider putting a lock around it that must be taken before reading/writing to it.
There is no such operation. The best you can do is
lock(collection){
foreach (object o in collection){
...
}
}
Your problem is that an enumeration does not allow the IList to change. This means you have to avoid this while going through the list.
A few possibilities come to mind:
Clone the list. Now each enumerator has its own copy to work on.
Serialize the access to the list. Use a lock to make sure no other thread can modify it while it is being enumerated.
Alternatively, you could write your own implementation of IList and IEnumerator that allows the kind of parallel access you need. However, I'm afraid this won't be simple.
ICollection MyCollection;
// Instantiate and populate the collection
lock(MyCollection.SyncRoot) {
// Some operation on the collection, which is now thread safe.
}
From MSDN
You'll find that's a very interesting topic.
The best approach relies on the ReadWriteResourceLock which use to have big performance issues due to the so called Convoy Problem.
The best article I've found treating the subject is this one by Jeffrey Richter which exposes its own method for a high performance solution.
So the requirements are: you need to enumerate through an IList<> without making a copy while simultaniously adding and removing elements.
Could you clarify a few things? Are insertions and deletions happening only at the beginning or end of the list?
If modifications can occur at any point in the list, how should the enumeration behave when elements are removed or added near or on the location of the enumeration's current element?
This is certainly doable by creating a custom IEnumerable object with perhaps an integer index, but only if you can control all access to your IList<> object (for locking and maintaining the state of your enumeration). But multithreaded programming is a tricky business under the best of circumstances, and this is a complex probablem.
Forech depends on the fact that the collection will not change. If you want to iterate over a collection that can change, use the normal for construct and be prepared to nondeterministic behavior. Locking might be a better idea, depending on what you're doing.
Default behavior for a simple indexed data structure like a linked list, b-tree, or hash table is to enumerate in order from the first to the last. It would not cause a problem to insert an element in the data structure after the iterator had already past that point or to insert one that the iterator would enumerate once it had arrived, and such an event could be detected by the application and handled if the application required it. To detect a change in the collection and throw an error during enumeration I could only imagine was someone's (bad) idea of doing what they thought the programmer would want. Indeed, Microsoft has fixed their collections to work correctly. They have called their shiny new unbroken collections ConcurrentCollections (System.Collections.Concurrent) in .NET 4.0.
I recently spend some time multip-threading a large application and had a lot of issues with the foreach operating on list of objects shared across threads.
In many cases you can use the good old for-loop and immediately assign the object to a copy to use inside the loop. Just keep in mind that all threads writing to the objects of your list should write to different data of the objects. Otherwise, use a lock or a copy as the other contributors suggest.
Example:
foreach(var p in Points)
{
// work with p...
}
Can be replaced by:
for(int i = 0; i < Points.Count; i ++)
{
Point p = Points[i];
// work with p...
}
Wrap the list in a locking object for reading and writing. You can even iterate with multiple readers at once if you have a suitable lock, that allows multiple concurrent readers but also a single writer (when there are no readers).
This is something that I've recently had to deal with and to me it really depends on what you're doing with the list.
If you need to use the list at a point in time (given the number of elements currently in it) AND another thread can only ADD to the end of the list, then maybe you just switch out to a FOR loop with a counter. At the point you grab the counter, you're only seeing X numbers of elements in the list. You can walk through the list (while others are adding to the end of it) . . . should not cause a problem.
Now, if the list needs to have items taken OUT of it by other threads, or CLEARED by other threads, then you'll need to implement one of the locking mechanisms mentioned above. Also, you may want to look at some of the newer "concurrent" collection classes (though I don't believe they implement IList - so you may need refactor for a dictionary).

Categories

Resources