Resharper removes yield from foreach. Why? - c#

I recently learned about yield and then created the following test console program:
public static string Customers = "Paul,Fred,Doug,Mark,Josh";
public static string Admins = "Paul,Doug,Mark";
public static void Main()
{
var test = CreateEfficientObject();
Console.WriteLine(test.Admins.FirstOrDefault());
//Note that 'GetAllCustomers' never runs.
}
public static IEnumerable<string> GetAllCustomers()
{
var databaseFetch = Customers.Split(',');
foreach (var s in databaseFetch)
{
yield return s;
}
}
public static IEnumerable<string> GetAllAdmins()
{
var databaseFetch = Admins.Split(',');
foreach (var s in databaseFetch)
{
yield return s;
}
}
static LoginEntitys CreateEfficientObject()
{
var returnObject = new LoginEntitys {};
returnObject.Admins = GetAllAdmins();
returnObject.Customers = GetAllCustomers();
return returnObject;
}
}
public class LoginEntitys
{
public IEnumerable<String> Admins { get; set; }
public IEnumerable<String> Customers { get; set; }
}
Yet I noticed Resharper wants to convert my foreach loops to :
public static IEnumerable<string> GetAllCustomers()
{
var databaseFetch = Customers.Split(',');
return databaseFetch;
}
Why does Resharper want to remove yield from this case? It changes the functionality completely as it will no longer lazy load without yield. I can only guess that either
A) I am using yield incorrectly/in bad pratice
B) It's a Resharper bug/suggestion that can just be ignored.
Any insight would be great.

You are correct that this proposed transformation changes the functionality of the code in subtle ways, preventing it from deferring the evaluation of the properties and performing the Split from being evaluated as early.
Perhaps those that implemented it were well aware that it was a change in functionality and felt that it was still a useful suggestion, one that could be ignored if the existing semantics were important, or if they actually failed to realize that the semantics were being altered. There's no good way for us to know, we can only guess. If those semantics are important for your program, then you are correct to not make the suggested transformation.

I think Resharper is being a bit dumb here, in the sense that its applying a standard "convert foreach to LINQ" transform without being aware of the context.
It doesn't suggest the same edits for a while loop:
public static IEnumerable<string> ReadLineFromFile(TextReader fileReader)
{
using (fileReader)
{
string currentLine;
while ((currentLine = fileReader.ReadLine()) != null)
{
yield return currentLine;
}
}
}
I guess the next iteration of Resharper which uses Roslyn will be much more context aware.
Thanks #servy for an engaging and refreshing discussion!

The code in your example is not calling the iterator on the IEnumerable you are returning. If you were using the result of GetAllAdmins() in a LINQ query for example the yield would be useful because execution of the expression could resume on each iteration.
I would imagine Resharper is just suggesting you remove unused code.

Related

Determining if a private field is read using Roslyn

I've been searching all day and have read many posts but I just can't quite come to a conclusion on this. I'm trying to create a Roslyn analyzer to report a diagnostic when a private field is unread. Registering the syntax action and finding out if its private was really easy. But now I'm stuck on trying to find out if the field is read in the class.
Assume we have the following example code:
public class C {
private int foo; //private field is declared but never read. Should report diagnostic here
public void DoNothing() {
//irrelevant
}
}
There are several examples of where I'd want this flagged (initialized or not, injected or not, multiple declarations on single line, etc.), but I think maybe they're not necessary for illustrating the question.
What I have so far is this:
public override void Initialize(AnalysisContext context) {
context.EnableConcurrentExecution();
context.ConfigureGeneratedCodeAnalysis(GeneratedCodeAnalysisFlags.None);
context.RegisterSyntaxNodeAction(AnalyzeField, SyntaxKind.FieldDeclaration);
}
private void AnalyzeField(SyntaxNodeAnalysisContext context) {
if (!(context.Node is FieldDeclarationSyntax fieldDeclarationSyntax)) {
return;
}
foreach (var variableDeclaration in fieldDeclarationSyntax.Declaration.Variables) {
if (context.SemanticModel.GetDeclaredSymbol(variableDeclaration) is IFieldSymbol variableDeclarationSymbol &&
IsFieldPrivate(variableDeclarationSymbol) &&
!IsFieldRead(context, variableDeclarationSymbol)) {
//report diagnostic here
}
}
}
private bool IsFieldPrivate(IFieldSymbol fieldSymbol) {
return fieldSymbol.DeclaredAccessibility == Accessibility.Private || // the field itself is explicitly private
fieldSymbol.ContainingType?.DeclaredAccessibility == Accessibility.Private; //the field is not private, but is contained within a private class
}
private bool IsFieldRead(SyntaxNodeAnalysisContext context, IFieldSymbol fieldSymbol) {
//context.Node.Parent will be the class declaration here since we're analyzing a field declaration
//but let's be safe about that just in case and make sure we traverse up until we find the class declaration
var classDeclarationSyntax = context.Node.Parent;
while (!(classDeclarationSyntax is ClassDeclarationSyntax)) {
classDeclarationSyntax = classDeclarationSyntax.Parent;
}
var methodsInClassContainingPrivateField = classDeclarationSyntax.DescendantNodes().OfType<MethodDeclarationSyntax>().ToImmutableArray();
foreach (var method in methodsInClassContainingPrivateField) {
var dataFlowAnalysis = context.SemanticModel.AnalyzeDataFlow(method); //does not work because this is not a StatementSyntax or ExpressionSyntax
if (dataFlowAnalysis.ReadInside.Contains(fieldSymbol) || dataFlowAnalysis.ReadOutside.Contains(fieldSymbol)) {
return true;
}
}
return false;
}
I just can't quite figure out how to get the IsFieldRead() method to work. This really feels like something that should be easy to do but I just can't quite wrap my head around it. I figured getting the methods and analyzing those for my field to see if it was read would be a decent idea, but that doesn't cover if the field is read by another private field, and I can't get it working anyway. :)
I managed to get this figured out thanks to this other SO answer by someone who actually works on Roslyn at Microsoft. Here is my IsFieldRead() method now. The key apparently lied in the Microsoft.CodeAnalysis.Operations namespace.
private bool IsFieldRead(SyntaxNodeAnalysisContext context, IFieldSymbol fieldSymbol) {
var classDeclarationSyntax = context.Node.Parent;
while (!(classDeclarationSyntax is ClassDeclarationSyntax)) {
classDeclarationSyntax = classDeclarationSyntax.Parent;
if (classDeclarationSyntax == null) {
throw new InvalidOperationException("You have somehow traversed up and out of the syntax tree when determining if a private member field is being read.");
}
}
//get all methods in the class
var methodsInClass = classDeclarationSyntax.DescendantNodes().OfType<MethodDeclarationSyntax>().ToImmutableArray();
foreach (var method in methodsInClass) {
//get all member references in those methods
if (context.SemanticModel.GetOperation(method).Descendants().OfType<IMemberReferenceOperation>().ToImmutableArray().Any(x => x.Member.Equals(fieldSymbol))) {
return true;
}
}
return false;
}
Note that this only covers usages within methods. There are several other places like other fields, properties, and constructors that would also need to be checked.

Avoid multiple similar foreach - C#

I apologize if I'm posting into the wrong community, I'm quite new here.
I have multiple methods using the same foreach loop, changing only the inner method I call:
public void CalculationMethod1()
{
foreach (Order order in ordersList)
{
foreach (Detail obj_detail in order.Details)
{
CalculateDiscount(obj_detail);
}
}
}
public void CalculationMethod2()
{
foreach (Order order in ordersList)
{
foreach (Detail obj_detail in order.Details)
{
CalculateTax(obj_detail);
}
}
}
Each inner method has different logic, database search, math calculations (not important here).
I'd like to call the methods above without repeating the foreach loop everytime, so I throught about the solution below:
public void CalculateMethod_3()
{
foreach (Order obj_order in ordersList)
{
foreach (Detail obj_detail in order.Details)
{
CalculateDiscount(obj_detail);
CalculateTax(obj_detail);
}
}
}
But I fall into a rule problem:
class Program
{
static void Main(string[] args)
{
Calculation c = new Calculation();
c.CalculateMethod_3();
c.AnotherMethod_4(); //It doesn't use objDetail
c.AnotherMethod_5(); //It doesn't use objDetail
c.CalculateMethod_6(); //Method 6 needs objDetail but respecting the order of the methods, so It must be after AnotherMethod_4 and AnotherMethod_5
}
}
How can I create a method to achieve my objective (I don't want to repeat code) respecting the rule above?
You can always pass a delegate to the method and then you can do basically whatever you want.
public void ApplyToDetails(Action<Detail> callback)
{
foreach (Order order in ordersList)
{
foreach (Detail obj_detail in order.Details)
{
callback(obj_detail);
}
}
}
Then to use you'd do something like this
ApplyToDetails(detail => CalculateTax(detail));
ApplyToDetails(detail =>
{
CalculateDiscount(detail);
CalculateTax(detail);
});
Delegates come in very handy in many cases and definitely in such a case. I know this has already been answered and rightly so, but here is an alternative for comparison. I have provided a link to give you some insight.
public class CalculationMethods
{
public delegate void CalculationDelegate(Detail method);
private Dictionary<string, CalculationDelegate> _methods;
public CalculationMethods
{
this._methods = new Dictionary<string, CalculationDelegate>()
{
{ "Discount", CalculateDiscount },
{ "Tax", CalculateTax }
};
}
public void Calculate(string method, Detail obj_detail)
{
foreach (Order order in ordersList)
{
foreach (Detail obj_detail in order.Details)
{
var m = this._methods.FirstOrDefault(item => item.Key == method).Value;
m(obj_detail);
}
}
}
}
Usage:
//Initialize
var methods = new CalculationMethods();
//Calculate Discount
methods.Calculate("Discount", obj_detail);
//Calculate Tax
methods.Calculate("Tax", obj_detail);
Side Note:
I would recommend some exception handling in case the method of calculation isn't found among the list of delegates. Example below: (Replace the calculate method with the following.)
public void Calculate(string method, Detail obj_detail)
{
foreach (Order order in ordersList)
{
foreach (Detail obj_detail in order.Details)
{
var m = this._methods.FirstOrDefault(item => item.Key == method).Value;
//Check if the method was found
if (m == null)
throw new ApplicationNullException("CalculationDelegate")
m(obj_detail);
}
}
}
Decent tutorial:
Delegates and Events
You can use delegates. (Google it - I don't have a development environment in front of me to run up a sample for you). Basically one method that takes a delegate to call:
Here is pseudo code...
public void CalculationMethod(delegate myFunction) // need to look up correct param syntax
{
foreach (Order order in ordersList)
{
foreach (Detail obj_detail in order.Details)
{
myFunction(); // Need to lookup correct calling syntax
}
}
}
I googled "c# delegate as parameter" and came up with http://msdn.microsoft.com/en-us/library/ms173172.aspx which seems to be a reasonable explanation.
As Darren Kopp says, you can use delegates. However, in the case that you are calling a method with a parameter, you can call it directly (you don't need the lambda expression).
With public void ApplyToDetails(Action<Detail> callback) { ... }:
ApplyToDetails(Method_1); // Uses objDetail
ApplyToDetails(d => Method_2()); // Doesn't use objDetail
ApplyToDetails(d => Method_3()); // Doesn't use objDetail
ApplyToDetails(Method_4); // Uses objDetail
Note that you must not place parameter braces after the methods you pass as delegate!
You could use delegates as the other answers have provided but I believe in your case doing so will lead to overly confusing code. Your code is cleaner and more readable if you redeclare the foreach loops in each method. Only if you were copy-pasting portions of the internals would I say you run the risk of code duplication.
Think about it this way: If you created a method passing in a delegate, what would the name of this method be called? It is a method that does something for each Detail in each Order you pass in and should be named something like DoSomethingForEachDetailInOrders(). What kind of class would this method exist for? You don't know what it is you're actually doing in the delegate, so the purpose of this class would have to be more framework-style code, which your app does not appear to be complex enough to warrant.
Additionally, if you were debugging this code or reading through it, instead of being able to see 2 foreach loops inside the method you are reading, you have to go scroll to the definition of the delegate, read that, and then go back to your method and resume reading.
Edit: I originally answered this question by downplaying the duplication of the foreach loops in the hopes that OP would not add additional complexity to his app attempting to make it follow "best practices." I didn't go deeper because the code requires a more intrusive refactor for maintainability. The foreach loop code smell stems from other problems as detailed in the comments below this answer. I still stand by my opinion that adding the delegate method is less desirable than the duplicated loops because the delegate method option is pretty much textbook boilerplate.
Added a code example to explain how the code should be refactored if maintainability is a concern:
public decimal CalculateDiscount(IEnumerable<Order> ordersList)
{
return ordersList.SelectMany(order => order.Details).Sum(detail => detail.Discount);
}
public decimal CalculateTax(IEnumerable<Order> ordersList)
{
return ordersList.SelectMany(order => order.Details).Sum(detail => detail.Total) * taxRate;
}
If you ABSOLUTELY MUST HAVE a custom function for getting all details for orders (could be refactored to an extension method):
public IEnumerable<Detail> GetDetailsForOrders(IEnumerable<Orders> orderList)
{
foreach(var order in orderList)
{
foreach (var detail in order.Details)
{
yield return detail;
}
}
}
public decimal CalculateDiscount(IEnumerable<Order> ordersList)
{
return GetDetailsForOrders(ordersList).Sum(detail => detail.Discount);
}
public decimal CalculateTax(IEnumerable<Order> ordersList)
{
return GetDetailsForOrders(ordersList).Sum(detail => detail.Total) * taxRate;
}

No ConcurrentList<T> in .Net 4.0?

I was thrilled to see the new System.Collections.Concurrent namespace in .Net 4.0, quite nice! I've seen ConcurrentDictionary, ConcurrentQueue, ConcurrentStack, ConcurrentBag and BlockingCollection.
One thing that seems to be mysteriously missing is a ConcurrentList<T>. Do I have to write that myself (or get it off the web :) )?
Am I missing something obvious here?
I gave it a try a while back (also: on GitHub). My implementation had some problems, which I won't get into here. Let me tell you, more importantly, what I learned.
Firstly, there's no way you're going to get a full implementation of IList<T> that is lockless and thread-safe. In particular, random insertions and removals are not going to work, unless you also forget about O(1) random access (i.e., unless you "cheat" and just use some sort of linked list and let the indexing suck).
What I thought might be worthwhile was a thread-safe, limited subset of IList<T>: in particular, one that would allow an Add and provide random read-only access by index (but no Insert, RemoveAt, etc., and also no random write access).
This was the goal of my ConcurrentList<T> implementation. But when I tested its performance in multithreaded scenarios, I found that simply synchronizing adds to a List<T> was faster. Basically, adding to a List<T> is lightning fast already; the complexity of the computational steps involved is miniscule (increment an index and assign to an element in an array; that's really it). You would need a ton of concurrent writes to see any sort of lock contention on this; and even then, the average performance of each write would still beat out the more expensive albeit lockless implementation in ConcurrentList<T>.
In the relatively rare event that the list's internal array needs to resize itself, you do pay a small cost. So ultimately I concluded that this was the one niche scenario where an add-only ConcurrentList<T> collection type would make sense: when you want guaranteed low overhead of adding an element on every single call (so, as opposed to an amortized performance goal).
It's simply not nearly as useful a class as you would think.
What would you use a ConcurrentList for?
The concept of a Random Access container in a threaded world isn't as useful as it may appear. The statement
if (i < MyConcurrentList.Count)
x = MyConcurrentList[i];
as a whole would still not be thread-safe.
Instead of creating a ConcurrentList, try to build solutions with what's there. The most common classes are the ConcurrentBag and especially the BlockingCollection.
With all due respect to the great answers provided already, there are times that I simply want a thread-safe IList. Nothing advanced or fancy. Performance is important in many cases but at times that just isn't a concern. Yes, there are always going to be challenges without methods like "TryGetValue" etc, but most cases I just want something that I can enumerate without needing to worry about putting locks around everything. And yes, somebody can probably find some "bug" in my implementation that might lead to a deadlock or something (I suppose) but lets be honest: When it comes to multi-threading, if you don't write your code correctly, it is going deadlock anyway. With that in mind I decided to make a simple ConcurrentList implementation that provides these basic needs.
And for what its worth: I did a basic test of adding 10,000,000 items to regular List and ConcurrentList and the results were:
List finished in: 7793 milliseconds.
Concurrent finished in: 8064 milliseconds.
public class ConcurrentList<T> : IList<T>, IDisposable
{
#region Fields
private readonly List<T> _list;
private readonly ReaderWriterLockSlim _lock;
#endregion
#region Constructors
public ConcurrentList()
{
this._lock = new ReaderWriterLockSlim(LockRecursionPolicy.NoRecursion);
this._list = new List<T>();
}
public ConcurrentList(int capacity)
{
this._lock = new ReaderWriterLockSlim(LockRecursionPolicy.NoRecursion);
this._list = new List<T>(capacity);
}
public ConcurrentList(IEnumerable<T> items)
{
this._lock = new ReaderWriterLockSlim(LockRecursionPolicy.NoRecursion);
this._list = new List<T>(items);
}
#endregion
#region Methods
public void Add(T item)
{
try
{
this._lock.EnterWriteLock();
this._list.Add(item);
}
finally
{
this._lock.ExitWriteLock();
}
}
public void Insert(int index, T item)
{
try
{
this._lock.EnterWriteLock();
this._list.Insert(index, item);
}
finally
{
this._lock.ExitWriteLock();
}
}
public bool Remove(T item)
{
try
{
this._lock.EnterWriteLock();
return this._list.Remove(item);
}
finally
{
this._lock.ExitWriteLock();
}
}
public void RemoveAt(int index)
{
try
{
this._lock.EnterWriteLock();
this._list.RemoveAt(index);
}
finally
{
this._lock.ExitWriteLock();
}
}
public int IndexOf(T item)
{
try
{
this._lock.EnterReadLock();
return this._list.IndexOf(item);
}
finally
{
this._lock.ExitReadLock();
}
}
public void Clear()
{
try
{
this._lock.EnterWriteLock();
this._list.Clear();
}
finally
{
this._lock.ExitWriteLock();
}
}
public bool Contains(T item)
{
try
{
this._lock.EnterReadLock();
return this._list.Contains(item);
}
finally
{
this._lock.ExitReadLock();
}
}
public void CopyTo(T[] array, int arrayIndex)
{
try
{
this._lock.EnterReadLock();
this._list.CopyTo(array, arrayIndex);
}
finally
{
this._lock.ExitReadLock();
}
}
public IEnumerator<T> GetEnumerator()
{
return new ConcurrentEnumerator<T>(this._list, this._lock);
}
IEnumerator IEnumerable.GetEnumerator()
{
return new ConcurrentEnumerator<T>(this._list, this._lock);
}
~ConcurrentList()
{
this.Dispose(false);
}
public void Dispose()
{
this.Dispose(true);
}
private void Dispose(bool disposing)
{
if (disposing)
GC.SuppressFinalize(this);
this._lock.Dispose();
}
#endregion
#region Properties
public T this[int index]
{
get
{
try
{
this._lock.EnterReadLock();
return this._list[index];
}
finally
{
this._lock.ExitReadLock();
}
}
set
{
try
{
this._lock.EnterWriteLock();
this._list[index] = value;
}
finally
{
this._lock.ExitWriteLock();
}
}
}
public int Count
{
get
{
try
{
this._lock.EnterReadLock();
return this._list.Count;
}
finally
{
this._lock.ExitReadLock();
}
}
}
public bool IsReadOnly
{
get { return false; }
}
#endregion
}
public class ConcurrentEnumerator<T> : IEnumerator<T>
{
#region Fields
private readonly IEnumerator<T> _inner;
private readonly ReaderWriterLockSlim _lock;
#endregion
#region Constructor
public ConcurrentEnumerator(IEnumerable<T> inner, ReaderWriterLockSlim #lock)
{
this._lock = #lock;
this._lock.EnterReadLock();
this._inner = inner.GetEnumerator();
}
#endregion
#region Methods
public bool MoveNext()
{
return _inner.MoveNext();
}
public void Reset()
{
_inner.Reset();
}
public void Dispose()
{
this._lock.ExitReadLock();
}
#endregion
#region Properties
public T Current
{
get { return _inner.Current; }
}
object IEnumerator.Current
{
get { return _inner.Current; }
}
#endregion
}
The reason why there is no ConcurrentList is because it fundamentally cannot be written. The reason why is that several important operations in IList rely on indices, and that just plain won't work. For example:
int catIndex = list.IndexOf("cat");
list.Insert(catIndex, "dog");
The effect that the author is going after is to insert "dog" before "cat", but in a multithreaded environment, anything can happen to the list between those two lines of code. For example, another thread might do list.RemoveAt(0), shifting the entire list to the left, but crucially, catIndex will not change. The impact here is that the Insert operation will actually put the "dog" after the cat, not before it.
The several implementations that you see offered as "answers" to this question are well-meaning, but as the above shows, they don't offer reliable results. If you really want list-like semantics in a multithreaded environment, you can't get there by putting locks inside the list implementation methods. You have to ensure that any index you use lives entirely inside the context of the lock. The upshot is that you can use a List in a multithreaded environment with the right locking, but the list itself cannot be made to exist in that world.
If you think you need a concurrent list, there are really just two possibilities:
What you really need is a ConcurrentBag
You need to create your own collection, perhaps implemented with a List and your own concurrency control.
If you have a ConcurrentBag and are in a position where you need to pass it as an IList, then you have a problem, because the method you're calling has specified that they might try to do something like I did above with the cat & dog. In most worlds, what that means is that the method you're calling is simply not built to work in a multi-threaded environment. That means you either refactor it so that it is or, if you can't, you're going to have to handle it very carefully. You you'll almost certainly be required to create your own collection with its own locks, and call the offending method within a lock.
ConcurrentList (as a resizeable array, not a linked list) is not easy to write with nonblocking operations. Its API doesn't translate well to a "concurrent" version.
In cases where reads greatly outnumber writes, or (however frequent) writes are non-concurrent, a copy-on-write approach may be appropriate.
The implementation shown below is
lockless
blazingly fast for concurrent reads, even while concurrent modifications are ongoing - no matter how long they take
because "snapshots" are immutable, lockless atomicity is possible, i.e. var snap = _list; snap[snap.Count - 1]; will never (well, except for an empty list of course) throw, and you also get thread-safe enumeration with snapshot semantics for free.. how I LOVE immutability!
implemented generically, applicable to any data structure and any type of modification
dead simple, i.e. easy to test, debug, verify by reading the code
usable in .Net 3.5
For copy-on-write to work, you have to keep your data structures effectively immutable, i.e. no one is allowed to change them after you made them available to other threads. When you want to modify, you
clone the structure
make modifications on the clone
atomically swap in the reference to the modified clone
Code
static class CopyOnWriteSwapper
{
public static void Swap<T>(ref T obj, Func<T, T> cloner, Action<T> op)
where T : class
{
while (true)
{
var objBefore = Volatile.Read(ref obj);
var newObj = cloner(objBefore);
op(newObj);
if (Interlocked.CompareExchange(ref obj, newObj, objBefore) == objBefore)
return;
}
}
}
Usage
CopyOnWriteSwapper.Swap(ref _myList,
orig => new List<string>(orig),
clone => clone.Add("asdf"));
If you need more performance, it will help to ungenerify the method, e.g. create one method for every type of modification (Add, Remove, ...) you want, and hard code the function pointers cloner and op.
N.B. #1 It is your responsibility to make sure the no one modifies the (supposedly) immutable data structure. There's nothing we can do in a generic implementation to prevent that, but when specializing to List<T>, you could guard against modification using List.AsReadOnly()
N.B. #2 Be careful about the values in the list. The copy on write approach above guards their list membership only, but if you'd put not strings, but some other mutable objects in there, you have to take care of thread safety (e.g. locking). But that is orthogonal to this solution and e.g. locking of the mutable values can be easily used without issues. You just need to be aware of it.
N.B. #3 If your data structure is huge and you modify it frequently, the copy-all-on-write approach might be prohibitive both in terms of memory consumption and the CPU cost of copying involved. In that case, you might want to use MS's Immutable Collections instead.
System.Collections.Generic.List<t> is already thread safe for multiple readers. Trying to make it thread safe for multiple writers wouldn't make sense. (For reasons Henk and Stephen already mentioned)
Some people hilighted some goods points (and some of my thoughts):
It could looklikes insane to unable random accesser (indexer) but to me it appears fine. You only have to think that there is many methods on multi-threaded collections that could fail like Indexer and Delete. You could also define failure (fallback) action for write accessor like "fail" or simply "add at the end".
It is not because it is a multithreaded collection that it will always be used in a multithreaded context. Or it could also be used by only one writer and one reader.
Another way to be able to use indexer in a safe manner could be to wrap actions into a lock of the collection using its root (if made public).
For many people, making a rootLock visible goes agaist "Good practice". I'm not 100% sure about this point because if it is hidden you remove a lot of flexibility to the user. We always have to remember that programming multithread is not for anybody. We can't prevent every kind of wrong usage.
Microsoft will have to do some work and define some new standard to introduce proper usage of Multithreaded collection. First the IEnumerator should not have a moveNext but should have a GetNext that return true or false and get an out paramter of type T (this way the iteration would not be blocking anymore). Also, Microsoft already use "using" internally in the foreach but sometimes use the IEnumerator directly without wrapping it with "using" (a bug in collection view and probably at more places) - Wrapping usage of IEnumerator is a recommended pratice by Microsoft. This bug remove good potential for safe iterator... Iterator that lock collection in constructor and unlock on its Dispose method - for a blocking foreach method.
That is not an answer. This is only comments that do not really fit to a specific place.
... My conclusion, Microsoft has to make some deep changes to the "foreach" to make MultiThreaded collection easier to use. Also it has to follow there own rules of IEnumerator usage. Until that, we can write a MultiThreadList easily that would use a blocking iterator but that will not follow "IList". Instead, you will have to define own "IListPersonnal" interface that could fail on "insert", "remove" and random accessor (indexer) without exception. But who will want to use it if it is not standard ?
I implemented one similar to Brian's. Mine is different:
I manage the array directly.
I don't enter the locks within the try block.
I use yield return for producing an enumerator.
I support lock recursion. This allows reads from list during iteration.
I use upgradable read locks where possible.
DoSync and GetSync methods allowing sequential interactions that require exclusive access to the list.
The code:
public class ConcurrentList<T> : IList<T>, IDisposable
{
private ReaderWriterLockSlim _lock = new ReaderWriterLockSlim(LockRecursionPolicy.SupportsRecursion);
private int _count = 0;
public int Count
{
get
{
_lock.EnterReadLock();
try
{
return _count;
}
finally
{
_lock.ExitReadLock();
}
}
}
public int InternalArrayLength
{
get
{
_lock.EnterReadLock();
try
{
return _arr.Length;
}
finally
{
_lock.ExitReadLock();
}
}
}
private T[] _arr;
public ConcurrentList(int initialCapacity)
{
_arr = new T[initialCapacity];
}
public ConcurrentList():this(4)
{ }
public ConcurrentList(IEnumerable<T> items)
{
_arr = items.ToArray();
_count = _arr.Length;
}
public void Add(T item)
{
_lock.EnterWriteLock();
try
{
var newCount = _count + 1;
EnsureCapacity(newCount);
_arr[_count] = item;
_count = newCount;
}
finally
{
_lock.ExitWriteLock();
}
}
public void AddRange(IEnumerable<T> items)
{
if (items == null)
throw new ArgumentNullException("items");
_lock.EnterWriteLock();
try
{
var arr = items as T[] ?? items.ToArray();
var newCount = _count + arr.Length;
EnsureCapacity(newCount);
Array.Copy(arr, 0, _arr, _count, arr.Length);
_count = newCount;
}
finally
{
_lock.ExitWriteLock();
}
}
private void EnsureCapacity(int capacity)
{
if (_arr.Length >= capacity)
return;
int doubled;
checked
{
try
{
doubled = _arr.Length * 2;
}
catch (OverflowException)
{
doubled = int.MaxValue;
}
}
var newLength = Math.Max(doubled, capacity);
Array.Resize(ref _arr, newLength);
}
public bool Remove(T item)
{
_lock.EnterUpgradeableReadLock();
try
{
var i = IndexOfInternal(item);
if (i == -1)
return false;
_lock.EnterWriteLock();
try
{
RemoveAtInternal(i);
return true;
}
finally
{
_lock.ExitWriteLock();
}
}
finally
{
_lock.ExitUpgradeableReadLock();
}
}
public IEnumerator<T> GetEnumerator()
{
_lock.EnterReadLock();
try
{
for (int i = 0; i < _count; i++)
// deadlocking potential mitigated by lock recursion enforcement
yield return _arr[i];
}
finally
{
_lock.ExitReadLock();
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return this.GetEnumerator();
}
public int IndexOf(T item)
{
_lock.EnterReadLock();
try
{
return IndexOfInternal(item);
}
finally
{
_lock.ExitReadLock();
}
}
private int IndexOfInternal(T item)
{
return Array.FindIndex(_arr, 0, _count, x => x.Equals(item));
}
public void Insert(int index, T item)
{
_lock.EnterUpgradeableReadLock();
try
{
if (index > _count)
throw new ArgumentOutOfRangeException("index");
_lock.EnterWriteLock();
try
{
var newCount = _count + 1;
EnsureCapacity(newCount);
// shift everything right by one, starting at index
Array.Copy(_arr, index, _arr, index + 1, _count - index);
// insert
_arr[index] = item;
_count = newCount;
}
finally
{
_lock.ExitWriteLock();
}
}
finally
{
_lock.ExitUpgradeableReadLock();
}
}
public void RemoveAt(int index)
{
_lock.EnterUpgradeableReadLock();
try
{
if (index >= _count)
throw new ArgumentOutOfRangeException("index");
_lock.EnterWriteLock();
try
{
RemoveAtInternal(index);
}
finally
{
_lock.ExitWriteLock();
}
}
finally
{
_lock.ExitUpgradeableReadLock();
}
}
private void RemoveAtInternal(int index)
{
Array.Copy(_arr, index + 1, _arr, index, _count - index-1);
_count--;
// release last element
Array.Clear(_arr, _count, 1);
}
public void Clear()
{
_lock.EnterWriteLock();
try
{
Array.Clear(_arr, 0, _count);
_count = 0;
}
finally
{
_lock.ExitWriteLock();
}
}
public bool Contains(T item)
{
_lock.EnterReadLock();
try
{
return IndexOfInternal(item) != -1;
}
finally
{
_lock.ExitReadLock();
}
}
public void CopyTo(T[] array, int arrayIndex)
{
_lock.EnterReadLock();
try
{
if(_count > array.Length - arrayIndex)
throw new ArgumentException("Destination array was not long enough.");
Array.Copy(_arr, 0, array, arrayIndex, _count);
}
finally
{
_lock.ExitReadLock();
}
}
public bool IsReadOnly
{
get { return false; }
}
public T this[int index]
{
get
{
_lock.EnterReadLock();
try
{
if (index >= _count)
throw new ArgumentOutOfRangeException("index");
return _arr[index];
}
finally
{
_lock.ExitReadLock();
}
}
set
{
_lock.EnterUpgradeableReadLock();
try
{
if (index >= _count)
throw new ArgumentOutOfRangeException("index");
_lock.EnterWriteLock();
try
{
_arr[index] = value;
}
finally
{
_lock.ExitWriteLock();
}
}
finally
{
_lock.ExitUpgradeableReadLock();
}
}
}
public void DoSync(Action<ConcurrentList<T>> action)
{
GetSync(l =>
{
action(l);
return 0;
});
}
public TResult GetSync<TResult>(Func<ConcurrentList<T>,TResult> func)
{
_lock.EnterWriteLock();
try
{
return func(this);
}
finally
{
_lock.ExitWriteLock();
}
}
public void Dispose()
{
_lock.Dispose();
}
}
In sequentially executing code the data structures used are different from (well written) concurrently executing code. The reason is that sequential code implies implicit order. Concurrent code however does not imply any order; better yet it implies the lack of any defined order!
Due to this, data structures with implied order (like List) are not very useful for solving concurrent problems. A list implies order, but it does not clearly define what that order is. Because of this the execution order of the code manipulating the list will determine (to some degree) the implicit order of the list, which is in direct conflict with an efficient concurrent solution.
Remember concurrency is a data problem, not a code problem! You cannot Implement the code first (or rewriting existing sequential code) and get a well designed concurrent solution. You need to design the data structures first while keeping in mind that implicit ordering doesn’t exist in a concurrent system.
lockless Copy and Write approach works great if you're not dealing with too many items.
Here's a class I wrote:
public class CopyAndWriteList<T>
{
public static List<T> Clear(List<T> list)
{
var a = new List<T>(list);
a.Clear();
return a;
}
public static List<T> Add(List<T> list, T item)
{
var a = new List<T>(list);
a.Add(item);
return a;
}
public static List<T> RemoveAt(List<T> list, int index)
{
var a = new List<T>(list);
a.RemoveAt(index);
return a;
}
public static List<T> Remove(List<T> list, T item)
{
var a = new List<T>(list);
a.Remove(item);
return a;
}
}
example usage:
orders_BUY = CopyAndWriteList.Clear(orders_BUY);
I'm surprised no-one has mentioned using LinkedList as a base for writing a specialised class.
Often we don't need the full API's of the various collection classes, and if you write mostly functional side effect free code, using immutable classes as far as possible, then you'll actually NOT want to mutate the collection favouring various snapshot implementations.
LinkedList solves some difficult problems of creating snapshot copies/clones of large collections. I also use it to create "threadsafe" enumerators to enumerate over the collection. I can cheat, because I know that I'm not changing the collection in any way other than appending, I can keep track of the list size, and only lock on changes to list size. Then my enumerator code simply enumerates from 0 to n for any thread that wants a "snapshot" of the append only collection, that will be guaranteed to represent a "snapshot" of the collection at any moment in time, regardless of what other threads are appending to the head of the collection.
I'm pretty certain that most requirements are often extremely simple, and you need 2 or 3 methods only. Writing a truly generic library is awfully difficult, but solving your own codes needs can sometimes be easy with a trick or two.
Long live LinkedList and good functional programming.
Cheers, ... love ya all!
Al
p.s. sample hack AppendOnly class here : https://github.com/goblinfactory/AppendOnly

Problem using C# iterator methods with code access security

I have a simple method that uses an iterator block to return an IEnumerable<T>:
IEnumerable<MyItem> GetItems()
{
foreach (var item in Items)
{
yield return item;
}
}
Ordinarily, this method works fine, but if I apply a [SecurityCritical] attribute to the assembly (or to the class that contains the above method), it throws a TypeLoadException when attempting to invoke the method. The type that is failing to load is the compiler-generated class that corresponds to the iterator method, and it is its GetEnumerator method that is causing the problem, since it is security transparent.
For comparison, if I modify the above method so that it populates and returns a List<MyItem>, everything works fine.
Any suggestions?
Thanks,
Tim.
It isn't the neatest thing to do, so hopefully you can find a better way, but you could always forgo the compiler-generated code and create your own class that implements IEnumerator<MyItem> (and perhaps your own class implementing IEnumerable<MyItem> - depending on complexity, doing so may make things easier or more difficult), and then build the enumerator more or less as you would in the days before .NET2.0.
If the logic of your real iterator block is very complicated, you might find looking at the reflection of the class the compiler created for you to be a good starting point in doing this, though sometimes the generated code is more complicated (or at least, less readable) than the approach one would take oneself.
It's always a bit disappointing to have to build an IEnumerator class when yield has made it so nice for us 99% of the time, but there are still times when its necessary, and it might solve your problem here.
I had the very same problem, in a complicated application. Spring comes in between and said that the 'blahblah' type is not Serializable and sure it was correct, Here is the disassembled code of compiler generated code and sure it's not Serializable. Maybe this was your problem too, and the solution is what you mentioned yourself cause the List is actually a Serializable type.
The code generate for yield return new KeyValuePair<??? ???>(???,???);
[CompilerGenerated, DebuggerDisplay(#"\{ x = {x}, y = {y} }", Type="<Anonymous Type>")]
internal sealed class <>f__AnonymousType0<<x>j__TPar, <y>j__TPar>
{
// Fields
[DebuggerBrowsable(DebuggerBrowsableState.Never)]
private readonly <x>j__TPar <x>i__Field;
[DebuggerBrowsable(DebuggerBrowsableState.Never)]
private readonly <y>j__TPar <y>i__Field;
// Methods
[DebuggerHidden]
public <>f__AnonymousType0(<x>j__TPar x, <y>j__TPar y)
{
this.<x>i__Field = x;
this.<y>i__Field = y;
}
[DebuggerHidden]
public override bool Equals(object value)
{
var type = value as <>f__AnonymousType0<<x>j__TPar, <y>j__TPar>;
return (((type != null) && EqualityComparer<<x>j__TPar>.Default.Equals(this.<x>i__Field, type.<x>i__Field)) && EqualityComparer<<y>j__TPar>.Default.Equals(this.<y>i__Field, type.<y>i__Field));
}
[DebuggerHidden]
public override int GetHashCode()
{
int num = -576933007;
num = (-1521134295 * num) + EqualityComparer<<x>j__TPar>.Default.GetHashCode(this.<x>i__Field);
return ((-1521134295 * num) + EqualityComparer<<y>j__TPar>.Default.GetHashCode(this.<y>i__Field));
}
[DebuggerHidden]
public override string ToString()
{
StringBuilder builder = new StringBuilder();
builder.Append("{ x = ");
builder.Append(this.<x>i__Field);
builder.Append(", y = ");
builder.Append(this.<y>i__Field);
builder.Append(" }");
return builder.ToString();
}
// Properties
public <x>j__TPar x
{
get
{
return this.<x>i__Field;
}
}
public <y>j__TPar y
{
get
{
return this.<y>i__Field;
}
}
}
You can vote for this issue: https://connect.microsoft.com/VisualStudio/feedback/details/667328/yield-and-securitycriticalattribute-problem
[EDIT] Response from Microsoft:
We've looked at SecurityCritical iterators and decided not to try to
make that work at least for this release. It is a significant and
complicated effort, and it does not seem too useful, as the call
through IEnumerator.MoveNext would be calling through a non-critical
interface.
We'll probably revisit this again in a later release; especially if we
see common scenarios for it.

How to know if an enumerator has reached the end of the collection in C#?

I am porting a library from C++ to C#. The old library uses vectors from C++ and in the C# I am using generic Dictionaries because they're actually a good data structure for what I'm doing (each element has an ID, then I just use using TypeDictionary = Dictionary<String, Type>;). Now, in the C# code I use a loop like this one
TypeDictionary.Enumerator tdEnum = MyTypeDictionary.GetEnumerator();
while( tdEnum.MoveNext() )
{
Type element = typeElement.Current.Value;
// More code here
}
to iterate through the elements of the collection. The problem is that in particular cases I need to check if a certain enumerator has reached the end of the collection, in C++ I would have done a check like this:
if ( tdEnum == MyTypeDictionary.end() ) // More code here
But I just don't know how to handle this situation in C#, any ideas?
Thank you
Tommaso
Here's a pretty simple way of accomplishing this.
bool hasNext = tdEnum.MoveNext();
while (hasNext) {
int i = tdEnum.Current;
hasNext = tdEnum.MoveNext();
}
I found an online tutorial that also may help you understand how this works.
http://www.c-sharpcorner.com/UploadFile/prasadh/Enumerators11132005232321PM/Enumerators.aspx
You know that you're at the end of an iterator when MoveNext() returns false. Otherwise you need to upgrade to a more descriptive data structure like IList<T>.
I have a "smart iterator" class in MiscUtil which you may find useful. It lets you test whether you're currently looking at the start or end of the sequence, and the index within the sequence. See the usage page for more information.
Of course in most cases you can just get away with doing this manually using the result of MoveNext(), but occasionally the extra encapsulation comes in handy.
Note that by necessity, this iterator will always have actually consumed one more value than it's yielded, in order to know whether or not it's reached the end. In most cases that isn't an issue, but it could occasionally give some odd experiences when debugging.
Using the decorator pattern to hold a value if the enumerator has ended is a valid approach.
Since it implements IEnumerator, you won't find difficulties to replace it in your code.
Here's a test class:
using System.Collections.Generic;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using MyDictionary = System.Collections.Generic.Dictionary<int, string>;
using MyKeyValue = System.Collections.Generic.KeyValuePair<int, string>;
namespace TestEnumerator
{
[TestClass]
public class UnitTest1
{
[TestMethod]
public void TestingMyEnumeradorPlus()
{
var itens = new MyDictionary()
{
{ 1, "aaa" },
{ 2, "bbb" }
};
var enumerator = new EnumeradorPlus<MyKeyValue>(itens.GetEnumerator());
enumerator.MoveNext();
Assert.IsFalse(enumerator.Ended);
enumerator.MoveNext();
Assert.IsFalse(enumerator.Ended);
enumerator.MoveNext();
Assert.IsTrue(enumerator.Ended);
}
}
public class EnumeradorPlus<T> : IEnumerator<T>
{
private IEnumerator<T> _internal;
private bool _hasEnded = false;
public EnumeradorPlus(IEnumerator<T> enumerator)
{
_internal = enumerator;
}
public T Current
{
get { return _internal.Current; }
}
public void Dispose()
{
_internal.Dispose();
}
object System.Collections.IEnumerator.Current
{
get { return _internal.Current; }
}
public bool MoveNext()
{
bool moved = _internal.MoveNext();
if (!moved)
_hasEnded = true;
return moved;
}
public void Reset()
{
_internal.Reset();
_hasEnded = false;
}
public bool Ended
{
get { return _hasEnded; }
}
}
}
Coming from C++ you might not be up to date on C# syntax. Perhaps you could simply use the foreach construct to avoid the test all together. The following code will be executed once for each element in your dictionary:
foreach (var element in MyTypeDictionary)
{
// More code here
}

Categories

Resources