Faster enumeration: Leveraging Array Enumeration

Faster enumeration: Leveraging Array Enumeration - c#

So, I have a class with an array inside. Currently, my strategy for enumerating over the class's items is to use the code, foreach (item x in classInstance.InsideArray) . I would much rather use foreach (item x in classInstance) and make the array private. My main concern is that I really need to avoid anything slow; the array gets hit a lot (and has a couple hundred items). It is vital that enumerating over this array is cheap. One thought was to just have the class implement IEnumerable<item>, but InsideArray.getEnumerator() only gives me a non-generic enumerator. I also tried implementing the IEnumerable interface. This worked but was very slow, possibly due to boxing.
Is there a way to make the class itself enumerable without a performance hit?
Normal Code:
//Class
public class Foo {
//Stuff
public Item[,] InsideArray {get; private set;}
}
//Iteration. Shows up all over the place
foreach (Item x in classInstance.InsideArray)
{
//doStuff
}
Adjusted, much slower code:
//Class
public class Foo : IEnumerable {
//Stuff
private Item[,] InsideArray;
System.Collections.IEnumerator System.Collections.IEnumerable GetEnumerator()
{
return InsideArray.GetEnumerator();
}
}
//Iteration. Shows up all over the place
foreach (Item x in classInstance)
{
//doStuff
}
Note: Adding an implementation for the nongeneric iterator is possible and faster than my slow solution, but it is still a bit worse than just using the array directly. I was hoping there was a way to somehow tell C#, "hey, when I ask you to iterate over this object iterate over it's array, just as fast," but apparently that is not quite possible...at least from the answers suggested thus far.

A bespoke iterator might make it quicker (edited to return as known type):
Basic: 2468ms - -2049509440
Bespoke: 1087ms - -2049509440
(you would use the ArrayIterator directly as Foo's GetEnumerator - essentially copying the code from ArrayEnumerator.GetEnumerator; my point is to show that a typed iterator is faster than the interface)
With code:
using System;
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;
class Foo
{
public struct ArrayIterator<T> : IEnumerator<T>
{
private int x, y;
private readonly int width, height;
private T[,] data;
public ArrayIterator(T[,] data)
{
this.data = data;
this.width = data.GetLength(0);
this.height = data.GetLength(1);
x = y = 0;
}
public void Dispose() { data = null; }
public bool MoveNext()
{
if (++x >= width)
{
x = 0;
y++;
}
return y < height;
}
public void Reset() { x = y = 0; }
public T Current { get { return data[x, y]; } }
object IEnumerator.Current { get { return data[x, y]; } }
}
public sealed class ArrayEnumerator<T> : IEnumerable<T>
{
private readonly T[,] arr;
public ArrayEnumerator(T[,] arr) { this.arr = arr; }
public ArrayIterator<T> GetEnumerator()
{
return new ArrayIterator<T>(arr);
}
System.Collections.Generic.IEnumerator<T> System.Collections.Generic.IEnumerable<T>.GetEnumerator()
{
return GetEnumerator();
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
public int[,] data;
public IEnumerable<int> Basic()
{
foreach (int i in data) yield return i;
}
public ArrayEnumerator<int> Bespoke()
{
return new ArrayEnumerator<int>(data);
}
public Foo()
{
data = new int[500, 500];
for (int x = 0; x < 500; x++)
for (int y = 0; y < 500; y++)
{
data[x, y] = x + y;
}
}
static void Main()
{
Test(1); // for JIT
Test(500); // for real
Console.ReadKey(); // pause
}
static void Test(int count)
{
Foo foo = new Foo();
int chk;
Stopwatch watch = Stopwatch.StartNew();
chk = 0;
for (int i = 0; i < count; i++)
{
foreach (int j in foo.Basic())
{
chk += j;
}
}
watch.Stop();
Console.WriteLine("Basic: " + watch.ElapsedMilliseconds + "ms - " + chk);
watch = Stopwatch.StartNew();
chk = 0;
for (int i = 0; i < count; i++)
{
foreach (int j in foo.Bespoke())
{
chk += j;
}
}
watch.Stop();
Console.WriteLine("Bespoke: " + watch.ElapsedMilliseconds + "ms - " + chk);
}
}

Cast your array to IEnumerable<item> before calling GetEnumerator() and you'll get the generic IEnumerator. For example:
string[] names = { "Jon", "Marc" };
IEnumerator<string> enumerable = ((IEnumerable<string>)names).GetEnumerator();
It may well still be a bit slower than enumerating the array directly with foreach (which the C# compiler does in a different way) but at least you won't have anything else in the way.
EDIT:
Okay, you said your other attempt used an indexer. You could try this approach, although I don't think it'll be any faster:
public IEnumerable<Item> Items
{
get
{
foreach (Item x in items)
{
yield return x;
}
}
}
An alternative would be to try to avoid using a two-dimensional array to start with. Is that an absolute requirement? How often are you iterating over a single array after creating it? It may be worth taking a slight hit at creation time to make iteration cheaper.
EDIT: Another suggestion, which is slightly off the wall... instead of passing the iterator back to the caller, why not get the caller to say what to do with each item, using a delegate?
public void ForEachItem(Action action)
{
foreach (Item item in items)
{
action(item);
}
}
Downsides:
You incur the penalty of a delegate call on each access.
It's hard to break out of the loop (other than by throwing an exception). There are different ways of approaching this, but let's cross that bridge when we come to it.
Developers who aren't familiar with delegates may get a bit confused.

How about adding an indexer to the class:
public MyInsideArrayType this[int index]
{
get{return this.insideArray[index];
}
And if you REALLY need foreach capabilities:
public IEnumerable<MyInsideArrayType> GetEnumerator()
{
for(int i = 0; i<this.insideArray.Count;i++)
{
yield return this[i];
}
}

All forms of iteration are cheap. If anyone in this day-and-age managed to somehow write and publish an expensive iterator they would be (rightly) burned at the stake.
Premature optimization is evil.
Cheers. Keith.

Related

Eliminating visitor pattern in C#

I have code that looks like this:
var visitor = new ImplementsVisitor();
for(int i = 0; i < array.Length; ++i)
arrray[i].Accept(visitor);
Each element in the array implements IItem interface, which has an Accept(IVisitor) method. Nothing but the standard visitor pattern.
While measuring performance, I came to the conclusion that the call by interface is too slow, and in this code is performance is critical. From your experience, what would be the best option of eliminating any virtual or interface calls? An if statement that checks for the concrete type? An enum on each element with a switch/case (in this case, the code's structure is such that no cast will be required)? Something else?
P.S. I cannot sort the items in the array. The order is important. Thus, I cannot sort them by concrete type to help branch prediction.

I created the following program. On my laptop the loop runs a million times in 8ms (that's a Release build, Debug is 11ms or so). That is approximately 0.000008ms to do the virtual dispatch and increment an int. Exactly how fast do you need it to be? I'd suspect that something has gone wrong with either your performance test or mine. If mine I'd be interested to hear suggestions for improvement.
Generally if performance at this level isn't good enough then using C# is probably a problem in itself. Its garbage collector has a habit of freezing threads in the middle of loops for example. If 0.000008ms on a loop iteration really is an issue, I'd suspect Assembly language or C would be a better choice.
using System;
using System.Collections.Generic;
using System.Diagnostics;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
const int count = 1000000;
IList<IItem> items = new List<IItem>(count);
for (int i = 0; i < count; i++)
{
var rnd = new Random();
if (rnd.NextDouble() > 0.5)
{
items.Add(new ClassA());
}
else
{
items.Add(new ClassB());
}
}
var visitor = new MyVisitor();
Stopwatch s = Stopwatch.StartNew();
for (int i = 0; i < items.Count; i++)
{
items[i].Accept(visitor);
}
s.Stop();
Console.WriteLine("ExecTime = {0}, Per Cycle = {1}", s.ElapsedMilliseconds, (double)s.ElapsedMilliseconds / count);
visitor.Output();
}
interface IVisitor
{
void Process(ClassA item);
void Process(ClassB item);
}
interface IItem
{
void Accept(IVisitor visitor);
}
abstract class BaseVisitor : IVisitor
{
public virtual void Process(ClassA item)
{
}
public virtual void Process(ClassB item)
{
}
}
class ClassA : IItem
{
public void Accept(IVisitor visitor)
{
visitor.Process(this);
}
}
class ClassB : IItem
{
public void Accept(IVisitor visitor)
{
visitor.Process(this);
}
}
class MyVisitor : BaseVisitor
{
int a = 0;
int b = 0;
public override void Process(ClassA item)
{
a++;
}
public override void Process(ClassB item)
{
b++;
}
public void Output()
{
Console.WriteLine("a = {0}, b = {1}", a, b);
}
}
}
}

You don't have one virtual call here, you have two, but you only need one. First your array presumably has a virtual call through IItem - but if these are all the same type, and you know the type (and it is sealed) a virtual call is unnecessary.
Then within the visited object, you need to do whatever operation the visitor wants to do. This will probably also involve a virtual call.
You might do better with a typed IVisitor:
interface IItem<TVisitor> : IItem
where TVisitor : IVisitor
{
void Accept(TVisitor visitor);
}
// Then
SpecialVisitor visitor = ImplementsSpecialVisitor();
foreach(var item in arrayOfSpecialItems){
item.Accept<SpecialVisitor>(visitor);
}

With Statement in C# like that of AS3/GML

I'm making a game using Monogame, and I've been trying to figure out how to implement a function that acts similarly to AS3's and GML's with statement.
So far I have a system that works, but not entirely the way I want it to. I store my GameObjects in a Dictionary of Lists. This is so I can get to the specific type of object I want to access without having to loop through a list of ALL objects. The key used is the name of the type.
public static Dictionary<string, List<GameObject>> All =
new Dictionary<string, List<GameObject>>();
I access all of a specific type of object using AllOf. If a List containing that type exists in the Dictionary, it returns that List, else it returns an empty list.
public static List<GameObject> AllOf(Type type)
{
string key = type.Name;
if(All.ContainsKey(key))
{
return All[key];
}
return new List<GameObject>();
}
An example of how these are implemented
public override void Update(GameTime gameTime)
{
List<GameObject> list = Instance.AllOf(typeof(Dummy));
for(int i = 0; i < list.Count; i++)
{
list[i].Update(gameTime);
list[i].foo += bar;
}
}
But I'd rather use something similar to the AS3/GML with statement, which would also allow for other, non-member codes to be executed.
with(typeof(Dummy))
{
Update(gameTime);
foo += bar;
int fooBar = 2;
someObject.someMemberFunction(fooBar);
}
Is there a way to accomplish this? My end goal is just to make my code look a little cleaner, and make it easier to make a lot of changes without having to type out a for loop each time.

No such syntax exists in C#, but you can access methods within the for that have nothing to do with the collection:
public override void Update(GameTime gameTime)
{
List<GameObject> list = Instance.AllOf(typeof(Dummy));
for(int i = 0; i < list.Count; i++)
{
list[i].Update(gameTime);
list[i].foo += bar;
int fooBar = 2;
someObject.someMemberFunction(fooBar);
}
}
Note that you can also use foreach, which is a little cleaner if you don't need the indexer:
foreach(var item in list)
{
item.Update(gameTime);
item.foo += bar;
int fooBar = 2;
someObject.someMemberFunction(fooBar);
}

try
using(Object myObject = new Object()){
}
i think this might be what your looking to use?

I have a small solution for this use case. This may be a bit of a necropost, but it is a pretty neat solution. Additionally, I think all of the C# features that are required existed back when this question was asked.
You can do something very similar to the GML with(x){} by using some form of delegate as a parameter to a static method, and passing a lambda as that parameter. The function can even be genericised, and you can call it without the class name by the using static statement. You will need to explicitly provide the typed/named parameter, but it is possible. You would need to hook it up to your own types, but the general idea is:
namespace NiftyStuff {
public static class With {
public static void with<T>(Action<T> proc) where T : GameObj {
var typeName = typeof(T).Name;
foreach (var item in GameObj.AllOf(typeName)) { proc((T)item); }
}
}
public class GameObj {
private static Dictionary<string, List<GameObj>> All = new Dictionary<string, List<GameObj>>();
public static List<GameObj> AllOf(string name) {
return All.ContainsKey(name) ? All[name] : null;
}
public static void Add(GameObj foo) {
string typeName = foo.GetType().Name;
List<GameObj> foos = All.ContainsKey(typeName) ? All[typeName] : (All[typeName] = new List<GameObj>());
foos.Add(foo);
}
public float x, y, angle;
public GameObj() { x = y = angle = 0; }
public void Destroy() { AllOf(GetType().Name)?.Remove(this); }
}
public class Enemy : GameObj {
public float maxHealth, curHealth;
public Enemy() : base() { maxHealth = curHealth = 300; }
public Enemy(float health) : base() { maxHealth = curHealth = health; }
public bool Damage(float amt) {
if (curHealth > 0) {
curHealth -= amt;
return curHealth <= 0;
}
return false;
}
}
public class Pumpkin : GameObj {
public bool exists = false;
public Pumpkin() : base() { exists = true; }
public bool LookAt() { return (exists = !exists); }
}
}
Actually using the above code would work as follows:
using NiftyStuff;
using static NiftyStuff.With;
//...
with ((Enemy e) => {
if (e.Damage(50)) {
Log("Made a kill!"); // Whatever log function you have...
}
});
with ((Pumpkin p) => {
if (p.LookAt()) {
Log("You see the pumpkin");
} else {
Log("You no longer see the pumpkin");
}
});
While not exactly like GML's with statement, it would at least let you run code against all of the registered objects of some type.
One important note is that you can't destroy objects inside of a with this way (due to concurrent modification of a collection while iterating it). You would need to collect all objects to be destroyed, and then remove them from the list in All, typically in a game loop this is done at the end of a frame.
Hope this helps, despite being 2 years out of date.

Access array inside a constructor

This Is for homework.
I have googled this and searched within stackoverflow, but I can not seem to find the answer. Perhaps my terminology is incorrect.
I am learning TDD for a class and my C# skills are rusty and limited.
I am trying to write a stack class. When I try to initiate an array inside the constructor, the methods cannot access it.
I'm sure it is something simple that I am missing.
Here is the code what i have tried so far :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace tdd_programmingTest
{
class Stack
{
int index = 0;
public Stack()
{
int[] items;
}
public void Push(int p)
{
items[index] = p;
index++;
}
public int Pop()
{
index--;
return items[index];
}
internal int IndexState()
{
return index;
}
}
}
I'm not looking for someone to write the code for me, just point me in the right direction. Thank you.

What you have here is a local variable:
public Stack()
{
int[] items;
}
It exits only inside of the Stack() constructor, and only for the lifetime of its execution.
You need to declare items as a field (member variable):
class Stack
{
private int index = 0;
private int[] items; // <-- move it here, and mark it private
public Stack()
{
}
// ...
}
But you have bigger problems. This is just a reference to an array which you haven't created yet.
So, you need to instantiate an array:
int[] items = new int[SIZE];
...but what size will you use? Once you create the array, it can not grow. You'll have to allocate a larger array and copy it, once you run out of space. This auto-self-expansion is how many ADT's work under the hood.
Speaking of running out of space, you'd better pay attention to your array's bounds in Push() and Pop()!
EDIT: So you need to specify a size. Just add a parameter to the constructor.
class Stack
{
private int index = 0;
private int[] items;
public Stack(int initialSize)
{
items = new int[initialSize];
}
public Stack() : Stack(100)
{
}
}

Put int[] items; outside of the constructor and add size parameter to the constructor to specify the size of items:
class Stack
{
int index = 0;
int[] items = new int[0];
public Stack(int size)
{
items = new int[size]; // initiate items with size
}
public void Push(int p)
{
items[index] = p;
index++;
}
public int Pop()
{
index--;
return items[index];
}
internal int IndexState()
{
return index;
}
}

C#: How can I make an IEnumerable<T> thread safe?

Say I have this simple method:
public IEnumerable<uint> GetNumbers()
{
uint n = 0;
while(n < 100)
yield return n++;
}
How would you make this thread safe? And by that I mean that you would get that enumerator once, and have multiple threads handle all the numbers without anyone getting duplicates.
I suppose a lock needs to be used somewhere, but where must that lock be for an iterator block to be thread safe? What, in general, do you need to remember if you want a thread safe IEnumerable<T>? Or rather I guess it would be a thread safe IEnumerator<T>...?

There's an inherent problem in doing so, because IEnumerator<T> has both MoveNext() and Current. You really want a single call such as:
bool TryMoveNext(out T value)
at that point you can atomically move to the next element and get a value. Implementing that and still being able to use yield could be tricky... I'll have a think about it though. I think you'd need to wrap the "non-threadsafe" iterator in a thread-safe one which atomically performed MoveNext() and Current to implement the interface shown above. I don't know how you'd then wrap this interface back into IEnumerator<T> so that you could use it in foreach though...
If you're using .NET 4.0, Parallel Extensions may be able to help you - you'd need to explain more about what you're trying to do though.
This is an interesting topic - I may have to blog about it...
EDIT: I've now blogged about it with two approaches.

I just tested this bit of code:
static IEnumerable<int> getNums()
{
Console.WriteLine("IENUM - ENTER");
for (int i = 0; i < 10; i++)
{
Console.WriteLine(i);
yield return i;
}
Console.WriteLine("IENUM - EXIT");
}
static IEnumerable<int> getNums2()
{
try
{
Console.WriteLine("IENUM - ENTER");
for (int i = 0; i < 10; i++)
{
Console.WriteLine(i);
yield return i;
}
}
finally
{
Console.WriteLine("IENUM - EXIT");
}
}
getNums2() always calls the finally part of the code. If you want your IEnumerable to be thread safe, add whatever thread locks you want instead of writelines, wither using ReaderWriterSlimLock, Semaphore, Monitor, etc.

Well, i'm not sure, but maybe with some locks in the caller ?
Draft:
Monitor.Enter(syncRoot);
foreach (var item in enumerable)
{
Monitor.Exit(syncRoot);
//Do something with item
Monitor.Enter(syncRoot);
}
Monitor.Exit(syncRoot);

I was thinking that you can't make the yield keyword thread-safe, unless you make it depend on an already thread-safe source of values:
public interface IThreadSafeEnumerator<T>
{
void Reset();
bool TryMoveNext(out T value);
}
public class ThreadSafeUIntEnumerator : IThreadSafeEnumerator<uint>, IEnumerable<uint>
{
readonly object sync = new object();
uint n;
#region IThreadSafeEnumerator<uint> Members
public void Reset()
{
lock (sync)
{
n = 0;
}
}
public bool TryMoveNext(out uint value)
{
bool success = false;
lock (sync)
{
if (n < 100)
{
value = n++;
success = true;
}
else
{
value = uint.MaxValue;
}
}
return success;
}
#endregion
#region IEnumerable<uint> Members
public IEnumerator<uint> GetEnumerator()
{
//Reset(); // depends on what behaviour you want
uint value;
while (TryMoveNext(out value))
{
yield return value;
}
}
#endregion
#region IEnumerable Members
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
//Reset(); // depends on what behaviour you want
uint value;
while (TryMoveNext(out value))
{
yield return value;
}
}
#endregion
}
You will have to decide whether each typical initiation of an enumerator should reset the sequence, or if the client code must do that.

You could just return a complete sequence each time rather than use yield:
return Enumerable.Range(0, 100).Cast<uint>().ToArray();

Any chances to imitate times() Ruby method in C#?

Every time I need to do something N times inside an algorithm using C# I write this code
for (int i = 0; i < N; i++)
{
...
}
Studying Ruby I have learned about method times() which can be used with the same semantics like this
N.times do
...
end
Code fragment in C# looks more complex and we should declare useless variable i.
I tried to write extension method which returns IEnumerable, but I am not satisfied with the result because again I have to declare a cycle variable i.
public static class IntExtender
{
public static IEnumerable Times(this int times)
{
for (int i = 0; i < times; i++)
yield return true;
}
}
...
foreach (var i in 5.Times())
{
...
}
Is it possible using some new C# 3.0 language features to make N times cycle more elegant?

A slightly briefer version of cvk's answer:
public static class Extensions
{
public static void Times(this int count, Action action)
{
for (int i=0; i < count; i++)
{
action();
}
}
public static void Times(this int count, Action<int> action)
{
for (int i=0; i < count; i++)
{
action(i);
}
}
}
Use:
5.Times(() => Console.WriteLine("Hi"));
5.Times(i => Console.WriteLine("Index: {0}", i));

It is indeed possible with C# 3.0:
public interface ILoopIterator
{
void Do(Action action);
void Do(Action<int> action);
}
private class LoopIterator : ILoopIterator
{
private readonly int _start, _end;
public LoopIterator(int count)
{
_start = 0;
_end = count - 1;
}
public LoopIterator(int start, int end)
{
_start = start;
_end = end;
}
public void Do(Action action)
{
for (int i = _start; i <= _end; i++)
{
action();
}
}
public void Do(Action<int> action)
{
for (int i = _start; i <= _end; i++)
{
action(i);
}
}
}
public static ILoopIterator Times(this int count)
{
return new LoopIterator(count);
}
Usage:
int sum = 0;
5.Times().Do( i =>
sum += i
);
Shamelessly stolen from http://grabbagoft.blogspot.com/2007/10/ruby-style-loops-in-c-30.html

If you are using .NET 3.5 then you can use the extension method Each proposed in this article, and use it to avoid classic loop.
public static class IEnumerableExtensions
{
public static void Each<T>(
this IEnumerable<T> source,
Action<T> action)
{
foreach(T item in source)
{
action(item);
}
}
}
This particular extension method spot
welds an Each method on anything that
implements IEnumerable. You know
this because the first parameter to
this method defines what this will be
inside the method body. Action is a
pre-defined class that basically
stands in for a function (delegate)
returning no value. Inside the method,
is where the elements are extracted
from the list. What this method
enables is for me to cleanly apply a
function in one line of code.
(http://www.codeproject.com/KB/linq/linq-to-life.aspx)
Hope this helps.

I wrote my own extension that add Times to Integer (plus some other stuff). You can get the code here : https://github.com/Razorclaw/Ext.NET
The code is very similar to Jon Skeet answer:
public static class IntegerExtension
{
public static void Times(this int n, Action<int> action)
{
if (action == null) throw new ArgumentNullException("action");
for (int i = 0; i < n; ++i)
{
action(i);
}
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Faster enumeration: Leveraging Array Enumeration - c#

How about adding an indexer to the class: public MyInsideArrayType this[int index] { get{return this.insideArray[index]; } And if you REALLY need foreach capabilities: public IEnumerable<MyInsideArrayType> GetEnumerator() { for(int i = 0; i<this.insideArray.Count;i++) { yield return this[i]; } }

All forms of iteration are cheap. If anyone in this day-and-age managed to somehow write and publish an expensive iterator they would be (rightly) burned at the stake. Premature optimization is evil. Cheers. Keith.

Related

Eliminating visitor pattern in C#

With Statement in C# like that of AS3/GML

Access array inside a constructor

C#: How can I make an IEnumerable<T> thread safe?

Any chances to imitate times() Ruby method in C#?

Categories

Resources