Cached arrays for subcomputation results in Parallel.For/Foreach - c#

I have a method that calculates some arrays in each iteration, then sums them (adding each array index-wise) and returns the array of sums.
I'm looking for a performant parallel method of doing so. In essence I want to avoid creating arrays constantly and reuse them instead.
Example:
private void Computation(float[] result)
{
var _lock = new object();
Parallel.For(0, 50, x =>
{
var subResult = new float[200*1000];
CalculateSubResult(subResult);
lock(_lock)
for (int i = 0; i < subResult.Length; i++)
result[i] += subResult[i];
});
}
The problem is that these Parallel.For's are nested (because CalculateSubResult also does similar loop). This leads to constant creating and discarding arrays which are used to hold subcomputation results.
To Get around this, im using ThreadLocal to store cached arrays:
ThreadLocal<float[]> threadLocalArray = new ThreadLocal<float[]>(()=>new float[200*1000]);
private void ComputationWithThreadLocal(float[] result)
{
var _lock = new object();
Parallel.For(0, 50, x =>
{
var subResult = threadLocalArray.Value;
CalculateSubResult(subResult);
lock (_lock)
for (int i = 0; i < subResult.Length; i++)
result[i] += subResult[i];
Array.Clear(subResult,0,subResult.Length);
});
}
There are two problems with this approach:
Over time (and this is a part of a long running computation) threadLocalArray stores more and more arrays, even if there are only few logical cores, which leads to a memory leak. I cannot call threadLocalArray.Dispose() until the computation is finished, so there is no way to prevent it from growing.
Summing subcomputation results requires lock and is wasteful. I would like to have a local per-thread sum, and only after thread finishes its partitioned work, the summation should occur.
There is an overload method which deals with 2.
public static ParallelLoopResult For<TLocal>(
int fromInclusive,
int toExclusive,
Func<TLocal> localInit,
Func<int, ParallelLoopState, TLocal, TLocal> body,
Action<TLocal> localFinally
)
The Func<TLocal> localInit would just init a subResult array like so: ()=>new float[200*1000]. In that case the Action<TLocal> localFinally executes only one per partition, which is more efficient. But the problem here is that localInit cannot use ()=>threadLocalArray.Value to get thread local array, because if Parallel.For loops are recursively nested then the same array is used for subcomputation at diffrent nest levels, leading to incorrect results.
I think I'm looking for something like this (CachedArrayProvider is a class that needs to manage reusing arrays):
Parallel.For(0,50,
()=>CachedArrayProvider.GetArray(),
(i, localArray) => {/*body*/},
(localArray) =>
{
//do final sum here
CachedArrayProvider.ReturnArray(localArray);
}
);
Any ideas?

Related

Consuming an IEnumerable multiple times in one pass

Is it possible to write a higher-order function that causes an IEnumerable to be consumed multiple times but in only one pass and without reading all the data into memory? [See Edit below for a clarification of what I'm looking for.]
For example, in the code below the enumerable is mynums (onto which I've tagged a .Trace() in order to see how many times we enumerate it). The goal is figure out if it has any numbers greater than 5, as well as the sum of all of the numbers. A function which processes an enumerable twice is Both_TwoPass, but it enumerates it twice. In contrast Both_NonStream only enumerates it once, but at the expense of reading it into memory. In principle it is possible carry out both of these tasks in a single pass and in a streaming fashion as shown by Any5Sum, but that is specific solution. Is it possible to write a function with the same signature as Both_* but that is the best of both worlds?
(It seems to me that this should be possible using threads. Is there a better solution using, say, async?)
Edit
Below is a clarification regarding what I'm looking for. What I've done is included a very down-to-earth description of each property in square brackets.
I'm looking for a function Both with the following characteristics:
It has signature (S1, S2) Both<T, S1, S2>(this IEnumerable<T> tt, Func<IEnumerable<T>, S1>, Func<IEnumerable<T>, S2>) (and produces the "right" output!)
It only iterates the first argument, tt, once. [What I mean by this is that when passed mynums (as defined below) it only outputs mynums: 0 1 2 ... once. This precludes function Both_TwoPass.]
It processes the data from the first argument, tt, in a streaming fashion. [What I mean by this is that, for example, there is insufficient memory to store all the items from tt in memory simultaneously, thus precluding function Both_NonStream.]
using System;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleApp
{
static class Extensions
{
public static IEnumerable<T> Trace<T>(this IEnumerable<T> tt, string msg = "")
{
Console.Write(msg);
try
{
foreach (T t in tt)
{
Console.Write(" {0}", t);
yield return t;
}
}
finally
{
Console.WriteLine('.');
}
}
public static (S1, S2) Both_TwoPass<T, S1, S2>(this IEnumerable<T> tt, Func<IEnumerable<T>, S1> f1, Func<IEnumerable<T>, S2> f2)
{
return (f1(tt), f2(tt));
}
public static (S1, S2) Both_NonStream<T, S1, S2>(this IEnumerable<T> tt, Func<IEnumerable<T>, S1> f1, Func<IEnumerable<T>, S2> f2)
{
var tt2 = tt.ToList();
return (f1(tt2), f2(tt2));
}
public static (bool, int) Any5Sum(this IEnumerable<int> ii)
{
int sum = 0;
bool any5 = false;
foreach (int i in ii)
{
sum += i;
any5 |= i > 5; // or: if (!any5) any5 = i > 5;
}
return (any5, sum);
}
}
class Program
{
static void Main()
{
var mynums = Enumerable.Range(0, 10).Trace("mynums:");
Console.WriteLine("TwoPass: (any > 5, sum) = {0}", mynums.Both_TwoPass(tt => tt.Any(k => k > 5), tt => tt.Sum()));
Console.WriteLine("NonStream: (any > 5, sum) = {0}", mynums.Both_NonStream(tt => tt.Any(k => k > 5), tt => tt.Sum()));
Console.WriteLine("Manual: (any > 5, sum) = {0}", mynums.Any5Sum());
}
}
}
The way you've written your computation model (i.e. return (f1(tt), f2(tt))) there is no way to avoid multiple iterations of your enumerable. You're basically saying compute Item1 then compute Item2.
You have to either change the model from (Func<IEnumerable<T>, S1>, Func<IEnumerable<T>, S2>) to (Func<T, S1>, Func<T, S2>) or to Func<IEnumerable<T>, (S1, S2)> to be able to run the computations in parallel.
You implementation of Any5Sum is basically the second approach (Func<IEnumerable<T>, (S1, S2)>). But there's already a built-in method for that.
Try this:
Console.WriteLine("Aggregate: (any > 5, sum) = {0}",
mynums
.Aggregate<int, (bool any5, int sum)>(
(false, 0),
(a, x) => (a.any5 | x > 5, a.sum + x)));
I think you and I are describing the same thing in the comments. There is no need to create such a "special-purpose IEnumerable", though, because the BlockingCollection<> class already exists for such producer-consumer scenarios. You'd use it as follows...
Create a BlockingCollection<> for each consuming function (i.e. tt1 and tt2).
By default, a BlockingCollection<> wraps a ConcurrentQueue<>, so the elements will arrive in FIFO order.
To satisfy your requirement that only one element be held in memory at a time, 1 will be specified for the bounded capacity. Note that this capacity is per collection, so with two collections there will be up to two queued elements at any given moment.
Each collection will hold the input elements for that consumer.
Create a thread/task for each consuming function.
The thread/task will simply call GetConsumingEnumerator() for its input collection, pass the resulting IEnumerable<> to its consuming function, and return that result.
GetConsumingEnumerable() does just as its name implies: it creates an IEnumerable<> that consumes (removes) elements from the collection. If the collection is empty, enumeration will block until an element is added. CompleteAdding() is called once the producer is finished, which allows the consuming enumerator to exit when the collection empties.
The producer enumerates the IEnumerable<>, tt, and adds each element to both collections. This is the only time that tt is enumerated.
BlockingCollection<>.Add() will block if the collection has reached its capacity, preventing the entirety of tt from being buffered in-memory.
Once tt has been fully enumerated, CompleteAdding() is called on each collection.
Once each consumer thread/task has completed, their results are returned.
Here's what that looks like in code...
public static (S1, S2) Both<T, S1, S2>(this IEnumerable<T> tt, Func<IEnumerable<T>, S1> tt1, Func<IEnumerable<T>, S2> tt2)
{
const int MaxQueuedElementsPerCollection = 1;
using (BlockingCollection<T> collection1 = new BlockingCollection<T>(MaxQueuedElementsPerCollection))
using (Task<S1> task1 = StartConsumerTask(collection1, tt1))
using (BlockingCollection<T> collection2 = new BlockingCollection<T>(MaxQueuedElementsPerCollection))
using (Task<S2> task2 = StartConsumerTask(collection2, tt2))
{
foreach (T element in tt)
{
collection1.Add(element);
collection2.Add(element);
}
// Inform any enumerators created by .GetConsumingEnumerable()
// that there will be no more elements added.
collection1.CompleteAdding();
collection2.CompleteAdding();
// Accessing the Result property blocks until the Task<> is complete.
return (task1.Result, task2.Result);
}
Task<S> StartConsumerTask<S>(BlockingCollection<T> collection, Func<IEnumerable<T>, S> func)
{
return Task.Run(() => func(collection.GetConsumingEnumerable()));
}
}
Note that, for efficiency's sake, you could increase MaxQueuedElementsPerCollection to, say, 10 or 100 so that the consumers don't have to run in lock-step with each other.
There is one problem with this code, though. When a collection is empty the consumer has to wait for the producer to produce an element, and when a collection is full the producer has to wait for the consumer to consume an element. Consider what happens mid-way through the execution of your tt => tt.Any(k => k > 5) lambda...
The producer waits for the collection to be non-full and adds 5.
The consumer waits for the collection to be non-empty and removes 5.
5 > 5 returns false and enumeration continues.
The producer waits for the collection to be non-full and adds 6.
The consumer waits for the collection to be non-empty and removes 6.
6 > 5 returns true and enumeration stops. Any(), the lambda, and the consumer task all return.
The producer waits for the collection to be non-full and adds 7.
The producer waits for the collection to be non-full and...that never happens!
The consumer has already abandoned the enumeration, so it won't consume any elements to make room for the new one. Add() will never return.
The cleanest way I could come up with to prevent this deadlock is to ensure the entire collection gets enumerated even if func doesn't do so. This just requires a simple change to the StartConsumerTask<>() local method...
Task<S> StartConsumerTask<S>(BlockingCollection<T> collection, Func<IEnumerable<T>, S> func)
{
return Task.Run(
() => {
try
{
return func(collection.GetConsumingEnumerable());
}
finally
{
// Prevent BlockingCollection<>.Add() calls from
// deadlocking by ensuring the entire collection gets
// consumed even if func abandoned its enumeration early.
foreach (T element in collection.GetConsumingEnumerable())
{
// Do nothing...
}
}
}
);
}
The downside of this is that tt will always be enumerated to completion, even if both tt1 and tt2 abandon their enumerators early.
With that addressed, this...
static void Main()
{
IEnumerable<int> mynums = Enumerable.Range(0, 10).Trace("mynums:");
Console.WriteLine("Both: (any > 5, sum) = {0}", mynums.Both(tt => tt.Any(k => k > 5), tt => tt.Sum()));
}
...outputs this...
mynums: 0 1 2 3 4 5 6 7 8 9.
Both: (any > 5, sum) = (True, 45)
The core problem here is who is responsible for calling Enumeration.MoveNext() (eg by using a foreach loop). Synchronizing multiple foreach loops across threads would be slow and fiddly to get right.
Implementing IAsyncEnumerable<T>, so that multiple await foreach loops can take turns processing items would be easier. But still silly.
So the simpler solution would be to change the question. Instead of trying to call multiple methods that both try to enumerate the items, change the interface to simply visit each item.
I believe it is possible to satisfy all the requirements of the question, and one more (very natural) requirement, namely that the original enumerable be only enumerated partially if each of the two Func<IEnumerable<T>, S> consume it partially.
(This was discussed by #BACON). The approach is discussed in more detail in my GitHub repo 'CoEnumerable'. The idea is that the Barrier class provides a fairly straightforward approach to implement a proxy IEnumerable which can be consumed by each of the Func<IEnumerable<T>, S>s while the proxy consumes the real IEnumerable just once. In particular, the implementation consumes only as much of the original enumerable is as absolutely necessary (i.e., it satisfies the extra requirement mentioned above).
The proxy is:
class BarrierEnumerable<T> : IEnumerable<T>
{
private Barrier barrier;
private bool moveNext;
private readonly Func<T> src;
public BarrierEnumerable(IEnumerator<T> enumerator)
{
src = () => enumerator.Current;
}
public Barrier Barrier
{
set => barrier = value;
}
public bool MoveNext
{
set => moveNext = value;
}
public IEnumerator<T> GetEnumerator()
{
try
{
while (true)
{
barrier.SignalAndWait();
if (moveNext)
{
yield return src();
}
else
{
yield break;
}
}
}
finally
{
barrier.RemoveParticipant();
}
}
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
in terms of which we can combine the two consumers
public static T Combine<S, T1, T2, T>(this IEnumerable<S> source,
Func<IEnumerable<S>, T1> coenumerable1,
Func<IEnumerable<S>, T2> coenumerable2,
Func<T1, T2, T> resultSelector)
{
using var ss = source.GetEnumerator();
var enumerable1 = new BarrierEnumerable<S>(ss);
var enumerable2 = new BarrierEnumerable<S>(ss);
using var barrier = new Barrier(2, _ => enumerable1.MoveNext = enumerable2.MoveNext = ss.MoveNext());
enumerable2.Barrier = enumerable1.Barrier = barrier;
using var t1 = Task.Run(() => coenumerable1(enumerable1));
using var t2 = Task.Run(() => coenumerable2(enumerable2));
return resultSelector(t1.Result, t2.Result);
}
The GitHub repo has a couple of examples of using the above code, and some brief design discussion (including limitations).

When immutable collections are preferable than concurrent

Recently read about immutable collections.
They are recommended to be used as a thread safe for reading, when the read operations are performed more often than write.
Then I want to test read performance ImmutableDictionary vs ConcurrentDictionary. Here is this very simple test (in .NET Core 2.1):
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Collections.Immutable;
using System.Diagnostics;
using System.Linq;
using System.Threading.Tasks;
namespace ImmutableSpeedTests
{
class Program
{
public class ConcurrentVsImmutable
{
public int ValuesCount;
public int ThreadsCount;
private ImmutableDictionary<int, int> immutable = ImmutableDictionary<int, int>.Empty;
private ConcurrentDictionary<int, int> concurrent = new ConcurrentDictionary<int, int>();
public ConcurrentVsImmutable(int valuesCount, int threadsCount)
{
ValuesCount = valuesCount;
ThreadsCount = threadsCount;
}
public void Setup()
{
// fill both collections. I don't measure time cause immutable is filling much slower obviously.
for (var i = 0; i < ValuesCount; i++)
{
concurrent[i] = i;
immutable = immutable.Add(i, i);
}
}
public async Task<long> ImmutableSum() => await Sum(immutable);
public async Task<long> ConcurrentSum() => await Sum(concurrent);
private async Task<long> Sum(IReadOnlyDictionary<int, int> dic)
{
var tasks = new List<Task<long>>();
// main job. Run multiple tasks to sum all values.
for (var i = 0; i < ThreadsCount; i++)
tasks.Add(Task.Run(() =>
{
long x = 0;
foreach (var key in dic.Keys)
{
x += dic[key];
}
return x;
}));
var result = await Task.WhenAll(tasks.ToArray());
return result.Sum();
}
}
static void Main(string[] args)
{
var test = new ConcurrentVsImmutable(1000000, 4);
test.Setup();
var sw = new Stopwatch();
sw.Start();
var result = test.ConcurrentSum().Result;
sw.Stop();
// Convince that the result of the work is the same
Console.WriteLine($"Concurrent. Result: {result}. Elapsed: {sw.ElapsedTicks}.");
sw.Reset();
sw.Start();
result = test.ImmutableSum().Result;
sw.Stop();
Console.WriteLine($" Immutable. Result: {result}. Elapsed: {sw.ElapsedTicks}.");
Console.ReadLine();
}
}
}
You can run this code. Elapsed time in ticks will differ from time to time but the time spent by ConcurrentDictionary is several times less than by ImmutableDictionary.
This experiment makes me embarrassed. Did I do it wrong? What the reason to use immutable collections if we have concurrent? When they are preferable?
Immutable collections are not alternative to concurrent collections. And the way they are designed to reduce memory consumption, they are bound to be slower, the trade off here is to use less memory and thus by using less n operations to do anything.
We usually copy collections to other collections to achieve immutability to persist state. Lets see what it means,
var s1 = ImmutableStack<int>.Empty;
var s2 = s1.Push(1);
// s2 = [1]
var s3 = s2.Push(2);
// s2 = [1]
// s3 = [1,2]
// notice that s2 has only one item, it is not modified..
var s4 = s3.Pop(ref var i);
// s2 = [1];
// still s2 has one item...
Notice that, s2 always has only one item. Even if all items are removed.
The way all data is stored internally is a huge tree and your collection is pointing to a branch which has descendants that represents initial state of the tree.
I don't think the performance can be matched with concurrent collection where goals are totally different.
In concurrent collection, you have a single copy of collection accessed by all threads.
In immutable collection you have virtually isolated copy of a tree, navigating that tree is always costly.
It is useful in transactional system, where if a transaction has to be rolled back, state of collection can be retained in commit points.
This is a criticism that's been made before.
As Akash already said, ImmutableDictionary works with an internal tree, instead of a hashset.
One aspect of this is that you can improve the performance slightly if you build the dictionary in one step instead of iteratively adding all the keys:
immutable = concurrent.ToImmutableDictionary();
Enumerating a hashset and a balanced tree are both O(n) operations. I took the average of a few runs on a single thread for varying container size and get results consistent with that:
I don't know why the immutable slope is 6x steeper. For now I'll just assume its doing tricky nonblocking tree stuff. I assume this class would be optimized for random stores and reads rather than enumeration.
To identify what exact scenarios ImmutableDictionary wins at, we'd need to wrap a concurrent dictionary to provide some level of immutability, and test both classes in the face of levels of read/write contention.
Not a serious suggestion, but a counterpoint to your test is to use immutability to "cheat" over multiple iterations by comparing:
private ConcurrentDictionary<object, long> cache = new ConcurrentDictionary<object, long>();
public long ImmutableSum()
{
return cache.GetOrAdd(immutable, (obj) => (obj as ImmutableDictionary<int, int>).Sum(kvp => (long)kvp.Value));
}
public long ConcurrentSum() => concurrent.Sum(kvp => (long)kvp.Value);
This makes a quite a difference on subsequent calls to sum an unchanged collection!
The two are not mutually exclusive. I use both.
If your dictionary is small the read performance of ImmutableDictionary will be superior to ConcurrentDictionary as K1*Log(N) < K2 where Log(N) < K2/K1 (when the hash table overhead is worse than tree traversal).
I personally find the write semantics of the Immutable collections easier to understand than those of the concurrent collections as they tend to be more consistent, especially when dealing with AddOrUpdate() and GetOrAdd().
In practice, I find that there have many cases in which I have a good number of small (or empty) dictionaries that are more appropriate as ImmutableDictionary and some larger ones that warrant the use of ConcurrentDictionary.
Having said that, if they are small then it doesn't make much of a difference what you use.
Regarding the answer of Peter Wishart, the enumeration performance of ImmutableDictionary is higher than ConcurrentDictionary (for reasonable N) because tree traversal is brutal in terms of memory latency on modern cache architectures.

How AsParallel extension actually works

So topic is the questions.
I get that method AsParallel returns wrapper ParallelQuery<TSource> that uses the same LINQ keywords, but from System.Linq.ParallelEnumerable instead of System.Linq.Enumerable
It's clear enough, but when i'm looking into decompiled sources, i don't understand how does it works.
Let's begin from an easiest extension : Sum() method. Code:
[__DynamicallyInvokable]
public static int Sum(this ParallelQuery<int> source)
{
if (source == null)
throw new ArgumentNullException("source");
else
return new IntSumAggregationOperator((IEnumerable<int>) source).Aggregate();
}
it's clear, let's go to Aggregate() method. It's a wrapper on InternalAggregate method that traps some exceptions. Now let's take a look on it.
protected override int InternalAggregate(ref Exception singularExceptionToThrow)
{
using (IEnumerator<int> enumerator = this.GetEnumerator(new ParallelMergeOptions?(ParallelMergeOptions.FullyBuffered), true))
{
int num = 0;
while (enumerator.MoveNext())
checked { num += enumerator.Current; }
return num;
}
}
and here is the question: how it works? I see no concurrence safety for a variable, modified by many threads, we see only iterator and summing. Is it magic enumerator? Or how does it works? GetEnumerator() returns QueryOpeningEnumerator<TOutput>, but it's code is too complicated.
Finally in my second PLINQ assault I found an answer. And it's pretty clear.
Problem is that enumerator is not simple. It's a special multithreading one. So how it works? Answer is that enumerator doesn't return a next value of source, it returns a whole sum of next partition. So this code is only executed 2,4,6,8... times (based on Environment.ProcessorCount), when actual summation work is performed inside enumerator.MoveNext in enumerator.OpenQuery method.
So TPL obviosly partition the source enumerable, then sum independently each partition and then pefrorm this summation, see IntSumAggregationOperatorEnumerator<TKey>. No magic here, just could plunge deeper.
The Sum operator aggregates all values in a single thread. There is no multi-threading here. The trick is that multi-threading is happening somewhere else.
The PLINQ Sum method can handle PLINQ enumerables. Those enumerables could be built up using other constructs (such as where) that allows a collection to be processed over multiple threads.
The Sum operator is always the last operator in a chain. Although it is possible to process this sum over multiple threads, the TPL team probably found out that this had a negative impact on performance, which is reasonable, since the only thing this method has to do is a simple integer addition.
So this method processes all results that come available from other threads and processes them on a single thread and returns that value. The real trick is in other PLINQ extension methods.
protected override int InternalAggregate(ref Exception singularExceptionToThrow)
{
using (IEnumerator<int> enumerator = this.GetEnumerator(new ParallelMergeOptions? (ParallelMergeOptions.FullyBuffered), true))
{
int num = 0;
while (enumerator.MoveNext())
checked { num += enumerator.Current; }
return num;
}
}
This code won't be executed parallel, the while will be sequentially execute it's innerscope.
Try this instead
List<int> list = new List<int>();
int num = 0;
Parallel.ForEach(list, (item) =>
{
checked { num += item; }
});
The inner action will be spread on the ThreadPool and the ForEach statement will be complete when all items are handled.
Here you need threadsafety:
List<int> list = new List<int>();
int num = 0;
Parallel.ForEach(list, (item) =>
{
Interlocked.Add(ref num, item);
});

Fastest and still safe way to exchange reference variables values

Basically what i need is to be able to add items to List (or another collection) constantly, around 3000 times per second in one thread. And to get and remove all items from that list once per 2 seconds.
I don't like classic ways to do this like using concurrent collections or lock on something every time i need to access collection because it would be slower than i need.
What i'm trying to do is to have 2 collections, one for each thread, and to find a way to make a thread safe switch from one collection to another.
Simplified and not thread-safe example:
var listA = new List<int>();
var listB = new List<int>();
// method is called externally 3000 times per second
void ProducerThread(int a)
{
listA.Add(a)
}
void ConsumerThread()
{
while(true)
{
Thread.Sleep(2000);
listB = Interlocked.Exchange(ref listA,listB);
//... processing listB data
// at this point when i'm done reading data
// producer stil may add an item because ListA.Add is not atomic
// correct me if i'm wrong
listB.Clear();
}
}
Is there any way to make above code work as intended (to be thread safe) while having producer thread blocked as little as possible? Or maybe another solution?
I would start out by using a BlockingCollection or another IProducerConsomerCollection in System.Collections.Concurrent. That is exactly what you have, a producer/consumer queue that is accessed from multiple threads. Those collections are also heavily optimized for performance. They don't use use a naive "lock the whole structure anytime anyone does any operation". They are smart enough to avoid locking wherever possible using lock-free synchronization techniques, and when they do need to use critical sections they can minimize what needs to be locked on such that the structure can often be accessed concurrently despite a certain amount of locking.
Before I move from there to anything else I would use one of those collections and ensure that it is too slow. If, after using that as your solution you have demonstrated that you are spending an unacceptable amount of time adding/removing items from the collection then you could consider investigating other solutions.
If, as I suspect will be the case, they perform quick enough, I'm sure you'll find that it makes writing the code much easier and clearer to read.
I am assuming that you just want to process new additions to listA, and that while you process these additions more additions are made.
var listA = new List<int>();
var dictA = new Dictionary<int,int>();
int rangeStart = 0;
int rangeEnd = 0;
bool protectRange = false;
// method is called externally 3000 times per second
void ProducerThread(int a)
{
listA.Add(a);
dictA.Add(rangeEnd++,a);
}
void ConsumerThread()
{
while(true)
{
Thread.Sleep(2000);
int rangeInstance = rangeEnd;
var listB = new List<int>();
for( int start = rangeStart; start < rangeInstance; start++ ){
listB.add(dictA[start]);
rangeStart++;
}
//... processing listB data
}
}
If the table has a fixed maximum size, why use a List? You could also pre-set the list size.
List<int> listA = new List<int>(6000);
Now, I haven't really test the following, but I think it would do what you want:
int[] listA = new int[6000]; // 3000 time * 2 seconds
int i = 0;
// method is called externally 3000 times per second
void ProducerThread(int a)
{
if (Monitor.TryEnter(listA)) // If true, consumer is in cooldown.
{
listA[i] = a;
i++;
Monitor.Exit(listA);
}
}
void ConsumerThread()
{
Monitor.Enter(listA); // Acquire thread lock.
while (true)
{
Monitor.Wait(listA, 2000); // Release thread lock for 2000ms, automaticly retake it after Producer released it.
foreach (int a in listA) { } //Processing...
listA = new int[6000];
i = 0;
}
}
You just need to be sure to have ConsumerThread run first, so it would queue itself and wait.

Iterating a dictionary in C#

var dict = new Dictionary<int, string>();
for (int i = 0; i < 200000; i++)
dict[i] = "test " + i;
I iterated this dictionary using the code below:
foreach (var pair in dict)
Console.WriteLine(pair.Value);
Then, I iterated it using this:
foreach (var key in dict.Keys)
Console.WriteLine(dict[key]);
And the second iteration took ~3 seconds less.
I can get both keys and values via both methods. What I wonder is whether the second approach has a drawback. Since the most rated question that I can find about this doesn't include this way of iterating a dictionary, I wanted to know why no one uses it and how does it work faster.
Your time tests have some fundamental flaws:
Console.Writeline is an I/O operation, which takes orders of magnitude more time than memory accesses and CPU calculations. Any difference in iteration times is probably being dwarfed by the cost of this operation. It's like measuring the weights of pennies in a cast-iron stove.
You don't mention how long the overall operation took, so saying that one took 3 seconds less than another is meaningless. If it took 300 seconds to run the first, and 303 seconds to run the second, then you are micro-optimizing.
You don't mention how you measured running time. Did running time include the time getting the program assembly loaded and bootstrapped?
You don't mention repeatability: Did you run these operations several times? Several hundred times? In different orders?
Here are my tests. Note how I try my best to ensure that the method of iteration is the only thing that changes, and I include a control to see how much of the time is taken up purely because of a for loop and assignment:
void Main()
{
// Insert code here to set up your test: anything that you don't want to include as
// part of the timed tests.
var dict = new Dictionary<int, string>();
for (int i = 0; i < 2000; i++)
dict[i] = "test " + i;
string s = null;
var actions = new[]
{
new TimedAction("control", () =>
{
for (int i = 0; i < 2000; i++)
s = "hi";
}),
new TimedAction("first", () =>
{
foreach (var pair in dict)
s = pair.Value;
}),
new TimedAction("second", () =>
{
foreach (var key in dict.Keys)
s = dict[key];
})
};
TimeActions(100, // change this number as desired.
actions);
}
#region timer helper methods
// Define other methods and classes here
public void TimeActions(int iterations, params TimedAction[] actions)
{
Stopwatch s = new Stopwatch();
foreach(var action in actions)
{
var milliseconds = s.Time(action.Action, iterations);
Console.WriteLine("{0}: {1}ms ", action.Message, milliseconds);
}
}
public class TimedAction
{
public TimedAction(string message, Action action)
{
Message = message;
Action = action;
}
public string Message {get;private set;}
public Action Action {get;private set;}
}
public static class StopwatchExtensions
{
public static double Time(this Stopwatch sw, Action action, int iterations)
{
sw.Restart();
for (int i = 0; i < iterations; i++)
{
action();
}
sw.Stop();
return sw.Elapsed.TotalMilliseconds;
}
}
#endregion
Result
control: 1.2173ms
first: 9.0233ms
second: 18.1301ms
So in these tests, using the indexer takes roughly twice as long as iterating key-value pairs, which is what I would expect*. This stays roughly proportionate if I increase the number of entries and the number of repetitions by an order of magnitude, and I get the same results if I run the two tests in reverse order.
* Why would I expect this result? The Dictionary class probably represents its entries as KeyValuePairs internally, so all it really has to do when you iterate it directly is walk through its data structure once, handing the caller each entry as it comes to it. If you iterate just the Keys, it still has to find each KeyValuePair, and give you the value of the Key property from it, so that step alone is going to cost roughly the same amount as iterating across it in the first place. Then you have to call the indexer, which has to calculate a hash for provided key, jump to the correct hashtable bucket, and do an equality check on the keys of any KeyValuePairs it finds there. These operations aren't terribly expensive, but once you do them N times, it's roughly as expensive as if you'd iterated over the internal hashtable structure again.

Categories

Resources