I have a List class, and I would like to override GetEnumerator() to return my own Enumerator class. This Enumerator class would have two additional properties that would be updated as the Enumerator is used.
For simplicity (this isn't the exact business case), let's say those properties were CurrentIndex and RunningTotal.
I could manage these properties within the foreach loop manually, but I would rather encapsulate this functionality for reuse, and the Enumerator seems to be the right spot.
The problem: foreach hides all the Enumerator business, so is there a way to, within a foreach statement, access the current Enumerator so I can retrieve my properties? Or would I have to foreach, use a nasty old while loop, and manipulate the Enumerator myself?
Strictly speaking, I would say that if you want to do exactly what you're saying, then yes, you would need to call GetEnumerator and control the enumerator yourself with a while loop.
Without knowing too much about your business requirement, you might be able to take advantage of an iterator function, such as something like this:
public static IEnumerable<decimal> IgnoreSmallValues(List<decimal> list)
{
decimal runningTotal = 0M;
foreach (decimal value in list)
{
// if the value is less than 1% of the running total, then ignore it
if (runningTotal == 0M || value >= 0.01M * runningTotal)
{
runningTotal += value;
yield return value;
}
}
}
Then you can do this:
List<decimal> payments = new List<decimal>() {
123.45M,
234.56M,
.01M,
345.67M,
1.23M,
456.78M
};
foreach (decimal largePayment in IgnoreSmallValues(payments))
{
// handle the large payments so that I can divert all the small payments to my own bank account. Mwahaha!
}
Updated:
Ok, so here's a follow-up with what I've termed my "fishing hook" solution. Now, let me add a disclaimer that I can't really think of a good reason to do something this way, but your situation may differ.
The idea is that you simply create a "fishing hook" object (reference type) that you pass to your iterator function. The iterator function manipulates your fishing hook object, and since you still have a reference to it in your code outside, you have visibility into what's going on:
public class FishingHook
{
public int Index { get; set; }
public decimal RunningTotal { get; set; }
public Func<decimal, bool> Criteria { get; set; }
}
public static IEnumerable<decimal> FishingHookIteration(IEnumerable<decimal> list, FishingHook hook)
{
hook.Index = 0;
hook.RunningTotal = 0;
foreach(decimal value in list)
{
// the hook object may define a Criteria delegate that
// determines whether to skip the current value
if (hook.Criteria == null || hook.Criteria(value))
{
hook.RunningTotal += value;
yield return value;
hook.Index++;
}
}
}
You would utilize it like this:
List<decimal> payments = new List<decimal>() {
123.45M,
.01M,
345.67M,
234.56M,
1.23M,
456.78M
};
FishingHook hook = new FishingHook();
decimal min = 0;
hook.Criteria = x => x > min; // exclude any values that are less than/equal to the defined minimum
foreach (decimal value in FishingHookIteration(payments, hook))
{
// update the minimum
if (value > min) min = value;
Console.WriteLine("Index: {0}, Value: {1}, Running Total: {2}", hook.Index, value, hook.RunningTotal);
}
// Resultint output is:
//Index: 0, Value: 123.45, Running Total: 123.45
//Index: 1, Value: 345.67, Running Total: 469.12
//Index: 2, Value: 456.78, Running Total: 925.90
// we've skipped the values .01, 234.56, and 1.23
Essentially, the FishingHook object gives you some control over how the iterator executes. The impression I got from the question was that you needed some way to access the inner workings of the iterator so that you could manipulate how it iterates while you are in the middle of iterating, but if this is not the case, then this solution might be overkill for what you need.
With foreach you indeed can't get the enumerator - you could, however, have the enumerator return (yield) a tuple that includes that data; in fact, you could probably use LINQ to do it for you...
(I couldn't cleanly get the index using LINQ - can get the total and current value via Aggregate, though; so here's the tuple approach)
using System.Collections;
using System.Collections.Generic;
using System;
class MyTuple
{
public int Value {get;private set;}
public int Index { get; private set; }
public int RunningTotal { get; private set; }
public MyTuple(int value, int index, int runningTotal)
{
Value = value; Index = index; RunningTotal = runningTotal;
}
static IEnumerable<MyTuple> SomeMethod(IEnumerable<int> data)
{
int index = 0, total = 0;
foreach (int value in data)
{
yield return new MyTuple(value, index++,
total = total + value);
}
}
static void Main()
{
int[] data = { 1, 2, 3 };
foreach (var tuple in SomeMethod(data))
{
Console.WriteLine("{0}: {1} ; {2}", tuple.Index,
tuple.Value, tuple.RunningTotal);
}
}
}
You can also do something like this in a more Functional way, depending on your requirements. What you are asking can be though of as "zipping" together multiple sequences, and then iterating through them all at once. The three sequences for the example you gave would be:
The "value" sequence
The "index" sequence
The "Running Total" Sequence
The next step would be to specify each of these sequences seperately:
List<decimal> ValueList
var Indexes = Enumerable.Range(0, ValueList.Count)
The last one is more fun... the two methods I can think of are to either have a temporary variable used to sum up the sequence, or to recalculate the sum for each item. The second is obviously much less performant, I would rather use the temporary:
decimal Sum = 0;
var RunningTotals = ValueList.Select(v => Sum = Sum + v);
The last step would be to zip these all together. .Net 4 will have the Zip operator built in, in which case it will look like this:
var ZippedSequence = ValueList.Zip(Indexes, (value, index) => new {value, index}).Zip(RunningTotals, (temp, total) => new {temp.value, temp.index, total});
This obviously gets noisier the more things you try to zip together.
In the last link, there is source for implementing the Zip function yourself. It really is a simple little bit of code.
Related
I have 2 lists of different types. I think for now it doesn't matter what types that are.
Both types have an information about occurance which is in ticks (but can also be a DateTime).
What I want to do is, to synchronize these 2 lists by time so for example i can iterate through all elements in the order how they occured in time.
Example: // in this example List has elements called A_NUM or B_NUM depending on a type of list and number after '_' will represent order at which this elements/events occured
ListA = {A_2, A_3, A_5}
ListB = {B_1, B_4, B_6}
And the result after synchronization will be something like this:
ResultList = {B_1, A_2, A_3, B_4, A_5, B_6}
Is it somehow possible to make such mixed list? Or I have to create some auxiliary List or Dictionary which will tell me synchronized order of this 2 lists?
EDIT:
One list is a list of eye fixations. Fixation have a position, duration, ... and also occurance attributes.
Second list is a list of some changes of text, for example on line 12 column 3 there was a char 'x' added at some time t.
And I want to iterate through these 2 lists simultaneously. I mean at time t1 fixation occured at position x,y. At time t2 there was a text change at position u,v, so I want to iterate through these events in the order as they occured in time.
Note: YES both lists are sorted by time. It is a sequence of fixations and sequence of text changes.
Your question strongly suggests a merge sort as the basic implementation detail. You have two inputs, both sorted, and just want them merged together in sequence.
The main difficulty implied by your question is that you are trying to merge sequences of two completely unrelated types. Ordinarily, you'd merge sequences of the same type, and so could easily manipulate them together. Barring that, they'd at least share a base class or interface type, so that you could treat them as a single generalized type. But it seems, from your question, that this is not the case.
Given that, I think the most straight-forward approach is still to use a merge sort, but to provide a mechanism for the sort to access the relevant property (ticks, DateTime, whatever). The sort would return the merged sequences, in correct order, as the object type (i.e. the only base type common to both inputs) and the caller would then have to cast back to the individual types for whatever purpose.
Here's an example of what I mean:
private static IEnumerable<TBase> Merge<TBase, T1, T2, TValue>(
IEnumerable<T1> sequence1, IEnumerable<T2> sequence2,
Func<T1, TValue> valueSelector1, Func<T2, TValue> valueSelector2)
where T1 : TBase
where T2 : TBase
where TValue : IComparable<TValue>
{
IEnumerator<T1> enumerator1 = sequence1.GetEnumerator();
IEnumerator<T2> enumerator2 = sequence2.GetEnumerator();
bool notDone1 = enumerator1.MoveNext(),
notDone2 = enumerator2.MoveNext();
while (notDone1 && notDone2)
{
TValue value1 = valueSelector1(enumerator1.Current),
value2 = valueSelector2(enumerator2.Current);
if (value1.CompareTo(value2) <= 0)
{
yield return enumerator1.Current;
notDone1 = enumerator1.MoveNext();
}
else
{
yield return enumerator2.Current;
notDone2 = enumerator2.MoveNext();
}
}
while (notDone1)
{
yield return enumerator1.Current;
notDone1 = enumerator1.MoveNext();
}
while (notDone2)
{
yield return enumerator2.Current;
notDone2 = enumerator2.MoveNext();
}
}
Used like this:
class A
{
public int Value { get; }
public A(int value)
{
Value = value;
}
}
class B
{
public int Value { get; }
public B(int value)
{
Value = value;
}
}
static void Main(string[] args)
{
const int minCount = 5, maxCount = 15, maxValue = 50;
Random random = new Random();
int listACount = random.Next(minCount, maxCount),
listBCount = random.Next(minCount, maxCount);
A[] listA = RandomOrderedSequence(random, maxValue, listACount).Select(i => new A(i)).ToArray();
B[] listB = RandomOrderedSequence(random, maxValue, listBCount).Select(i => new B(i)).ToArray();
Console.WriteLine("listA: ");
Console.WriteLine(string.Join(", ", listA.Select(a => a.Value)));
Console.WriteLine("listB: ");
Console.WriteLine(string.Join(", ", listB.Select(b => b.Value)));
foreach (object o in Merge<object, A, B, int>(listA, listB, a => a.Value, b => b.Value))
{
A a = o as A;
if (a != null)
{
// Do something with object of type A
Console.WriteLine($"a.Value: {a.Value}");
}
else
{
// Must be a B. Do something with object of type B
B b = (B)o;
Console.WriteLine($"b.Value: {b.Value}");
}
}
}
static IEnumerable<int> RandomOrderedSequence(Random random, int max, int count)
{
return RandomSequence(random, max, count).OrderBy(i => i);
}
static IEnumerable<int> RandomSequence(Random random, int max, int count)
{
while (count-- > 0)
{
yield return random.Next(max);
}
}
In your case, you would of course replace types A and B with the types you're actually using, provide appropriate selectors, and then do whatever you like with each instance returned as the merged, in-order sequence.
Note that even if the types do turn out to share some common basis for which they can be compared and merged, I would still recommend a merge sort over simply concatenating and merging the result. The merge sort is a much more efficient way to merge already-ordered data into a single sequence of ordered data.
You need the lists to implement a common interface so they can be compared. For example:
public interface ISynchronizable
{
long GetTicks();
}
So you need to have the two object implement this, like so:
public class Fixation : ISynchronizable
{
public long GetTicks()
{
// get the ticks
}
// some other code
}
public class TextChange: ISynchronizable
{
public long GetTicks()
{
// get the ticks
}
// some other code
}
Then the result list would be created like this:
public List<ISynchronizable> list = new List<ISynchronizable>();
list.AddRange(fixationList);
list.AddRange(textChangeList);
resultList = list.OrderBy(e => e.GetTicks()).ToList();
I'm implementing a memory system for an AI agent. It needs to have an internal list of state transitions which is capped at some number, say 10000.
If at capacity, adding a new memory should automatically remove the oldest memory.
Importantly, I should also need to be able to quickly access any item in this list.
A wrapper for Queue at first seemed obvious, but Queue does not allow fast access of any element. (O(n))
Similarly, remove an item from the beginning of a List structure takes O(n).
LinkedLists allow fast additions and removals, but again do not allow quick access to every index.
An array would allow random access but obviously it's not dynamically resizeable and deletion is problematic.
I've seen a HashMap being suggested but I'm ensure how that might be implemented.
Suggestions?
If you want the queue to be a fixed length, you could use a circular buffer which enables O(1) enqueue, dequeue and indexing operations and automatically overwrites old entries when the queue is full.
Try using a Dictionary with a LinkedList. The keys of the Dictionary are the indexes of the LinkedList nodes and the values of the Dictionary are of type LinkedListNode; that is, the LinkedList nodes.
The Dictionary would give you almost an O(1) on its operations and removing/adding LinkedListNode(s) to the beginning or end of a LinkedList is of O(1) as well.
Another alternative is to use a HashTable. However, in this case you have to know the capacity of the table beforehand (See Hashtable.Add Method) in order to get the O(1) performance:
If Count is less than the capacity of the Hashtable, this method is an O(1) operation. If the capacity needs to be increased to accommodate the new element, this method becomes an O(n) operation, where n is Count.
In the first solution, no matter what's the capcity of the LinkedList or the Dictionary you would still get almost an O(1) from both the Dictionary and the LinkedList. Of course that's going to be an O(3) or O(4) depending on the total number of operations that you perform on both the Dictionary and the LinkedList to do an add or remove operation inside your memory class. The search access is going to be always an O(1) because you will be using the Dictionary only.
HashMap is for Java, so the closest equivalent is Dictionary. C# Java HashMap equivalent. But I wouldn't say that this is the ultimate answer.
If you implement it as Dictionary, which key == the content, then you can search the content with O(1). However, you cannot have same key. Also, because it is not ordered, you may not know which the 1st content is.
If you implement it as Dictionary, which key == index, and value == the content, searching for the content still takes O(n) because you don't know the location of content.
A List or an Array will cost O(1) if you search the content by index reference. So, please double check your statement that it takes O(n)
If you search by index is sufficient, then circular array/ buffer which #Lee mentioned is good enough.
Otherwise, similar to DB, you might want to maintain in 2 separate data: 1 for storing the data (Circular Array) and the other one for search (Hash).
EDIT: #Lee has it right. A circular buffer seems to give you what you want. Answer left in place though.
I think the data structure you want might be a priority queue -- it depends on what you mean by 'quickly access any item'. If you mean 'able to enumerate all items in O(N)', then a priority queue fits the bill. If you mean 'enumerate the list in historical order', then it won't.
Assuming you need these operations;
add an item and associate with a time
remove the oldest item
enumerate all existing items in arbitrary order
Then you could easily extend this priority queue implementation I wrote a little while ago.
You'll want implement IEnumerable as a loop through the T[] data array from 0 to cursor. This will give you your enumeration.
Implement a GetItem(i) function which returns this.data[i] so long as i <= cursor.
Implement an automatic size limit by putting this into the Push() method;
if (queue.Size => 10000) {
queue.Pop();
}
I think this is O(ln n) for push and pop, and O(N) to enumerate ALL items, or O(i) to find ANY item, so long as you don't need them in order.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Mindfire.DataStructures
{
public class PiorityQueue<T>
{
private int[] priorities;
private T[] data;
private int cursor;
private int capacity;
public int Size
{
get
{
return cursor+1;
}
}
public PiorityQueue(int capacity)
{
this.cursor = -1;
this.capacity = capacity;
this.priorities = new int[this.capacity];
this.data = new T[this.capacity];
}
public T Pop()
{
if (this.Size == 0)
{
throw new InvalidOperationException($"The {this.GetType().Name} is Empty");
}
var result = this.data[0];
this.data[0] = this.data[cursor];
this.priorities[0] = this.priorities[cursor];
this.cursor--;
var loc = 0;
while (true)
{
var l = loc * 2;
var r = loc * 2 + 1;
var leftIsBigger = l <= cursor && this.priorities[loc] < this.priorities[l];
var rightIsBigger = r <= cursor && this.priorities[loc] < this.priorities[r];
if (leftIsBigger)
{
Swap(loc, l);
loc = l;
}
else if (rightIsBigger)
{
Swap(loc, r);
loc = r;
}
else
{
break;
}
}
return result;
}
public void Push(int priority, T v)
{
this.cursor++;
if (this.cursor == this.capacity)
{
Resize(this.capacity * 2);
};
this.data[this.cursor] = v;
this.priorities[this.cursor] = priority;
var loc = (this.cursor -1)/ 2;
while (this.priorities[loc] < this.priorities[cursor])
{
// swap
this.Swap(loc, cursor);
}
}
private void Swap(int a, int b)
{
if (a == b) { return; }
var data = this.data[b];
var priority = this.priorities[b];
this.data[b] = this.data[a];
this.priorities[b] = this.priorities[a];
this.priorities[a] = priority;
this.data[a] = data;
}
private void Resize(int newCapacity)
{
var newPriorities = new int[newCapacity];
var newData = new T[newCapacity];
this.priorities.CopyTo(newPriorities, 0);
this.data.CopyTo(newData, 0);
this.data = newData;
this.priorities = newPriorities;
this.capacity = newCapacity;
}
public PiorityQueue() : this(1)
{
}
public T Peek()
{
if (this.cursor > 0)
{
return this.data[0];
}
else
{
return default(T);
}
}
public void Push(T item, int priority)
{
}
}
}
I've a c# Dictionary<DateTime,SomeObject> instance.
I've the following code:
private Dictionary<DateTime, SomeObject> _containedObjects = ...;//Let's imagine you have ~4000 items in it
public IEnumerable<SomeObject> GetItemsList(HashSet<DateTime> requiredTimestamps){
//How to return the list of SomeObject contained in _containedObjects
//Knowing that rarely(~<5% of the call), one or several DateTime of "requiredTimestamps" may not be in _containedObjects
}
I'm looking how to return an IEnumerable<SomeObject> containing all element that were referenced by one of the provided keys. The only issue is that this method will be called very often, and we might not always have every given key in parameter.
So is there something more efficient than this:
private Dictionary<DateTime, SomeObject> _containedObjects = ...;//Let's imagine you have ~4000 items in it
public IEnumerable<SomeObject> GetItemsList(HashSet<DateTime> requiredTimestamps){
List<SomeObject> toReturn = new List<SomeObject>();
foreach(DateTime dateTime in requiredTimestamps){
SomeObject found;
if(_containedObjects.TryGetValue(dateTime, out found)){
toReturn.Add(found);
}
}
return toReturn;
}
In general, there are two ways you can do this:
Go through requiredTimestamps sequentially and look up each date/time stamp in the dictionary. Dictionary lookup is O(1), so if there are k items to look up, it will take O(k) time.
Go through the dictionary sequentially and extract those with matching keys in the requiredTimestamps hash set. This will take O(n) time, where n is the number of items in the dictionary.
In theory, the first option--which is what you currently have--will be the fastest way to do it.
In practice, it's likely that the first one will be more efficient when the number of items you're looking up is less than some percentage of the total number of items in the dictionary. That is, if you're looking up 100 keys in a dictionary of a million, the first option will almost certainly be faster. If you're looking up 500,000 keys in a dictionary of a million, the second method might be faster because it's a whole lot faster to move to the next key than it is to do a lookup.
You'll probably want to optimize for the most common case, which I suspect is looking up a relatively small percentage of keys. In that case, the method you describe is almost certainly the best approach. But the only way to know for sure is to measure.
One optimization you might consider is pre-sizing the output list. That will avoid re-allocations. So when you create your toReturn list:
List<SomeObject> toReturn = new List<SomeObject>(requiredTimestamps.Count);
Method 1:
To make this significantly faster - this is not by changing the algorithm but by making a local copy of _containedObjects in your method and referencing the local copy for the lookup.
Example:
public static IEnumerable<int> GetItemsList3(HashSet<DateTime> requiredTimestamps)
{
var tmp = _containedObjects;
List<int> toReturn = new List<int>();
foreach (DateTime dateTime in requiredTimestamps)
{
int found;
if (tmp.TryGetValue(dateTime, out found))
{
toReturn.Add(found);
}
}
return toReturn;
}
Test data and times (on set of 5000 items with 125 keys found):
Your original method (milliseconds): 2,06032186895335
Method 1 (milliseconds): 0,53549626223609
Method 2:
One way to make this marginally quicker is to iterate through the smaller set and do the lookup on the bigger set. Depending on the size difference you will gain some speed.
You are using a Dictionary and HashSet, so your lookup on either of these will be O(1).
Example: If _containedObjects has less items than requiredTimestamps we loop through _containedObjects (otherwise use your method for the converse)
public static IEnumerable<int> GetItemsList2(HashSet<DateTime> requiredTimestamps)
{
List<int> toReturn = new List<int>();
foreach (var dateTime in _containedObjects)
{
int found;
if (requiredTimestamps.Contains(dateTime.Key))
{
toReturn.Add(dateTime.Value);
}
}
return toReturn;
}
Test data and times (on set of 5000 for _containedObjects and set of 10000 items for requiredTimestamps with 125 keys found):
Your original method (milliseconds): 3,88056291367086
Method 2 (milliseconds): 3,31025939438943
You can use LINQ but I doubt if it is going to increase any performance, even if there is any difference it would be negligible.
Your method could be:
public IEnumerable<SomeObject> GetItemsList(HashSet<DateTime> requiredTimestamps)
{
return _containedObjects.Where(r => requiredTimestamps.Contains(r.Key))
.Select(d => d.Value);
}
One positive with this is lazy evaluation since you are not populating a list and returning it.
Here are some different ways to do it - performance is all pretty much the same so you can choose based on readability.
Paste this into LinqPad if you want to test it out - otherwise just harvest whatever code you need.
I think my personal favourite from a readability point of view is method 3. Method 4 is certainly readable but has the unpleasant feature that it does two lookups into the dictionary for every required timestamp.
void Main()
{
var obj = new TestClass<string>(i => string.Format("Element {0}", i));
var sampleDateTimes = new HashSet<DateTime>();
for(int i = 0; i < 4000 / 20; i++)
{
sampleDateTimes.Add(DateTime.Today.AddDays(i * -5));
}
var result = obj.GetItemsList_3(sampleDateTimes);
foreach (var item in result)
{
Console.WriteLine(item);
}
}
class TestClass<SomeObject>
{
private Dictionary<DateTime, SomeObject> _containedObjects;
public TestClass(Func<int, SomeObject> converter)
{
_containedObjects = new Dictionary<DateTime, SomeObject>();
for(int i = 0; i < 4000; i++)
{
_containedObjects.Add(DateTime.Today.AddDays(-i), converter(i));
}
}
public IEnumerable<SomeObject> GetItemsList_1(HashSet<DateTime> requiredTimestamps)
{
List<SomeObject> toReturn = new List<SomeObject>();
foreach(DateTime dateTime in requiredTimestamps)
{
SomeObject found;
if(_containedObjects.TryGetValue(dateTime, out found))
{
toReturn.Add(found);
}
}
return toReturn;
}
public IEnumerable<SomeObject> GetItemsList_2(HashSet<DateTime> requiredTimestamps)
{
foreach(DateTime dateTime in requiredTimestamps)
{
SomeObject found;
if(_containedObjects.TryGetValue(dateTime, out found))
{
yield return found;
}
}
}
public IEnumerable<SomeObject> GetItemsList_3(HashSet<DateTime> requiredTimestamps)
{
return requiredTimestamps
.Intersect(_containedObjects.Keys)
.Select (k => _containedObjects[k]);
}
public IEnumerable<SomeObject> GetItemsList_4(HashSet<DateTime> requiredTimestamps)
{
return requiredTimestamps
.Where(dt => _containedObjects.ContainsKey(dt))
.Select (dt => _containedObjects[dt]);
}
}
using c# i have a list those objects all have a float mass that is randomized when the object is created.
Whats the most efficient way to loop through the list and find the object with the highest mass?
The most efficient way to do this with a simple list will be a simple linear time search, as in
SomeObject winner;
float maxMass = 0.0f; // Assuming all masses are at least zero!
foreach(SomeObject o in objects) {
if(o.mass > maxMass) {
maxMass = o.mass;
winner = o;
}
}
If this is something you intend to do regularly, it may be beneficial to store your objects in an order sorted by mass and/or to use a more appropriate storage container.
Sounds like a perfect candidate for the MaxBy/MinBy operators in morelinq. You could use it as follows:
objects.MaxBy(obj=>obj.Mass)
Implementing IComparable would make things simple and easy to maintain. I have provided an example. Hope this helps.
I am not sure if this is more efficient than looping. I understand that sometimes using linq slightly degrades the performance for the first time when it is invoked.
But definitely many a times maintainable code scores more over slight performance gain. Can someone provide more details on performance of PARALLEL execution vs looping with AsParallel().
class Program
{
delegate int Del();
static void Main(string[] args)
{
List<MyClass> connections = new List<MyClass>();
connections.Add(new MyClass() { name = "a", mass = 5.001f });
connections.Add(new MyClass() { name = "c", mass = 4.999f });
connections.Add(new MyClass() { name = "b", mass = 4.2f });
connections.Add(new MyClass() { name = "a", mass = 4.99f });
MyClass maxConnection = connections.AsParallel().Max();
Console.WriteLine("{0} {1} ", maxConnection.name, maxConnection.mass);
Console.ReadLine();
}
class MyClass : IComparable
{
public string name { get; set; }
public float mass { get; set; }
public int CompareTo(object obj)
{
return (int)(mass - ((MyClass)obj).mass);
}
}
}
The simplest and most efficient solution (assuming repeated queries) is to sort the list by size.
i.e.
private int SortByMass(ObjectWithMass left,ObjectWithMass right)
{
return left.Mass.CompareTo(right.Mass);
}
List<ObjectWithMass> myList = MyFunctionToPopulateTheList();
myList.sort(SortByMass);
Once the list is sorted, the first element will be the smallest, and the last will be the largest.
You can use myList.Reverse() if you want it by largest to smallest.
This runs in o(nlog(n)) to sort, and then finding the largest object is myList[myList.Count -1]. which is o(1) for .net lists (they are actually arrays underneath)
If you're willing to trade a little space for time then, in one of the instance constructors, have something like the following
public Foo(Foo min, Foo max)
{
min = min ?? new Foo();
max = max ?? new Foo();
if(max.mass < this.mass)
max = this;
if(min > this.mass)
min = this;
}
And upon object creation have the calling method pass those paramters.
Foo min, max = null;
//Create a bunch of Foo objects
var Foos = from n in Enumerable.Range(0, 10000) select new Foo(min, max);
for (var keyValue = 0; keyValue < dwhSessionDto.KeyValues.Count; keyValue++)
{...}
var count = dwhSessionDto.KeyValues.Count;
for (var keyValue = 0; keyValue < count; keyValue++)
{...}
I know there's a difference between the two, but is one of them faster than the other? I would think the second is faster.
Yes, the first version is much slower. After all, I'm assuming you're dealing with types like this:
public class SlowCountProvider
{
public int Count
{
get
{
Thread.Sleep(1000);
return 10;
}
}
}
public class KeyValuesWithSlowCountProvider
{
public SlowCountProvider KeyValues
{
get { return new SlowCountProvider(); }
}
}
Here, your first loop will take ~10 seconds, whereas your second loop will take ~1 second.
Of course, you might argue that the assumption that you're using this code is unjustified - but my point is that the right answer will depend on the types involved, and the question doesn't state what those types are.
Now if you're actually dealing with a type where accessing KeyValues and Count is cheap (which is quite likely) I wouldn't expect there to be much difference. Mind you, I'd almost always prefer to use foreach where possible:
foreach (var pair in dwhSessionDto.KeyValues)
{
// Use pair here
}
That way you never need the count. But then, you haven't said what you're trying to do inside the loop either. (Hint: to get more useful answers, provide more information.)
it depends how difficult it is to compute dwhSessionDto.KeyValues.Count if its just a pointer to an int then the speed of each version will be the same. However, if the Count value needs to be calculated, then it will be calculated every time, and therefore impede perfomance.
EDIT -- heres some code to demonstrate that the condition is always re-evaluated
public class Temp
{
public int Count { get; set; }
}
static void Main(string[] args)
{
var t = new Temp() {Count = 5};
for (int i = 0; i < t.Count; i++)
{
Console.WriteLine(i);
t.Count--;
}
Console.ReadLine();
}
The output is 0, 1, 2 - only !
See comments for reasons why this answer is wrong.
If there is a difference, it’s the other way round: Indeed, the first one might be faster. That’s because the compiler recognizes that you are iterating from 0 to the end of the array, and it can therefore elide bounds checks within the loop (i.e. when you access dwhSessionDTo.KeyValues[i]).
However, I believe the compiler only applies this optimization to arrays so there probably will be no difference here.
It is impossible to say without knowing the implementation of dwhSessionDto.KeyValues.Count and the loop body.
Assume a global variable bool foo = false; and then following implementations:
/* Loop body... */
{
if(foo) Thread.Sleep(1000);
}
/* ... */
public int Count
{
get
{
foo = !foo;
return 10;
}
}
/* ... */
Now, the first loop will perform approximately twice as fast as the second ;D
However, assuming non-moronic implementation, the second one is indeed more likely to be faster.
No. There is no performance difference between these two loops. With JIT and Code Optimization, it does not make any difference.
There is no difference but why you think that thereis difference , can you please post your findings?
if you see the implementation of insert item in Dictionary using reflector
private void Insert(TKey key, TValue value, bool add)
{
int freeList;
if (key == null)
{
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
}
if (this.buckets == null)
{
this.Initialize(0);
}
int num = this.comparer.GetHashCode(key) & 0x7fffffff;
int index = num % this.buckets.Length;
for (int i = this.buckets[index]; i >= 0; i = this.entries[i].next)
{
if ((this.entries[i].hashCode == num) && this.comparer.Equals(this.entries[i].key, key))
{
if (add)
{
ThrowHelper.ThrowArgumentException(ExceptionResource.Argument_AddingDuplicate);
}
this.entries[i].value = value;
this.version++;
return;
}
}
if (this.freeCount > 0)
{
freeList = this.freeList;
this.freeList = this.entries[freeList].next;
this.freeCount--;
}
else
{
if (this.count == this.entries.Length)
{
this.Resize();
index = num % this.buckets.Length;
}
freeList = this.count;
this.count++;
}
this.entries[freeList].hashCode = num;
this.entries[freeList].next = this.buckets[index];
this.entries[freeList].key = key;
this.entries[freeList].value = value;
this.buckets[index] = freeList;
this.version++;
}
Count is a internal member to this class which is incremented each item you insert an item into dictionary
so i beleive that there is no differenct at all.
The second version can be faster, sometimes. The point is that the condition is reevaluated after every iteration, so if e.g. the getter of "Count" actually counts the elements in an IEnumerable, or interogates a database /etc, this will slow things down.
So I'd say that if you dont affect the value of "Count" in the "for", the second version is safer.