c# Adding a Remove(int index) method to the .NET Queue class - c#

I would like to use the generic queue class as described in the .NET framework (3.5)
but I will need a Remove(int index) method to remove items from the queue. Can I achieve this functionality with an extension method? Anyone care to point me in the right direction?

What you want is a List<T> where you always call RemoveAt(0) when you want to get the item from the Queue. Everything else is the same, really (calling Add would add an item to the end of the Queue).

Combining both casperOne's and David Anderson's suggestions to the next level. The following class inherits from List and hides the methods that would be detrimental to the FIFO concept while adding the three Queue methods (Equeue, Dequeu, Peek).
public class ListQueue<T> : List<T>
{
new public void Add(T item) { throw new NotSupportedException(); }
new public void AddRange(IEnumerable<T> collection) { throw new NotSupportedException(); }
new public void Insert(int index, T item) { throw new NotSupportedException(); }
new public void InsertRange(int index, IEnumerable<T> collection) { throw new NotSupportedException(); }
new public void Reverse() { throw new NotSupportedException(); }
new public void Reverse(int index, int count) { throw new NotSupportedException(); }
new public void Sort() { throw new NotSupportedException(); }
new public void Sort(Comparison<T> comparison) { throw new NotSupportedException(); }
new public void Sort(IComparer<T> comparer) { throw new NotSupportedException(); }
new public void Sort(int index, int count, IComparer<T> comparer) { throw new NotSupportedException(); }
public void Enqueue(T item)
{
base.Add(item);
}
public T Dequeue()
{
var t = base[0];
base.RemoveAt(0);
return t;
}
public T Peek()
{
return base[0];
}
}
Test code:
class Program
{
static void Main(string[] args)
{
ListQueue<string> queue = new ListQueue<string>();
Console.WriteLine("Item count in ListQueue: {0}", queue.Count);
Console.WriteLine();
for (int i = 1; i <= 10; i++)
{
var text = String.Format("Test{0}", i);
queue.Enqueue(text);
Console.WriteLine("Just enqueued: {0}", text);
}
Console.WriteLine();
Console.WriteLine("Item count in ListQueue: {0}", queue.Count);
Console.WriteLine();
var peekText = queue.Peek();
Console.WriteLine("Just peeked at: {0}", peekText);
Console.WriteLine();
var textToRemove = "Test5";
queue.Remove(textToRemove);
Console.WriteLine("Just removed: {0}", textToRemove);
Console.WriteLine();
var queueCount = queue.Count;
for (int i = 0; i < queueCount; i++)
{
var text = queue.Dequeue();
Console.WriteLine("Just dequeued: {0}", text);
}
Console.WriteLine();
Console.WriteLine("Item count in ListQueue: {0}", queue.Count);
Console.WriteLine();
Console.WriteLine("Now try to ADD an item...should cause an exception.");
queue.Add("shouldFail");
}
}

Here's how you remove a specific item from the queue with one line of Linq (it's recreating the queue, BUT for the lack of a better method...)
//replace "<string>" with your actual underlying type
myqueue = new Queue<string>(myqueue.Where(s => s != itemToBeRemoved));
I know it's not removing by index, but still, someone might find this useful (this question ranks in Google for "remove specific item from a c# queue" so I decided to add this answer, sorry)

It's a pretty late answer but I write it for future readers
List<T> is exactly what you need but it has a big disadvantage when compared to Queue<T>: it's implemented with an array then Dequeue() is pretty expansive (in terms of time) because all items must be shifted one step back with Array.Copy. Even Queue<T> uses an array but together with two indices (for head and tail).
In your case you also need Remove/RemoveAt and its performance won't be good (for the same reason: if you're not removing from list tail then another array must be allocated and items copied).
A better data structure to have quick Dequeue/Remove time is a linked list (you'll sacrifice - a little bit - performance for Enqueue but assuming your queue has an equal number of Enqueue/Dequeue operations you'll have a great gain in performance, especially when its size will grow).
Let's see a simple skeleton for its implementation (I'll skip implementation for IEnumerable<T>, IList<T> and other helper methods).
class LinkedQueue<T>
{
public int Count
{
get { return _items.Count; }
}
public void Enqueue(T item)
{
_items.AddLast(item);
}
public T Dequeue()
{
if (_items.First == null)
throw new InvalidOperationException("...");
var item = _items.First.Value;
_items.RemoveFirst();
return item;
}
public void Remove(T item)
{
_items.Remove(item);
}
public void RemoveAt(int index)
{
Remove(_items.Skip(index).First());
}
private LinkedList<T> _items = new LinkedList<T>();
}
For a quick comparison:
Queue List LinkedList
Enqueue O(1)/O(n)* O(1)/O(n)* O(1)
Dequeue O(1) O(n) O(1)
Remove n/a O(n) O(n)
* O(1) is typical case but sometimes it'll be O(n) (when internal array need to be resized).
Of course you'll pay something for what you gain: memory usage is bigger (especially for small T overhead will be great). Right implementation (List<T> vs LinkedList<T>) must be chosen carefully according to your usage scenario, you may also convert that code to use a single linked list to reduce 50% of memory overhead.

Although there isn't a built-in way, you shouldn't use a List structure or other structure, IFF RemoveAt isn't a frequent operation.
If you are normally enqueuing and dequeuing but only occasionally removing, then you should be able to afford a queue rebuild when removing.
public static void Remove<T>(this Queue<T> queue, T itemToRemove) where T : class
{
var list = queue.ToList(); //Needs to be copy, so we can clear the queue
queue.Clear();
foreach (var item in list)
{
if (item == itemToRemove)
continue;
queue.Enqueue(item);
}
}
public static void RemoveAt<T>(this Queue<T> queue, int itemIndex)
{
var list = queue.ToList(); //Needs to be copy, so we can clear the queue
queue.Clear();
for (int i = 0; i < list.Count; i++)
{
if (i == itemIndex)
continue;
queue.Enqueue(list[i]);
}
}
The following approach might be more efficient, using less memory, and thus less GC:
public static void RemoveAt<T>(this Queue<T> queue, int itemIndex)
{
var cycleAmount = queue.Count;
for (int i = 0; i < cycleAmount; i++)
{
T item = queue.Dequeue();
if (i == itemIndex)
continue;
queue.Enqueue(item);
}
}

Someone will probably develop a better solution, but from what I see you will need to return a new Queue object in your Remove method. You'll want to check if the index is out of bounds and I may have got the ordering of the items being added wrong, but here's a quick and dirty example that could be made into an extension quite easily.
public class MyQueue<T> : Queue<T> {
public MyQueue()
: base() {
// Default constructor
}
public MyQueue(Int32 capacity)
: base(capacity) {
// Default constructor
}
/// <summary>
/// Removes the item at the specified index and returns a new Queue
/// </summary>
public MyQueue<T> RemoveAt(Int32 index) {
MyQueue<T> retVal = new MyQueue<T>(Count - 1);
for (Int32 i = 0; i < this.Count - 1; i++) {
if (i != index) {
retVal.Enqueue(this.ElementAt(i));
}
}
return retVal;
}
}

I do not believe we should be using List<T> to emulate a queue, for a queue, the enqueue and dequeue operations should be very highly performant, which they would not be when using a List<T>. For the RemoveAt method however, it is acceptable to be non-performant, as it is not the primary purpose of a Queue<T>.
My approach at implementing RemoveAt is O(n) but the queue still maintains a largely O(1) enqueue (sometimes the internal array needs reallocating which makes the operations O(n)) and always O(1) dequeue.
Here is my implementation of a RemoveAt(int) extension method for a Queue<T>:
public static void RemoveAt<T>(this Queue<T> queue, int index)
{
Contract.Requires(queue != null);
Contract.Requires(index >= 0);
Contract.Requires(index < queue.Count);
var i = 0;
// Move all the items before the one to remove to the back
for (; i < index; ++i)
{
queue.MoveHeadToTail();
}
// Remove the item at the index
queue.Dequeue();
// Move all subsequent items to the tail end of the queue.
var queueCount = queue.Count;
for (; i < queueCount; ++i)
{
queue.MoveHeadToTail();
}
}
Where MoveHeadToTail is defined as follows:
private static void MoveHeadToTail<T>(this Queue<T> queue)
{
Contract.Requires(queue != null);
var dequed = queue.Dequeue();
queue.Enqueue(dequed);
}
This implementation also modifies the actual Queue<T> rather than returning a new Queue<T> (which I think is more in-keeping with other RemoveAt implementations).

In fact, this defeats the whole purpose of Queue and the class you'll eventually come up with the will violate the FIFO semantics altogether.

If the Queue is being used to preserve the order of the items in the collection, and if you will not have duplicate items, then a SortedSet might be what you are looking for. The SortedSet acts much like a List<T>, but stays ordered. Great for things like drop down selections.

David Anderson's solution is probably the best but has some overhead.
Are you using custom objects in the queue? if so, add a boolean like cancel
Check with your workers that process the queue if that boolean is set and then skip it.

Note that with a list you can make the "removal" process more efficient if you don't actually remove the item but merely "mark" it as "removed". Yes, you have to add a bit of code to deal with how you've done it, but the payoff is the efficiency.
Just as one example - Say you have a List<string>. Then you can, for example, just set that particular item to null and be done with it.

The queue class is so difficult to understand. Use a generic list instead.

Related

Correct Concurrent Collection for storing timed non-recurring structures

I am searching for right thread-safe collection (concurrent collection) for the following scenario:
I may have requests from an external source which generates GUIDs (so it is unique and non-recurring). I need to store (say the last 100 requests) and check if duplicate GUIDs are delivered or not. I may not save all GUIDs more than 100 or so due to some limitations.
Now the problem is that when this mechanism is used in a service, it must be bound to 100 items and searching based on GUIDs is vital.
I decided to use ConcurrentDictionary yet I doubt it is a good decision since I may change the keys after using the whole 100 slots. I may find a good mechanism to replace the oldest requests when dictionary is full.
Any idea is much appreciated.
A code snippet is provided to show my incomplete implementation
public static ConcurrentDictionary<string, TimedProto> IncidentsCreated = new ConcurrentDictionary<string, TimedProto>(20, 100);
private static bool AddTo_AddedIncidents(proto ReceivedIncident)
{
try
{
int OldestCounter = 0;
DateTime OldestTime = DateTime.Now;
if (IncidentsCreated.Count < 100)
{
TimedProto tp = new TimedProto();
tp.IncidentProto = ReceivedIncident;
tp.time = DateTime.Now;
IncidentsCreated.AddOrUpdate(ReceivedIncident.IncidentGUID, tp,
(s,i) => i);
return true;
}
else //array is full, a replace oldest mechanism is required
{
}
return true;
}
catch (Exception ex)
{
LogEvent("AddTo_AddedIncidents\n"+ex.ToString(), EventLogEntryType.Error);
return false;
}
}
public struct proto
{
public string IncidentGUID;
//other variables
}
public struct TimedProto
{
public proto IncidentProto;
public DateTime time;
}
Thanks
Try this one: http://ayende.com/blog/162529/trivial-lru-cache-impl?key=02e8069c-62f8-4042-a7d2-d93806369824&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+AyendeRahien+%28Ayende+%40+Rahien%29
Your implementation is flawed since you do use DateTime which has a granularity of 15ms. This means that you can accidentally delete even your most recent guids if you have a high inflow.
public class LruCache<TKey, TValue>
{
private readonly int _capacity;
private readonly Stopwatch _stopwatch = Stopwatch.StartNew();
class Reference<T> where T : struct
{
public T Value;
}
private class Node
{
public TValue Value;
public volatile Reference<long> Ticks;
}
private readonly ConcurrentDictionary<TKey, Node> _nodes = new ConcurrentDictionary<TKey, Node>();
public LruCache(int capacity)
{
Debug.Assert(capacity > 10);
_capacity = capacity;
}
public void Set(TKey key, TValue value)
{
var node = new Node
{
Value = value,
Ticks = new Reference<long> { Value = _stopwatch.ElapsedTicks }
};
_nodes.AddOrUpdate(key, node, (_, __) => node);
if (_nodes.Count > _capacity)
{
foreach (var source in _nodes.OrderBy(x => x.Value.Ticks).Take(_nodes.Count / 10))
{
Node _;
_nodes.TryRemove(source.Key, out _);
}
}
}
public bool TryGet(TKey key, out TValue value)
{
Node node;
if (_nodes.TryGetValue(key, out node))
{
node.Ticks = new Reference<long> {Value = _stopwatch.ElapsedTicks};
value = node.Value;
return true;
}
value = default(TValue);
return false;
}
}
I would use a Circular Buffer for this - there are plenty of implementations around, including this one, and making a thread-safe wrapper for one of them wouldn't be hard.
With only 100 or so slots, looking up by key would be reasonably efficient, and inserting would be extremely efficient (no reallocation as old items are discarded and replaced by new ones).

Lock only on an Id

I have a method which needs to run exclusivley run a block of code, but I want to add this restriction only if it is really required. Depending on an Id value (an Int32) I would be loading/modifying distinct objects, so it doesn't make sense to lock access for all threads. Here's a first attempt of doing this -
private static readonly ConcurrentDictionary<int, Object> LockObjects = new ConcurrentDictionary<int, Object>();
void Method(int Id)
{
lock(LockObjects.GetOrAdd(Id,new Object())
{
//Do the long running task here - db fetches, changes etc
Object Ref;
LockObjects.TryRemove(Id,out Ref);
}
}
I have my doubts if this would work - the TryRemove can fail (which will cause the ConcurrentDictionary to keep getting bigger).
A more obvious bug is that the TryRemove successfully removes the Object but if there are other threads (for the same Id) which are waiting (locked out) on this object, and then a new thread with the same Id comes in and adds a new Object and starts processing, since there is no one else waiting for the Object it just added.
Should I be using TPL or some sort of ConcurrentQueue to queue up my tasks instead ? What's the simplest solution ?
I use a similar approach to lock resources for related items rather than a blanket resource lock... It works perfectly.
Your almost there but you really don't need to remove the object from the dictionary; just let the next object with that id get the lock on the object.
Surely there is a limit to the number of unique ids in your application? What is that limit?
The main semantic issue I see is that an object can be locked without being listed in the collection because the the last line in the lock removes it and a waiting thread can pick it up and lock it.
Change the collection to be a collection of objects that should guard a lock. Do not name it LockedObjects and do not remove the objects from the collection unless you no longer expect the object to be needed.
I always think of this type of objects as a key instead of a lock or blocked object; the object is not locked, it is a key to locked sequences of code.
I used the following approach. Do not check the original ID, but get small hash-code of int type to get the existing object for lock. The count of lockers depends on your situation - the more locker counter, the less the probability of collision.
class ThreadLocker
{
const int DEFAULT_LOCKERS_COUNTER = 997;
int lockersCount;
object[] lockers;
public ThreadLocker(int MaxLockersCount)
{
if (MaxLockersCount < 1) throw new ArgumentOutOfRangeException("MaxLockersCount", MaxLockersCount, "Counter cannot be less, that 1");
lockersCount = MaxLockersCount;
lockers = Enumerable.Range(0, lockersCount).Select(_ => new object()).ToArray();
}
public ThreadLocker() : this(DEFAULT_LOCKERS_COUNTER) { }
public object GetLocker(int ObjectID)
{
var idx = (ObjectID % lockersCount + lockersCount) % lockersCount;
return lockers[idx];
}
public object GetLocker(string ObjectID)
{
var hash = ObjectID.GetHashCode();
return GetLocker(hash);
}
public object GetLocker(Guid ObjectID)
{
var hash = ObjectID.GetHashCode();
return GetLocker(hash);
}
}
Usage:
partial class Program
{
static ThreadLocker locker = new ThreadLocker();
static void Main(string[] args)
{
var id = 10;
lock(locker.GetLocker(id))
{
}
}
}
Of cource, you can use any hash-code functions to get the corresponded array index.
If you want to use the ID itself and do not allow collisions, caused by hash-code, you can you the next approach. Maintain the Dictionary of objects and store info about the number of the threads, that want to use ID:
class ThreadLockerByID<T>
{
Dictionary<T, lockerObject<T>> lockers = new Dictionary<T, lockerObject<T>>();
public IDisposable AcquireLock(T ID)
{
lockerObject<T> locker;
lock (lockers)
{
if (lockers.ContainsKey(ID))
{
locker = lockers[ID];
}
else
{
locker = new lockerObject<T>(this, ID);
lockers.Add(ID, locker);
}
locker.counter++;
}
Monitor.Enter(locker);
return locker;
}
protected void ReleaseLock(T ID)
{
lock (lockers)
{
if (!lockers.ContainsKey(ID))
return;
var locker = lockers[ID];
locker.counter--;
if (Monitor.IsEntered(locker))
Monitor.Exit(locker);
if (locker.counter == 0)
lockers.Remove(locker.id);
}
}
class lockerObject<T> : IDisposable
{
readonly ThreadLockerByID<T> parent;
internal readonly T id;
internal int counter = 0;
public lockerObject(ThreadLockerByID<T> Parent, T ID)
{
parent = Parent;
id = ID;
}
public void Dispose()
{
parent.ReleaseLock(id);
}
}
}
Usage:
partial class Program
{
static ThreadLockerByID<int> locker = new ThreadLockerByID<int>();
static void Main(string[] args)
{
var id = 10;
using(locker.AcquireLock(id))
{
}
}
}
There are mini-libraries that do this for you, such as AsyncKeyedLock. I've used it and it saved me a lot of headaches.

Threading and List<> collection

I have List<string> collection called List<string> list.
I have two threads.
One thread is enumerating through all list elements and adding to collection.
Second thread is enumerating through all list elements and removing from it.
How can make it thread safe?
I tried creating global Object "MyLock" and using lock(MyLock) block in each thread function but it didn't work.
Can you help me?
If you have access to .NET 4.0 you can use the class ConcurrentQueue or a BlockingCollection with a ConcurrentQueue backing it. It does exactly what you are trying to do and does not require any locking. The BlockingCollection will make your thread wait if there is no items available in the list.
A example of removing from the ConcurrentQueue you do something like
ConcurrentQueue<MyClass> cq = new ConcurrentQueue<MyClass>();
void GetStuff()
{
MyClass item;
if(cq.TryDeqeue(out item))
{
//Work with item
}
}
This will try to remove a item, but if there are none available it does nothing.
BlockingCollection<MyClass> bc = BlockingCollection<MyClass>(new ConcurrentQueue<MyClass>());
void GetStuff()
{
if(!bc.IsCompleated) //check to see if CompleatedAdding() was called and the list is empty.
{
try
{
MyClass item = bc.Take();
//Work with item
}
catch (InvalidOpperationExecption)
{
//Take is marked as completed and is empty so there will be nothing to take
}
}
}
This will block and wait on the Take till there is something available to take from the list. Once you are done you can call CompleteAdding() and Take will throw a execption when the list becomes empty instead of blocking.
Without knowing more about your program and requirements, I'm going say that this is a "Bad Idea". Altering a List<> while iterating through it's contents will most likely throw an exception.
You're better off using a Queue<> instead of a List<>, as a Queue<> was designed with synchronization in mind.
You should be able to lock directly on your list:
lock(list) {
//work with list here
}
However adding/removing from the list while enumerating it will likely cause an exception...
Lock on the SyncRoot of your List<T>:
lock(list.SyncRoot)
{
}
More information on how to use it properly can be found here
You could implement your own version of IList<T> that wraps the underlying List<T> to provide locking on every method call.
public class LockingList<T> : IList<T>
{
public LockingList(IList<T> inner)
{
this.Inner = inner;
}
private readonly object gate = new object();
public IList<T> Inner { get; private set; }
public int IndexOf(T item)
{
lock (gate)
{
return this.Inner.IndexOf(item);
}
}
public void Insert(int index, T item)
{
lock (gate)
{
this.Inner.Insert(index, item);
}
}
public void RemoveAt(int index)
{
lock (gate)
{
this.Inner.RemoveAt(index);
}
}
public T this[int index]
{
get
{
lock (gate)
{
return this.Inner[index];
}
}
set
{
lock (gate)
{
this.Inner[index] = value;
}
}
}
public void Add(T item)
{
lock (gate)
{
this.Inner.Add(item);
}
}
public void Clear()
{
lock (gate)
{
this.Inner.Clear();
}
}
public bool Contains(T item)
{
lock (gate)
{
return this.Inner.Contains(item);
}
}
public void CopyTo(T[] array, int arrayIndex)
{
lock (gate)
{
this.Inner.CopyTo(array, arrayIndex);
}
}
public int Count
{
get
{
lock (gate)
{
return this.Inner.Count;
}
}
}
public bool IsReadOnly
{
get
{
lock (gate)
{
return this.Inner.IsReadOnly;
}
}
}
public bool Remove(T item)
{
lock (gate)
{
return this.Inner.Remove(item);
}
}
public IEnumerator<T> GetEnumerator()
{
lock (gate)
{
return this.Inner.ToArray().AsEnumerable().GetEnumerator();
}
}
IEnumerator IEnumerable.GetEnumerator()
{
lock (gate)
{
return this.Inner.ToArray().GetEnumerator();
}
}
}
You would use this code like this:
var list = new LockingList<int>(new List<int>());
If you're using large lists and/or performance is an issue then this kind of locking may not be terribly performant, but in most cases it should be fine.
It is very important to notice that the two GetEnumerator methods call .ToArray(). This forces the evaluation of the enumerator before the lock is released thus ensuring that any modifications to the list don't affect the actual enumeration.
Using code like lock (list) { ... } or lock (list.SyncRoot) { ... } do not cover you against list changes occurring during enumerations. These solutions only cover against concurrent modifications to the list - and that's only if all callers do so within a lock. Also these solutions can cause your code to die if some nasty bit of code takes a lock and doesn't release it.
In my solution you'll notice I have a object gate that is a private variable internal to the class that I lock on. Nothing outside the class can lock on this so it is safe.
I hope this helps.
As others already said, you can use concurrent collections from the System.Collections.Concurrent namespace. If you can use one of those, this is preferred.
But if you really want a list which is just synchronized, you could look at the SynchronizedCollection<T>-Class in System.Collections.Generic.
Note that you had to include the System.ServiceModel assembly, which is also the reason why I don't like it so much. But sometimes I use it.

C# Priority Queue

I'm looking for a priority queue with an interface like this:
class PriorityQueue<T>
{
public void Enqueue(T item, int priority)
{
}
public T Dequeue()
{
}
}
All the implementations I've seen assume that item is an IComparable but I don't like this approach; I want to specify the priority when I'm pushing it onto the queue.
If a ready-made implementation doesn't exist, what's the best way to go about doing this myself? What underlying data structure should I use? Some sort of self-balancing tree, or what? A standard C#.net structure would be nice.
If you have an existing priority queue implementation based on IComparable, you can easily use that to build the structure you need:
public class CustomPriorityQueue<T> // where T need NOT be IComparable
{
private class PriorityQueueItem : IComparable<PriorityQueueItem>
{
private readonly T _item;
private readonly int _priority:
// obvious constructor, CompareTo implementation and Item accessor
}
// the existing PQ implementation where the item *does* need to be IComparable
private readonly PriorityQueue<PriorityQueueItem> _inner = new PriorityQueue<PriorityQueueItem>();
public void Enqueue(T item, int priority)
{
_inner.Enqueue(new PriorityQueueItem(item, priority));
}
public T Dequeue()
{
return _inner.Dequeue().Item;
}
}
You can add safety checks and what not, but here is a very simple implementation using SortedList:
class PriorityQueue<T> {
SortedList<Pair<int>, T> _list;
int count;
public PriorityQueue() {
_list = new SortedList<Pair<int>, T>(new PairComparer<int>());
}
public void Enqueue(T item, int priority) {
_list.Add(new Pair<int>(priority, count), item);
count++;
}
public T Dequeue() {
T item = _list[_list.Keys[0]];
_list.RemoveAt(0);
return item;
}
}
I'm assuming that smaller values of priority correspond to higher priority items (this is easy to modify).
If multiple threads will be accessing the queue you will need to add a locking mechanism too. This is easy, but let me know if you need guidance here.
SortedList is implemented internally as a binary tree.
The above implementation needs the following helper classes. This address Lasse V. Karlsen's comment that items with the same priority can not be added using the naive implementation using a SortedList.
class Pair<T> {
public T First { get; private set; }
public T Second { get; private set; }
public Pair(T first, T second) {
First = first;
Second = second;
}
public override int GetHashCode() {
return First.GetHashCode() ^ Second.GetHashCode();
}
public override bool Equals(object other) {
Pair<T> pair = other as Pair<T>;
if (pair == null) {
return false;
}
return (this.First.Equals(pair.First) && this.Second.Equals(pair.Second));
}
}
class PairComparer<T> : IComparer<Pair<T>> where T : IComparable {
public int Compare(Pair<T> x, Pair<T> y) {
if (x.First.CompareTo(y.First) < 0) {
return -1;
}
else if (x.First.CompareTo(y.First) > 0) {
return 1;
}
else {
return x.Second.CompareTo(y.Second);
}
}
}
You could write a wrapper around one of the existing implementations that modifies the interface to your preference:
using System;
class PriorityQueueThatYouDontLike<T> where T: IComparable<T>
{
public void Enqueue(T item) { throw new NotImplementedException(); }
public T Dequeue() { throw new NotImplementedException(); }
}
class PriorityQueue<T>
{
class ItemWithPriority : IComparable<ItemWithPriority>
{
public ItemWithPriority(T t, int priority)
{
Item = t;
Priority = priority;
}
public T Item {get; private set;}
public int Priority {get; private set;}
public int CompareTo(ItemWithPriority other)
{
return Priority.CompareTo(other.Priority);
}
}
PriorityQueueThatYouDontLike<ItemWithPriority> q = new PriorityQueueThatYouDontLike<ItemWithPriority>();
public void Enqueue(T item, int priority)
{
q.Enqueue(new ItemWithPriority(item, priority));
}
public T Dequeue()
{
return q.Dequeue().Item;
}
}
This is the same as itowlson's suggestion. I just took longer to write mine because I filled out more of the methods. :-s
Here's a very simple lightweight implementation that has O(log(n)) performance for both push and pop. It uses a heap data structure built on top of a List<T>.
/// <summary>Implements a priority queue of T, where T has an ordering.</summary>
/// Elements may be added to the queue in any order, but when we pull
/// elements out of the queue, they will be returned in 'ascending' order.
/// Adding new elements into the queue may be done at any time, so this is
/// useful to implement a dynamically growing and shrinking queue. Both adding
/// an element and removing the first element are log(N) operations.
///
/// The queue is implemented using a priority-heap data structure. For more
/// details on this elegant and simple data structure see "Programming Pearls"
/// in our library. The tree is implemented atop a list, where 2N and 2N+1 are
/// the child nodes of node N. The tree is balanced and left-aligned so there
/// are no 'holes' in this list.
/// <typeparam name="T">Type T, should implement IComparable[T];</typeparam>
public class PriorityQueue<T> where T : IComparable<T> {
/// <summary>Clear all the elements from the priority queue</summary>
public void Clear () {
mA.Clear ();
}
/// <summary>Add an element to the priority queue - O(log(n)) time operation.</summary>
/// <param name="item">The item to be added to the queue</param>
public void Add (T item) {
// We add the item to the end of the list (at the bottom of the
// tree). Then, the heap-property could be violated between this element
// and it's parent. If this is the case, we swap this element with the
// parent (a safe operation to do since the element is known to be less
// than it's parent). Now the element move one level up the tree. We repeat
// this test with the element and it's new parent. The element, if lesser
// than everybody else in the tree will eventually bubble all the way up
// to the root of the tree (or the head of the list). It is easy to see
// this will take log(N) time, since we are working with a balanced binary
// tree.
int n = mA.Count; mA.Add (item);
while (n != 0) {
int p = n / 2; // This is the 'parent' of this item
if (mA[n].CompareTo (mA[p]) >= 0) break; // Item >= parent
T tmp = mA[n]; mA[n] = mA[p]; mA[p] = tmp; // Swap item and parent
n = p; // And continue
}
}
/// <summary>Returns the number of elements in the queue.</summary>
public int Count {
get { return mA.Count; }
}
/// <summary>Returns true if the queue is empty.</summary>
/// Trying to call Peek() or Next() on an empty queue will throw an exception.
/// Check using Empty first before calling these methods.
public bool Empty {
get { return mA.Count == 0; }
}
/// <summary>Allows you to look at the first element waiting in the queue, without removing it.</summary>
/// This element will be the one that will be returned if you subsequently call Next().
public T Peek () {
return mA[0];
}
/// <summary>Removes and returns the first element from the queue (least element)</summary>
/// <returns>The first element in the queue, in ascending order.</returns>
public T Next () {
// The element to return is of course the first element in the array,
// or the root of the tree. However, this will leave a 'hole' there. We
// fill up this hole with the last element from the array. This will
// break the heap property. So we bubble the element downwards by swapping
// it with it's lower child until it reaches it's correct level. The lower
// child (one of the orignal elements with index 1 or 2) will now be at the
// head of the queue (root of the tree).
T val = mA[0];
int nMax = mA.Count - 1;
mA[0] = mA[nMax]; mA.RemoveAt (nMax); // Move the last element to the top
int p = 0;
while (true) {
// c is the child we want to swap with. If there
// is no child at all, then the heap is balanced
int c = p * 2; if (c >= nMax) break;
// If the second child is smaller than the first, that's the one
// we want to swap with this parent.
if (c + 1 < nMax && mA[c + 1].CompareTo (mA[c]) < 0) c++;
// If the parent is already smaller than this smaller child, then
// we are done
if (mA[p].CompareTo (mA[c]) <= 0) break;
// Othewise, swap parent and child, and follow down the parent
T tmp = mA[p]; mA[p] = mA[c]; mA[c] = tmp;
p = c;
}
return val;
}
/// <summary>The List we use for implementation.</summary>
List<T> mA = new List<T> ();
}
That is the exact interface used by my highly optimized C# priority-queue.
It was developed specifically for pathfinding applications (A*, etc.), but should work perfectly for any other application as well.
public class User
{
public string Name { get; private set; }
public User(string name)
{
Name = name;
}
}
...
var priorityQueue = new SimplePriorityQueue<User>();
priorityQueue.Enqueue(new User("Jason"), 1);
priorityQueue.Enqueue(new User("Joseph"), 10);
//Because it's a min-priority queue, the following line will return "Jason"
User user = priorityQueue.Dequeue();
What would be so terrible about something like this?
class PriorityQueue<TItem, TPriority> where TPriority : IComparable
{
private SortedList<TPriority, Queue<TItem>> pq = new SortedList<TPriority, Queue<TItem>>();
public int Count { get; private set; }
public void Enqueue(TItem item, TPriority priority)
{
++Count;
if (!pq.ContainsKey(priority)) pq[priority] = new Queue<TItem>();
pq[priority].Enqueue(item);
}
public TItem Dequeue()
{
--Count;
var queue = pq.ElementAt(0).Value;
if (queue.Count == 1) pq.RemoveAt(0);
return queue.Dequeue();
}
}
class PriorityQueue<TItem> : PriorityQueue<TItem, int> { }
I realise that your question specifically asks for a non-IComparable-based implementation, but I want to point out a recent article from Visual Studio Magazine.
http://visualstudiomagazine.com/articles/2012/11/01/priority-queues-with-c.aspx
This article with #itowlson's can give a complete answer.
A little late but I'll add it here for reference
https://github.com/ERufian/Algs4-CSharp
Key-value-pair priority queues are implemented in Algs4/IndexMaxPQ.cs, Algs4/IndexMinPQ.cs and Algs4/IndexPQDictionary.cs
Notes:
If the Priorities are not IComparable's, an IComparer can be specified in the constructor
Instead of enqueueing the object and its priority, what is enqueued is an index and its priority (and, for the original question, a separate List or T[] would be needed to convert that index to the expected result)
.NET6 finally offers an API for PriorityQueue
See here
Seems like you could roll your own with a seriews of Queues, one for each priority. Dictionary and just add it to the appropriate one.

Which is the best way to improve memory usage when you collect a large data set before processing it? (.NET)

When I have to get GBs of data, save it on a collection and process it, I have memory overflows. So instead of:
public class Program
{
public IEnumerable<SomeClass> GetObjects()
{
var list = new List<SomeClass>();
while( // get implementation
list.Add(object);
}
return list;
}
public void ProcessObjects(IEnumerable<SomeClass> objects)
{
foreach(var object in objects)
// process implementation
}
void Main()
{
var objects = GetObjects();
ProcessObjects(objects);
}
}
I need to:
public class Program
{
void ProcessObject(SomeClass object)
{
// process implementation
}
public void GetAndProcessObjects()
{
var list = new List<SomeClass>();
while( // get implementation
Process(object);
}
return list;
}
void Main()
{
var objects = GetAndProcessObjects();
}
}
There is a better way?
You ought to leverage C#'s iterator blocks and use the yield return statement to do something like this:
public class Program
{
public IEnumerable<SomeClass> GetObjects()
{
while( // get implementation
yield return object;
}
}
public void ProcessObjects(IEnumerable<SomeClass> objects)
{
foreach(var object in objects)
// process implementation
}
void Main()
{
var objects = GetObjects();
ProcessObjects(objects);
}
}
This would allow you to stream each object and not keep the entire sequence in memory - you would only need to keep one object in memory at a time.
Don't use a List, which requires all the data to be present in memory at once. Use IEnumerable<T> and produce the data on demand, or better, use IQueryable<T> and have the entire execution of the query deferred until the data are required.
Alternatively, don't keep the data in memory at all, but rather save the data to a database for processing. When processing is complete, then query the database for the results.
public IEnumerable<SomeClass> GetObjects()
{
foreach( var obj in GetIQueryableObjects
yield return obj
}
You want to yield!
Delay processing of your enumeration. Build a method that returns an IEnumerable but only returns one record at a time using the yield statement.
The best methodology in this case would be to Get and Process in chunks. You will have to find out how big a chunk to Get and Process by trial and error. So the code would be something like :
public class Program
{
public IEnumerable GetObjects(int anchor, int chunkSize)
{
var list = new List();
while( // get implementation for given anchor and chunkSize
list.Add(object);
}
return list;
}
public void ProcessObjects(IEnumerable<SomeClass> objects)
{
foreach(var object in objects)
// process implementation
}
void Main()
{
int chunkSize = 5000;
int totalSize = //Get Total Number of rows;
int anchor = //Get first row to process as anchor;
While (anchor < totalSize)
(
var objects = GetObjects(anchor, chunkSize);
ProcessObjects(objects);
anchor += chunkSize;
}
}
}

Categories

Resources