Time-stamped, thread-safe data structure for time-based lookup?

Time-stamped, thread-safe data structure for time-based lookup? - c#

I want to implement a list type data structure that can be appended to, with an associated time-stamp. The point of this is that I can then get all the data that is newer than a certain time-stamp.
I have tried doing this with a ConcurrantDicitionary but I'm not confident this is the best way to do it. I would much prefer to have a List< decimal[2] > for applications which I won't go into here. The first value of the array can have the timestamp and the second will be the value. Alternatively, could use List< TimeStampedObject >. However, apparently there is no such thing as a concurrent list in C#.
For the record, my data is ordered with regards to timestamp.
I want to be able to do things like:
public static Dictionary<DateTime, decimal> GetLatest(DateTime since, Dictionary<DateTime, decimal> requestedDict)
{
Dictionary<DateTime, decimal> returnList = new Dictionary<DateTime, decimal>();
returnList = requestedDict.Where(x => x.Key > since).ToDictionary(x => x.Key, x => x.Value);
return returnList;
}
UPDATE:
Here is the List item I have come up with; please let me know if this has any potential downfalls:
public class ConcurrentList: List<StampedValue>
{
ReaderWriterLockSlim _samplesLock = new ReaderWriterLockSlim();
public ConcurrentList() : base()
{
}
public void AddThreadSafe(StampedValue item){
this._samplesLock.EnterWriteLock();
try
{
this.Add(item);
}
finally
{
this._samplesLock.ExitWriteLock();
}
}
public List<StampedValue> GetLatest(long since){
return this.Where( s => s.Timestamp > since ).ToList();
}
public List<StampedValue> GetLatest(DateTime since){
throw new NotImplementedException();
}
}
public class StampedValue
{
public long Timestamp { get; set; }
public decimal Value { get; set; }
public StampedValue(long t, decimal v){
this.Timestamp = t;
this.Value = v;
}
}

Seems to me your best bet is just a List<T> that you protect with a ReaderWriterLockSlim. For example:
class Sample
{
public DateTime EventTime { get; set; }
public decimal Value { get; set; }
}
List<Sample> _samples = new List<Sample>();
ReaderWriterLockSlim _samplesLock = new ReaderWriterLockSlim();
// to get items after a particular date
List<Sample> GetSamplesAfterDate(DateTime dt)
{
_samplesLock.EnterReadLock();
try
{
return _samples.Where(s => s.EventTime >= dt).ToList();
}
finally
{
_samplesLock.ExitReadLock();
}
}
If your list is known to be in chronological order, then you can improve the performance by using binary search on the list to find the first item that's greater than or equal to your passed time stamp. I just used the LINQ version here because the point is to illustrate the locking.
Appending to the list is similar: acquire the write lock, append, and release the lock:
void AppendSample(Sample s)
{
_samplesLock.EnterWriteLock();
try
{
_samples.Add(s);
}
finally
{
_samplesLock.ExitWriteLock();
}
}
An alternative is to use List<KeyValuePair<DateTime, decimal>> rather than List<Sample>. The locking would remain the same.
This should perform quite well in most situations.

Have you looked at the SynchronizedCollection<T> class? It seems to me to be what you are looking for. You could also specialize SynchronizedKeyedCollection<K, T>
EDIT (2014/May/8):
The documentation I linked to above is not as clear or useful as one would like, as such it may be helpful to look at the reference implementation.

Related

wrap dictionary for readability?

I'm building a translator that saves the translation in a dictionary where the first string is an identifier and the seconds string is the translated string.
It seems to me that the dictionary syntax is not very readable so I'm thinking about wrapping my dictionary like
class Translation : Dictionary<string,string>{}
and then also the keyvaluepair like
class SingleTranslation : KeyValuePair<string,string>
But the KeyValuePair class is sealed (can not be inherited). Does anyone have any suggestions on how I can make my dictionary more readable?
My biggest worry is when I have to iterate over the dictionary with
foreach(KeyValuePair<string,string> kvp in _translation)
{
string whatever = kvp.Value;
do stuff...
if(kvp.key)
do stuff..
}
I could of course create a string in the foreach that is called Identifier and set it equal to kvp.key. But I would prefer something like
foreach(SingleTranslation singleTranslation in _translation)
{
singleTranslation.Identifier ... do stuff...
}

Don't do that. Either use Dictionnary directly for complete access or use composition if you want more control.
Also use var in foreach loops. There is no value in defining a custom type for that (and it should not even works as you try to convert KeyValuePair to a derived class. And by the way, this is one reason why it is sealed.
If you really want to use custom types, and do not want to write much custom code, then maybe something like that could works for you:
class Translation
{
public Dictionary<string,string> Data { get } = new Dictionary<string,string>;
}
Then you could do:
Translation t; // Fill some data...
foreach (var item in t.Data) { … }
That way, you can ensure that you don't pass the improper dictionary to functions as you use distinct types for each case:
void DisplayTranslation(Translation t) { … }
If you want, you could improve your Translation class so that it does not expose the internal dictionary but expose appropriate members, properties and interfaces for the desired usage.

You could always use something other than a dictionary, like a class that inherits from List and then add an indexer on it so you could still use syntax like translations["myIndex"]. The code below could be optimized, but you can get the idea.
public class Translations : List<SingleTranslation>
{
public SingleTranslation this[string identifier]
{
get
{
return this.FirstOrDefault(p => p.Identifier == identifier);
}
set
{
SingleTranslation translation = this.FirstOrDefault(p => p.Identifier == identifier);
if (translation == null)
{
this.Add(value);
}
else
{
translation.Value = value.Value;
}
}
}
}
public class SingleTranslation
{
public SingleTranslation(string identifier, string value)
{
Identifier = identifier;
Value = value;
}
public string Identifier { get; set; }
public string Value { get; set; }
}
Sample usage:
public class Program
{
public static void Main()
{
Translations translations = new Translations();
translations.Add(new SingleTranslation("hello", "hola"));
translations.Add(new SingleTranslation("day", "día"));
foreach(SingleTranslation translation in translations)
{
Console.WriteLine("{0}: {1}", translation.Identifier, translation.Value);
}
translations["hello"].Value = "salut";
translations["day"].Value = "jour";
foreach(SingleTranslation translation in translations)
{
Console.WriteLine("{0}: {1}", translation.Identifier, translation.Value);
}
}
}
A working example of this is in this fiddle:

If readability is simply your issue, you could alias it within the namespace declaration.
using SingleTranslation = KeyValuePair<string,string>;

Fastest way to compare two List<CustomObject>

I have two List<CustomObject>, called list1 and list2
public class CustomObject
{
public string foo { get; set; }
public string bar{ get; set; }
}
The goal is to generate a new list with all the entries that have been modified/added in list2.
Because these lists can get pretty long, looping through them is not an option ...
Any ideas?

Adding another answer to accomodate some additional NFRs that have come up in the comments:
Objects can be identified by a hash code
The list is very big, so performance is an issue
The idea is to compare an old list to a new list to see if any new hash codes have popped up.
You will want to store your objects in a dictionary:
var list = new Dictionary<string, CustomObject>();
When you add them, provide the hash as the key:
list.Add(customObject.Hash, customObject);
To scan for new ones:
var difference = new List<CustomObject>();
foreach (customObject o in newList)
{
if (oldList.ContainsKey(o.Hash)) difference.Add(o);
}
Log(String.Format("{0} new hashes found.", difference.Count));
By using the Dictionary you take advantage of the way the keys are stored in a hash table. Finding an item in a hash table is faster than just doing a scan & compare sort of thing. I believe this will be O(n*log(n)) instead of O(n^2).

Here's a traditional way to do it:
public class CustomObject : IComparable
{
public string foo { get; set; }
public string bar{ get; set; }
public int CompareTo(CustomObject o)
{
if (this.foo == o.foo && this.bar == o.bar) return 0;
//We have to code for the < and > comparisons too. Could get painful if there are a lot of properties to compare.
if (this.Foo == o.Foo) return (this.Bar.CompareTo(o.Bar));
return this.Foo.CompareTo(o.Foo);
}
}
Then use Linq.Except:
listA.Except(listB)

Trying to Utilise a generic <T> collection

I am using C# and I thought I finally had the chance to understand a Generic type. I have several strongly typed objects that need the same static method. Rather than create one static method for each type I thought I could make it generic. Something I have never done and really wanted too.
Here is where I invoke it.
bool isDuplicate = Utilities.GetDuplicates<RoomBookingModel>(roomBookings);
Here is my static method which resides in a static class called Utilities.
public static bool GetDuplicates<T>(List<T> pBookings)
{
foreach (var item in pBookings)
{
var myVal = item.bookingId
}
return true;
}
So I want to get at the values within var item inside the foreach loop so I can do comparisons. It's definately passed pBookings because I can hover and they have a .Count() with a collection of my strongly typed object. I am missing something here, possibly a casting process. I was wondering if anyone could advise me where I am coming up short.
var myVal = item.bookingId - I cannot get the bookingID from item because I am lacking in some basic understanding here. bookingId doesn't exist, I just get access to extension methods such as .toString and .equals
ANSWER OF SORTS What I did based on all of your really helpful assistance. I utilised Anderson Pimentel. I'm probably still off the mark but wanted to garner anyones thoughts here.
So basically I have several booking models, all need checking for duplicates. I really wanted to understand Generics in this way. So what I did is. Created a base class.
public class BookingBaseModel
{
public int BookingID { get; set; }
public DateTime BookingStartDateTime { get; set; }
public DateTime BookingEndDateTime { get; set; }
}
Then had my booking classes all inherit whats common to all. Like this...
public class RoomBookingModel : BookingBaseModel
{
public string RoomName{ get; set; }
}
public class vehicleBookingModel : BookingBaseModel
{
public string vehicleName{ get; set; }
}
Then in my utilities static helper I did this..
public static void GetDuplicates<T>(List<T> items) where T : BookingBaseModel
{
foreach (var item in items)
{
int myId = item.ID;
DateTime startDateTime = item.BookingStartDateTime;
DateTime endDateTime = item.BookingEndDateTime;
//Do you logic here
}
}
Then finally did something like this in corresponding controller action.
RoomController...
Utilities.GetDuplicates<RoomBookingModel>(roomBookings);
VehicleController....
Utilities.GetDuplicates<VehicleBookingModel>(vehicleBookings);
Is this basically how we go about using generics in this way?

The compiler has no hint of what type is T. If you have a base class (or an Interface) which has the bookingId attribute, like BaseModel, you can constrain the generic type like the following:
public class BaseModel
{
public int Id { get; set; }
}
public static bool GetDuplicates<T>(List<T> items) where T : BaseModel
{
foreach (var item in items)
{
var myId = item.Id;
//Do you logic here
}
return true;
}

Once you're inside your GetDuplicates method, you have lost all knowledge of the RoomBookingModel type. That's the point of generic methods - they should be able to act on whatever type has been passed in to them, e.g. the logic within them should be generic across any type.
So your foreach loop is fine - you know you've been given a list of something, and you know lists can be iterated. But inside that foreach, item is just a T. You don't know what actual type it is because any type could have been passed in. So it doesn't make sense to access a specific property or method off of item - for example, what if I called GetDuplicates passing in a List<int>? It wouldn't have a bookingId property.

As written by others, you don't know anything of T. A classical solution, used by LINQ (see for example GroupBy) is to have your method receive a delegate that does the key-extraction, like:
public static bool GetDuplicates<T, TKey>(List<T> pBookings, Func<T, TKey> selector)
{
foreach (var item in pBookings)
{
TKey key = selector(item);
}
return true;
}
You then use it like:
GetDuplicates(pBookings, p => p.bookingId);

If you like to use a generic method, you have to provide also a generic method, which is able to generate a key out of the specified type T. Luckily we have LINQ which already provides the needed parts to build your generic method:
internal class Extensions
{
public static IEnumerable<T> GetDuplicates<T, TKey>(this IEnumerable<T> source, Func<T, TKey> keySelector)
{
return source.GroupBy(keySelector)
.Where(group => group.Skip(1).Any())
.SelectMany(group => group);
}
public static bool ContainsDuplicates<T, TKey>(this IEnumerable<T> source, Func<T, TKey> keySelector)
{
return GetDuplicates(source, keySelector).Any();
}
}
By having this (and type inference) you can use these methods e.g. by calling:
var hasDuplicates = roomBookings.ContainsDuplicates(item => item.bookingId);
if(hasDuplicates)
{
Console.WriteLine("Duplicates found:");
foreach (var duplicate in roomBookings.GetDuplicates(item => item.bookingId))
{
Console.WriteLine(duplicate);
}
}

I wonder if generics is really the tool for the job here. Your needs would be better served if each of your strongly typed objects shared a common interface.
"I have several strongly typed objects that need the same static method."
In this situation, all of the classes must share a common feature, such as, for instance, a property BookingId.
So, you'd need to formalize this by extracting this common interface:
public interface IBooking
{
int BookingId{ get; }
}
Make sure that every one of your strongly typed items implements the interface:
public class RoomBooking : IBooking
{
//etc...
}
And now make your static method accept IBooking instances:
public static bool GetDuplicates(IEnumerable<IBooking> pBookings)
{
//does pBookings contain items with duplicate BookingId values?
return pBookings.GroupBy(b => b.BookingId).Any(g => g.Count() > 1);
}
An easy read that isn't obfuscated by the unnecessary use of generics.

Since there are no constraints or hints about what T is, the compiler does not have enough information. Consider
bool isDuplicate = Utilities.GetDuplicates<int>(roomBookings);
Clearly an int does not have a bookingId member.
Every possible specific type for T would have to have a common base class or interface that has a bookingId, and even then you would have to add a generic constraint to your method signature to access that.

Perhaps you are looking for something like this:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Duplicates
{
public static class EnumerableExtensions
{
public static bool HasDuplicates<T, I>(this IEnumerable<T> enumerable, Func<T, I> identityGetter, IEqualityComparer<I> comparer )
{
var hashSet = new HashSet<I>(comparer);
foreach (var item in enumerable)
{
var identity = identityGetter(item);
if (hashSet.Contains(identity)) return true;
hashSet.Add(identity);
}
return false;
}
public static bool HasDuplicates<T, I>(this IEnumerable<T> enumerable, Func<T, I> identityGetter)
{
return enumerable.HasDuplicates(identityGetter, EqualityComparer<I>.Default);
}
}
public class Booking
{
public int BookingId { get; set; }
public string BookingName { get; set; }
}
public class Customer
{
public string CustomerId { get; set; }
public string Name { get; set; }
}
class Program
{
static void Main(string[] args)
{
var bookings = new List<Booking>()
{
new Booking { BookingId = 1, BookingName = "Booking 1" },
new Booking { BookingId = 1, BookingName = "Booking 1" }
};
Console.WriteLine("Q: There are duplicate bookings?. A: {0}", bookings.HasDuplicates(x => x.BookingId));
var customers = new List<Customer>()
{
new Customer { CustomerId = "ALFKI", Name = "Alfred Kiss" },
new Customer { CustomerId = "ANATR", Name = "Ana Trorroja" }
};
Console.WriteLine("Q: There are duplicate customers?. A: {0} ", customers.HasDuplicates(x => x.CustomerId));
}
}
}

C# Generics Efficiency, a better way to do this

Ok, lets say I have classes such as the following:
public class KPIObject<T> //<--This class where T is the following classes
{
public List<T> Data { get; set; }
public string Caption { get; set; }
}
public class KPICycleCountAccuracyData //<--There are 20 of these with different names and values
{
public string Facility { get; set; }
public string CCAdjustedCases { get; set; }
public string TotalCases { get; set; }
public string CCAdjustedPercent { get; set; }
}
Then I have:
public List<ReportData>> ProcessAccountReport(GetAccountReport request)
{
var data = new List<ReportData>();
ProcessKPI(data, request.KPICycleCountAccuracy, "KPICycleCountAccuracy"); //<-- 20 of these
return data;
}
Here is the ProcessKPI method:
private static void ProcessKPI<T>(List<ReportData> data, ICollection<KPIObject<T>> items, string name)
{
if (items == null || items.Count <= 0) return;
foreach (var item in items)
{
if (item.Data == null || item.Data.Count <= 0) continue;
var temp = new List<object>();
temp.AddRange((IEnumerable<object>)item.Data);
data.Add(new ReportData { Data = temp, Name = name, Title = item.Caption });
}
}
All of this works and compiles correctly, I am just wondering if this is the most efficient way of doing this.
Thanks.
EDIT
I changed process KPI to this:
private static void ProcessKPI<T>(ICollection<ReportData> data, ICollection<KPIObject<T>> items, string name)
{
if (items == null || items.Count <= 0) return;
foreach (var item in items.Where(item => item.Data != null && item.Data.Count > 0))
{
data.Add(new ReportData { Data = (IEnumerable<object>)item.Data, Name = name, Title = item.Caption });
}
}

Couple of comments
There is no need to make data a ref parameter in ProcessKPI. A ref parameter is only meaningful for a class type in C# if you actually assign to it. Here you're just modifying the object so ref doesn't by you anything except awkward call syntax
Even though Count is signed it won't ever return a negative value.
I would prefer (IEnumerable<object>)item.Data over the as IEnumerable<object> version. If the latter fails it will result in an ArgumentNullException when really it's a casting issue.

Speed
Assuming you are talking about computational efficiency (i.e. speed), there are two operations that you might be able to improve:
First, you create a copy of the item.Data in the temp variable. When you know that the resulting ReportData will never be modified, you may use the item.Data directly, forgoing the expensive copy operation.
data.Add(new ReportData {
Data = (IEnumerable<object>)item.Data,
Name = name,
Title = item.Caption });
Second, converting to IEnumerable<object> will probably cause unnecessary boxing/unboxing at a later point. See if it makes sense for your application to add a generic type parameter to ReportData, so you may instantiate it as new ReportData<KPIObject>(). That way the compiler may do a better job of optimizing the code.
Memory
By implementing your solution using continuations you may be able to process one ReportData element at a time instead of all at once, thereby reducing the memory footprint. Have a look at the yield statement to see how to impelement such an approach.
Other
For futher code quality improvements, JaredPar's answer offers some exellent advice.

LINQ query code for complex merging of data

I've posted this before, but I worded it poorly. I'm trying again with a more well thought out structure.
I have the following code and I am trying to figure out the shorter linq expression to do it 'inline'. Please examine the "Run()" method near the bottom. I am attempting to understand how to join two dictionaries together based on a matching identifier in one of the objects - so that I can use the query in this sort of syntax.
var selected = from a in items.List()
// etc. etc.
select a;
so that I can define my structure in code like ...
TModelViewModel = new TModelViewModel
{
TDictionary = from a in items... etc. etc...
}
instead of going through a bunch of foreach loops, extra object declarations, etc.
This is my class structure. The Run() method is what I am trying to simplify. I basically need to do this conversion inline in a couple of places, and I wanted to simplify it a great deal so that I can define it more 'cleanly'.
class TModel
{
public Guid Id { get; set; }
}
class TModels : List<TModel>
{
}
class TValue
{
}
class TStorage
{
public Dictionary<Guid, TValue> Items { get; set; }
}
class TArranged
{
public Dictionary<TModel, TValue> Items { get; set; }
}
static class Repository
{
static public TItem Single<TItem, TCollection>(Predicate<TItem> expression)
{
return default(TItem); // access logic.
}
}
class Sample
{
public void Run()
{
TStorage tStorage = new TStorage();
// access tStorage logic here.
Dictionary<TModel, TValue> d = new Dictionary<TModel, TValue>();
foreach (KeyValuePair<Guid, TValue> kv in tStorage.Items)
{
d.Add(Repository.Single<TModel, TModels>(m => m.Id == kv.Key),kv.Value);
}
}
}

Haven't really tested this, and it's quite ugly, but I think this should work:
Dictionary<TModel, TValue> d = new Dictionary<TModel, TValue>();
d = d.Concat(tStorage
.Items
.Select(i => new KeyValuePair<TModel, TValue>(
new TModel { Id = i.Key }, i.Value))).ToDictionary(i => i.Key, i => i.Value);

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Time-stamped, thread-safe data structure for time-based lookup? - c#

Related

wrap dictionary for readability?

Fastest way to compare two List<CustomObject>

Trying to Utilise a generic <T> collection

C# Generics Efficiency, a better way to do this

LINQ query code for complex merging of data

Categories

Resources