I have two List<CustomObject>, called list1 and list2
public class CustomObject
{
public string foo { get; set; }
public string bar{ get; set; }
}
The goal is to generate a new list with all the entries that have been modified/added in list2.
Because these lists can get pretty long, looping through them is not an option ...
Any ideas?
Adding another answer to accomodate some additional NFRs that have come up in the comments:
Objects can be identified by a hash code
The list is very big, so performance is an issue
The idea is to compare an old list to a new list to see if any new hash codes have popped up.
You will want to store your objects in a dictionary:
var list = new Dictionary<string, CustomObject>();
When you add them, provide the hash as the key:
list.Add(customObject.Hash, customObject);
To scan for new ones:
var difference = new List<CustomObject>();
foreach (customObject o in newList)
{
if (oldList.ContainsKey(o.Hash)) difference.Add(o);
}
Log(String.Format("{0} new hashes found.", difference.Count));
By using the Dictionary you take advantage of the way the keys are stored in a hash table. Finding an item in a hash table is faster than just doing a scan & compare sort of thing. I believe this will be O(n*log(n)) instead of O(n^2).
Here's a traditional way to do it:
public class CustomObject : IComparable
{
public string foo { get; set; }
public string bar{ get; set; }
public int CompareTo(CustomObject o)
{
if (this.foo == o.foo && this.bar == o.bar) return 0;
//We have to code for the < and > comparisons too. Could get painful if there are a lot of properties to compare.
if (this.Foo == o.Foo) return (this.Bar.CompareTo(o.Bar));
return this.Foo.CompareTo(o.Foo);
}
}
Then use Linq.Except:
listA.Except(listB)
Related
I have a class that has the following fields.
class StandardizedModel
{
public string Case{ get; set; }
public DateTime? CreatedDateLocal { get; set; }
public DateTime? ClosedDateLocal { get; set; }
private string _status;
}
Let's say I make two lists with this custom class.
List1 ( main list)
List2
What I am trying to do is compare the elements in the two lists, if they are different I need to return the original StandardizedModel from list1.
The two lists will always be the same size and the case will always exist in both lists, the dates and status can be different though.
I've tried using a linq's zip to try and compare the elements and then add them to a list but that returns 0.
List<StandardizedModel> testList = new List<StandardizedModel>();
var test = List1.Zip(List2, (a, b) =>
{
if (a =! b) { testList.Add(a);}
return testList;
});
You can use Enumerable.Join which is efficient:
var changes = from m1 in list1
join m2 in list2 on m1.Case equals m2.Case
where m1.CreatedDateLocal != m2.CreatedDateLocal
|| m1.ClosedDateLocal != m2.ClosedDateLocal
|| m1._status != m2._status
select m1;
List<StandardizedModel> changeList = changes.ToList();
Your best option For code re usability is to override the equality implementation for your objects so that you can determine if they are equal in the way you want them, if you try to get equality on an object by default it is by reference, meaning do they point to the same memory location. Which will never be the case if they are 2 separate instances. (this is why your zip does not work) Then you can use your zip to create your list.
I want to implement a list type data structure that can be appended to, with an associated time-stamp. The point of this is that I can then get all the data that is newer than a certain time-stamp.
I have tried doing this with a ConcurrantDicitionary but I'm not confident this is the best way to do it. I would much prefer to have a List< decimal[2] > for applications which I won't go into here. The first value of the array can have the timestamp and the second will be the value. Alternatively, could use List< TimeStampedObject >. However, apparently there is no such thing as a concurrent list in C#.
For the record, my data is ordered with regards to timestamp.
I want to be able to do things like:
public static Dictionary<DateTime, decimal> GetLatest(DateTime since, Dictionary<DateTime, decimal> requestedDict)
{
Dictionary<DateTime, decimal> returnList = new Dictionary<DateTime, decimal>();
returnList = requestedDict.Where(x => x.Key > since).ToDictionary(x => x.Key, x => x.Value);
return returnList;
}
UPDATE:
Here is the List item I have come up with; please let me know if this has any potential downfalls:
public class ConcurrentList: List<StampedValue>
{
ReaderWriterLockSlim _samplesLock = new ReaderWriterLockSlim();
public ConcurrentList() : base()
{
}
public void AddThreadSafe(StampedValue item){
this._samplesLock.EnterWriteLock();
try
{
this.Add(item);
}
finally
{
this._samplesLock.ExitWriteLock();
}
}
public List<StampedValue> GetLatest(long since){
return this.Where( s => s.Timestamp > since ).ToList();
}
public List<StampedValue> GetLatest(DateTime since){
throw new NotImplementedException();
}
}
public class StampedValue
{
public long Timestamp { get; set; }
public decimal Value { get; set; }
public StampedValue(long t, decimal v){
this.Timestamp = t;
this.Value = v;
}
}
Seems to me your best bet is just a List<T> that you protect with a ReaderWriterLockSlim. For example:
class Sample
{
public DateTime EventTime { get; set; }
public decimal Value { get; set; }
}
List<Sample> _samples = new List<Sample>();
ReaderWriterLockSlim _samplesLock = new ReaderWriterLockSlim();
// to get items after a particular date
List<Sample> GetSamplesAfterDate(DateTime dt)
{
_samplesLock.EnterReadLock();
try
{
return _samples.Where(s => s.EventTime >= dt).ToList();
}
finally
{
_samplesLock.ExitReadLock();
}
}
If your list is known to be in chronological order, then you can improve the performance by using binary search on the list to find the first item that's greater than or equal to your passed time stamp. I just used the LINQ version here because the point is to illustrate the locking.
Appending to the list is similar: acquire the write lock, append, and release the lock:
void AppendSample(Sample s)
{
_samplesLock.EnterWriteLock();
try
{
_samples.Add(s);
}
finally
{
_samplesLock.ExitWriteLock();
}
}
An alternative is to use List<KeyValuePair<DateTime, decimal>> rather than List<Sample>. The locking would remain the same.
This should perform quite well in most situations.
Have you looked at the SynchronizedCollection<T> class? It seems to me to be what you are looking for. You could also specialize SynchronizedKeyedCollection<K, T>
EDIT (2014/May/8):
The documentation I linked to above is not as clear or useful as one would like, as such it may be helpful to look at the reference implementation.
What's a good collection in C# to store the data below:
I have check boxes that bring in a subjectId, varnumber, varname, and title associated with each checkbox.
I need a collection that can be any size, something like ArrayList maybe with maybe:
list[i][subjectid] = x;
list[i][varnumber] = x;
list[i][varname] = x;
list[i][title] = x;
Any good ideas?
A List<Mumble> where Mumble is a little helper class that stores the properties.
List<Mumble> list = new List<Mumble>();
...
var foo = new Mumble(subjectid);
foo.varnumber = bar;
...
list.Add(foo);
,..
list[i].varname = "something else";
public Class MyFields
{
public int SubjectID { get; set; }
public int VarNumber { get; set; }
public string VarName { get; set; }
public string Title { get; set; }
}
var myList = new List<MyFields>();
To access a member:
var myVarName = myList[i].VarName;
A generic list, List<YourClass> would be great - where YourClass has properties of subjectid, varnumber etc.
You'd likely want to use a two-dimensional array for this, and allocate positions in the second dimension of the array for each of your values. For instance, list[i][0] would be the subjectid, list[i][1] would be varnumber, and so on.
Determining what collection, typically begins with what do you want to do with it?
If your only criteria is it can be anysize, then I would consider List<>
Since this is a Key, Value pair I would recommend you use a generic IDictionary based collection.
// Create a new dictionary of strings, with string keys,
// and access it through the IDictionary generic interface.
IDictionary<string, string> openWith =
new Dictionary<string, string>();
// Add some elements to the dictionary. There are no
// duplicate keys, but some of the values are duplicates.
openWith.Add("txt", "notepad.exe");
openWith.Add("bmp", "paint.exe");
openWith.Add("dib", "paint.exe");
openWith.Add("rtf", "wordpad.exe");
As others have said, it looks like you'd be better creating a class to hold the values so that your list returns an object that contains all the data you need. While two-dimensional arrays can be useful, this doesn't look like one of those situations.
For more information about a better solution and why a two-dimensional array/list in this instance isn't a good idea you might want to read: Create a list of objects instead of many lists of values
If there's an outside chance that the order of [i] is not in a predictable order, or possibly has gaps, but you need to use it as a key:
public class Thing
{
int SubjectID { get; set; }
int VarNumber { get; set; }
string VarName { get; set; }
string Title { get; set; }
}
Dictionary<int, Thing> things = new Dictionary<int, Thing>();
dict.Add(i, thing);
Then to find a Thing:
var myThing = things[i];
I have a list of items like so {One,Two,Three,One,Four,One,Five,Two,One} and I need a query that takes that list and generates a list based on only unique items, so the list returned would be {One,Two,Three,Four,Five}.
Use the Distinct operator:
var unique = list.Distinct();
The Distinct operator. There's an example at MSDN.
http://msdn.microsoft.com/en-us/library/bb348436.aspx
var x = new string[] {"One", "Two", "Three", "One", "Four", "One", "Five", "Two", "One"}.ToList();
var distinct = x.Distinct();
It's worth noting Distinct() will use the default means of determining equality, which might not suit you if your list contains complex objects rather than primitives.
There is an overload that allows you to specify an IEqualityComparer for providing custom equality logic.
More details on how Distinct determines if two items are equal: http://msdn.microsoft.com/en-us/library/ms224763.aspx
use distinct
List<string> l = new List<string>
{
"One","Two","Three","One","Four","One","Five","Two","One"
};
var rootcategories2 = l.Distinct();
Aside from Distinct, as others have mentioned, you can also use a HashSet:
List<string> distinct = new HashSet<string>(list).ToList();
If you're using LINQ, though, go with Distinct.
Thanks for all your answers, I guess I mis-posed my question slightly. What I really have is a complex class, on which I want the comparison (for Distinct) based on a particular member of the class.
class ComplexClass
{
public string Name{ get; set; }
public string DisplayName{ get; set; }
public int ComplexID{ get; set; }
}
List<ComplexClass> complexClassList = new List<ComplexClass>();
complexClassList.Add(new ComplexClass(){Name="1", DisplayName="One", ComplexID=1});
complexClassList.Add(new ComplexClass(){Name="2", DisplayName="Two", ComplexID=2});
complexClassList.Add(new ComplexClass(){Name="3", DisplayName="One", ComplexID=1});
// This doesn't produce a distinct list, since the comparison is Default
List<ComplexClass) uniqueList = complexClassList.Distinct();
class ComplexClassNameComparer : IEquatable<ComplexClass>
{
public override bool Equals(ComplexClass x, ComplexClass y)
{
return (x.To.DisplayName == y.To.DisplayName);
}
public override int GetHashCode(ComplexClass obj)
{
return obj.DisplayName.GetHashCode();
}
}
// This does produce a distinct list, since the comparison is specific
List<ComplexClass) uniqueList = Enumerable.Distinct(complexClassList , new ComplexClassNameComparer());
I had a similar issue. I have a list of importItems with customerName as one of the properties and wanted to generate a list of unique customers from that original list.
I pass the original list to GenCustomers()...using linq I created a unique list of itemCustomers. I then traverse that list to create a new Customer list.
public static List<Customer> GenCustomers(List<ImportItem> importItems)
{
List<Customer> customers = new List<Customer>();
var itemCustomers = importItems.Select(o => new { o.CustomerName }).Distinct();
foreach (var item in itemCustomers)
{
customers.Add(new Customer() { CompanyName = item.CustomerName });
}
return customers;
}
I've got a group of data that looks like this:
001 001 One
001 002 Two
001 003 Three
002 001 One
002 002 Two
002 003 Three
...
Now, certainly, I could create an array of string[x][y] = z, but this array has to be resizable, and i'd prefer to use the string representations of the indexers than convert to numeric. The reason is that i will need to look up the data by string, and i don't see the point in needless string->number conversions.
My first thought was this:
Dictionary<string, Dictionary<string, string>> data;
data = new Dictionary<string, Dictionary<string, string>>();
Dictionary<string, string> subdata = Dictionary<string, string>();
subdata.Add(key, string);
data.add(key2, subdata);
and this works, but is somewhat cumbersome. It also feels wrong and kludgy and not particularly efficient.
So what's the best way to store this sort of data in a collection?
I also thought of creating my own collection class, but I'd rather not if I don't have to. I'd rather just use the existing tools.
This is pretty common request, and most people end up writing some variation of a Tuple class. If you're using ASP.Net, you can utilize the Triple class that's already available, otherwise, write something like:
public class Tuple<T, T2, T3>
{
public Tuple(T first, T2 second, T3 third)
{
First = first;
Second = second;
Third = third;
}
public T First { get; set; }
public T2 Second { get; set; }
public T3 Third { get; set; }
}
There's a generic three-tuple class, so you can create a new List<Tuple<string, string, string>>() and create your tuples and add them. Expand on that basic class with some indexing functionality and you're up up and away.
Edit: A list with a dictionary doesn't seem like the correct approach, because each dictionary is only holding one value. There is no multi-entry relationship between the key and values - there is simply one multi-part key and one associated value. The data is equivalent to a database row (or tuple!).
Edit2: Here's an indexable list class you could use for convenience.
public class MyTupleList : List<Tuple<string, string, string>>
{
public Tuple<string, string, string> this[string first, string second]
{
get
{
return (this.Find(x => x.First == first && x.Second == second));
}
set
{
this[first, second] = value;
}
}
}
I think this really depends on what you are modelling here. If you're planning to use an object-oriented approach, you shouldn't be thinking of these as arbitrary items inside a data structure.
I'm guessing from looking at this that the first two columns are serving as a "key" for the other items. Define a simple struct, and create a dictionary of like so:
struct Key {
public int Val1 { get; set; }
public int Val2 { get; set; }
}
....
Dictionary<Key, string> values;
Obviously Key and the items inside it should be mapped to something closer to what you are representing.
Given a suitable Pair<A,B> class*, left as an exercise for the reader, you could use a Dictionary<Pair<string, string>, string>.
* A class with equality and hash code overrides, nothing terribly hard.
Would a List<List<T>> work for you? Still kludgy, but better than dictionaries IMHO.
EDIT: What about a Dictionary<string,string> and mapping the two keys to a single string?
var data = new Dictionary<string,string>(StringComparer.Ordinal);
data[GetKey("002", "001")] = "One";
with
string GetKey(string a, string b) {
return a + "\0" + b;
}
List<List<string>> is really your best bet in this case. But I agree, it's kludgy. Personally, I would create a custom class that implements a two-dimensional indexer and maybe use a List<List<T>> internally.
For example:
public class DynamicTwoDimensonalArray<T>
{
private List<List<T>> Items = new List<List<T>>();
public T this[int i1, int i2]
{
get
{
return Items[i1][i2];
}
set
{
Items[i1][i2] = value;
}
}
}
This is a basic idea to get you going; clearly the setter needs to deal with bounds issues. But it's a start.
Edit:
No. As I said, I would prefer to index them by string. And they may not always be sequential (might have a missing number in the middle). - Mystere Man
Hmm... this is interesting. If that's the case, your best bet would be to create some sort of concatenation of the combination of the two indexers and use that as the key in a single-level dictionary. I would still use a custom class to make using the indexing easier. For example:
public class TwoDimensionalDictionary
{
private Dictionary<string, string> Items = new Dictionary<string, string>();
public string this[string i1, string i2]
{
get
{
// insert null checks here
return Items[BuildKey(i1, i2)];
}
set
{
Items[BuildKey(i1, i2)] = value;
}
}
public string BuildKey(string i1, string i2)
{
return "I1: " + i1 + " I2: " + i2;
}
}
If you are ever going to need to find z by given (x,y) (and not, for example, find all y by given x), then use this:
Dictionary<KeyValuePair<string, string>, string>
Otherwise, your dictionary is fine as is.