To not run out of memory by brining in the whole table at ones, I am doing in it chunks of LOAD_SIZE records.
Here is how I am doing it, I feel like there are some indexes that are off by one record? and possible performance improvements that I can do in
So I wanted to have your opinion on this approach.
int totalCount = repo.Context.Employees.Count();
int startRow = 0;
while (startRow <= totalCount)
{
repo.PaginateEmployees(startRow, LOAD_SIZE);
startRow = startRow + LOAD_SIZE ;
}
public List<EmpsSummary> PaginateEmployees(int startRow, int loadSize)
{
var query = (from p in this.Context.Employees
.Skip(startRow).Take(loadSize)
select new EmpsSummary
{
FirstName = p.FirstName,
LastName = p.LastName,
Phone = p.Phone
});
return query.ToList();
}
Because of how Linq works (lazy loading and has compares), if you formulate your statements right it will manage memory much better than you will be able.
From your comments (which should be added to the question) I offer this solution which should manage memory for you just fine.
This example code is not intended to compile -- it is given as an example
// insert list
List<EmpsSummary> insertList;
// add stuff to insertList
List<EmpsSummary> filteredList = insertList.Except(this.Context.Employees);
This assumes that this.Context.Employees is of type EmpsSummary. If it isn't you have to cast it to the correct type.
Also you will need to be able to compare EmpsSummary. To do so create this IEquitable like this:
This example code is not intended to compile -- it is given as an example
public class EmpsSummary : IEquatable<EmpsSummary>
{
public string FirstName { get; set; }
public string LastName { get; set; }
public string Phone { get; set; }
public bool Equals(EmpsSummary other)
{
//Check whether the compared object is null.
if (Object.ReferenceEquals(other, null)) return false;
//Check whether the compared object references the same data.
if (Object.ReferenceEquals(this, other)) return true;
//Check whether the products' properties are equal.
return FirstName.Equals(other.FirstName) &&
LastName.Equals(other.LastName) &&
Phone.Equals(other.Phone);
}
// If Equals() returns true for a pair of objects
// then GetHashCode() must return the same value for these objects.
public override int GetHashCode()
{
int hashProductFirstName = FirstName == null ? 0 : FirstName.GetHashCode();
int hashProductLastName = LastName == null ? 0 : LastName.GetHashCode();
int hashProductPhone = Phone == null ? 0 : Phone.GetHashCode();
//Calculate the hash code
return hashProductFirstName ^ hashProductLastName ^ hashProductPhone;
}
}
Related
Lets say I have a domain model of an assembly line that has different orders on it. The user can change the value of the order but each order's value must be greater than the one in front of it and less than the one behind it. I have created an aggregate root called Line to enforce this invariant. This is the simplified version of that code below.
public class Line : IAggregateRoot
{
public int Id { get; }
public List<Order> Orders { get; }
public Line(int id, List<Order> orders)
{
Id = id;
Orders = orders;
}
public void SetOrderValue(int orderId, int newOrderValue)
{
var orderPos = Orders.FindIndex(o => o.Id == orderId);
if (orderPos != -1 && IsValidOrderValue(orderPos,newOrderValue))
{
Orders[orderPos].Value = newOrderValue;
}
}
private bool IsValidOrderValue(int orderPos, int newOrderValue)
{
var lessThanAfter = orderPos == Orders.Count - 1 ? true : newOrderValue <= Orders[orderPos + 1].Value;
var greaterThanBefore = orderPos == 0 ? true : newOrderValue <= Orders[orderPos - 1].Value;
return lessThanAfter && greaterThanBefore;
}
}
public class Order : IEntity<int>
{
public int Id { get; }
public int Value { get; set; }
/*
* Other info about the order goes here
*/
public Order(int id, int value)
{
Id = id;
Value = value;
}
}
The issue that I have is that any object that references Line can also change the value of any order and break the invariant.
line.Orders[0].Value = 10;
I know that in DDD, the aggregate root shouldn't allow references to the inner entities so I thought about making the orders list private. However, then when I try to store the Line aggregate root in a repository, the repository has no way of being able to fetch and save the list of orders. Is there a recommended way in DDD to protect the Order objects from outside objects being able to change their values while at the same time keeping the Order info public so the repository can save it in the database?
I am comparing two same objects by implementing an IEquatable interface on the object. If they are not equal, then update the DB; otherwise, leave it as it is. Here the context is i need to update the table with the data coming from an excel sheet and compare the data and update only when there is a data change.
Below is the code for the same
var constructionMaterialTypes = package.GetObjectsFromSheet<ConstructionMaterialTypeExcelDto>(ConstructionDataSheetNames.CONSTRUCTION_MATERIAL_TYPE,
ConstructionDataTableNames.ConstructionMaterialType);
var materialTypes = new List<ConstructionMaterialType>();
foreach (var materialType in constructionMaterialTypes)
{
var materialTypeId = GetGuidFromMD5Hash(materialType.name);
List<string> materialTypeNotes = new();
if (!string.IsNullOrEmpty(materialType.notes))
{
materialTypeNotes.Add(materialType.notes);
}
var existingMaterialType = ctx.ConstructionMaterialTypes.SingleOrDefault(cm => cm.Id == materialTypeId);
var constructionMaterialType = new ConstructionMaterialType
{
Id = materialTypeId,
Name = materialType.name,
NotesHTML = materialTypeNotes
};
if (existingMaterialType != default)
{
if (existingMaterialType != constructionMaterialType) // Object comparison happening here
{
existingMaterialType.Name = materialType.name;
existingMaterialType.NotesHTML = materialTypeNotes;
}
}
else
{
materialTypes.Add(constructionMaterialType);
}
}
and then below is the actual class where I am implementing Iequatable interface
public sealed class ConstructionMaterialType : IIdentity<Guid>, IEquatable<ConstructionMaterialType>
{
[Key]
public Guid Id { get; set; }
public string Name { get; set; }
public List<string> NotesHTML { get; set; }
public bool Equals(ConstructionMaterialType other)
{
if (other is null)
return false;
return this.Id == other.Id
&& this.Name == other.Name
&& this.NotesHTML == other.NotesHTML;
}
public override bool Equals(object obj) => Equals(obj as ConstructionMaterialType);
public override int GetHashCode()
{
int hash = 19;
hash = hash * 31 + (Id == default ? 0 : Id.GetHashCode());
hash = hash * 31 + (Name == null ? 0 : Name.GetHashCode(StringComparison.OrdinalIgnoreCase));
hash = hash * 31 + (NotesHTML == null ? 0 : NotesHTML.GetHashCode());
return hash;
}
}
this condition existingMaterialType != constructionMaterialType is always true even if both objects are holding the same values, and I have attached the images as well for reference purposes
I am not sure where I am doing wrong in the above code. Could anyone please point me in the right direction?
Many thanks in advance
You did not override the != operator, but you can use !existingMaterialType.Equals(constructionMaterialType) instead.
this.NotesHTML == other.NotesHTML will do a reference comparison of the two list, so even if both contain excactly the same strings, it will return false is the two lists are different instances. You might want to use this.NotesHTML.SequenceEqual(other.NotesHTML) instead (might need sone adaptation if NotesHTML can be null).
Note: GetHashCode must deliver the same result for all objects that compare equal. So if you change anything in the Equals method, you probably also have to change GetHashCode. As it is not necessary that objects that compare non-equal have different hash codes, it is an option to just not take into account some properties. Here: just omit the line with NotesHTML.
I have two arrays of ArrayList.
public class ProductDetails
{
public string id;
public string description;
public float rate;
}
ArrayList products1 = new ArrayList();
ArrayList products2 = new ArrayList();
ArrayList duplicateProducts = new ArrayList();
Now what I want is to get all the products (with all the fields of ProductDetails class) having duplicate description in both products1 and products2.
I can run two for/while loops as traditional way, but that would be very slow specially if I will be having over 10k elements in both arrays.
So probably something can be done with LINQ.
If you want to use linQ, you need write your own EqualityComparer where you override both methods Equals and GetHashCode()
public class ProductDetails
{
public string id {get; set;}
public string description {get; set;}
public float rate {get; set;}
}
public class ProductComparer : IEqualityComparer<ProductDetails>
{
public bool Equals(ProductDetails x, ProductDetails y)
{
//Check whether the objects are the same object.
if (Object.ReferenceEquals(x, y)) return true;
//Check whether the products' properties are equal.
return x != null && y != null && x.id.Equals(y.id) && x.description.Equals(y.description);
}
public int GetHashCode(ProductDetails obj)
{
//Get hash code for the description field if it is not null.
int hashProductDesc = obj.description == null ? 0 : obj.description.GetHashCode();
//Get hash code for the idfield.
int hashProductId = obj.id.GetHashCode();
//Calculate the hash code for the product.
return hashProductDesc ^ hashProductId ;
}
}
Now, supposing you have this objects:
ProductDetails [] items1= { new ProductDetails { description= "aa", id= 9, rating=2.0f },
new ProductDetails { description= "b", id= 4, rating=2.0f} };
ProductDetails [] items= { new ProductDetails { description= "aa", id= 9, rating=1.0f },
new ProductDetails { description= "c", id= 12, rating=2.0f } };
IEnumerable<ProductDetails> duplicates =
items1.Intersect(items2, new ProductComparer());
Consider overriding the System.Object.Equals method.
public class ProductDetails
{
public string id;
public string description;
public float rate;
public override bool Equals(object obj)
{
if(obj is ProductDetails == null)
return false;
if(ReferenceEquals(obj,this))
return true;
ProductDetails p = (ProductDetails)obj;
return description == p.description;
}
}
Filtering would then be as simple as:
var result = products1.Where(product=>products2.Contains(product));
EDIT:
Do consider that this implementation is not optimal..
Moreover- it has been proposed in the comments to your question that you use a data base.
This way performance will be optimized - as per the database implementation In any case- the overhead will not be yours.
However, you can optimize this code by using a Dictionary or a HashSet:
Overload the System.Object.GetHashCode method:
public override int GetHashCode()
{
return description.GetHashCode();
}
You can now do this:
var hashSet = new HashSet<ProductDetails>(products1);
var result = products2.Where(product=>hashSet.Contains(product));
Which will boost your performance to an extent since lookup will be less costly.
10k elements is nothing, however make sure you use proper collection types. ArrayList is long deprecated, use List<ProductDetails>.
Next step is implementing proper Equals and GetHashCode overrides for your class. The assumption here is that description is the key since that's what you care about from a duplication point of view:
public class ProductDetails
{
public string id;
public string description;
public float rate;
public override bool Equals(object obj)
{
var p = obj as ProductDetails;
return ReferenceEquals(p, null) ? false : description == obj.description;
}
public override int GetHashCode() => description.GetHashCode();
}
Now we have options. One easy and efficient way of doing this is using a hash set:
var set = new HashSet<ProductDetails>();
var products1 = new List<ProductDetails>(); // fill it
var products2 = new List<ProductDetails>(); // fill it
// shove everything in the first list in the set
foreach(var item in products1)
set.Add(item);
// and simply test the elements in the second set
foreach(var item in products2)
if(set.Contains(item))
{
// item.description was already used in products1, handle it here
}
This gives you linear (O(n)) time-complexity, best you can get.
this is my query:
rows.GroupBy(row => new TaxGroupObject
{
EnvelopeID = row.Field<int>("EnvelopeID"),
PolicyNumber = row.Field<string>("PolicyNumber"),
TZ = row.Field<string>("TZ")
})
.Select(row =>
{
int i;
if (row.Key.EnvelopeID == 5713 && row.Key.PolicyNumber == "50002617" && row.Key.TZ == "50002617")
i=1+1;
var newRow = structure.NewRow();
newRow["PolicyNumber"]=row.Key.PolicyNumber;
newRow["TZ"]=row.Key.TZ;
newRow["CreditPremiaTaxParagraph45"] = row.Sum(x => decimal.Parse(x["CreditPremiaTaxParagraph45"].ToString()));
newRow["WorklossTax"] = row.Sum(x => decimal.Parse(x["WorklossTax"].ToString()));
newRow["MiscTax"] = row.Sum(x => decimal.Parse(x["MiscTax"].ToString()));
newRow["EnvelopeID"] = row.Key.EnvelopeID;
return newRow;
}
);
internal class TaxGroupObject
{
public long? EnvelopeID{ get; set; }
public string PolicyNumber { get; set; }
public string TZ { get; set; }
}
i put a breakpoint on the line with "i=1+1", after an if condition comparing all the keys i've used for the group by with some hard coded values. that break point is being hit twice, although the group by suppose to group all rows with same keys together. the thing is that for most of the values in the table the grouping works just fine and i cant understand how its possible. if you can help in any way it would be highly appreciated.
The problem is that TaxGroupObject does not implement GetHashCode and Equals. These methods are used by GroupBy to determine what makes one TaxGroupObject object equal to another. By default, it's by reference equality, not property equality.
This should work, using the GetHashCode algorithm from What is the best algorithm for an overridden System.Object.GetHashCode?:
internal class TaxGroupObject
{
public long? EnvelopeID { get; set; }
public string PolicyNumber { get; set; }
public string TZ { get; set; }
public override int GetHashCode()
{
unchecked // Overflow is fine, just wrap
{
int hash = 17;
hash = hash * 23 + EnvelopeID.GetHashCode();
hash = hash * 23 + (PolicyNumber != null ? PolicyNumber.GetHashCode() : -2);
hash = hash * 23 + (TZ != null ? TZ.GetHashCode() : -1);
return hash;
}
}
public override bool Equals(object obj)
{
if (obj.GetType() != typeof(TaxGroupObject))
return false;
var other = (TaxGroupObject)obj;
return this.EnvelopeID == other.EnvelopeID &&
this.PolicyNumber == other.PolicyNumber &&
this.TZ == other.TZ;
}
}
Also, you should only use immutable objects in something like a grouping or dictionary. At a minimum, you must be sure that the objects here do not change during your grouping.
eventually i found it simpler to give up inheritance and use a struct instead of class, it also works since struct is a value type therefore doesn't need equals method override. I am intrested in which of these approaches bring better performance, if anybody knows. Intuitively it seems like structs are more efficient, but I am not sure, and I currently don't have the time to emulate the two options or make the proper re(google)search.
Thanks
Question:
Can anyone tell me why my unit test is failing with this error message?
CollectionAssert.AreEquivalent failed. The expected collection contains 1
occurrence(s) of . The actual
collection contains 0 occurrence(s).
Goal:
I'd like to check if two lists are identical. They are identical if both contain the same elements with the same property values. The order is irrelevant.
Code example:
This is the code which produces the error. list1 and list2 are identical, i.e. a copy-paste of each other.
[TestMethod]
public void TestListOfT()
{
var list1 = new List<MyPerson>()
{
new MyPerson()
{
Name = "A",
Age = 20
},
new MyPerson()
{
Name = "B",
Age = 30
}
};
var list2 = new List<MyPerson>()
{
new MyPerson()
{
Name = "A",
Age = 20
},
new MyPerson()
{
Name = "B",
Age = 30
}
};
CollectionAssert.AreEquivalent(list1.ToList(), list2.ToList());
}
public class MyPerson
{
public string Name { get; set; }
public int Age { get; set; }
}
I've also tried this line (source)
CollectionAssert.AreEquivalent(list1.ToList(), list2.ToList());
and this line (source)
CollectionAssert.AreEquivalent(list1.ToArray(), list2.ToArray());
P.S.
Related Stack Overflow questions:
I've seen both these questions, but the answers didn't help.
CollectionAssert use with generics?
Unit-testing IList with CollectionAssert
You are absolutely right. Unless you provide something like an IEqualityComparer<MyPerson> or implement MyPerson.Equals(), the two MyPerson objects will be compared with object.Equals, just like any other object. Since the objects are different, the Assert will fail.
It works if I add an IEqualityComparer<T> as described on MSDN and if I use Enumerable.SequenceEqual. Note however, that now the order of the elements is relevant.
In the unit test
//CollectionAssert.AreEquivalent(list1, list2); // Does not work
Assert.IsTrue(list1.SequenceEqual(list2, new MyPersonEqualityComparer())); // Works
IEqualityComparer
public class MyPersonEqualityComparer : IEqualityComparer<MyPerson>
{
public bool Equals(MyPerson x, MyPerson y)
{
if (object.ReferenceEquals(x, y)) return true;
if (object.ReferenceEquals(x, null) || object.ReferenceEquals(y, null)) return false;
return x.Name == y.Name && x.Age == y.Age;
}
public int GetHashCode(MyPerson obj)
{
if (object.ReferenceEquals(obj, null)) return 0;
int hashCodeName = obj.Name == null ? 0 : obj.Name.GetHashCode();
int hasCodeAge = obj.Age.GetHashCode();
return hashCodeName ^ hasCodeAge;
}
}
I was getting this same error when testing a collection persisted by nHibernate. I was able to get this to work by overriding both the Equals and GetHashCode methods. If I didn't override both I still got the same error you mentioned:
CollectionAssert.AreEquivalent failed. The expected collection contains 1 occurrence(s) of .
The actual collection contains 0 occurrence(s).
I had the following object:
public class EVProjectLedger
{
public virtual long Id { get; protected set; }
public virtual string ProjId { get; set; }
public virtual string Ledger { get; set; }
public virtual AccountRule AccountRule { get; set; }
public virtual int AccountLength { get; set; }
public virtual string AccountSubstrMethod { get; set; }
private Iesi.Collections.Generic.ISet<Contract> myContracts = new HashedSet<Contract>();
public virtual Iesi.Collections.Generic.ISet<Contract> Contracts
{
get { return myContracts; }
set { myContracts = value; }
}
public override bool Equals(object obj)
{
EVProjectLedger evProjectLedger = (EVProjectLedger)obj;
return ProjId == evProjectLedger.ProjId && Ledger == evProjectLedger.Ledger;
}
public override int GetHashCode()
{
return new { ProjId, Ledger }.GetHashCode();
}
}
Which I tested using the following:
using (ITransaction tx = session.BeginTransaction())
{
var evProject = session.Get<EVProject>("C0G");
CollectionAssert.AreEquivalent(TestData._evProjectLedgers.ToList(), evProject.EVProjectLedgers.ToList());
tx.Commit();
}
I'm using nHibernate which encourages overriding these methods anyways. The one drawback I can see is that my Equals method is based on the business key of the object and therefore tests equality using the business key and no other fields. You could override Equals however you want but beware of equality pollution mentioned in this post:
CollectionAssert.AreEquivalent failing... can't figure out why
If you would like to achieve this without having to write an equality comaparer, there is a unit testing library that you can use, called FluentAssertions,
https://fluentassertions.com/documentation/
It has many built in equality extension functions including ones for the Collections. You can install it through Nuget and its really easy to use.
Taking the example in the question above all you have to write in the end is
list1.Should().BeEquivalentTo(list2);
By default, the order matters in the two collections, however it can be changed as well.
I wrote this to test collections where the order is not important:
public static bool AreCollectionsEquivalent<T>(ICollection<T> collectionA, ICollection<T> collectionB, IEqualityComparer<T> comparer)
{
if (collectionA.Count != collectionB.Count)
return false;
foreach (var a in collectionA)
{
if (!collectionB.Any(b => comparer.Equals(a, b)))
return false;
}
return true;
}
Not as elegant as using SequenceEquals, but it works.
Of course to use it you simply do:
Assert.IsTrue(AreCollectionsEquivalent<MyType>(collectionA, collectionB, comparer));