I have some data which I serialized with protobuf.net. The data is a map, and contains some duplicates (which happened as my key didn't implement IEquatable)
I want to deserialize the data into a dictionary and ignore duplicates.
There seems to be an attribute for that, i.e. [ProtoMap(DisableMap=false)], which the documentation says:
Disable "map" handling; dictionaries will use .Add(key, value) instead
of [key] = value. ...
Basically I want to behavior to be [key] = value, but apparently, the attribute is ignored.
Am I doing anything wrong? Is there any way to achieve the desired (and documented) behavior of ignoring duplicates?
Example code:
1. Produce data with duplicates:
// ------------- ------------- ------------- ------------- ------------- ------------- -------------
// The following part generated the bytes, which requires the key NOT to implement IEquatable
// ------------- ------------- ------------- ------------- ------------- ------------- -------------
var cache = new MyTestClass() { Dictionary = new Dictionary<MyTestKey, string>() };
cache.Dictionary[new MyTestKey { Value = "X" }] = "A";
cache.Dictionary[new MyTestKey { Value = "X" }] = "B";
var bytes = cache.Serialize();
var bytesStr = string.Join(",", bytes); // "10,8,10,3,10,1,88,18,1,65,10,8,10,3,10,1,88,18,1,66";
//..
[DataContract]
public class MyTestKey
{
[DataMember(Order = 1)]
public string Value { get; set; }
}
[DataContract]
public class MyTestClass
{
[DataMember(Order = 1)]
[ProtoMap(DisableMap = false)]
public Dictionary<MyTestKey, string> Dictionary { get; set; }
}
´´´
2. Try deserialize the data, with property IEquatable, which fails..:
[DataContract]
public class MyTestKey : IEquatable<MyTestKey>
{
[DataMember(Order = 1)]
public string Value { get; set; }
public bool Equals(MyTestKey other)
{
if (ReferenceEquals(null, other)) return false;
if (ReferenceEquals(this, other)) return true;
return Value == other.Value;
}
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj)) return false;
if (ReferenceEquals(this, obj)) return true;
if (obj.GetType() != this.GetType()) return false;
return Equals((MyTestKey) obj);
}
public override int GetHashCode()
{
return (Value != null ? Value.GetHashCode() : 0);
}
}
//..
var bytesStr2 = "10,8,10,3,10,1,88,18,1,65,10,8,10,3,10,1,88,18,1,66";
var bytes2 = bytesStr2.Split(',').Select(byte.Parse).ToArray();
var cache = bytes2.DeserializeTo<MyTestClass>();
´´´
Exception An item with the same key has already been added.
public static class SerializationExtensions
{
public static T DeserializeTo<T>(this byte[] bytes)
{
if (bytes == null)
return default(T);
using (var ms = new MemoryStream(bytes))
{
return Serializer.Deserialize<T>(ms);
}
}
public static byte[] Serialize<T>(this T setup)
{
using (var ms = new MemoryStream())
{
Serializer.Serialize(ms, setup);
return ms.ToArray();
}
}
There's a few different things going on here; "map" mode is actually the one you want here - so it isn't that you're trying to disable map, but actually you're trying to force it on (it is now on by default in most common dictionary scenarios).
There are some complications:
the library only processes [ProtoMap(...)] when processing a [ProtoMember(...)] for a [ProtoContract(...)]
even then, it only processes [ProtoMap(...)] for key-types that are valid as "map" keys in the proto specification
you can turn it on manually (not via the attributes), but in v2.* it enforces the same check as #2 at runtime, which means it will fail
The manual enable from #3 works in v3.* (currently in alpha):
RuntimeTypeModel.Default[typeof(MyTestClass)][1].IsMap = true;
however, this is obviously inelegant, and today requires using an alpha build (we've been using it in production here at Stack Overflow for an extended period; I just need to get a release together - docs, etc).
Given that it works, I'm tempted to propose that we should soften #2 in v3.*, such that while the default behavior remains the same, it would still check for [ProtoMap(...)] for custom types, and enable that mode. I'm on the fence about whether to soften #1.
I'd be interested in your thoughts on these things!
But to confirm: the following works fine in v3.* and outputs "B" (minor explanation of the code: in protobuf, append===merge for root objects, so serializing two payloads one after the other has the same effect as serializing a dictionary with the combined content, so the two Serialize calls spoofs a payload with two identical keys):
static class P
{
static void Main()
{
using var ms = new MemoryStream();
var key = new MyTestKey { Value = "X" };
RuntimeTypeModel.Default[typeof(MyTestClass)][1].IsMap = true;
Serializer.Serialize(ms, new MyTestClass() { Dictionary =
new Dictionary<MyTestKey, string> { { key, "A" } } });
Serializer.Serialize(ms, new MyTestClass() { Dictionary =
new Dictionary<MyTestKey, string> { { key, "B" } } });
ms.Position = 0;
var val = Serializer.Deserialize<MyTestClass>(ms).Dictionary[key];
Console.WriteLine(val); // B
}
}
I think what I'd like is if, in v3.*, it worked without the IsMap = true line, with:
[ProtoContract]
public class MyTestClass
{
[ProtoMember(1)]
[ProtoMap] // explicit enable here, because not a normal map type
public Dictionary<MyTestKey, string> Dictionary { get; set; }
}
Related
I am creating an gRPC service and we decided to choose the code first approach with protobuf-net.
Now I am running into a scenario where we have a couple of classes that need to be wrapped.
We do not want to define KnownTypes in the MyMessage class (just a sample name to illustrate the problem).
So I am trying to use the Any type which currently gives me some struggle with packing.
The sample code has the MyMessage which defines some header values and has to possiblity to deliver any type as payload.
[ProtoContract]
public class MyMessage
{
[ProtoMember(1)] public int HeaderValue1 { get; set; }
[ProtoMember(2)] public string HeaderValue2 { get; set; }
[ProtoMember(3)] public Google.Protobuf.WellknownTypes.Any Payload { get; set; }
}
[ProtoContract]
public class Payload1
{
[ProtoMember(1)] public bool Data1 { get; set; }
[ProtoMember(2)] public string Data2 { get; set; }
}
[ProtoContract]
public class Payload2
{
[ProtoMember(1)] public string Data1 { get; set; }
[ProtoMember(2)] public string Data2 { get; set; }
}
Somewhere in the code I construct my message with a payload ...
Payload2 payload = new Payload2 {
Data1 = "abc",
Data2 = "def"
};
MyMessage msg = new MyMessage
{
HeaderValue1 = 123,
HeaderValue2 = "iAmHeaderValue2",
Payload = Google.Protobuf.WellknownTypes.Any.Pack(payload)
};
Which doesn't work because Payload1 and Payload2 need to implement Google.Protobuf.IMessage.
Since I can't figure out how and do not find a lot information how to do it at all I am wondering if I am going a wrong path.
How is it intedend to use Any in protobuf-net?
Is there a simple (yet compatible) way to pack a C# code first class into Google.Protobuf.WellknownTypes.Any?
Do I really need to implement Google.Protobuf.IMessage?
Firstly, since you say "where we have a couple of classes that need to be wrapped" (emphasis mine), I wonder if what you actually want here is oneof rather than Any. protobuf-net has support for the oneof concept, although it isn't obvious from a code-first perspective. But imagine we had (in a contract-first sense):
syntax = "proto3";
message SomeType {
oneof Content {
Foo foo = 1;
Bar bar = 2;
Blap blap = 3;
}
}
message Foo {}
message Bar {}
message Blap {}
This would be implemented (via the protobuf-net schema tools) as:
private global::ProtoBuf.DiscriminatedUnionObject __pbn__Content;
[global::ProtoBuf.ProtoMember(1, Name = #"foo")]
public Foo Foo
{
get => __pbn__Content.Is(1) ? ((Foo)__pbn__Content.Object) : default;
set => __pbn__Content = new global::ProtoBuf.DiscriminatedUnionObject(1, value);
}
public bool ShouldSerializeFoo() => __pbn__Content.Is(1);
public void ResetFoo() => global::ProtoBuf.DiscriminatedUnionObject.Reset(ref __pbn__Content, 1);
[global::ProtoBuf.ProtoMember(2, Name = #"bar")]
public Bar Bar
{
get => __pbn__Content.Is(2) ? ((Bar)__pbn__Content.Object) : default;
set => __pbn__Content = new global::ProtoBuf.DiscriminatedUnionObject(2, value);
}
public bool ShouldSerializeBar() => __pbn__Content.Is(2);
public void ResetBar() => global::ProtoBuf.DiscriminatedUnionObject.Reset(ref __pbn__Content, 2);
[global::ProtoBuf.ProtoMember(3, Name = #"blap")]
public Blap Blap
{
get => __pbn__Content.Is(3) ? ((Blap)__pbn__Content.Object) : default;
set => __pbn__Content = new global::ProtoBuf.DiscriminatedUnionObject(3, value);
}
public bool ShouldSerializeBlap() => __pbn__Content.Is(3);
public void ResetBlap() => global::ProtoBuf.DiscriminatedUnionObject.Reset(ref __pbn__Content, 3);
optionally with an enum to help:
public ContentOneofCase ContentCase => (ContentOneofCase)__pbn__Content.Discriminator;
public enum ContentOneofCase
{
None = 0,
Foo = 1,
Bar = 2,
Blap = 3,
}
This approach may be easier and preferable to Any.
On Any:
Short version: protobuf-net has not, to date, had any particular request to implement Any. It probably isn't a huge amount of work - simply: it hasn't yet happened. It looks like you're referencing both protobuf-net and the Google libs here, and using the Google implementation of Any. That's fine, but protobuf-net isn't going to use it at all - it doesn't know about the Google APIs in this context, so: implementing IMessage won't actually help you.
I'd be more than happy to look at Any with you, from the protobuf-net side. Ultimately, time/availability is always the limiting factor, so I prioritise features that are seeing demand. I think you may actually be the first person asking me about Any in protobuf-net.
My Any implementation.
[ProtoContract(Name = "type.googleapis.com/google.protobuf.Any")]
public class Any
{
/// <summary>Pack <paramref name="value"/></summary>
public static Any Pack(object? value)
{
// Handle null
if (value == null) return new Any { TypeUrl = null!, Value = Array.Empty<byte>() };
// Get type
System.Type type = value.GetType();
// Write here
MemoryStream ms = new MemoryStream();
// Serialize
RuntimeTypeModel.Default.Serialize(ms, value);
// Create any
Any any = new Any
{
TypeUrl = $"{type.Assembly.GetName().Name}/{type.FullName}",
Value = ms.ToArray()
};
// Return
return any;
}
/// <summary>Unpack any record</summary>
public object? Unpack()
{
// Handle null
if (TypeUrl == null || Value == null || Value.Length == 0) return null;
// Find '/'
int slashIx = TypeUrl.IndexOf('/');
// Convert to C# type name
string typename = slashIx >= 0 ? $"{TypeUrl.Substring(slashIx + 1)}, {TypeUrl.Substring(0, slashIx)}" : TypeUrl;
// Get type (Note security issue here!)
System.Type type = System.Type.GetType(typename, true)!;
// Deserialize
object value = RuntimeTypeModel.Default.Deserialize(type, Value.AsMemory());
// Return
return value;
}
/// <summary>Test type</summary>
public bool Is(System.Type type) => $"{type.Assembly.GetName().Name}/{type.FullName}" == TypeUrl;
/// <summary>Type url (using C# type names)</summary>
[ProtoMember(1)]
public string TypeUrl = null!;
/// <summary>Data serialization</summary>
[ProtoMember(2)]
public byte[] Value = null!;
/// <summary></summary>
public static implicit operator Container(Any value) => new Container(value.Unpack()! );
/// <summary></summary>
public static implicit operator Any(Container value) => Any.Pack(value.Value);
/// <summary></summary>
public struct Container
{
/// <summary></summary>
public object? Value;
/// <summary></summary>
public Container()
{
this.Value = null;
}
/// <summary></summary>
public Container(object? value)
{
this.Value = value;
}
}
}
'System.Object' can be used as a field or property in a surrounding Container record:
[DataContract]
public class Container
{
/// <summary></summary>
[DataMember(Order = 1, Name = nameof(Value))]
public Any.Container Any { get => new Any.Container(Value); set => Value = value.Value; }
/// <summary>Object</summary>
public object? Value;
}
Serialization
RuntimeTypeModel.Default.Add(typeof(Any.Container), false).SetSurrogate(typeof(Any));
var ms = new MemoryStream();
RuntimeTypeModel.Default.Serialize(ms, new Container { Value = "Hello world" });
Container dummy = RuntimeTypeModel.Default.Deserialize(typeof(Container), ms.ToArray().AsMemory()) as Container;
I'm looking for a non-intrusive way to enforce deserialization to fail under the following circumstances:
The type is not defined in a strongly named assembly.
BinaryFormatter is used.
Since serialized, the type has been modified (e.g. a property has been added).
Below is an illustration/repro of the problem in form of a failing NUnit test. I'm looking for a generic way to make this pass without modifying the Data class, preferably by just setting up the BinaryFormatter during serialization and/or deserialization. I also don't want to involve serialization surrogates, as this is likely to require specific knowledge for each affected type.
Can't find anything in the MSDN docs that helps me though.
[Serializable]
public class Data
{
public string S { get; set; }
}
public class DataSerializationTests
{
/// <summary>
/// This string contains a Base64 encoded serialized instance of the
/// original version of the Data class with no members:
/// [Serializable]
/// public class Data
/// { }
/// </summary>
private const string Base64EncodedEmptyDataVersion =
"AAEAAAD/////AQAAAAAAAAAMAgAAAEtTc2MuU3Rvcm0uRGF0YS5UZXN0cywgV"+
"mVyc2lvbj0xLjAuMC4wLCBDdWx0dXJlPW5ldXRyYWwsIFB1YmxpY0tleVRva2"+
"VuPW51bGwFAQAAABlTc2MuU3Rvcm0uRGF0YS5UZXN0cy5EYXRhAAAAAAIAAAAL";
[Test]
public void Deserialize_FromOriginalEmptyVersionFails()
{
var binaryFormatter = new BinaryFormatter();
var memoryStream = new MemoryStream(Convert.FromBase64String(Base64EncodedEmptyDataVersion));
memoryStream.Seek(0L, SeekOrigin.Begin);
Assert.That(
() => binaryFormatter.Deserialize(memoryStream),
Throws.Exception
);
}
}
I'd recommend a "Java" way here - declare int field in every single serializable class like private int _Serializable = 0; and check that your current version & serialized version match; manually increase when you change properties. If you insist on automated way you'll have to store a lot of metadata and check if current metadata & persisted metadata matches (extra burden on performance/size of serialized data).
Here is the automatic descriptor. Basically you'll have to store TypeDescriptor instance as a part of your binary data & on retrieve check if persisted TypeDescriptor is valid for serialization (IsValidForSerialization) against current TypeDescriptor.
var persistedDescriptor = ...;
var currentDescriptor = Describe(typeof(Foo));
bool isValid = persistedDescriptor.IsValidForSerialization(currentDescriptor);
[Serializable]
[DataContract]
public class TypeDescriptor
{
[DataMember]
public string TypeName { get; set; }
[DataMember]
public IList<FieldDescriptor> Fields { get; set; }
public TypeDescriptor()
{
Fields = new List<FieldDescriptor>();
}
public bool IsValidForSerialization(TypeDescriptor currentType)
{
if (!string.Equals(TypeName, currentType.TypeName, StringComparison.Ordinal))
{
return false;
}
foreach(var field in Fields)
{
var mirrorField = currentType.Fields.FirstOrDefault(f => string.Equals(f.FieldName, field.FieldName, StringComparison.Ordinal));
if (mirrorField == null)
{
return false;
}
if (!field.Type.IsValidForSerialization(mirrorField.Type))
{
return false;
}
}
return true;
}
}
[Serializable]
[DataContract]
public class FieldDescriptor
{
[DataMember]
public TypeDescriptor Type { get; set; }
[DataMember]
public string FieldName { get; set; }
}
private static TypeDescriptor Describe(Type type, IDictionary<Type, TypeDescriptor> knownTypes)
{
if (knownTypes.ContainsKey(type))
{
return knownTypes[type];
}
var descriptor = new TypeDescriptor { TypeName = type.FullName, Fields = new List<FieldDescriptor>() };
knownTypes.Add(type, descriptor);
if (!type.IsPrimitive && type != typeof(string))
{
foreach (var field in type.GetFields(BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public).OrderBy(f => f.Name))
{
var attributes = field.GetCustomAttributes(typeof(NonSerializedAttribute), false);
if (attributes != null && attributes.Length > 0)
{
continue;
}
descriptor.Fields.Add(new FieldDescriptor { FieldName = field.Name, Type = Describe(field.FieldType, knownTypes) });
}
}
return descriptor;
}
public static TypeDescriptor Describe(Type type)
{
return Describe(type, new Dictionary<Type, TypeDescriptor>());
}
I also though about some mechanism of shortening size of persisted metadata - like calculating MD5 from xml-serialized or json-serialized TypeDescriptor; but in that case new property/field will mark your object as incompatible for serialization.
I have the following code:
[Serializable]
public class CustomClass
{
public CustomClass()
{
this.Init();
}
public void Init()
{
foreach (PropertyInfo p in this.GetType().GetProperties())
{
DescriptionAttribute da = null;
DefaultValueAttribute dv = null;
foreach (Attribute attr in p.GetCustomAttributes(true))
{
if (attr is DescriptionAttribute)
{
da = (DescriptionAttribute) attr;
}
if (attr is DefaultValueAttribute)
{
dv = (DefaultValueAttribute) attr;
}
}
UInt32 value = 0;
if (da != null && !String.IsNullOrEmpty(da.Description))
{
value = Factory.Instance.SelectByCode(da.Description, 3);
}
if (dv != null && value == 0)
{
value = (UInt32) dv.Value;
}
p.SetValue(this, value, null);
}
}
private UInt32 name;
[Description("name")]
[DefaultValue(41)]
public UInt32 Name
{
get { return this.name; }
set { this.name = value; }
}
(30 more properties)
}
Now the weird thing is: when I try to serialize this class I will get an empty node CustomClass!
<CustomClass />
And when I remove Init from the constructor it works as expected! I will get the full xml representation of the class but ofcourse without values (all with value 0).
<CustomClass>
<Name>0</Name>
...
</CustomClass>
Also, when I comment out the body of Init, I will get the same as above (the one with default values)
I've tried it with a public method, with a Helper class everything, but it does not work. That is, instead of the expected:
<CustomClass>
<Name>15</Name>
...
</CustomClass>
I will get
<CustomClass />
It seems when I use reflection in this class, serialization is not possible.
Or to summarize: when I call Init or when I fill my properties with reflection -> Serialization fails, when I remove this code part -> Serialization works but of course without my values.
Is this true? And does somebody know an alternative for my solution?
It should automatically get something from the database based on the Description and when this returns nothing it falls back to the DefaultValue...
PS1: I am using the XmlSerializer
PS2: When I set a breakpoint before the serialization, I can see that all the properties are filled with the good values (like 71, 72 etc).
Now the weird thing is: when I try to serialize this class I will get an empty node CustomClass!
XmlSerializer uses DefaultValue to decide which values to serialize - if it matches the default value, it doesn't store it. This approach is consistent with similar models such as data-binding / model-binding.
Frankly, I would say that in this case both DefaultValueAttribute and DescriptionAttribute are poor choices. Write your own - perhaps EavInitAttribute - then use something like:
[EavInit(41, "name")]
public uint Name {get;set;}
Note that there are other ways of controlling this conditional serialization - you could write a method like:
public bool ShouldSerializeName() { return true; }
which will also work to convince it to write the value (this is another pattern recognised by various serialization and data-binding APIs) - but frankly this is even more work (it is per-property, and needs to be public, so it makes a mess of the API).
Finally, I would say that hitting the database multiple times (once per property) for every new object construction is very expensive - especially since many of those values are likely to be assigned values in a moment anyway (so looking them up is wasted effort). I would put a lot of thought into making this both "lazy" and "cached" if it was me.
An example of a lazy and "sparse" implementation:
using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;
using System.Xml.Serialization;
static class Program
{
static void Main()
{
var obj = new CustomClass();
Console.WriteLine(obj.Name);
// show it working via XmlSerializer
new XmlSerializer(obj.GetType()).Serialize(Console.Out, obj);
}
}
public class CustomClass : EavBase
{
[EavInit(42, "name")]
public uint Name
{
get { return GetEav(); }
set { SetEav(value); }
}
}
public abstract class EavBase
{
private Dictionary<string, uint> values;
protected uint GetEav([CallerMemberName] string propertyName = null)
{
if (values == null) values = new Dictionary<string, uint>();
uint value;
if (!values.TryGetValue(propertyName, out value))
{
value = 0;
var prop = GetType().GetProperty(propertyName);
if (prop != null)
{
var attrib = (EavInitAttribute)Attribute.GetCustomAttribute(
prop, typeof(EavInitAttribute));
if (attrib != null)
{
value = attrib.DefaultValue;
if (!string.IsNullOrEmpty(attrib.Key))
{
value = LookupDefaultValueFromDatabase(attrib.Key);
}
}
}
values.Add(propertyName, value);
}
return value;
}
protected void SetEav(uint value, [CallerMemberName] string propertyName = null)
{
(values ?? (values = new Dictionary<string, uint>()))[propertyName] = value;
}
private static uint LookupDefaultValueFromDatabase(string key)
{
// TODO: real code here
switch (key)
{
case "name":
return 7;
default:
return 234;
}
}
[AttributeUsage(AttributeTargets.Property, AllowMultiple = false, Inherited = true)]
protected class EavInitAttribute : Attribute
{
public uint DefaultValue { get; private set; }
public string Key { get; private set; }
public EavInitAttribute(uint defaultValue) : this(defaultValue, "") { }
public EavInitAttribute(string key) : this(0, key) { }
public EavInitAttribute(uint defaultValue, string key)
{
DefaultValue = defaultValue;
Key = key ?? "";
}
}
}
The default values for classes generated with protogen don't seem to be serialized when UseImplicitZeroDefaults = false.
I have a small .proto file:
package protobuf;
option java_package = "com.company.protobuf";
option java_outer_classname = "Test";
message TestMessage{
optional string Message = 1;
optional bool ABool = 2;
optional int32 AnInt = 3;
}
Using protogen.exe, I've generated a TestMessage class that I'm trying to send back and forth across the wire to a Java app. I can't seem to get protobuf-net to serialize a value of zero for AnInt or false for ABool, including setting UseImplicitZeroDefaults=false. However, using annotated classes for serialization with that setting works. Here's an equivalent class to the one I generated:
[ProtoContract]
class Test2
{
[ProtoMember(1)]
public string Message { get; set; }
[ProtoMember(2)]
public bool ABool { get; set; }
[ProtoMember(3)]
public int AnInt { get; set; }
}
Initializing the two classes with the same data and serializing to byte[] shows that four extra bytes are coming from the annotated class.
...
private static readonly RuntimeTypeModel serializer;
static Program()
{
serializer = TypeModel.Create();
serializer.UseImplicitZeroDefaults = false;
Console.WriteLine(serializer.UseImplicitZeroDefaults); //prints false
}
static void SendMessages(ITopic topic, ISession session)
{
Console.WriteLine(serializer.UseImplicitZeroDefaults);
TestMessage t = new TestMessage();
t.ABool = false;
t.AnInt = 0;
t.Message = "Test Message";
using (var o = new MemoryStream())
{
serializer.Serialize(o, t);
Console.WriteLine(string.Format("Tx: Message={0} ABool={1} AnInt={2}", t.Message, t.ABool, t.AnInt));
Console.WriteLine(o.ToArray().Length);
}
Test2 t2 = new Test2();
t2.ABool = false;
t2.AnInt = 0;
t2.Message = "Test Message";
using (var o = new MemoryStream())
{
serializer.Serialize(o, t2);
Console.WriteLine(string.Format("Tx: Message={0} ABool={1} AnInt={2}", t.Message, t.ABool, t.AnInt));
Console.WriteLine(o.ToArray().Length);
}
}
Output:
False
Tx: Message=Test Message ABool=False AnInt=0
14
Tx: Message=Test Message ABool=False AnInt=0
18
Is there a setting I'm missing? Or do classes generated from .proto files use a different mechanism for serialization? In an ideal world, I would expect the UseImplicitZeroDefaults setting to get picked up by both the annotated and generated classes on their way through the serializer.
If you add -p:detectMissing to your call to protogen, it should emit code following a different pattern that allows for better tracking of these. Basically, it should do what you want then.
Question:
Can anyone tell me why my unit test is failing with this error message?
CollectionAssert.AreEquivalent failed. The expected collection contains 1
occurrence(s) of . The actual
collection contains 0 occurrence(s).
Goal:
I'd like to check if two lists are identical. They are identical if both contain the same elements with the same property values. The order is irrelevant.
Code example:
This is the code which produces the error. list1 and list2 are identical, i.e. a copy-paste of each other.
[TestMethod]
public void TestListOfT()
{
var list1 = new List<MyPerson>()
{
new MyPerson()
{
Name = "A",
Age = 20
},
new MyPerson()
{
Name = "B",
Age = 30
}
};
var list2 = new List<MyPerson>()
{
new MyPerson()
{
Name = "A",
Age = 20
},
new MyPerson()
{
Name = "B",
Age = 30
}
};
CollectionAssert.AreEquivalent(list1.ToList(), list2.ToList());
}
public class MyPerson
{
public string Name { get; set; }
public int Age { get; set; }
}
I've also tried this line (source)
CollectionAssert.AreEquivalent(list1.ToList(), list2.ToList());
and this line (source)
CollectionAssert.AreEquivalent(list1.ToArray(), list2.ToArray());
P.S.
Related Stack Overflow questions:
I've seen both these questions, but the answers didn't help.
CollectionAssert use with generics?
Unit-testing IList with CollectionAssert
You are absolutely right. Unless you provide something like an IEqualityComparer<MyPerson> or implement MyPerson.Equals(), the two MyPerson objects will be compared with object.Equals, just like any other object. Since the objects are different, the Assert will fail.
It works if I add an IEqualityComparer<T> as described on MSDN and if I use Enumerable.SequenceEqual. Note however, that now the order of the elements is relevant.
In the unit test
//CollectionAssert.AreEquivalent(list1, list2); // Does not work
Assert.IsTrue(list1.SequenceEqual(list2, new MyPersonEqualityComparer())); // Works
IEqualityComparer
public class MyPersonEqualityComparer : IEqualityComparer<MyPerson>
{
public bool Equals(MyPerson x, MyPerson y)
{
if (object.ReferenceEquals(x, y)) return true;
if (object.ReferenceEquals(x, null) || object.ReferenceEquals(y, null)) return false;
return x.Name == y.Name && x.Age == y.Age;
}
public int GetHashCode(MyPerson obj)
{
if (object.ReferenceEquals(obj, null)) return 0;
int hashCodeName = obj.Name == null ? 0 : obj.Name.GetHashCode();
int hasCodeAge = obj.Age.GetHashCode();
return hashCodeName ^ hasCodeAge;
}
}
I was getting this same error when testing a collection persisted by nHibernate. I was able to get this to work by overriding both the Equals and GetHashCode methods. If I didn't override both I still got the same error you mentioned:
CollectionAssert.AreEquivalent failed. The expected collection contains 1 occurrence(s) of .
The actual collection contains 0 occurrence(s).
I had the following object:
public class EVProjectLedger
{
public virtual long Id { get; protected set; }
public virtual string ProjId { get; set; }
public virtual string Ledger { get; set; }
public virtual AccountRule AccountRule { get; set; }
public virtual int AccountLength { get; set; }
public virtual string AccountSubstrMethod { get; set; }
private Iesi.Collections.Generic.ISet<Contract> myContracts = new HashedSet<Contract>();
public virtual Iesi.Collections.Generic.ISet<Contract> Contracts
{
get { return myContracts; }
set { myContracts = value; }
}
public override bool Equals(object obj)
{
EVProjectLedger evProjectLedger = (EVProjectLedger)obj;
return ProjId == evProjectLedger.ProjId && Ledger == evProjectLedger.Ledger;
}
public override int GetHashCode()
{
return new { ProjId, Ledger }.GetHashCode();
}
}
Which I tested using the following:
using (ITransaction tx = session.BeginTransaction())
{
var evProject = session.Get<EVProject>("C0G");
CollectionAssert.AreEquivalent(TestData._evProjectLedgers.ToList(), evProject.EVProjectLedgers.ToList());
tx.Commit();
}
I'm using nHibernate which encourages overriding these methods anyways. The one drawback I can see is that my Equals method is based on the business key of the object and therefore tests equality using the business key and no other fields. You could override Equals however you want but beware of equality pollution mentioned in this post:
CollectionAssert.AreEquivalent failing... can't figure out why
If you would like to achieve this without having to write an equality comaparer, there is a unit testing library that you can use, called FluentAssertions,
https://fluentassertions.com/documentation/
It has many built in equality extension functions including ones for the Collections. You can install it through Nuget and its really easy to use.
Taking the example in the question above all you have to write in the end is
list1.Should().BeEquivalentTo(list2);
By default, the order matters in the two collections, however it can be changed as well.
I wrote this to test collections where the order is not important:
public static bool AreCollectionsEquivalent<T>(ICollection<T> collectionA, ICollection<T> collectionB, IEqualityComparer<T> comparer)
{
if (collectionA.Count != collectionB.Count)
return false;
foreach (var a in collectionA)
{
if (!collectionB.Any(b => comparer.Equals(a, b)))
return false;
}
return true;
}
Not as elegant as using SequenceEquals, but it works.
Of course to use it you simply do:
Assert.IsTrue(AreCollectionsEquivalent<MyType>(collectionA, collectionB, comparer));