Deserialize class properties to derived class instead of base abstract class

Deserialize class properties to derived class instead of base abstract class - c#

The structure here is a bit convoluted so please bear with me. I hope that there's a way to do what a want to do, but if it's not possible then feel free to tell me!
Unfortunately I've not been involved in the design process of the XML file spec from the beginning, the goalposts have been moved a dozen times and the spec can't be amended for the sake of making it easier for me to do as any spec amendments are extortionately priced (yes, even renaming XML elements!)
Anyway I digress.
There are two different types of purchase order that have a slightly different file structure and slightly different processing logic and I am trying to handle them both with the same code instead of having two different projects for what is nearly identical.
Both types of purchase order derive from a set of abstract classes that determine basic business logic that both PO types share (quantity, cost, PO number, etc.).
Base classes
abstract class PO {
[XmlIgnore]
abstract POType PurchaseOrderType {get;}
[XmlIgnore]
abstract PO_Head Header {get;set;}
[XmlIgnore]
abstract List<PO_Line> LineItems {get;set;}
}
abstract class PO_Head {
[XmlIgnore]
abstract string PONumber {get;set;}
}
abstract class PO_Line {
[XmlIgnore]
abstract string ItemCode {get;set;}
[XmlIgnore]
abstract decimal UnitCost {get;set;}
[XmlIgnore]
abstract int OrderQty {get;set;}
}
Derived classes
class POR : PO {
// POR implementations
}
class POR_Head : PO_Line {
// POR implementations
}
class POR_Line : PO_Line {
// POR implementations
}
class CPO : PO {
// CPO implementations
}
class CPO_Head : PO_Line {
// CPO implementations
}
class CPO_Line : PO_Line {
// CPO implementations
}
The base abstract class is then used in the code to process each purchase order and import them into our accounting system.
for (int i = pending.Transactions.Count -1; i > -1; i--) {
PO pendingOrder = (PO)pending.Transactions[i];
// Import PO type
}
The problem is that when I attempt to deserialize into each derived class as appropriate, it attempts to deserialize the Header and LineItems into the derived types PO_Head and PO_Line. Is there any way I can explicitly tell the XmlSerializer to treat Header and LineItems as the derived class versions - CPO_Head and CPO_Line or POR_Head and POR_Line - depending on the class that is being serialized - CPO or POR respectively. Something akin to the below.
class CPO : PO {
[XmlIgnore]
override POType PurchaseOrderType => POType.CPO;
[XmlElement("CPO_Head")]
[XmlDeserializerType(typeof(CPO_Head))]
override PO_Head Header {get;set;}
[XmlArray("CPO_Lines")]
[XmlArrayItem("CPO_Line")]
[XmlDeserializerType(typeof(CPO_Line))]
override List<PO_Line> LineItems {get;set;}
}
Kind of dug myself a hole with this one (first time using deserialization) so I'm hoping there's an easy way out that would save me having to rewrite a ton of the work I've done!
TIA
EDIT - code used to deserialize/serialize as requested
public static void SerializeToXmlFile(object obj, FileInfo dstFile)
{
XmlSerializer xmlSerializer = new XmlSerializer(obj.GetType());
using (FileStream fs = dstFile.Open(FileMode.Create, FileAccess.Write))
{
xmlSerializer.Serialize(fs, obj);
}
}
public static object DeserializeFromXmlFile(FileInfo srcFile, Type type)
{
XmlSerializer xmlSerializer = new XmlSerializer(type);
using (FileStream fs = srcFile.Open(FileMode.Open, FileAccess.Read))
{
return xmlSerializer.Deserialize(fs);
}
}
public static void Main(string[] args)
{
// Deserialize from XML file
FileInfo srcFile = new FileInfo("path\to\file");
Type t;
if (srcFile.Name.Substring(0,3) == "CPO")
t = typeof(CPO);
else if (srcFile.Name.Substring(0,3) == "POR")
t = typeof(POR);
PO po = DeserializeFromXmlFile(file, t);
// Process the file
// ...
// Serialize back to file
FileInfo dstFile = new FileInfo("new\path\to\file");
SerializeToXmlFile(po, dstFile);
}
EDIT - fixed
As per the marked correct answer I was able to resolve this by specifying the Type in the XmlElement and XmlArrayItem attributes.
class POR {
[XmlElement(typeof(POR_Head), ElementName="PO_Head")]
override PO_Head Header {get;set;} = new POR_Head();
[XmlArray("PO_Lines")]
[XmlArrayItem(typeof(POR_Line), ElementName="PO_Line")]
override List<PO_Line> LineItems {get;set;} = new List<LineItems>();
}
class CPO {
[XmlElement(typeof(CPO_Head), ElementName="PO_Head")]
override PO_Head Header {get;set;} = new CPO_Head();
[XmlArray("PO_Lines")]
[XmlArrayItem(typeof(CPO_Line), ElementName="PO_Line")]
override List<PO_Line> LineItems {get;set;} = new List<LineItems>();
}

I think you are looking for the XmlArrayItemAttribute. According to the docs:
You can apply the XmlArrayItemAttribute to any public read/write member that returns an array, or provides access to one. For example, a field that returns an array of objects, a collection, an ArrayList, or any class that implements the IEnumerable interface.
The XmlArrayItemAttribute supports polymorphism--in other words, it allows the XmlSerializer to add derived objects to an array.
If I understand the example correctly, the attribute should be written with a list of the possible derived types that are allowed in the serialized enumerable, e.g.
[XmlArrayItem (typeof(PO_Line), ElementName = "PO_Line"),
XmlArrayItem (typeof(CPO_Line),ElementName = "CPO_Line")]
Since you are only passing a string, the attribute interprets that as the ElementName, not the type. The type must be passed using typeof(ClassName).

Related

Serializing Interface array

I am trying to implement a way to save a set of objects to file, and then read it back to objects again.
I want to serialize the objects to XML (or JSON). The objects consists of one master object which holds an array of all the other objects. The array is of the type Interface, to allow several different types of child objects with some common functionality.
Obviously, there will be a problem during deserialization because the type of the interface object is not known.
Example:
[Serializable]
public class MasterClass
{
public ImyInterface[] subObjects;
}
public interface ImyInterface
{
}
How can I serialize/deserialize these objects?
My suggestions:
Add information about the object type in the serialized data.
Use a different solution than interface.

This is not the only way to serialize your data, but it is a ready to use solution from the framework:
DataContractSerializer supports this is you don't mind adding attributes for each of the available implementations of the interface:
[DataContract]
[KnownType(typeof(MyImpl))] // You'd have to do this for every implementation of ImyInterface
public class MasterClass
{
[DataMember]
public ImyInterface[] subObjects;
}
public interface ImyInterface
{
}
public class MyImpl : ImyInterface
{
...
}
Serializing/deserializing:
MasterClass mc = ...
using (var stream = new MemoryStream())
{
DataContractSerializer ser = new DataContractSerializer(typeof(MasterClass));
ser.WriteObject(stream, mc);
stream.Position = 0;
var deserialized = ser.ReadObject(stream);
}
For JSON you could use DataContractJsonSerializer instead.

One solution is to use an abstract class instead of an interface:
public class MasterClass
{
public MyAbstractClass[] subObjects;
}
[XmlInclude(typeof(MyImpl ))] //Include all classes that inherits from the abstract class
public abstract class MyAbstractClass
{
}
public class MyImpl : MyAbstractClass
{
...
}
It can be serialized/deserialized with the XmlSerializer:
MasterClass mc = ...
using (FileStream fs = File.Create("objects.xml"))
{
xs = new XmlSerializer(typeof(MasterClass));
xs.Serialize(fs, mc);
}
using (StreamReader file = new StreamReader("objects.xml"))
{
XmlSerializer reader = new XmlSerializer(typeof(MasterClass));
var deserialized = reader.Deserialize(file);
}

Inheritance in protobuf.net, adding a lower base class still backward compatible?

I have been using protobuf.net for a while and it is excellent. I can have a class which is inherited from a base class, I can serialise the derived class by using ProtoInclude statements in the base class. If my base class originally had only say two ProtoInclude statements when the object was serialised, say
[ProtoInclude(100, typeof(Vol_SurfaceObject))]
[ProtoInclude(200, typeof(CurveObject))]
internal abstract class MarketDataObject
I can still deserialise that same object in to code that has evolved to have more derivations:
[ProtoInclude(100, typeof(Vol_SurfaceObject))]
[ProtoInclude(200, typeof(CurveObject))]
[ProtoInclude(300, typeof(StaticDataObject))]
internal abstract class MarketDataObject
So far so good (in fact excellent, thanks Marc). However, now what if I want to have a base class even lower then my current base class here (in this case, MarketDataObject). Such that I would have
[ProtoInclude(100, typeof(Vol_SurfaceObject))]
[ProtoInclude(200, typeof(CurveObject))]
[ProtoInclude(300, typeof(StaticDataObject))]
internal abstract class MarketDataObject : LowerStillBaseClass
{ blah }
[ProtoInclude(10, typeof(MarketDataObject))]
internal abstract class LowerStillBaseClass
{ blah }
Whilst the code will of course work, will I be still be able to deserialise the initial objects that were serialised when the object had only 2 ProtoInclude statements to this new form of the MarketDataObject class?

This will not work purely with static protbuf-net attributes. Simplifying somewhat, imagine you start with the following :
namespace V1
{
[ProtoContract]
internal class MarketDataObject
{
[ProtoMember(1)]
public string Id { get; set; }
}
}
And refactor it to be the following:
namespace V2
{
[ProtoInclude(10, typeof(MarketDataObject))]
[ProtoContract]
internal abstract class LowerStillBaseClass
{
[ProtoMember(1)]
public string LowerStillBaseClassProperty { get; set; }
}
[ProtoContract]
internal class MarketDataObject : LowerStillBaseClass
{
[ProtoMember(1)]
public string Id { get; set; }
}
}
Next, try to deserialize a created from the V1 class into a V2 class. You will fail with the following exception:
ProtoBuf.ProtoException: No parameterless constructor found for LowerStillBaseClass
The reason this does not work is that type hierarchies are serialized base-first rather than derived-first. To see this, dump the protobuf-net contracts for each type by calling Console.WriteLine(RuntimeTypeModel.Default.GetSchema(type)); For V1.MarketDataObject we get:
message MarketDataObject {
optional string Id = 1;
}
And for V2.MarketDataObject:
message LowerStillBaseClass {
optional string LowerStillBaseClassProperty = 1;
// the following represent sub-types; at most 1 should have a value
optional MarketDataObject MarketDataObject = 10;
}
message MarketDataObject {
optional string Id = 1;
}
MarketDataObject is getting encoded into a message with its base type fields first, at the top level, then derived type fields are recursively encapsulated inside a nested optional message with a field id that represents its subtype. So when a V1 message is deserialized to a V2 object, no subtype field is encountered, the correct derived type is not inferred, and derived type values are lost.
One workaround is to avoid using [ProtoInclude(10, typeof(MarketDataObject))] and instead populate the base class members in the derived type's contract programmatically using the RuntimeTypeModel API:
namespace V3
{
[ProtoContract]
internal abstract class LowerStillBaseClass
{
[ProtoMember(1)]
public string LowerStillBaseClassProperty { get; set; }
}
[ProtoContract]
internal class MarketDataObject : LowerStillBaseClass
{
static MarketDataObject()
{
AddBaseTypeProtoMembers(RuntimeTypeModel.Default);
}
const int BaseTypeIncrement = 11000;
public static void AddBaseTypeProtoMembers(RuntimeTypeModel runtimeTypeModel)
{
var myType = runtimeTypeModel[typeof(MarketDataObject)];
var baseType = runtimeTypeModel[typeof(MarketDataObject).BaseType];
if (!baseType.GetSubtypes().Any(s => s.DerivedType == myType))
{
foreach (var field in baseType.GetFields())
{
myType.Add(field.FieldNumber + BaseTypeIncrement, field.Name);
}
}
}
[ProtoMember(1)]
public string Id { get; set; }
}
}
(Here I am populating the contract inside the static constructor for MarketDataObject. You might want to do it elsewhere.) The schema for V3. looks like:
message MarketDataObject {
optional string Id = 1;
optional string LowerStillBaseClassProperty = 11001;
}
This schema is compatible with the V1 schema, and so A V1 message can be deserialized into a V3 class without data loss. Sample fiddle.
Of course, if you are moving a member from MarketDataObject to LowerStillBaseClass you will need to ensure that the field id stays the same.
The disadvantage of this workaround is that you lose the ability to deserialize an object of type LowerStillBaseClass and have protobuf-net automatically infer the correct derived type.

Pass List<child> as argument?

I want to create a method that serializes lists of different childs. Each child gets his own XML file. Doing this for a single object was easy:
interface ISerializable
{
}
public class Item : ISerializable
{
public string name { get; set; }
public int number { get; set; }
}
public class Weapon : ISerializable
{
public string name { get; set; }
public int damage {get; set;}
}
public static void SerializeToXML(ISerializable child)
{
XmlSerializer serializer = new XmlSerializer(child.GetType());
using (TextWriter textWriter = new StreamWriter("test.xml"))
{
serializer.Serialize(textWriter, child);
}
}
I can put anything derived from ISerializable into the serialize method to get a desired result. However when give List<ISerializable> as possible argument it does not compile Cannot convert List<item> to List<ISerializable>. Anyway to solve this?
public static void SerializeListToXML(List<ISerializable> listToXML)
{
XmlSerializer serializer = new XmlSerializer(listToXML.GetType());
using (TextWriter textWriter = new StreamWriter("test.xml"))
{
serializer.Serialize(textWriter, listToXML);
}
}
The base reason i'm doing this is to get XML templates of much larger structures so i an manually add XML to load into my program. I figured i would benefit from creating a serializer first so i have the correct XML structure to start with. Perhaps this can come in handy later down the line for saving user data too.

SerializeListToXML should take an IEnumerable<ISerializable> instead of a List<Item>.
A List<Item> cannot be casted to List<ISerializable>. If you could to that, then you'd be able to add any ISerializable to the list, not just Items. With an IEnumerable<>, you cannot modify the collection - only iterate over it. Since SerializeListToXML does not need to modify the collection, it can accept an IEnumerable<ISerializable>, which is less restrictive. Please see the following section and sub-sections about covariance and contravariance: Covariance and Contravariance

Getting ServiceStack to retain type information

I'm using ServiceStack to serialize and deserialize some objects to JSON. Consider this example:
public class Container
{
public Animal Animal { get; set; }
}
public class Animal
{
}
public class Dog : Animal
{
public void Speak() { Console.WriteLine("Woof!"); }
}
var container = new Container { Animal = new Dog() };
var json = JsonSerializer.SerializeToString(container);
var container2 = JsonSerializer.DeserializeFromString<Container>(json);
((Dog)container.Animal).Speak(); //Works
((Dog)container2.Animal).Speak(); //InvalidCastException
The last line throws a InvalidCastException, because the Animal field is instantiated as an Animal type, not a Dog type. Is there any way I can tell ServiceStack to retain the information that this particular instance was of the Dog type?

Inheritance in DTOs is a bad idea - DTO's should be as self-describing as possible and by using inheritance clients effectively have no idea what the service ultimately returns. Which is why your DTO classes will fail to de/serialize properly in most 'standards-based' serializers.
There's no good reason for having interfaces in DTO's (and very few reasons to have them on POCO models), it's a cargo cult habit of using interfaces to reduce coupling in application code that's being thoughtlessly leaked into DTOs. But across process boundaries, interfaces only adds coupling (it's only reduced in code) since the consumer has no idea what concrete type to deserialize into so it has to emit serialization-specific implementation hints that now embeds C# concerns on the wire (so now even C# namespaces will break serialization) and now constrains your response to be used by a particular serializer. Leaking C# concerns on the wire violates one of the core goal of services for enabling interoperability.
As there is no concept of 'type info' in the JSON spec, in order for inheritance to work in JSON Serializers they need to emit proprietary extensions to the JSON wireformat to include this type info - which now couples your JSON payload to a specific JSON serializer implementation.
ServiceStack's JsonSerializer stores this type info in the __type property and since it can considerably bloat the payload, will only emit this type information for types that need it, i.e. Interfaces, late-bound object types or abstract classes.
With that said the solution would be to change Animal to either be an Interface or an abstract class, the recommendation however is not to use inheritance in DTOs.

You are serializing only the properties of the animal object, whether the serialized object is dog or not. Even if you add a public property to the dog class, like "Name", it will not be serialized so when you deserialize you will only have properties of an "Animal" class.
If you change it to the following it will work;
public class Container<T> where T: Animal
{
public T Animal { get; set; }
}
public class Animal
{
}
public class Dog : Animal
{
public void Speak() { Console.WriteLine("Woof!"); }
public string Name { get; set; }
}
var c = new Container<Dog> { Animal = new Dog() { Name = "dog1" } };
var json = JsonSerializer.SerializeToString<Container<Dog>>(c);
var c2 = JsonSerializer.DeserializeFromString<Container<Dog>>(json);
c.Animal.Speak(); //Works
c2.Animal.Speak();

Maybe is off-topic but Newtonsoft serializer can do that including the option:
serializer = new JsonSerializer();
serializer.TypeNameHandling = TypeNameHandling.All;
It will create a property inside the json called $type with the strong type of the object. When you call the deserializer, that value will be use to build the object again with the same types. The next test works using newtonsoft with strong type, not with ServiceStack
[TestFixture]
public class ServiceStackTests
{
[TestCase]
public void Foo()
{
FakeB b = new FakeB();
b.Property1 = "1";
b.Property2 = "2";
string raw = b.ToJson();
FakeA a=raw.FromJson<FakeA>();
Assert.IsNotNull(a);
Assert.AreEqual(a.GetType(), typeof(FakeB));
}
}
public abstract class FakeA
{
public string Property1 { get; set; }
}
public class FakeB:FakeA
{
public string Property2 { get; set; }
}

How can I read data from a serialized object whose definition isn't in my code base any more?

Say I have the following class:
[Serializable]
public class A
{
public string B { get; set; }
}
and the following method was used to serialize it:
public void Serialize()
{
BinaryFormatter b = new BinaryFormatter();
b.Serialize(new FileStream(#"C:\Temp\out.dat", FileMode.Create), new A());
}
If at some point, someone came along and modified the class definition to contain an extra property (or remove one):
[Serializable]
public class A // Same class
{
public string B { get; set; }
public string C { get; set; } // New property
}
then the following will break:
public void Deserialize()
{
BinaryFormatter b = new BinaryFormatter();
A deserialized = (A)b.Deserialize(new FileStream(#"C:\Temp\out.dat", FileMode.Open));
}
because the serialized version of the class does not match the class definition of the new class.
I really dislike the idea of serialization as a persistence mechanism because it's so fragile. I would have dealt with this much earlier if I had been involved in the decision.
Is there any way to get my serialized data into a form I can read it without reverting all of my current class definitions to their serialized state?
Even if it's hacky "to the max", I was hoping I could do this, because I would hopefully only do it once (or until I could figure out how to fix the root of the problem).
edit
There are dozens of these classes that have been serialized and then modified in my system already. It is not feasible to use version control to see exactly when and how each individual class changed.
I'm currently trying to figure out a way I can retain "old A"'s settings before the user tries to deserialize the object to the "new A" format and I have to 1) try, 2) catch, 3) swallow the error, and finally 4) recreate A with default values (based on the new object definition).

I believe that you can decorate newly added members with the OptionalFieldAttribute (System.Runtime.Serialization namespace), which will allow you to deserialize instances that were serialized before that member was added.
[Serializable]
public class A // Same class
{
public string B { get; set; }
[OptionalField] public string C { get; set; } // New property
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.