How to do a polymorphic deserialization in C# given a XSD?

How to do a polymorphic deserialization in C# given a XSD? - c#

I have the following given:
1) A XML Schema, XSD-file, compiled to C# classes using the XSD.EXE tool.
2) A RabbitMQ message queue containing well formed messages in XML of any type defined in the XML Schema. Here are two snippets of different messages:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<UserReport xmlns=".../v5.1"; ... >
... User report message content...
</UserReport>
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<CaptureReport xmlns=".../v5.1"; ...>
... Capture report message content...
</CaptureReport>
3) Experience using the XmlSerializer .Net class to deserialize, when the type is known.
The question is how to deserialize messages from XML to a an object, when the type is unknown. It's not possible to instantiate the XmlSerializer, because the type is unknown.
One way is to loop through all possible types until deserialiation succeed, which is a bad solution because there are many different types defined in the XML Schema.
Is there any other alternatives?

There are a few approaches you can take depending on how exactly you've achieved your polymorphism in the XML itself.
Element name is the type name (reflection approach)
You could get the root element name like this:
string rootElement = null;
using (XmlReader reader = XmlReader.Create(xmlFileName))
{
while (reader.Read())
{
// We won't have to read much of the file to find the root element as it will be the first one found
if (reader.NodeType == XmlNodeType.Element)
{
rootElement = reader.Name;
break;
}
}
}
Then you could find the type by reflection like this (adjust reflection as necessary if your classes are in a different assembly):
var serializableType = Type.GetType("MyApp." + rootElement);
var serializer = new XmlSerializer(serializableType);
You would be advised to cache the mapping from the element name to the XML serializer if performance is important.
Element name maps to the type name
If the XML element names are different from the type names, or you don't want to do reflection, you could instead create a Dictionary mapping from the element names in the XML to the XmlSerializer objects, but still look-up the root element name using the snippet above.
Common root element with polymorphism through xsi:type
If your XML messages all have the same root element name, and the polymorphism is achieved by having types identified using xsi:type, then you can do something like this:
using System;
using System.Xml.Serialization;
namespace XmlTest
{
public abstract class RootElement
{
}
public class TypeA : RootElement
{
public string AData
{
get;
set;
}
}
public class TypeB : RootElement
{
public int BData
{
get;
set;
}
}
class Program
{
static void Main(string[] args)
{
var serializer = new System.Xml.Serialization.XmlSerializer(typeof(RootElement),
new Type[]
{
typeof(TypeA),
typeof(TypeB)
});
RootElement rootElement = null;
string axml = "<RootElement xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xsi:type=\"TypeA\"><AData>Hello A</AData></RootElement>";
string bxml = "<RootElement xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xsi:type=\"TypeB\"><BData>1234</BData></RootElement>";
foreach (var s in new string[] { axml, bxml })
{
using (var reader = new System.IO.StringReader(s))
{
rootElement = (RootElement)serializer.Deserialize(reader);
}
TypeA a = rootElement as TypeA;
if (a != null)
{
Console.WriteLine("TypeA: {0}", a.AData);
}
else
{
TypeB b = rootElement as TypeB;
if (b != null)
{
Console.WriteLine("TypeB: {0}", b.BData);
}
else
{
Console.Error.WriteLine("Unexpected type.");
}
}
}
}
}
}
Note the second parameter to the XmlSerializer constructor which is an array of additional types that you want the .NET serializer to know about.

This is an answer based on #softwariness one, but it provides some automation.
If your classes are generated via xsd then all root types are decorated with XmlRootAttribute so we can use it:
public class XmlVariantFactory
{
private readonly Dictionary<string, Type> _xmlRoots;
public XmlVariantFactory() : this(Assembly.GetExecutingAssembly().GetTypes())
{
}
public XmlVariantFactory(IEnumerable<Type> types)
{
_xmlRoots = types
.Select(t => new {t, t.GetCustomAttribute<XmlRootAttribute>()?.ElementName})
.Where(x => !string.IsNullOrEmpty(x.ElementName))
.ToDictionary(x => x.ElementName, x => x.t);
}
public Type GetSerializationType(XmlReader reader)
{
while (reader.Read())
{
if (reader.NodeType == XmlNodeType.Element)
{
return _xmlRoots[reader.LocalName];
}
}
throw new ArgumentException("No known root type found for passed XML");
}
}
It scans all type in executing assembly and finds all possible XML roots. You can do it for all assemblies:
public XmlVariantFactory() : this(AppDomain.CurrentDomain.SelectMany(a => a.GetTypes())
{
}
And then you juse use it:
var input = new StringReader(TestResource.firstRequestResponse);
var serializationType = new XmlVariantFactory().GetSerializationType(XmlReader.Create(input));
var xmlReader = XmlReader.Create(input);
bool canDeserialize = new XmlSerializer(serializationType).CanDeserialize(xmlReader);
Assert.True(canDeserialize);

The responsible for the XML Schema have added the xml-tag to a content field in the RabbitMQ protocol header. The header holds the tag for the dto, data transfer object, sent and serialized to xml. This means that a IOC container becomes handy. I have coded a dto builder interface and its implementation by a generic builder. Thus the builder will build a dto when the dto class is specified for generic part. Note that the dto-class is generated by the xsd-tool. In a IOC container like MS Unity I registered the builder interface implementations for dto all classes and added the xml-tag to the register call. The IOC container’s resolver function is called with the actual received xml-tag from the RabbitMQ header in order to instantiate the specific builder of the dto.

Related

How to deserialize two different XML types in to one class

I have two different XML documents. Their structure is almost identical, but they have some different elements.
I would like to deserialize the incoming documents in to one class which is a superset of both classes. There is no need to ever serialize the class, I only need to deserialize the documents.
The XML document types have a different root element, lets say the root of the first is <CLASSA>and the other is <CLASSB>.
I am looking for something like this, where both <CLASSA> and <CLASSB> xml documents are mapped to ClassAandB:
[XmlRoot(ElementName="CLASSA,CLASSB")]
public class ClassAandB {
[XmlElement(ElementName="syntaxid")]
public Syntaxid Syntaxid{ get; set; }
[XmlElement(ElementName="email")]
public Email Email { get; set; }
[XmlElement(ElementName="envelope")]
public Envelope Envelope { get; set; }
[XmlElement(ElementName="header")]
public Header Header { get; set; }
}
I can then find out which of the two types it is by reading the Syntaxid property. This helps me because a lot of the processing is the same for both types.
Any suggestions how to do this?

Since the xml root element name might depend on the content of the xml document, you'll have to configure the XmlSerializer at runtime with this xml root element name to be used.
In this case, there's no need anymore to apply an XmlRootAttribute.
This can be done via the constructor overload accepting an XmlRootAttribute argument, via which you pass the root element name.
public XmlSerializer (Type type, System.Xml.Serialization.XmlRootAttribute root);
You might know the root element name in front eg. depending on the source of the xml document, or you might discover it at runtime from the xml document itself.
The following from the example below shows how the xml root element name gets set.
String rootName = "CLASSA"; // "CLASSB"
var serializer = new XmlSerializer(typeof(ClassAandB), new XmlRootAttribute(rootName));
An simplified example using an XmlReader as source and retrieving the root xml element name from the content.
public class ClassAandB
{
[XmlElement(ElementName="syntaxid")]
public String Syntaxid{ get; set; }
[XmlElement(ElementName="email")]
public String Email { get; set; }
[XmlElement(ElementName="header")]
public String Header { get; set; }
}
var classA = Deserialize(XmlReader.Create(
new StringReader("<CLASSA><syntaxid>A</syntaxid></CLASSA>"))
);
Console.WriteLine(classA.Syntaxid); // A
var classB = Deserialize(
XmlReader.Create(new StringReader("<CLASSB><syntaxid>B</syntaxid></CLASSB>"))
);
Console.WriteLine(classB.Syntaxid); // B
public static ClassAandB Deserialize(XmlReader reader)
{
reader.MoveToContent();
string rootName = reader.Name;
var serializer = new XmlSerializer(typeof(ClassAandB),
new XmlRootAttribute(rootName)
);
var deserialized = serializer.Deserialize(reader) as ClassAandB;
return deserialized;
}

I suggest you to remove XmlRoot attribute and use:
var doc = new XmlDocument();
doc.Load("file.xml");
XmlElement root = xmlDoc.DocumentElement;
var serializer = new XmlSerializer(typeof(ClassAandB), new XmlRootAttribute(root.ToString()));

<xmlns='' > was not expected. But cannot define XMLroot as it changes. C#

I have a site that imports bookings from an external XML file and stores them as nodes in an Umbraco site.
The issue:
Basically, The system in which I exported the bookings from has changed how it exports the XML and the original root node.
I am assuming this is an issue related to this root node as many answers on stack have mentioned this in regards to the error in the subject title, but none that I could see have covered how to deal with a dynamic rootnode.
What I used to have in the XML file:
<Bookings>
<row id="1">
<FirstBookingAttribute>....
</row>
</Bookings>
Now is dynamic and can end up looking like:
<BookingsFebruary2017 company="CompanyNameHere">
<row id="1">
<firstBookingAttribute>....
</row>
</BookingsFebruary2017>
Because of this I am not entirely sure how to set the XMLroot node in C#.
Incase this issue is not related to the root node, I have pasted the erroring code below:
Controller File
var reader = new StreamReader(tempFile);
var deserializer = new XmlSerializer(typeof(Bookings));
var xml = deserializer.Deserialize(reader);
var studentBookings = (Bookings) xml;
Model of Bookings
public class Bookings
{
[XmlElement("row")]
public List<Booking> StudentBookings { get; set; }
}
Thank you

XmlSerializer doesn't allow for dynamic root names out of the box. The expected root name is fixed, and is defined by the type name, the presence of an [XmlRoot] or [XmlType] attribute, and possibly the XmlSerializer constructor used.
Thus an easy solution is to pre-load the XML into an XDocument, modify the root name, and then deserialize:
XDocument doc;
using (var xmlReader = XmlReader.Create(tempFile))
{
doc = XDocument.Load(xmlReader);
}
var rootName = doc.Root.Name;
doc.Root.Name = "Bookings";
var deserializer = new XmlSerializer(typeof(Bookings));
Bookings bookings;
using (var xmlReader = doc.CreateReader())
{
bookings = (Bookings)deserializer.Deserialize(xmlReader);
}
bookings.RootName = rootName.LocalName;
Note the use of CreateReader() to deserialize directly from the loaded XML.
Another option would be to create a custom XmlSerializer for each file using new XmlSerializer(typeof(Bookings), new XmlRootAttribute { ElementName = rootName }), however there are two issues with this:
Each serializer so created must be cached in a hash table to prevent a severe memory leak (loading of duplicate dynamically created assemblies), as is explained in this answer and also the documentation.
But even if you do this, if you are reading many different files with many different root names, your memory use will grow rapidly and continually due to constant loading of new unique dynamically created XmlSerializer assemblies.
However, if you only had to load one single file, you could do something like this:
Bookings bookings = null;
using (var xmlReader = XmlReader.Create(tempFile))
{
while (xmlReader.Read())
{
if (xmlReader.NodeType == XmlNodeType.Element)
{
var rootName = xmlReader.LocalName;
var deserializer = new XmlSerializer(typeof(Bookings), new XmlRootAttribute { ElementName = rootName });
bookings = (Bookings)deserializer.Deserialize(xmlReader);
bookings.RootName = rootName;
break;
}
}
}
If both solutions require too much memory, you could consider creating a subclassed XmlTextReader that allows for the the root element to be renamed on the fly while reading.
In both cases I modified Bookings to look as follows:
public class Bookings
{
[XmlIgnore]
public string RootName { get; set; }
[XmlAttribute("company")]
public string Company { get; set; }
[XmlElement("row")]
public List<Booking> StudentBookings { get; set; }
}

How To Track Down Deserialize XML to Object Error in XSD?

This is a sample of my xml file:
<IFX xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="finalizacaoOrcamentoVO">
<dadosOrcamento>...</dadosOrcamento>
<faturamento>...</faturamento>
</IFX>
This is my auto-generated by Visual Studio object class:
/// <remarks/>
[System.Xml.Serialization.XmlTypeAttribute(AnonymousType = true)]
[System.Xml.Serialization.XmlRootAttribute(Namespace = "", IsNullable = false)]
public partial class IFX
{
private IFXDadosOrcamento dadosOrcamentoField;
private IFXFaturamento faturamentoField;
But I've been getting this error every time I try to deserialize:
Message "Error document XML (1, 57)." string
This is my deserialize method:
IFX document;
XmlSerializer serializer = new XmlSerializer(typeof(object));
using (var reader = XmlReader.Create(file.InputStream))
{
document = (IFX)serializer.Deserialize(reader);
}
Any hint on what should be fixed?
Thanks in advance!
These are my subclasses:
ClassObject

The xsi:type attribute, short for {http://www.w3.org/2001/XMLSchema-instance}type, is a w3c standard attribute that is used to explicitly assert the type of its element. As explained in Xsi:type Attribute Binding Support, XmlSerializer interprets this to mean that the element is serialized from a polymorphic derived type of the expected type.
I.e. if your XML looks like:
<IFX xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="finalizacaoOrcamentoVO">
<dadosOrcamento>
<IFXDadosOrcamentoValue>A IFXDadosOrcamentoValue</IFXDadosOrcamentoValue>
</dadosOrcamento>
</IFX>
Then XmlSerializer expects that the following classes will exist:
[System.Xml.Serialization.XmlTypeAttribute(AnonymousType = true)]
[System.Xml.Serialization.XmlRootAttribute(Namespace = "", IsNullable = false)]
[XmlInclude(typeof(finalizacaoOrcamentoVO))]
public partial class IFX
{
private IFXDadosOrcamento dadosOrcamentoField;
public IFXDadosOrcamento dadosOrcamento { get { return dadosOrcamentoField; } set { dadosOrcamentoField = value; } }
}
public class finalizacaoOrcamentoVO : IFX
{
// This derived type may have some or all of the properties shown as elements in the XML file.
}
public class IFXDadosOrcamento
{
public string IFXDadosOrcamentoValue { get; set; }
}
Where finalizacaoOrcamentoVO is a type that inherits from IFX.
Note the presence of [XmlInclude(typeof(finalizacaoOrcamentoVO))]. This attribute informs the serializer of the subtypes that might be encountered and must be present for every allowed subtype.
Having done this, the XML can now be deserialized via:
IFX document;
XmlSerializer serializer = new XmlSerializer(typeof(IFX));
using (var reader = XmlReader.Create(inputStream))
{
document = (IFX)serializer.Deserialize(reader);
}
An instance of finalizacaoOrcamentoVO will actually get created thereby.
That being said, in your comment you state that your auto-generated classes do not include a derived type finalizacaoOrcamentoVO. If you generated the classes from an XSD, then the XSD and XML do not match and you should get that resolved. If you generated the classes by pasting this very XML into Visual Studio, then there might be a bug or limitation in Visual Studio's code generation, which can be fixed manually as shown above.
If you really want to ignore the xsi:type attribute, then you will have to do so manually, since support for it is built-in to the serializer. One way is to load into an intermediate XDocument:
XDocument xDoc;
using (var reader = XmlReader.Create(inputStream))
{
xDoc = XDocument.Load(reader);
}
var attr = xDoc.Root.Attribute("{http://www.w3.org/2001/XMLSchema-instance}type");
if (attr != null)
attr.Remove();
var document = xDoc.Deserialize<IFX>();
Using the extension method:
public static class XObjectExtensions
{
public static T Deserialize<T>(this XContainer element, XmlSerializer serializer = null)
{
if (element == null)
throw new ArgumentNullException();
using (var reader = element.CreateReader())
return (T)(serializer ?? new XmlSerializer(typeof(T))).Deserialize(reader);
}
}

ServiceStack Parse Xml Request to Dictionary

One of our requirements is to have a decoupled architecture where we need to map data from one system to another, and the intermediate mapping is handled by a ServiceStack service request. Our issue is that a vendor can only provide data via Xml that does not conform to the standard dictionary request that ServiceStack offers like below:
<Lead xmlns:d2p1="http://schemas.microsoft.com/2003/10/Serialization/Arrays">
<d2p1:KeyValueOfstringstring>
<d2p1:Key>String</d2p1:Key>
<d2p1:Value>String</d2p1:Value>
</d2p1:KeyValueOfstringstring>
</Lead>
Instead they need a mechanism like:
<Lead>
<LeadId>Value</LeadId>
<FirstName>First Name</FirstName>
<LastName>Last Name</LastName>
...
</Lead>
Since the nodes in their xml request may change over time, and we're simply acting as a middle-man, is there a native way to accept a dynamic request or handle this as Dictionary with data similar to what's below?
Dictionary<string, string>
{
{ "LeadId", "Value" },
{ "FirstName", "First Name" },
{ "LastName", "Last Name" }
...
};

The default XML Serialization doesn't provide any way that you could transparently infer an XML fragment into a string dictionary so you're going to need to manually parse the XML which you can do in ServieStack by telling ServiceStack to skip built-in Serialization by implementing IRequiresRequestStream which ServiceStack will inject with the Request Stream so you can deserialize it yourself, e.g:
public class Lead : IRequiresRequestStream
{
public Stream RequestStream { get; set; }
}
In your Service you'd then manually parse the raw XML and convert it to the data collections you want, e.g:
public class RawServices : Service
{
public object Any(Lead request)
{
var xml = request.RequestStream.ReadFully().FromUtf8Bytes();
var map = new Dictionary<string, string>();
var rootEl = (XElement)XDocument.Parse(xml).FirstNode;
foreach (var node in rootEl.Nodes())
{
var el = node as XElement;
if (el == null) continue;
map[el.Name.LocalName] = el.Value;
}
return new LeadResponse {
Results = map
}
}
}

Problem with C# XmlSerialization

I have xml file:
<?xml version="1.0" encoding="utf-8"?>
<LabelTypesCollection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance="xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<LabelTypes>
<LabelType>
<Name>LabelTypeProduct</Name>
</LabelType>
<LabelType>
<Name>LabelTypeClient</Name>
</LabelType>
</LabelTypes>
</LabelTypesCollection>
And 2 c# classes:
[Serializable]
[XmlRoot("LabelTypesCollection")]
public class LabelTypesCollection
{
private static string _labelTypesCollectionPath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.ApplicationData), Path.Combine(Program.ProgramName, "LabelTypesCollection.xml"));
[XmlArray("LabelTypes", ElementName="LabelType")]
public List<LabelType> LabelTypes { get; set; }
public static LabelTypesCollection LoadAllLabelTypes()
{
FileInfo fi = new FileInfo(_labelTypesCollectionPath);
if (!fi.Exists)
{
Logger.WriteLog("Could not find size_types_collection.xml file.", new Exception("Could not find size_types_collection.xml file."));
return new LabelTypesCollection();
}
try
{
using (FileStream fs = fi.OpenRead())
{
XmlSerializer serializer = new XmlSerializer(typeof(LabelTypesCollection));
LabelTypesCollection labelTypesCollection = (LabelTypesCollection)serializer.Deserialize(fs);
return labelTypesCollection;
}
}
catch (Exception ex)
{
Logger.WriteLog("Error during loading LabelTypesCollection", ex);
return null;
}
}
}
[Serializable]
public class LabelType
{
[XmlElement("Name")]
public string Name { get; set; }
[XmlIgnore]
public string TranslatedName
{
get
{
string translated = Common.Resources.GetValue(Name);
return (translated == null) ? Name : translated;
}
}
}
And when I call:
LabelTypesCollection.LoadAllLabelTypes();
I get LabelTypeCollection object with empty LabelTypes list. There is no error or anything. Could anyone point me to the problem?

Change this
[XmlArray("LabelTypes", ElementName="LabelType")]
to this
[XmlArray]
The ElementName of an XmlArrayAttribute specifies the element name of the container, and is actually what you specify in the first parameter to the ctor! So the ctor you have says "this class serializes as a container named LabelTypes; no wait actually I want the container to be named LabelType". The named parameter is overwriting what the first unnamed parameter says.
And in fact, since you want the container element to be named LabelTypes, which is what the member is actually called, you don't need to specify it at all.
You may have been thinking of XmlArrayItemAttribute, which controls what the individual members of a serialized collection are named - but you don't need that here either.
My usual approach for working out xml serializer stuff is to build objects manually then look at the xml they serialize to. In this case, using the code you currently have produces xml like this:
<?xml version="1.0" encoding="utf-16"?>
<LabelTypesCollection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<LabelType>
<LabelType>
<Name>one</Name>
</LabelType>
<LabelType>
<Name>two</Name>
</LabelType>
</LabelType>
</LabelTypesCollection>
which is what tipped me off to the incorrect LabelType specifier.
Note that you also don't need the XmlRoot on LabelTypesCollection, or the XmlElement on Name, since you are just specifying what the xml serializer will come up with anyway.

Here's a suggestion.
Write a small test program that creates an instance of LabelTypesCollection, and adds some LabelType objects into it.
Then use an XmlSerializer to write the object to a file, and look at the Xml you get, to ensure that your input Xml is in the correct schema.
Maybe there's something wrong with one of your Xml elements.

I really think you get an empty list because your code can't find the xml file. Also try instantiating your list. If you have the xml path correctly.
public List<LabelType> LabelTypes = new List<LabelType>();

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to do a polymorphic deserialization in C# given a XSD? - c#

Related

How to deserialize two different XML types in to one class

<xmlns='' > was not expected. But cannot define XMLroot as it changes. C#

How To Track Down Deserialize XML to Object Error in XSD?

ServiceStack Parse Xml Request to Dictionary

Problem with C# XmlSerialization

Categories

Resources