XmlDeserialize cdata and its siblings

XmlDeserialize cdata and its siblings - c#

I'm in the process of deserializing into C# objects a custom inflexible XML schema to traverse and migrate the data within.
A brief example:
<Source>
...
<Provider>
<![CDATA[read 1]]>
<Identifier><![CDATA[read 2]]></Identifier>
<IdentificationScheme><![CDATA[read 3]]></IdentificationScheme>
</Provider>
...
</Source>
I'm looking the deserialize the Provider element with the first CDATA element value, read 1, and it's sibling element values too, read 2 and read 3.
Using http://xmltocsharp.azurewebsites.net/ it produces the following objects:
[XmlRoot(ElementName = "Provider")]
public class Provider
{
[XmlElement(ElementName = "Identifier")]
public string Identifier { get; set; }
[XmlElement(ElementName = "IdentificationScheme")]
public string IdentificationScheme { get; set; }
}
[XmlRoot(ElementName = "Source")]
public class Source
{
[XmlElement(ElementName = "Provider")]
public Provider Provider { get; set; }
}
But it fails to account for the the CDATA value, in fact I think deserializing it like this the value would not be reachable.
I think this maybe also be related to the XmlDeserializer to use, I was planning on RestSpharp's (as it's a library to the website already) or System.Xml.Link.XDocument, but I'm not sure whether either can handle this scenario?
In my searches I couldn't find an example either, but stack did suggest this <!{CDATA[]]> and <ELEMENT> in a xml element that is precisely the same schema option.
Thanks so much for any help in advance,
EDIT 1
As far as I can tell the [XmlText] is the solution required, as pointed out in Marc Gravell's answer below, but it does not work/is implemented on RestSharp's XmlDeserializer, but further testing would be required to ascertain that for sure.

The CDATA is essentially just escaping syntax and is handled by most readers. What you are looking for is:
[XmlText]
public string WhateverThisIs { get; set; }
on the object that has raw content. By adding that to Provider, WhateverThisIs gets the value of "read 1". The other 2 properties already deserialize correctly as "read 2" and "read 3" without you having to do anything.
For reference, everything here would behave almost the same without the CDATA (there are some whitespace issues):
<Provider>
read 1
<Identifier>read 2</Identifier>
<IdentificationScheme>read 3</IdentificationScheme>
</Provider>

Related

Include raw XML when deserializing XML

I have the following XML and classes that are being serialized from it:
<Alerts>
<io id="1">
<name>Foo</name>
<status>Active</status>
</io>
<io id="2">
<name>Bar</name>
<status>Inactive</status>
</io>
</Alerts>
[XmlRoot("Alerts")]
[Serializable]
public class Alerts
{
[XmlElement("io")]
public List<Alert> { get; set; }
}
public class Alert
{
[XmlElement("name")]
public string Name { get; set; }
[XmlElement("status")]
public string Status { get; set; }
}
What I require is a property in my Alert class, that upon deserialization contains the XML of its node. For example, after deserializing the provided XML, I end up with a list of 2 Alert objects. I would need the first alert to have a property that contains this as a string:
<io id="1">
<name>Foo</name>
<status>Active</status>
</io>
Any ideas how I can achieve this?

I think the only way to actually accomplish this is to have a string property, and then in your xml replace the xml reserved characters with entity character references so it doesn't get serialized as xml. So your xml would look something like:
<io id="1">
<name>Foo</name>
<status>Active</status>
<innerAlert>
<io id="3"><name>FooBar</name><status>Inactive</status><innerAlert></innerAlert></io>
</innerAlert>
I still keep going back to my comment and thinking that you might be better off adding a property of type Alert or IEnumerable to your Alert class and let the tree deserialize out all the way down, but maybe that's just not an option to you.

Use the type's root name when serializing collection items

Just a contrived example but I have some classes that I want to control the names of when serialized.
[XmlRoot("item")]
public class ItemWrapper
{
[XmlArray("parts")]
public List<PartWrapper> Parts { get; set; }
}
[XmlRoot("part")]
public class PartWrapper
{
[XmlAttribute("name")]
public string Name { get; set; }
}
Now when I serialize an ItemWrapper that has some Parts, I was expecting each of the PartWrapper items to be serialized using the root name ("part") but it instead uses the type's name (what it would use by default if the XmlRootAttribute wasn't specified).
<item>
<parts>
<PartWrapper name="foo" />
<PartWrapper name="bar" />
</parts>
</item>
Now I know to fix this to get what I want is to add the XmlArrayItemAttribute to the Parts property to specify the name of each of the items but I feel it's redundant. I just wanted to use the name of the type's root as I specified.
[XmlRoot("item")]
public class Item
{
[XmlArray("parts")]
[XmlArrayItem("part", Type=typeof(PartWrapper))]
public List<PartWrapper> Parts { get; set; }
}
Is there a way that I can tell the serializer that I want to use the root name for the collection items? If so, how?
Why wouldn't it just use the root name? Any good reason why it shouldn't?
It was my understanding that the XmlRootAttribute allowed you to name the element exactly like an XmlElementAttribute does, only it had to be applied to the class. I guess I'm mistaken.

From the documentation, it seems that XmlRootAttribute is used only when that element is the root (which is why it works for "item" in your case):
Controls XML serialization of the attribute target as an XML root element
- MSDN
It seems to me like perhaps you want XmlTypeAttribute for things that are not root.

Deserialize XML Array Where Root is Array and Elements Dont Follow Conventions

The XML I am getting is provided by an outside source so I don't have the ability to easily reformat it. I would like to use xml attributes on my entities instead of having to write a linq query that knows how the XML and entity is formatted. Here is an example:
<?xml version="1.0"?>
<TERMS>
<TERM>
<ID>2013-2</ID>
<DESC>Spring 2013</DESC>
</TERM>
<TERM>
<ID>2013-3</ID>
<DESC>Summer 2013 Jun&Jul</DESC>
</TERM>
</TERMS>
I know the the XMLSerializer expects ArrayOfTerm instead of TERMS for example, but that I can tweak my entity to use a different element name with the xml attributes such as this:
public class TermData
{
[XmlArray("TERMS")]
[XmlArrayItem("TERM")]
public List<Term> terms;
}
public class Term
{
[XmlElement("ID")]
public string id;
[XmlElement("DESC")]
public string desc;
}
and I am deserializing the data like so:
TermData data;
XmlSerializer serializer = new XmlSerializer(typeof(TermData));
using (StringReader reader = new StringReader(xml))
{
data = (TermData)serializer.Deserialize(reader);
}
return View(data.terms);
The problem I am facing is that TERMS is the root and the array itself. If the XML were to have a root element that was not the array, I could edit my TermData class like so and it would deserialize correctly (already tested).
[XmlRoot("ROOT")]
public class TermData
{
[XmlArray("TERMS")]
[XmlArrayItem("TERM")]
public List<Term> terms;
}
Note that using TERMS as the XMLRoot does not work. Right now, my code is throwing
InvalidOperationException: There is an error in XML document (2,2).
InnerException: "<TERMS xmlns=" was not expected.
This would lead me to believe that the XML is not formatted correctly, but from my understanding the example I gave is perfectly valid XML.
This would all be trivial if I could edit the source xml, but there could be tons of other responses like this and I need to be able to flex for whatever I might get. What I'm trying to confirm is whether or not the XMLSerializer can support this type of XML structure. I've tested just about everything and can't get it deserialize without editing the XML. It would also be convenient if I didn't have to define a wrapper class (TermData) to hold the list, but this seems to only work if the xml follows the naming conventions for the serializer (ArrayOfTerm, etc).

Maybe you can try :
[XmlRoot("TERMS")]
public class TermData
{
public TermData()
{
terms = new List<Term>();
}
[XmlElement("TERM")]
public List<Term> terms{get;set;}
}
public class Term
{
[XmlElement("ID")]
public string id{get;set;}
[XmlElement("DESC")]
public string desc{get;set;}
}
Hope this will help,

How to deserialize with RestSharp when I have an attribute named "value"?

I am using RestSharp to deserialize a XML file where some of the nodes are like this:
<element value="something" />
The elementes with an attribute called 'value' will not deserialize. Any ideas on how to get RestShap to deserialize this?
The object used to deserialize is like:
public class Object
{
public string Value { get; set; }
}
Please note, the XML is returned from a web service so I do not have the option of changing the attribute name to something different.

Okay, I have found a solution. I think this is somewhat an edge case.
I renamed the variable
public string Value {get;set;}
to
public string value {get;set;}
And now it deserializes perfectly. Guess the Uppercase Value is for the value contained in a XML element only.

XML deserialize null elements?

trying to deserialize a Xml string, but always get problem for elements like these:
<Taxable />
<DefaultPurchasePrice />
My C# code snippet:
[XmlRoot(ElementName = "Product", Namespace = "http://api.test.com/version/1", IsNullable = false)]
public class Product
{
public Guid Guid { get; set; }
public string ProductName { get; set; }
public bool Taxable { get; set; }
public Decimal DefautSellPrice { get; set; }
[XmlElement("DefaultPurchasePrice")]
public string DefaultPurchasePriceElement
{
get
{
if (DefaultPurchasePrice == null)
return String.Empty;
else
return DefaultPurchasePrice.ToString();
}
set
{
if (value == null | value.Length == 0)
DefaultPurchasePrice = null;
else
DefaultPurchasePrice = Convert.ToDecimal(value);
}
}
[XmlIgnore]
public decimal? DefaultPurchasePrice{ get; set;}
}
Seems like
xsi:nil="true"
attribute in XML should solve my problem. But as we are using XML provided by from a REST server as part of an API testing. We don't have direct control how the XML be constructed, but we can give them feedback. So I think I should explicitly ask them to fix their XML, as it is their XML's problem right?
In the mean time, I could get individual elements deserialized by the following code:
[XmlElement("DefaultPurchasePrice")]
public string DefaultPurchasePriceElement
{
get
{
if (DefaultPurchasePrice == null)
return String.Empty;
else
return DefaultPurchasePrice.ToString();
}
set
{
if (value == null | value.Length == 0)
DefaultPurchasePrice = null;
else
DefaultPurchasePrice = Convert.ToDecimal(value);
}
}
[XmlIgnore]
public decimal? DefaultPurchasePrice{ get; set;}
But there are quite a few null elements in the XML string, and again, the other party could fix their XML so I don't need do anything to my deserialize code in that case right?
Anyway, could I do something in my code before deserialization so the XML could have proper xsi:nil="true" attribute for null elements so that I don't need do much in my C# code but can quickly fix their XML?
I am thinking about #Ryan's solution in the 2nd last from here: deserialize-xml-with-empty-elements-in-c, but not sure are there any better solutions?
EDIT:
Just did a small test, adding xsi:nill='true' in XML null elements will indeed working with my existing C# code.
But I do need make sure my C# class mapped from XML have nullable datattype for those null elements comeing from XML with xsi:nill='true'. But it make sense: when some datafield come from XML might be a null type, I need explicitly define the correspond datatype as nullable. I am much happy with that rather than my current solution.

I don't know the answer to your problem, but it seems to me that asking your colleagues to fix their XML isn't the right answer. It is common wisdom when writing network and file format code to "Be conservative in what you give, but accepting in what you receive", or some such.
That is, you should be prepared to receive just about ANYTHING in your incoming XML stream. If the XML is well-formed and contains the elements and attributes you require, you should be able to parse it correctly. If it has elements you don't permit, you should gracefully either ignore them or raise an error condition. If the XML is not well-formed, you should raise an error.
Otherwise your program won't be robust in the face of errors coming in from the other end, and could have security holes.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

XmlDeserialize cdata and its siblings - c#

Related

Include raw XML when deserializing XML

Use the type's root name when serializing collection items

Deserialize XML Array Where Root is Array and Elements Dont Follow Conventions

How to deserialize with RestSharp when I have an attribute named "value"?

XML deserialize null elements?

Categories

Resources