Best way to implement IXmlSerializable ReadXml() using XPath - c#

I'm implementing the ReadXml() method of IXmlSerializable and I figure using XPath is probably the nicest way to do this.
However, ReadXml() needs to handle the reader position properly.
So given that my WriteXml() produces something like this:
<ObjectName>
<SubNode>2</SubNode>
<SubNode>2</SubNode>
</ObjectName>
Is there a better way than (this terrible way) below to ensure the Reader is correctly positioned afterwards?
public override void ReadXml(System.Xml.XmlReader reader)
{
reader.Read(); /* Read Opening tag */
/* Using reader.ReadStartElement("ObjectName") reads forward a node,
i.e. current node becomes the first <SubNode>
whereas Read() doesn't even when the documentation says they both do
*/
XPathNavigator n = MakeXPathNavigator(reader.ReadSubtree());
XPathNodeIterator nodes = n.Select(".//SubNode");
while (nodes.MoveNext())
{
/* Do stuff with nodes */
_values.Add(nodes.Current.ValueAsInt);
}
reader.Skip(); /* Skip reader forward */
}
public static XPathNavigator MakeXPathNavigator(XmlReader reader)
{
try
{
return new XPathDocument(reader).CreateNavigator();
}
catch(XmlException e)
{
throw e; /* Maybe hide/deal with exception */
}
}

I suspect you might run into some performance issues if using that approach routinely. Since your xml is relatively simple, I strongly suspect that you would do better just using XmlReader directly...
...but doing so isn't easy; IMO, it is better to try to avoid the need to implement IXmlSerializable (juts using regular collection properties etc) - it is a common cause of bugs and frustration.

Related

Serialization xml with double xml tags where order is important in c#

I want to serialize xml which works but I have a problem. Sometimes I have the same tag multiple times but on different places. So the order is important so I cant just stick it in an list.
For example I have the following xml:
<?xml version="1.0" encoding="utf-16"?>
<commands>
<execute>some statement</execute>
<wait>5</wait>
<execute>some statement</execute>
<wait>5</wait>
<execute>some statement</execute>
<execute>some statement</execute>
</commands>
Then my object would look something like this:
[XmlRoot(ElementName="commands")]
public class Commands {
[XmlElement(ElementName="execute")]
public List<string> Execute { get; set; }
[XmlElement(ElementName="wait")]
public List<int> Wait { get; set; }
}
If I then serialize it with the following function:
var xmlSerializer = new XmlSerializer(obj.GetType());
using (var writer = new Utf8StringWriter())
{
xmlSerializer.Serialize(writer, obj);
return writer.ToString();
}
The order will not be the same.... It would first serialize the execute tags and then the wait statements. While the order is important.
Does someone have a clue on how to tackle this problem?
Ps. changing the xml is not a solution as I'm tied to that....
Thanks in advance!
After searching for a while I tackle the problem as the following. The wait and execute are basically commands to I serialize now a list of commands. Of course the seriliazer will complain that it can't serlize this because it is a list of interfaces (ICommand) so I implemented the IXmlSerliazer so that I tell how to serliaze this.
This worked actually quite good

Editing XML Output With LINQ

I have an XML output file from a process being run that needs the contents of various fields edited according to a collection of tables in our database. For example, what's included in
<?xml version="1.0" encoding="utf-8"?>
<ArrayOfUserReportPreviewListDto xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<UserReportPreviewListDto>
<ExtensionData />
<Id>previewReportFieldsId</Id>
<Field0>-7</Field0>
<Field1>L</Field1>
<Field2>Lab Work Complete</Field2>
<Field3>False</Field3>
<Field4>LabWorkComplete</Field4>
<Field6>False</Field6>
</UserReportPreviewListDto>
<UserReportPreviewListDto>
<ExtensionData />
<Id>previewReportFieldsId</Id>
<Field0>-6</Field0>
<Field1>S</Field1>
<Field2>Sent to Lab</Field2>
<Field3>False</Field3>
<Field4>SentToLab</Field4>
<Field6>False</Field6>
</UserReportPreviewListDto>
<UserReportPreviewListDto>
<ExtensionData />
<Id>previewReportFieldsId</Id>
<Field0>-5</Field0>
<Field1>V</Field1>
<Field2>Void</Field2>
<Field3>False</Field3>
<Field4>Void</Field4>
<Field6>True</Field6>
<Field7>12/11/2013</Field7>
<Field9>769</Field9>
</UserReportPreviewListDto>
would need Field4 changed from LabWorkComplete (tblEnum.FieldTypeDesc) to 2 (tblEnum.FieldTypeNum).
I'm very new to using LINQ, and am not even completely sure it's the best route for this. I've created a DataSet in the project, with a DataTable populated from the database with what I need to work with. And...that's as far as I've got. Right now I'm using a massive list of tedious If statements to accomplish this, and am thinking this avenue may be more efficient than a collection of statements like this.
var xe = XElement.Load("serializer.xml");
string field4Value = xe.XPathSelectElement(#"/UserReportPreviewListDto/Field4").Value;
if (field4Value == "Incomplete")
{
xe.XPathSelectElement(#"/UserReportPreviewListDto/Field4").Value = "0";
}
else if (field4Value == "SendToLab")
{
xe.XPathSelectElement(#"/UserReportPreviewListDto/Field4").Value = "1";
}
else if (field4Value == "LabWorkComplete")
{
xe.XPathSelectElement(#"/UserReportPreviewListDto/Field4").Value = "2";
}
So that's where I am. If LINQ wouldn't be the best avenue, what would be? If it would be, what would be the best way to do it? Additionally, any particularly helpful resources along these lines that can be recommended would be appreciated; I'd much rather learn code than copy code. I'd hate to have to ask this again next week, after all.
Your XML structure is weird. Field0...Field6 is not common, there are usually meaningful names in there. You can always write a function that encapsulates string to integer string conversion, and just provide an xpath as an argument. Then go higher level, provide xpath + conversion delegate, and from this point it's as easy as one line per property. Here is an implementation example:
using System;
using System.Xml.Linq;
using System.Xml.XPath;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
var xe = XElement.Load("serializer.xml");
ConvertValue(xe, #"/UserReportPreviewListDto/Field4", TranslateValueField4);
}
private static void ConvertValue(XElement xe, string xpath, TranslateValue translator)
{
string field4Value = xe.XPathSelectElement(xpath).Value;
xe.XPathSelectElement(xpath).Value = translator(field4Value);
}
private delegate string TranslateValue(string value);
private static string TranslateValueField4(string value)
{
switch (value)
{
case "Incomplete" :
return "0";
case "SendToLab" :
return "1";
case "LabWorkComplete":
return "2";
default:
throw new NotImplementedException(); //or provide handling for unknown values
}
}
}
}
You can also avoid using xpath, and just iterate using foreach:
static void Main(string[] args)
{
var doc = XDocument.Load(#"input.xml");
foreach (var xe in doc.Root.Elements("UserReportPreviewListDto"))
{
ConvertValue(xe, "Field4", TranslateValueField4);
}
//doc.Save(path);
}
private static void ConvertValue(XElement xe, string fieldName, TranslateValue translator)
{
//.Element returns Nothing if element is missing, may want to handle this case
XElement field4 = xe.Element(fieldName);
string field4Converted = TranslateValueField4(field4.Value);
field4.SetValue(field4Converted);
}
I always, always, always prefect to store xml into custom classes and then work with them inside my C# environment. Makes the process of modifying it feel much more natural. Please look at my question here to see the best way to do this. It takes a bit more time, but it makes things SO much easier in the long run. You said you wanted the best route and to learn, right? ;)

Rss20FeedFormatter and RRS2 extensions

I want to use the RSS2 extensions feature to add my own non-standard elements to my RSS feed as described here:
http://cyber.law.harvard.edu/rss/rss.html#extendingRss:
However I don't think that the .Net Rss20FeedFormatter class supports this feature.
My code looks something like this:
public Rss20FeedFormatter GetRSS()
{
var feed = new SyndicationFeed(....);
feed.Items = new List<SyndicationItem>();
// add items to feed
return new Rss20FeedFormatter(feed);
}
If it doesn't support it is there any alternative to just creating the XML element by element?
Here's my findings. Took me a while to figure it all out.
This is what you do, your feed has to have a namespace
XNamespace extxmlns = "http://www.yoursite.com/someurl";
feed.AttributeExtensions.Add(new XmlQualifiedName("ext", XNamespace.Xmlns.NamespaceName), extxmlns.NamespaceName);
feed.ElementExtensions.Add(new XElement(extxmlns + "link", new XAttribute("rel", "self"), new XAttribute("type", "application/rss+xml")));
return new Rss20FeedFormatter(feed, false);
Your items need to be a derived class, and you write the extended properties in WriteElementExtensions, making sure you prefix them with the namespace (you don't have to but this is what is required to make it valid RSS).
class TNSyndicationItem : SyndicationItem
protected override void WriteElementExtensions(XmlWriter writer, string version)
{
writer.WriteElementString("ext:abstract", this.Abstract);
writer.WriteElementString("ext:channel", this.Channel);
}
The extended properties are ignore if you look in an RSS reader such as firefox, you'll need to write code to read them as well.
The url http://www.yoursite.com/someurl doesn't have to exist, but you need it to define the namespace and make the RSS valid. Normally you'll just put a page there which says something about what the feed should look like.

How do I use XmlSerializer to handle different namespace versions?

I am using the .NET XmlSerializer class to deserialize GPX files.
There are two versions of the GPX standard:
<gpx xmlns="http://www.topografix.com/GPX/1/0"> ... </gpx>
<gpx xmlns="http://www.topografix.com/GPX/1/1"> ... </gpx>
Also, some GPX files do not specify a default namespace:
<gpx> ... </gpx>
My code needs to handle all three cases, but I can't work out how to get XmlSerializer to do it.
I am sure there must be a simple solution because this a common scenario, for example KML has the same issue.
I have done something similar to this a few times before, and this might be of use to you if you only have to deal with a small number of namespaces and you know them all beforehand. Create a simple inheritance hierarchy of classes, and add attributes to the different classes for the different namespaces. See the following code sample. If you run this program it gives the output:
Deserialized, type=XmlSerializerExample.GpxV1, data=1
Deserialized, type=XmlSerializerExample.GpxV2, data=2
Deserialized, type=XmlSerializerExample.Gpx, data=3
Here is the code:
using System;
using System.IO;
using System.Xml;
using System.Xml.Serialization;
[XmlRoot("gpx")]
public class Gpx {
[XmlElement("data")] public int Data;
}
[XmlRoot("gpx", Namespace = "http://www.topografix.com/GPX/1/0")]
public class GpxV1 : Gpx {}
[XmlRoot("gpx", Namespace = "http://www.topografix.com/GPX/1/1")]
public class GpxV2 : Gpx {}
internal class Program {
private static void Main() {
var xmlExamples = new[] {
"<gpx xmlns='http://www.topografix.com/GPX/1/0'><data>1</data></gpx>",
"<gpx xmlns='http://www.topografix.com/GPX/1/1'><data>2</data></gpx>",
"<gpx><data>3</data></gpx>",
};
var serializers = new[] {
new XmlSerializer(typeof (Gpx)),
new XmlSerializer(typeof (GpxV1)),
new XmlSerializer(typeof (GpxV2)),
};
foreach (var xml in xmlExamples) {
var textReader = new StringReader(xml);
var xmlReader = XmlReader.Create(textReader);
foreach (var serializer in serializers) {
if (serializer.CanDeserialize(xmlReader)) {
var gpx = (Gpx)serializer.Deserialize(xmlReader);
Console.WriteLine("Deserialized, type={0}, data={1}", gpx.GetType(), gpx.Data);
}
}
}
}
}
Here's the solution I came up with before the other suggestions came through:
var settings = new XmlReaderSettings();
settings.IgnoreComments = true;
settings.IgnoreProcessingInstructions = true;
settings.IgnoreWhitespace = true;
using (var reader = XmlReader.Create(filePath, settings))
{
if (reader.IsStartElement("gpx"))
{
string defaultNamespace = reader["xmlns"];
XmlSerializer serializer = new XmlSerializer(typeof(Gpx), defaultNamespace);
gpx = (Gpx)serializer.Deserialize(reader);
}
}
This example accepts any namespace, but you could easily make it filter for a specific list of known namespaces.
Oddly enough you can't solve this nicely. Have a look at the deserialize section in this troubleshooting article. Especially where it states:
Only a few error conditions lead to exceptions during the
deserialization process. The most common ones are:
•The name of the
root element or its namespace did not match the expected name.
...
The workaround I use for this is to set the first namespace, try/catch the deserialize operation and if it fails because of the namespace I try it with the next one. Only if all namespace options fail do I throw the error.
From a really strict point of view you can argue that this behavior is correct since the type you deserialize to should represent a specific schema/namespace and then it doesn't make sense that it should also be able to read data from another schema/namespace. In practice this is utterly annoying though. File extenstion rarely change when versions change so the only way to tell if a .gpx file is v0 or v1 is to read the xml contents but the xmldeserializer won't unless you tell upfront which version it will be.

What about the using construct in c#

I see this:
using (StreamWriter sw = new StreamWriter("file.txt"))
{
// d0 w0rk s0n
}
Everything I try to find info on is does not explain what this doing, and instead gives me stuff about namespaces.
You want to check out documentation for the using statement (instead of the using directive which is about namespaces).
Basically it means that the block is transformed into a try/finally block, and sw.Dispose() gets called in the finally block (with a suitable nullity check).
You can use a using statement wherever you deal with a type implementing IDisposable - and usually you should use it for any disposable object you take responsibility for.
A few interesting bits about the syntax:
You can acquire multiple resources in one statement:
using (Stream input = File.OpenRead("input.txt"),
output = File.OpenWrite("output.txt"))
{
// Stuff
}
You don't have to assign to a variable:
// For some suitable type returning a lock token etc
using (padlock.Acquire())
{
// Stuff
}
You can nest them without braces; handy for avoiding indentation
using (TextReader reader = File.OpenText("input.txt"))
using (TextWriter writer = File.CreateText("output.txt"))
{
// Stuff
}
The using construct is essentially a syntactic wrapper around automatically calling dispose on the object within the using. For example your above code roughly translates into the following
StreamWriter sw = new StreamWriter("file.text");
try {
// do work
} finally {
if ( sw != null ) {
sw.Dispose();
}
}
Your question is answered by section 8.13 of the specification.
Here you go: http://msdn.microsoft.com/en-us/library/yh598w02.aspx
Basically, it automatically calls the Dispose member of an IDisposable interface at the end of the using scope.
check this Using statement

Categories

Resources