I have a class InputConfig which contains a List<IncludeExcludeRule>:
public class InputConfig
{
// The rest of the class omitted
private List<IncludeExcludeRule> includeExcludeRules;
public List<IncludeExcludeRule> IncludeExcludeRules
{
get { return includeExcludeRules; }
set { includeExcludeRules = value; }
}
}
public class IncludeExcludeRule
{
// Other members omitted
private int idx;
private string function;
public int Idx
{
get { return idx; }
set { idx = value; }
}
public string Function
{
get { return function; }
set { function = value; }
}
}
Using ...
FileStream fs = new FileStream(path, FileMode.Create);
XmlSerializer xmlSerializer = new XmlSerializer(typeof(InputConfig));
xmlSerializer.Serialize(fs, this);
fs.Close();
... and ...
StreamReader sr = new StreamReader(path);
XmlSerializer reader = new XmlSerializer(typeof(InputConfig));
InputConfig inputConfig = (InputConfig)reader.Deserialize(sr);
It works like a champ! Easy stuff, except that I need to preserve whitespace in the member function when deserializing. The generated XML file demonstrates that the whitespace was preserved when serializing, but it is lost on deserializing.
<IncludeExcludeRules>
<IncludeExcludeRule>
<Idx>17</Idx>
<Name>LIEN</Name>
<Operation>E =</Operation>
<Function> </Function>
</IncludeExcludeRule>
</IncludeExcludeRules>
The MSDN documentation for XmlAttributeAttribute seems to address this very issue under the header Remarks, yet I don't understand how to put it to use. It provides this example:
// Set this to 'default' or 'preserve'.
[XmlAttribute("space",
Namespace = "http://www.w3.org/XML/1998/namespace")]
public string Space
Huh? Set what to 'default' or 'preserve'? I'm sure I'm close, but this just isn't making sense. I have to think there's just a single line XmlAttribute to insert in the class before the member to preserve whitespace on deserialize.
There are many instances of similar questions here and elsewhere, but they all seem to involve the use of XmlReader and XmlDocument, or mucking about with individual nodes and such. I'd like to avoid that depth.
To preserve all whitespace during XML deserialization, simply create and use an XmlReader:
StreamReader sr = new StreamReader(path);
XmlReader xr = XmlReader.Create(sr);
XmlSerializer reader = new XmlSerializer(typeof(InputConfig));
InputConfig inputConfig = (InputConfig)reader.Deserialize(xr);
Unlike XmlSerializer.Deserialize(XmlReader), XmlSerializer.Deserialize(TextReader) preserves only significant whitespace marked by the xml:space="preserve" attribute.
The cryptic documentation means that you need to specify an additional field with the [XmlAttribute("space", Namespace = "http://www.w3.org/XML/1998/namespace")] whose value is default or preserve. XmlAttribute controls the name of the generated attribute for a field or property. The attribute's value is the field's value.
For example, this class:
public class Group
{
[XmlAttribute (Namespace = "http://www.cpandl.com")]
public string GroupName;
[XmlAttribute(DataType = "base64Binary")]
public Byte [] GroupNumber;
[XmlAttribute(DataType = "date", AttributeName = "CreationDate")]
public DateTime Today;
[XmlAttribute("space", Namespace = "http://www.w3.org/XML/1998/namespace")]
public string Space ="preserve";
}
Will be serialized to:
<?xml version="1.0" encoding="utf-16"?>
<Group xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
d1p1:GroupName=".NET"
GroupNumber="ZDI="
CreationDate="2001-01-10"
xml:space="preserve"
xmlns:d1p1="http://www.cpandl.com" />
I believe the part you are missing is to add the xml:space="preserve" to the field, e.g.:
<Function xml:space="preserve"> </Function>
For more details, here is the relevant section in the XML Specification
With annotation in the class definition, according to the MSDN blog it should be:
[XmlAttribute("space=preserve")]
but I remember it being
[XmlAttribute("xml:space=preserve")]
Michael Liu's answer above worked for me, but with one caveat. I would have commented on his answer, but my "reputation" is not adequate enough.
I found that using XmlReader did not fully fix the issue, and the reason for this is that the .net property in question had the attribute:
XmlText(DataType="normalizedString")
To rectify this I found that adding the additional attribute worked:
[XmlAttribute("xml:space=preserve")]
Obviously, if you have no control over the .net class then you have a problem.
Related
When setting the InnerXml of an XmlElement having a default namespace, all tags without explicit namespaces are parsed as if they have xmlns="", instead of inheriting that XmlElement's default namespace (which is what happens when parsing a real XML document).
My question is: how to parse a complex XML string as a document fragment and assign it to an XmlElement, and inheriting the target XmlElement's namespace prefixes, default namespace, etc. when parsing that string?
Disclaimer:
I am totally aware of what XML namespaces are and what is the exact behavior of XmlElement.InnerXml regarding to XML namespaces. I'm not asking why XmlElement.InnerXml is doing what it currently does, or whether such behavior is good or bad. I'm asking if I can change this behavior or use some other techniques to achieve what I've described above.
I'm implementing some kind of XML template system, which allows users to insert some rather complex XML strings as fragments into another XML document. It will be insane to require users to always use explicit namespaces (the overhead of writing redundant namespace declarations can easily defeat the benefit of templating). I want a method to parse them and insert the resulting fragments into the main document as if they are literally copy-and-pasted into the target.
I'm aware that it is possible to preserve the default namespaces with pure DOM operations (like XmlDocument.CreateElement), but I don't want to manually implement an XML parser and convert XML strings into DOM operations.
I also don't want to do "serialize the whole XML document, do string manipulation, and parse it back" kind of things.
Is it possible?
As mentioned in this comment by Martin Honnen as well as this answer by Gideon Engelberth to Can I use predefined namespaces when loading an XDocument?, you can use an XmlParserContext to predefine a value for a namespace (including the default namespace) when parsing XML via XmlReader. Using an XmlReader configured with an appropriate context, you can load your inner XML directly into your XmlElement and and inherit any required namespaces from its scope.
To inherit just the default namespace, create the following extension methods:
public static partial class XmlNodeExtensions
{
public static void SetInnerXmlAndInheritDefaultNamespace(this XmlElement xmlElement, string innerXml)
{
using (var textReader = new StringReader(innerXml))
xmlElement.SetInnerXmlAndInheritDefaultNamespace(textReader);
}
public static void SetInnerXmlAndInheritDefaultNamespace(this XmlElement xmlElement, TextReader innerTextReader)
{
XmlNamespaceManager mgr = new XmlNamespaceManager(new NameTable());
mgr.AddNamespace("", xmlElement.GetNamespaceOfPrefix(""));
XmlParserContext ctx = new XmlParserContext(null, mgr, null, XmlSpace.Default);
using (var reader = XmlReader.Create(innerTextReader, new XmlReaderSettings { ConformanceLevel = ConformanceLevel.Fragment, CloseInput = false }, ctx))
using (var writer = xmlElement.CreateNavigator().AppendChild())
{
writer.WriteNode(reader, true);
}
}
}
To inherit all namespaces (the requirements in your question aren't entirely clear), create the following:
public static partial class XmlNodeExtensions
{
public static void SetInnerXmlAndInheritNamespaces(this XmlElement xmlElement, string innerXml)
{
using (var textReader = new StringReader(innerXml))
xmlElement.SetInnerXmlAndInheritNamespaces(textReader);
}
public static void SetInnerXmlAndInheritNamespaces(this XmlElement xmlElement, TextReader innerTextReader)
{
XmlNamespaceManager mgr = new XmlNamespaceManager(new NameTable());
var navigator = xmlElement.CreateNavigator();
foreach (var pair in navigator.GetNamespacesInScope(XmlNamespaceScope.ExcludeXml))
mgr.AddNamespace(pair.Key, pair.Value);
XmlParserContext ctx = new XmlParserContext(null, mgr, null, XmlSpace.Default);
using (var writer = navigator.AppendChild())
using (var reader = XmlReader.Create(innerTextReader, new XmlReaderSettings { ConformanceLevel = ConformanceLevel.Fragment, CloseInput = false }, ctx))
{
writer.WriteNode(reader, true);
}
}
}
Assuming xmlElement is the existing XmlElement to which you want to insert an innerXml string, you can do:
xmlElement.SetInnerXmlAndInheritDefaultNamespace(innerXml);
E.g. if your XML Document looks like:
<Root xmlns="defaultNameSpace" xmlns:d="dataNodeNamespace"><d:Data></d:Data></Root>
And you add the following XML to <d:Data>:
<ElementToAdd Id="10101"><InnerValue>my inner value</InnerValue></ElementToAdd><ElementToAdd Id="20202"><InnerValue>another inner value</InnerValue></ElementToAdd>
The result will be:
<Root xmlns="defaultNameSpace" xmlns:d="dataNodeNamespace">
<d:Data>
<ElementToAdd Id="10101">
<InnerValue>my inner value</InnerValue>
</ElementToAdd>
<ElementToAdd Id="20202">
<InnerValue>another inner value</InnerValue>
</ElementToAdd>
</d:Data>
</Root>
Demo fiddle here and here.
I'm having to recreate a vendor's XML file. I don't have access to their code, schema, or anything, so I'm doing this using the XmlSerializer and attributes. I'm doing it this way because the system is using a generic XmlWriter I've built to write other system XML files, so I'm killing two birds with one stone. Everything has been working out great, with exception of one property value. The vendor XML looks like this:
<TextOutlTxt>
<p style="text-align:left;margin-top:0pt;margin-bottom:0pt;">
<span>SUBSTA SF6 CIRCUIT BKR CONC FDN "C"</span>
</p>
</TextOutlTxt>
Here's my property configuration:
private string _value;
[XmlElement("TextOutlTxt")]
public XmlNode Value
{
get
{
string text = _value;
text = Regex.Replace(text, #"[\a\b\f\n\r\t\v\\""'&<>]", m => string.Join(string.Empty, m.Value.Select(c => string.Format("&#x{0:X};", Convert.ToInt32(c))).ToArray()));
string value = "\n<p style=\"text-align:left;margin-top:0pt;margin-bottom:0pt;\">\n<span>ReplaceMe</span>\n</p>\n";
XmlDocument document = new XmlDocument();
document.InnerXml = "<root>" + value + "</root>";
XmlNode innerNode = document.DocumentElement.FirstChild;
innerNode.InnerText = text;
return innerNode;
}
set
{ }
}
And this gives me:
<TextOutlTxt>
<p style="text-align:left;margin-top:0pt;margin-bottom:0pt;" xmlns="">SUBSTA SF6 CIRCUIT BKR CONC FDN "C"</p>
</TextOutlTxt>
So I'm close, but no cigar. There is an unwanted xmlns="..." attribute; it must not be present. In my XmlWriter, I have done the following to remove the namespace unless found atop the object it is serializing:
protected override void OnWrite<T>(T sourceData, Stream outputStream)
{
IKnownTypesLocator knownTypesLocator = KnownTypesLocator.Instance;
//Let's see if we can get the default namespace
XmlRootAttribute xmlRootAttribute = sourceData.GetType().GetCustomAttributes<XmlRootAttribute>().FirstOrDefault();
XmlSerializer serializer = null;
if (xmlRootAttribute != null)
{
string nameSpace = xmlRootAttribute.Namespace ?? string.Empty;
XmlSerializerNamespaces nameSpaces = new XmlSerializerNamespaces();
nameSpaces.Add(string.Empty, nameSpace);
serializer = new XmlSerializer(typeof(T), new XmlAttributeOverrides(), knownTypesLocator.XmlItems.ToArray(), xmlRootAttribute, nameSpace);
//Now we can serialize
using (StreamWriter writer = new StreamWriter(outputStream))
{
serializer.Serialize(writer, sourceData, nameSpaces);
}
}
else
{
serializer = new XmlSerializer(typeof(T), knownTypesLocator.XmlItems.ToArray());
//Now we can serialize
using (StreamWriter writer = new StreamWriter(outputStream))
{
serializer.Serialize(writer, sourceData);
}
}
}
I'm sure I'm overlooking something. Any help would be greatly appreciated!
UPDATE 9/26/2017
So... I've been asked to provide more detail, specifically an explanation of the purpose of my code, and a reproducible example. So here's both:
The purpose for the XML. I am writing an interface UI between two systems. I read data from one, give users options to massage the data, and then give the the ability to export the data into files the second system can import. It's regarding a bill of material system where system one are the CAD drawings and objects in those drawings and system two is an enterprise estimating system that is also being configured to support electronic bills of material. I was given the XMLs from the vendor to recreate.
Fully functional example code.... I've tried generalizing the code in a reproducible form.
[XmlRoot("OutlTxt", Namespace = "http://www.mynamespace/09262017")]
public class OutlineText
{
private string _value;
[XmlElement("TextOutlTxt")]
public XmlNode Value
{
get
{
string text = _value;
text = Regex.Replace(text, #"[\a\b\f\n\r\t\v\\""'&<>]", m => string.Join(string.Empty, m.Value.Select(c => string.Format("&#x{0:X};", Convert.ToInt32(c))).ToArray()));
string value = "\n<p style=\"text-align:left;margin-top:0pt;margin-bottom:0pt;\">\n<span>ReplaceMe</span>\n</p>\n";
XmlDocument document = new XmlDocument();
document.InnerXml = "<root>" + value + "</root>";
XmlNode innerNode = document.DocumentElement.FirstChild;
innerNode.InnerText = text;
return innerNode;
}
set
{ }
}
private OutlineText()
{ }
public OutlineText(string text)
{
_value = text;
}
}
public class XmlFileWriter
{
public void Write<T>(T sourceData, FileInfo targetFile) where T : class
{
//This is actually retrieved through a locator object, but surely no one will mind an empty
//collection for the sake of an example
Type[] knownTypes = new Type[] { };
using (FileStream targetStream = targetFile.OpenWrite())
{
//Let's see if we can get the default namespace
XmlRootAttribute xmlRootAttribute = sourceData.GetType().GetCustomAttributes<XmlRootAttribute>().FirstOrDefault();
XmlSerializer serializer = null;
if (xmlRootAttribute != null)
{
string nameSpace = xmlRootAttribute.Namespace ?? string.Empty;
XmlSerializerNamespaces nameSpaces = new XmlSerializerNamespaces();
nameSpaces.Add(string.Empty, nameSpace);
serializer = new XmlSerializer(typeof(T), new XmlAttributeOverrides(), knownTypes, xmlRootAttribute, nameSpace);
//Now we can serialize
using (StreamWriter writer = new StreamWriter(targetStream))
{
serializer.Serialize(writer, sourceData, nameSpaces);
}
}
else
{
serializer = new XmlSerializer(typeof(T), knownTypes);
//Now we can serialize
using (StreamWriter writer = new StreamWriter(targetStream))
{
serializer.Serialize(writer, sourceData);
}
}
}
}
}
public static void Main()
{
OutlineText outlineText = new OutlineText(#"SUBSTA SF6 CIRCUIT BKR CONC FDN ""C""");
XmlFileWriter fileWriter = new XmlFileWriter();
fileWriter.Write<OutlineText>(outlineText, new FileInfo(#"C:\MyDirectory\MyXml.xml"));
Console.ReadLine();
}
The result produced:
<?xml version="1.0" encoding="utf-8"?>
<OutlTxt xmlns="http://www.mynamespace/09262017">
<TextOutlTxt>
<p style="text-align:left;margin-top:0pt;margin-bottom:0pt;" xmlns="">SUBSTA SF6 CIRCUIT BKR CONC FDN "C"</p>
</TextOutlTxt>
</OutlTxt>
Edit 9/27/2017
Per the request in the solution below, a secondary issue I've ran into is keeping the hexadecimal codes. To illustrate this issue based on the above example, let's say the value between is
SUBSTA SF6 CIRCUIT BKR CONC FDN "C"
The vendor file is expecting the literals to be in their hex code format like so
SUBSTA SF6 CIRCUIT BKR CONC FDN "C"
I've rearranged the sample code Value property to be like so:
private string _value;
[XmlAnyElement("TextOutlTxt", Namespace = "http://www.mynamespace/09262017")]
public XElement Value
{
get
{
string value = string.Format("<p xmlns=\"{0}\" style=\"text-align:left;margin-top:0pt;margin-bottom:0pt;\"><span>{1}</span></p>", "http://www.mynamespace/09262017", _value);
string innerXml = string.Format("<TextOutlTxt xmlns=\"{0}\">{1}</TextOutlTxt>", "http://www.mynamespace/09262017", value);
XElement element = XElement.Parse(innerXml);
//Remove redundant xmlns attributes
foreach (XElement descendant in element.DescendantsAndSelf())
{
descendant.Attributes().Where(att => att.IsNamespaceDeclaration && att.Value == "http://www.mynamespace/09262017").Remove();
}
return element;
}
set
{
_value = value == null ? null : value.ToString();
}
}
if I use the code
string text = Regex.Replace(element.Value, #"[\a\b\f\n\r\t\v\\""'&<>]", m => string.Join(string.Empty, m.Value.Select(c => string.Format("&#x{0:X};", Convert.ToInt32(c))).ToArray()));
to create the hex code values ahead of the XElement.Parse(), the XElement converts them back to their literal values. If I try to set the XElement.Value directly after the XElement.Parse()(or through SetValue()), it changes the " to " Not only that, but it seems to mess with the element output and adds additional elements throwing it all out of whack.
Edit 9/27/2017 #2 to clarify, the original implementation had a related problem, namely that the escaped text was re-escaped. I.e. I was getting
SUBSTA SF6 CIRCUIT BKR CONC FDN "C"
But wanted
SUBSTA SF6 CIRCUIT BKR CONC FDN "C"
The reason you are getting xmlns="" added to your embedded XML is that your container element(s) <OutlineText> and <TextOutlTxt> are declared to be in the "http://www.mynamespace/09262017" namespace by use of the [XmlRootAttribute.Namespace] attribute, whereas the embedded literal XML elements are in the empty namespace. To fix this, your embedded XML literal must be in the same namespace as its parent elements.
Here is the XML literal. Notice there is no xmlns="..." declaration anywhere in the XML:
<p style="text-align:left;margin-top:0pt;margin-bottom:0pt;" xmlns="">SUBSTA SF6 CIRCUIT BKR CONC FDN "C"</p>
Lacking such a declaration, the <p> element is in the empty namespace. Conversely, your OutlineText type is decorated with an [XmlRoot] attribute:
[XmlRoot("OutlTxt", Namespace = "http://www.mynamespace/09262017")]
public class OutlineText
{
}
Thus the corresponding OutlTxt root element will be in the http://www.mynamespace/09262017 namespace. All its child elements will default to this namespace as well unless overridden. Placing the embedded XmlNode in the empty namespace counts as overriding the parent namespace, and so an xmlns="" attribute is required.
The simplest way to avoid this problem is for your embedded XML string literal to place itself in the correct namespace as follows:
<p xmlns="http://www.mynamespace/09262017" style="text-align:left;margin-top:0pt;margin-bottom:0pt;">
<span>ReplaceMe</span>
</p>
Then, in your Value method, strip redundant namespace declarations. This is somewhat easier to do with the LINQ to XML API:
[XmlRoot("OutlTxt", Namespace = OutlineText.Namespace)]
public class OutlineText
{
public const string Namespace = "http://www.mynamespace/09262017";
private string _value;
[XmlAnyElement("TextOutlTxt", Namespace = OutlineText.Namespace)]
public XElement Value
{
get
{
var escapedValue = EscapeTextValue(_value);
var nestedXml = string.Format("<p xmlns=\"{0}\" style=\"text-align:left;margin-top:0pt;margin-bottom:0pt;\"><span>{1}</span></p>", Namespace, escapedValue);
var outerXml = string.Format("<TextOutlTxt xmlns=\"{0}\">{1}</TextOutlTxt>", Namespace, nestedXml);
var element = XElement.Parse(outerXml);
//Remove redundant xmlns attributes
element.DescendantsAndSelf().SelectMany(e => e.Attributes()).Where(a => a.IsNamespaceDeclaration && a.Value == Namespace).Remove();
return element;
}
set
{
_value = value == null ? null : value.Value;
}
}
static string EscapeTextValue(string text)
{
return Regex.Replace(text, #"[\a\b\f\n\r\t\v\\""'&<>]", m => string.Join(string.Empty, m.Value.Select(c => string.Format("&#x{0:X};", Convert.ToInt32(c))).ToArray()));
}
private OutlineText()
{ }
public OutlineText(string text)
{
_value = text;
}
}
And the resulting XML will look like:
<OutlTxt xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.mynamespace/09262017">
<TextOutlTxt>
<p style="text-align:left;margin-top:0pt;margin-bottom:0pt;">
<span>SUBSTA SF6 CIRCUIT BKR CONC FDN "C"</span>
</p>
</TextOutlTxt>
</OutlTxt>
Note that I have changed the attribute on Value from [XmlElement] to [XmlAnyElement]. I did this because it appears your value XML might contain multiple mixed content nodes at the root level, e.g.:
Start Text <p>Middle Text</p> End Text
Using [XmlAnyElement] enables this by allowing a container node to be returned without causing an extra level of XML element nesting.
Sample working .Net fiddle.
Your question now has two requirements:
Suppress certain xmlns="..." attributes on an embedded XElement or XmlNode while serializing, AND
Force certain characters inside element text to be escaped (e.g. " => "). Even though this is not required by the XML standard, your legacy receiving system apparently needs this.
Issue #1 can be addressed as in this answer
For issue #2, however, there is no way to force certain characters to be unnecessarily escaped using XmlNode or XElement because escaping is handled at the level of XmlWriter during output. And Microsoft's built-in implementations of XmlWriter seem not to have any settings that can force certain characters that do not need to be escaped to nevertheless be escaped. You would need to try to subclass XmlWriter or XmlTextWriter (as described e.g. here and here) then intercept string values as they are written and escape quote characters as desired.
Thus, as an alternate approach that solves both #1 and #2, you could implement IXmlSerializable and write your desired XML directly with XmlWriter.WriteRaw():
[XmlRoot("OutlTxt", Namespace = OutlineText.Namespace)]
public class OutlineText : IXmlSerializable
{
public const string Namespace = "http://www.mynamespace/09262017";
private string _value;
// For debugging purposes.
internal string InnerValue { get { return _value; } }
static string EscapeTextValue(string text)
{
return Regex.Replace(text, #"[\a\b\f\n\r\t\v\\""'&<>]", m => string.Join(string.Empty, m.Value.Select(c => string.Format("&#x{0:X};", Convert.ToInt32(c))).ToArray()));
}
private OutlineText()
{ }
public OutlineText(string text)
{
_value = text;
}
#region IXmlSerializable Members
XmlSchema IXmlSerializable.GetSchema()
{
return null;
}
void IXmlSerializable.ReadXml(XmlReader reader)
{
_value = ((XElement)XNode.ReadFrom(reader)).Value;
}
void IXmlSerializable.WriteXml(XmlWriter writer)
{
var escapedValue = EscapeTextValue(_value);
var nestedXml = string.Format("<p style=\"text-align:left;margin-top:0pt;margin-bottom:0pt;\"><span>{0}</span></p>", escapedValue);
writer.WriteRaw(nestedXml);
}
#endregion
}
And the output will be
<OutlTxt xmlns="http://www.mynamespace/09262017"><p style="text-align:left;margin-top:0pt;margin-bottom:0pt;"><span>SUBSTA SF6 CIRCUIT BKR CONC FDN "C"</span></p></OutlTxt>
Note that, if you use WriteRaw(), you can easily generate invalid XML simply by writing markup characters embedded in text values. You should be sure to add unit tests that verify that does not occur, e.g. that new OutlineText(#"<") does not cause problems. (A quick check seems to show your Regex is escaping < and > appropriately.)
New sample .Net fiddle.
I am having a weird issue I am banging my head against...
I have a class like this:
[XmlRoot("DoesntWork")]
class Root
{
[XmlElement(ElementName="WontWork", Order=1)]
public string xmlOutPropertyName
{...}
}
and I am serializing with this:
textBox1.Clear();
Root rt = new Root();
rt.xmlOutPropertyName = "[0000000001]";
XmlSerializer serializer = new XmlSerializer();
textBox1.Text = serializer.Serialize(rt);
but I always get xml that returns the names of class and property and not the name I want.
<Root>
<xmlOutPropertyName>[0000000001]</xmlOutPropertyName>
</Root>
Any idea why this is happening??
Dumb mistake, I was not paying attention and using the wrong serialization library.
I am writing a set of objects that must serialize to and from Xml, following a strict specification that I cannot change. One element in this specification can contain a mix of strings and elements in-line.
A simple example of this Xml output would be this:
<root>Leading text <tag>tag1</tag> <tag>tag2</tag></root>
Note the whitespace characters between the closing of the first tag, and the start of the second tag. Here are the objects that represents this structure:
[XmlRoot("root")]
public class Root
{
[XmlText(typeof(string))]
[XmlElement("tag", typeof(Tag))]
public List<object> Elements { get; set; }
//this is simply for the sake of example.
//gives us four objects in the elements array
public static Root Create()
{
Root root = new Root();
root.Elements.Add("Leading text ");
root.Elements.Add(new Tag() { Text = "tag1" });
root.Elements.Add(" ");
root.Elements.Add(new Tag() { Text = "tag2" });
return root;
}
public Root()
{
Elements = new List<object>();
}
}
public class Tag
{
[XmlText]
public string Text {get;set;}
}
Calling Root.Create(), and saving to a file using this method looks perfect:
public XDocument SerializeToXml(Root obj)
{
XmlSerializer serializer = new XmlSerializer(typeof(Root));
XDocument doc = new XDocument();
using (var writer = doc.CreateWriter())
{
serializer.Serialize(writer, obj);
}
return doc;
}
Serialization looks exactly like the xml structure at the beginning of this post.
Now when I want to serialize an xml file back into a Root object, I call this:
public static Root FromFile(string file)
{
XmlSerializer serializer = new XmlSerializer(typeof(Root));
XmlReaderSettings settings = new XmlReaderSettings();
XmlReader reader = XmlTextReader.Create(file, settings);
//whitespace gone here
Root root = serializer.Deserialize(reader) as Root;
return root;
}
The problem is here. The whitespace string is eliminated. When I call Root.Create(), there are four objects in the Elements array. One of them is a space. This serializes just fine, but when deserializing, there are only 3 objects in Elements. The whitespace string gets eliminated.
Any ideas on what I'm doing wrong? I've tried using xml:space="preserve", as well as a host of XmlReader, XmlTextReader, etc. variations. Note that when I use a StringBuilder to read the XmlTextReader, the xml contains the spaces as I'd expect. Only when calling Deserialize(stream) do I lose the spaces.
Here's a link to an entire working example. It's LinqPad friendly, just copy/paste: http://pastebin.com/8MkUQviB The example opens two files, one a perfect serialized xml file, the second being a deserialized then reserialized version of the first file. Note you'll have to reference System.Xml.Serialization.
Thanks for reading this novel. I hope someone has some ideas. Thank you!
It looks like a bug. Workaround seems to be replace all whitespaces and crlf in XML text nodes by
entities. Semantic equal entities (
) does not work.
<root>Leading text <tag>tag1</tag> <tag>tag2</tag></root>
is working for me.
I'm new to C#. I'm building an application that persists an XML file with a list of elements. The structure of my XML file is as follows:
<Elements>
<Element>
<Name>Value</Name>
<Type>Value</Type>
<Color>Value</Color>
</Element>
<Element>
<Name>Value</Name>
<Type>Value</Type>
<Color>Value</Color>
</Element>
<Element>
<Name>Value</Name>
<Type>Value</Type>
<Color>Value</Color>
</Element>
</Elements>
I have < 100 of those items, and it's a single list (so I'm considering a DB solution to be overkill, even SQLite). When my application loads, I want to read this list of elements to memory. At present, after browsing the web a bit, I'm using XmlTextReader.
However, and maybe I'm using it in the wrong way, I read the data tag-by-tag, and thus expect the tags to be in a certain order (otherwise the code will be messy). What I would like to do is read complete "Element" structures and extract tags from them by name. I'm sure it's possible, but how?
To clarify, the main difference is that the way I'm using XmlTextReader today, it's not tolerant to scenarios such as wrong order of tags (e.g. Type comes before Name in a certain Element).
What's the best practice for loading such structures to memory in C#?
It's really easy to do in LINQ to XML. Are you using .NET 3.5? Here's a sample:
using System;
using System.Xml.Linq;
using System.Linq;
class Test
{
[STAThread]
static void Main()
{
XDocument document = XDocument.Load("test.xml");
var items = document.Root
.Elements("Element")
.Select(element => new {
Name = (string)element.Element("Name"),
Type = (string)element.Element("Type"),
Color = (string)element.Element("Color")})
.ToList();
foreach (var x in items)
{
Console.WriteLine(x);
}
}
}
You probably want to create your own data structure to hold each element, but you just need to change the "Select" call to use that.
Any particular reason you're not using XmlDocument?
XmlDocument myDoc = new XmlDocument()
myDoc.Load(fileName);
foreach(XmlElement elem in myDoc.SelectNodes("Elements/Element"))
{
XmlNode nodeName = elem.SelectSingleNode("Name/text()");
XmlNode nodeType = elem.SelectSingleNode("Type/text()");
XmlNode nodeColor = elem.SelectSingleNode("Color/text()");
string name = nodeName!=null ? nodeName.Value : String.Empty;
string type = nodeType!=null ? nodeType.Value : String.Empty;
string color = nodeColor!=null ? nodeColor.Value : String.Empty;
// Here you use the values for something...
}
It sounds like XDocument, and XElement might be better suited for this task. They might not have the absolute speed of XmlTextReader, but for your cases they sound like they would be appropriate and it would make dealing with fixed structures a lot easier. Parsing out elements would work like so:
XDocument xml;
foreach (XElement el in xml.Element("Elements").Elements("Element")) {
var name = el.Element("Name").Value;
// etc.
}
You can even get a bit fancier with Linq:
XDocument xml;
var collection = from el in xml.Element("Elements").Elements("Element")
select new { Name = el.Element("Name").Value,
Color = el.Element("Color").Value,
Type = el.Element("Type").Value
};
foreach (var item in collection) {
// here you can use item.Color, item.Name, etc..
}
You could use XmlSerializer class (http://msdn.microsoft.com/en-us/library/system.xml.serialization.xmlserializer.aspx)
public class Element
{
public string Name { get; set; }
public string Type { get; set; }
public string Color { get; set; }
}
class Program
{
static void Main(string[] args)
{
string xml =
#"<Elements>
<Element>
<Name>Value</Name>
<Type>Value</Type>
<Color>Value</Color>
</Element>(...)</Elements>";
XmlSerializer serializer = new XmlSerializer(typeof(Element[]), new XmlRootAttribute("Elements"));
Element[] result = (Element[])serializer.Deserialize(new StringReader(xml));}
You should check out Linq2Xml, http://www.hookedonlinq.com/LINQtoXML5MinuteOverview.ashx