xml serialization and encoding - c#

i want the xml encoding to be:
<?xml version="1.0" encoding="windows-1252"?>
To generate encoding like encoding="windows-1252" I wrote this code.
var myns = OS.xmlns;
using (var stringWriter = new StringWriter())
{
var settings = new XmlWriterSettings
{
Encoding = Encoding.GetEncoding(1252),
OmitXmlDeclaration = false
};
using (var writer = XmlWriter.Create(stringWriter, settings))
{
var ns = new XmlSerializerNamespaces();
ns.Add(string.Empty, myns);
var xmlSerializer = new XmlSerializer(OS.GetType(), myns);
xmlSerializer.Serialize(writer, OS,ns);
}
xmlString= stringWriter.ToString();
}
But I am still not getting my expected encoding what am I missing? Please guide me to generate encoding like encoding="windows-1252"?. What do I need to change in my code?

As long as you output the XML directly to a String (through a StringBuilder or a StringWriter) you'll always get UTF-8 or UTF-16 encondings. This is because strings in .NET are internally represented as Unicode characters.
In order to get the proper encoding you'll have to switch to a binary output, such as a Stream.
Here's a quick example:
var settings = new XmlWriterSettings
{
Encoding = Encoding.GetEncoding(1252)
};
using (var buffer = new MemoryStream())
{
using (var writer = XmlWriter.Create(buffer, settings))
{
writer.WriteRaw("<sample></sample>");
}
buffer.Position = 0;
using (var reader = new StreamReader(buffer))
{
Console.WriteLine(reader.ReadToEnd());
Console.Read();
}
}
Related resources:
C# in Depth: Strings in C# and .NET

Related

XML Serialization changes in upgraded project (.NET 3.5 to 4.5)

Recently upgraded a 3.5 project to 4.5. There is a chunk of data that we are serializing and storing in the database, but everytime a save occurs in the upgraded project, the XML formatting has changed, throwing errors, and I can't seem to figure out the core issue. There are 2 SO questions in particular that mention encoding changes, but I've tried switching to UTF8 (in a few different ways specified in the answers on those questions), without any success - with UTF8 I just got a mess of strange characters throughout the entire file.
The main issues that I can see occurring are:
A leading ? character is added to the XML (which I've come to find out is a valid character, but we aren't handling apparently)
Child nodes aren't being included with some of the nodes.
Here is our serialization method:
public static string SerializeXml<T>(T instance)
{
XmlSerializer serializer = new XmlSerializer(typeof(T));
MemoryStream memStream = new MemoryStream();
XmlTextWriter xmlWriter = new XmlTextWriter(memStream, Encoding.Unicode);
serializer.Serialize(xmlWriter, instance);
memStream = (MemoryStream)xmlWriter.BaseStream;
return UnicodeEncoding.Unicode.GetString(memStream.ToArray()).Replace("<?xml version=\"1.0\" encoding=\"utf-16\"?>", "");
}
and our deserialization method:
public static T DeserializeXml<T>(string xml)
{
XmlSerializer xs = new XmlSerializer(typeof(T));
StringReader reader = new StringReader(xml);
return (T)xs.Deserialize(reader);
}
Any help would be appreciated, I am not too familiar with serialization or encoding. Just curious what may have changed with the upgrade to 4.5, or if there is something I need to take a closer look at.
If you want to Serialize to a String you need to use UTF16. If you want to Serialize with UTF8 you need to serialize to a byte[]. Strings in C# are UTF16 so in the code you posted I believe all your data is encoded with UTF16 but because you are omitting the Xml Declaration code assumes it is UTF8.
I would recommend using functions like this and not omitting the XmlDeclaration:
public static string SerializeXmlToString<T>(T instance)
{
XmlSerializer serializer = new XmlSerializer(typeof(T));
XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = Encoding.Unicode;
StringBuilder builder = new StringBuilder();
using (StringWriter writer = new StringWriter(builder))
using (XmlWriter xmlWriter = XmlWriter.Create(writer, settings))
{
serializer.Serialize(xmlWriter, instance);
}
return builder.ToString();
}
public static byte[] SerializeXml<T>(T instance)
{
XmlSerializer serializer = new XmlSerializer(typeof(T));
XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = Encoding.UTF8;
using (MemoryStream memStream = new MemoryStream())
{
using (XmlWriter xmlWriter = XmlWriter.Create(memStream, settings))
{
serializer.Serialize(xmlWriter, instance);
}
return memStream.ToArray();
}
}
public static T DeserializeXml<T>(string data)
{
XmlSerializer serializer = new XmlSerializer(typeof(T));
using (StringReader reader = new StringReader(data))
{
return (T)serializer.Deserialize(reader);
}
}
public static T DeserializeXml<T>(byte[] bytes)
{
XmlSerializer serializer = new XmlSerializer(typeof(T));
using(MemoryStream stream = new MemoryStream(bytes))
{
return (T)serializer.Deserialize(stream);
}
}

Round trip XML serializing with .Net DataContractSerializer fails

I seem to be getting some junk at the head of my serialized XML string. I have a simple extension method
public static string ToXML(this object This)
{
DataContractSerializer ser = new DataContractSerializer(This.GetType());
var settings = new XmlWriterSettings { Indent = true };
using (MemoryStream ms = new MemoryStream())
using (var w = XmlWriter.Create(ms, settings))
{
ser.WriteObject(w, This);
w.Flush();
return UTF8Encoding.Default.GetString(ms.ToArray());
}
}
and when I apply it to my object I get the string
<?xml version="1.0" encoding="utf-8"?>
<RootModelType xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.datacontract.org/2004/07/WeinCad.Data">
<MoineauPump xmlns:d2p1="http://schemas.datacontract.org/2004/07/Weingartner.Numerics">
<d2p1:Rotor>
<d2p1:Equidistance>0.0025</d2p1:Equidistance>
<d2p1:Lobes>2</d2p1:Lobes>
<d2p1:MajorRadius>0.04</d2p1:MajorRadius>
<d2p1:MinorRadius>0.03</d2p1:MinorRadius>
</d2p1:Rotor>
</MoineauPump>
</RootModelType>
Note the junk at the beginning. When I try to deserialize this
I get an error. If I copy paste the XML into my source minus
the junk prefix I can deserialize it. What is the junk text
and how can I remove it or handle it?
Note my deserialization code is
public static RootModelType Load(Stream data)
{
DataContractSerializer ser = new DataContractSerializer(typeof(RootModelType));
return (RootModelType)ser.ReadObject(data);
}
public static RootModelType Load(string data)
{
using(var stream = new MemoryStream(Encoding.UTF8.GetBytes(data))){
return Load(stream);
}
}
This fix seems to work
public static string ToXML(this object obj)
{
var settings = new XmlWriterSettings { Indent = true };
using (MemoryStream memoryStream = new MemoryStream())
using (StreamReader reader = new StreamReader(memoryStream))
using(XmlWriter writer = XmlWriter.Create(memoryStream, settings))
{
DataContractSerializer serializer =
new DataContractSerializer(obj.GetType());
serializer.WriteObject(writer, obj);
writer.Flush();
memoryStream.Position = 0;
return reader.ReadToEnd();
}
}

Indentation of XML not working as expected

I am trying to create an XML file using string data. (Which is itself in XML format.) But the main problem is that the XML that I have created is not properly formatted. I have used XmlWriterSettings to format the XML, but it does not seem to be working. Can anyone tell me what is wrong with this code.
string unformattedXml = #"<datas><data1>sampledata1</data1><datas>";
XmlWriterSettings xmlSettingsWithIndentation = new XmlWriterSettings { Indent = true};
using (XmlWriter writer = XmlWriter.Create(Console.Out, xmlSettingsWithIndentation))
{
writer.WriteRaw(unformattedXml);
}
Actually when I load this string in an XmlDocument and then saves it as a file, it was formatted. I just wanted to know why it was not working with XmlWriter.
You help would be much appreciated.
Thanks,
Alex
To ignore white space, try this:
private static string FormatXml(string unformattedXml)
{
//First read the xml, ignoring whitespace.
var readeroptions = new XmlReaderSettings { IgnoreWhitespace = true };
var reader = XmlReader.Create(new StringReader(unformattedXml), readeroptions);
//Then write it out with indentation.
var sb = new StringBuilder();
var xmlSettingsWithIndentation = new XmlWriterSettings { Indent = true };
using (var writer = XmlWriter.Create(sb, xmlSettingsWithIndentation))
writer.WriteNode(reader, true);
return sb.ToString();
}
It should work if you use a XmlReader instead of a raw string.
(I expect it is a typo when your last XML element is not closed property, and that by formatting you refer to correct indentation):
class Program
{
static void Main(string[] args)
{
string unformattedXml = #"<datas><data1>sampledata1</data1></datas>";
var rdr = XmlReader.Create(new StringReader(unformattedXml));
var sb = new StringBuilder();
var xmlSettingsWithIndentation = new XmlWriterSettings
{
Indent = true
};
using (var writer = XmlWriter.Create(sb, xmlSettingsWithIndentation))
writer.WriteNode(rdr, true);
Console.WriteLine(sb);
Console.ReadKey();
}
}
It outputs:
<?xml version="1.0" encoding="utf-16"?>
<datas>
<data1>sampledata1</data1>
</datas>
Please cf. similar questions:
XmlWriter.WriteRaw indentation
XML indenting when injecting an XML string into an XmlWriter

Format XML string to print friendly XML string

I have an XML string as such:
<?xml version='1.0'?><response><error code='1'> Success</error></response>
There are no lines between one element and another, and thus is very difficult to read. I want a function that formats the above string:
<?xml version='1.0'?>
<response>
<error code='1'> Success</error>
</response>
Without resorting to manually write the format function myself, is there any .Net library or code snippet that I can use offhand?
You will have to parse the content somehow ... I find using LINQ the most easy way to do it. Again, it all depends on your exact scenario. Here's a working example using LINQ to format an input XML string.
string FormatXml(string xml)
{
try
{
XDocument doc = XDocument.Parse(xml);
return doc.ToString();
}
catch (Exception)
{
// Handle and throw if fatal exception here; don't just ignore them
return xml;
}
}
[using statements are ommitted for brevity]
Use XmlTextWriter...
public static string PrintXML(string xml)
{
string result = "";
MemoryStream mStream = new MemoryStream();
XmlTextWriter writer = new XmlTextWriter(mStream, Encoding.Unicode);
XmlDocument document = new XmlDocument();
try
{
// Load the XmlDocument with the XML.
document.LoadXml(xml);
writer.Formatting = Formatting.Indented;
// Write the XML into a formatting XmlTextWriter
document.WriteContentTo(writer);
writer.Flush();
mStream.Flush();
// Have to rewind the MemoryStream in order to read
// its contents.
mStream.Position = 0;
// Read MemoryStream contents into a StreamReader.
StreamReader sReader = new StreamReader(mStream);
// Extract the text from the StreamReader.
string formattedXml = sReader.ReadToEnd();
result = formattedXml;
}
catch (XmlException)
{
// Handle the exception
}
mStream.Close();
writer.Close();
return result;
}
This one, from kristopherjohnson is heaps better:
It doesn't require an XML document header either.
Has clearer exceptions
Adds extra behaviour options: OmitXmlDeclaration = true, NewLineOnAttributes = true
Less lines of code
static string PrettyXml(string xml)
{
var stringBuilder = new StringBuilder();
var element = XElement.Parse(xml);
var settings = new XmlWriterSettings();
settings.OmitXmlDeclaration = true;
settings.Indent = true;
settings.NewLineOnAttributes = true;
using (var xmlWriter = XmlWriter.Create(stringBuilder, settings))
{
element.Save(xmlWriter);
}
return stringBuilder.ToString();
}
The simple solution that is working for me:
XmlDocument xmlDoc = new XmlDocument();
StringWriter sw = new StringWriter();
xmlDoc.LoadXml(rawStringXML);
xmlDoc.Save(sw);
String formattedXml = sw.ToString();
Check the following link: How to pretty-print XML (Unfortunately, the link now returns 404 :()
The method in the link takes an XML string as an argument and returns a well-formed (indented) XML string.
I just copied the sample code from the link to make this answer more comprehensive and convenient.
public static String PrettyPrint(String XML)
{
String Result = "";
MemoryStream MS = new MemoryStream();
XmlTextWriter W = new XmlTextWriter(MS, Encoding.Unicode);
XmlDocument D = new XmlDocument();
try
{
// Load the XmlDocument with the XML.
D.LoadXml(XML);
W.Formatting = Formatting.Indented;
// Write the XML into a formatting XmlTextWriter
D.WriteContentTo(W);
W.Flush();
MS.Flush();
// Have to rewind the MemoryStream in order to read
// its contents.
MS.Position = 0;
// Read MemoryStream contents into a StreamReader.
StreamReader SR = new StreamReader(MS);
// Extract the text from the StreamReader.
String FormattedXML = SR.ReadToEnd();
Result = FormattedXML;
}
catch (XmlException)
{
}
MS.Close();
W.Close();
return Result;
}
I tried:
internal static void IndentedNewWSDLString(string filePath)
{
var xml = File.ReadAllText(filePath);
XDocument doc = XDocument.Parse(xml);
File.WriteAllText(filePath, doc.ToString());
}
it is working fine as expected.
.NET 2.0 ignoring name resolving, and with proper resource-disposal, indentation, preserve-whitespace and custom encoding:
public static string Beautify(System.Xml.XmlDocument doc)
{
string strRetValue = null;
System.Text.Encoding enc = System.Text.Encoding.UTF8;
// enc = new System.Text.UTF8Encoding(false);
System.Xml.XmlWriterSettings xmlWriterSettings = new System.Xml.XmlWriterSettings();
xmlWriterSettings.Encoding = enc;
xmlWriterSettings.Indent = true;
xmlWriterSettings.IndentChars = " ";
xmlWriterSettings.NewLineChars = "\r\n";
xmlWriterSettings.NewLineHandling = System.Xml.NewLineHandling.Replace;
//xmlWriterSettings.OmitXmlDeclaration = true;
xmlWriterSettings.ConformanceLevel = System.Xml.ConformanceLevel.Document;
using (System.IO.MemoryStream ms = new System.IO.MemoryStream())
{
using (System.Xml.XmlWriter writer = System.Xml.XmlWriter.Create(ms, xmlWriterSettings))
{
doc.Save(writer);
writer.Flush();
ms.Flush();
writer.Close();
} // End Using writer
ms.Position = 0;
using (System.IO.StreamReader sr = new System.IO.StreamReader(ms, enc))
{
// Extract the text from the StreamReader.
strRetValue = sr.ReadToEnd();
sr.Close();
} // End Using sr
ms.Close();
} // End Using ms
/*
System.Text.StringBuilder sb = new System.Text.StringBuilder(); // Always yields UTF-16, no matter the set encoding
using (System.Xml.XmlWriter writer = System.Xml.XmlWriter.Create(sb, settings))
{
doc.Save(writer);
writer.Close();
} // End Using writer
strRetValue = sb.ToString();
sb.Length = 0;
sb = null;
*/
xmlWriterSettings = null;
return strRetValue;
} // End Function Beautify
Usage:
System.Xml.XmlDocument xmlDoc = new System.Xml.XmlDocument();
xmlDoc.XmlResolver = null;
xmlDoc.PreserveWhitespace = true;
xmlDoc.Load("C:\Test.svg");
string SVG = Beautify(xmlDoc);
Customizable Pretty XML output with UTF-8 XML declaration
The following class definition gives a simple method to convert an input XML string into formatted output XML with the xml declaration as UTF-8. It supports all the configuration options that the XmlWriterSettings class offers.
using System;
using System.Text;
using System.Xml;
using System.IO;
namespace CJBS.Demo
{
/// <summary>
/// Supports formatting for XML in a format that is easily human-readable.
/// </summary>
public static class PrettyXmlFormatter
{
/// <summary>
/// Generates formatted UTF-8 XML for the content in the <paramref name="doc"/>
/// </summary>
/// <param name="doc">XmlDocument for which content will be returned as a formatted string</param>
/// <returns>Formatted (indented) XML string</returns>
public static string GetPrettyXml(XmlDocument doc)
{
// Configure how XML is to be formatted
XmlWriterSettings settings = new XmlWriterSettings
{
Indent = true
, IndentChars = " "
, NewLineChars = System.Environment.NewLine
, NewLineHandling = NewLineHandling.Replace
//,NewLineOnAttributes = true
//,OmitXmlDeclaration = false
};
// Use wrapper class that supports UTF-8 encoding
StringWriterWithEncoding sw = new StringWriterWithEncoding(Encoding.UTF8);
// Output formatted XML to StringWriter
using (XmlWriter writer = XmlWriter.Create(sw, settings))
{
doc.Save(writer);
}
// Get formatted text from writer
return sw.ToString();
}
/// <summary>
/// Wrapper class around <see cref="StringWriter"/> that supports encoding.
/// Attribution: http://stackoverflow.com/a/427737/3063884
/// </summary>
private sealed class StringWriterWithEncoding : StringWriter
{
private readonly Encoding encoding;
/// <summary>
/// Creates a new <see cref="PrettyXmlFormatter"/> with the specified encoding
/// </summary>
/// <param name="encoding"></param>
public StringWriterWithEncoding(Encoding encoding)
{
this.encoding = encoding;
}
/// <summary>
/// Encoding to use when dealing with text
/// </summary>
public override Encoding Encoding
{
get { return encoding; }
}
}
}
}
Possibilities for further improvement:-
An additional method GetPrettyXml(XmlDocument doc, XmlWriterSettings settings) could be created that allows the caller to customize the output.
An additional method GetPrettyXml(String rawXml) could be added that supports parsing raw text, rather than have the client use the XmlDocument. In my case, I needed to manipulate the XML using the XmlDocument, hence I didn't add this.
Usage:
String myFormattedXml = null;
XmlDocument doc = new XmlDocument();
try
{
doc.LoadXml(myRawXmlString);
myFormattedXml = PrettyXmlFormatter.GetPrettyXml(doc);
}
catch(XmlException ex)
{
// Failed to parse XML -- use original XML as formatted XML
myFormattedXml = myRawXmlString;
}
Check the following link: Format an XML file so it looks nice in C#
// Format the XML text.
StringWriter string_writer = new StringWriter();
XmlTextWriter xml_text_writer = new XmlTextWriter(string_writer);
xml_text_writer.Formatting = Formatting.Indented;
xml_document.WriteTo(xml_text_writer);
// Display the result.
txtResult.Text = string_writer.ToString();
It is possible to pretty-print an XML string via a streaming transformation with XmlWriter.WriteNode(XmlReader, true). This method
copies everything from the reader to the writer and moves the reader to the start of the next sibling.
Define the following extension methods:
public static class XmlExtensions
{
public static string FormatXml(this string xml, bool indent = true, bool newLineOnAttributes = false, string indentChars = " ", ConformanceLevel conformanceLevel = ConformanceLevel.Document) =>
xml.FormatXml( new XmlWriterSettings { Indent = indent, NewLineOnAttributes = newLineOnAttributes, IndentChars = indentChars, ConformanceLevel = conformanceLevel });
public static string FormatXml(this string xml, XmlWriterSettings settings)
{
using (var textReader = new StringReader(xml))
using (var xmlReader = XmlReader.Create(textReader, new XmlReaderSettings { ConformanceLevel = settings.ConformanceLevel } ))
using (var textWriter = new StringWriter())
{
using (var xmlWriter = XmlWriter.Create(textWriter, settings))
xmlWriter.WriteNode(xmlReader, true);
return textWriter.ToString();
}
}
}
And now you will be able to do:
var inXml = #"<?xml version='1.0'?><response><error code='1'> Success</error></response>";
var newXml = inXml.FormatXml(indentChars : "", newLineOnAttributes : false); // Or true, if you prefer
Console.WriteLine(newXml);
Which prints
<?xml version='1.0'?>
<response>
<error code="1"> Success</error>
</response>
Notes:
Other answers load the XML into some Document Object Model such as XmlDocument or XDocument/XElement, then re-serialize the DOM with indentation enabled.
This streaming solution completely avoids the added memory overhead of a DOM.
In your question you do not add any indentation for the nested <error code='1'> Success</error> node, so I set indentChars : "". Generally an indentation of two spaces per level of nesting is customary.
Attribute delimiters will be unconditionally transformed to double-quotes if currently single-quotes. (I believe this is true of other answers as well.)
Passing conformanceLevel : ConformanceLevel.Fragment allows strings containing sequences of XML fragments to be formatted.
Other than ConformanceLevel.Fragment, the input XML string must be well-formed. If it is not, XmlReader will throw an exception.
Demo fiddle here.
if you load up the XMLDoc I'm pretty sure the .ToString() function posses an overload for this.
But is this for debugging? The reason that it is sent like that is to take up less space (i.e stripping unneccessary whitespace from the XML).
Hi why don't you just try this:
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.PreserveWhitespace = false;
....
....
xmlDoc.Save(fileName);
PreserveWhitespace = false; that option can be used xml beautifier as well.

What is the simplest way to get indented XML with line breaks from XmlDocument?

When I build XML up from scratch with XmlDocument, the OuterXml property already has everything nicely indented with line breaks. However, if I call LoadXml on some very "compressed" XML (no line breaks or indention) then the output of OuterXml stays that way. So ...
What is the simplest way to get beautified XML output from an instance of XmlDocument?
Based on the other answers, I looked into XmlTextWriter and came up with the following helper method:
static public string Beautify(this XmlDocument doc)
{
StringBuilder sb = new StringBuilder();
XmlWriterSettings settings = new XmlWriterSettings
{
Indent = true,
IndentChars = " ",
NewLineChars = "\r\n",
NewLineHandling = NewLineHandling.Replace
};
using (XmlWriter writer = XmlWriter.Create(sb, settings)) {
doc.Save(writer);
}
return sb.ToString();
}
It's a bit more code than I hoped for, but it works just peachy.
As adapted from Erika Ehrli's blog, this should do it:
XmlDocument doc = new XmlDocument();
doc.LoadXml("<item><name>wrench</name></item>");
// Save the document to a file and auto-indent the output.
using (XmlTextWriter writer = new XmlTextWriter("data.xml", null)) {
writer.Formatting = Formatting.Indented;
doc.Save(writer);
}
Or even easier if you have access to Linq
try
{
RequestPane.Text = System.Xml.Linq.XElement.Parse(RequestPane.Text).ToString();
}
catch (System.Xml.XmlException xex)
{
displayException("Problem with formating text in Request Pane: ", xex);
}
A shorter extension method version
public static string ToIndentedString( this XmlDocument doc )
{
var stringWriter = new StringWriter(new StringBuilder());
var xmlTextWriter = new XmlTextWriter(stringWriter) {Formatting = Formatting.Indented};
doc.Save( xmlTextWriter );
return stringWriter.ToString();
}
If the above Beautify method is being called for an XmlDocument that already contains an XmlProcessingInstruction child node the following exception is thrown:
Cannot write XML declaration.
WriteStartDocument method has already
written it.
This is my modified version of the original one to get rid of the exception:
private static string beautify(
XmlDocument doc)
{
var sb = new StringBuilder();
var settings =
new XmlWriterSettings
{
Indent = true,
IndentChars = #" ",
NewLineChars = Environment.NewLine,
NewLineHandling = NewLineHandling.Replace,
};
using (var writer = XmlWriter.Create(sb, settings))
{
if (doc.ChildNodes[0] is XmlProcessingInstruction)
{
doc.RemoveChild(doc.ChildNodes[0]);
}
doc.Save(writer);
return sb.ToString();
}
}
It works for me now, probably you would need to scan all child nodes for the XmlProcessingInstruction node, not just the first one?
Update April 2015:
Since I had another case where the encoding was wrong, I searched for how to enforce UTF-8 without BOM. I found this blog post and created a function based on it:
private static string beautify(string xml)
{
var doc = new XmlDocument();
doc.LoadXml(xml);
var settings = new XmlWriterSettings
{
Indent = true,
IndentChars = "\t",
NewLineChars = Environment.NewLine,
NewLineHandling = NewLineHandling.Replace,
Encoding = new UTF8Encoding(false)
};
using (var ms = new MemoryStream())
using (var writer = XmlWriter.Create(ms, settings))
{
doc.Save(writer);
var xmlString = Encoding.UTF8.GetString(ms.ToArray());
return xmlString;
}
}
XmlTextWriter xw = new XmlTextWriter(writer);
xw.Formatting = Formatting.Indented;
public static string FormatXml(string xml)
{
try
{
var doc = XDocument.Parse(xml);
return doc.ToString();
}
catch (Exception)
{
return xml;
}
}
A simple way is to use:
writer.WriteRaw(space_char);
Like this sample code, this code is what I used to create a tree view like structure using XMLWriter :
private void generateXML(string filename)
{
using (XmlWriter writer = XmlWriter.Create(filename))
{
writer.WriteStartDocument();
//new line
writer.WriteRaw("\n");
writer.WriteStartElement("treeitems");
//new line
writer.WriteRaw("\n");
foreach (RootItem root in roots)
{
//indent
writer.WriteRaw("\t");
writer.WriteStartElement("treeitem");
writer.WriteAttributeString("name", root.name);
writer.WriteAttributeString("uri", root.uri);
writer.WriteAttributeString("fontsize", root.fontsize);
writer.WriteAttributeString("icon", root.icon);
if (root.children.Count != 0)
{
foreach (ChildItem child in children)
{
//indent
writer.WriteRaw("\t");
writer.WriteStartElement("treeitem");
writer.WriteAttributeString("name", child.name);
writer.WriteAttributeString("uri", child.uri);
writer.WriteAttributeString("fontsize", child.fontsize);
writer.WriteAttributeString("icon", child.icon);
writer.WriteEndElement();
//new line
writer.WriteRaw("\n");
}
}
writer.WriteEndElement();
//new line
writer.WriteRaw("\n");
}
writer.WriteEndElement();
writer.WriteEndDocument();
}
}
This way you can add tab or line breaks in the way you are normally used to, i.e. \t or \n
When implementing the suggestions posted here, I had trouble with the text encoding. It seems the encoding of the XmlWriterSettings is ignored, and always overridden by the encoding of the stream. When using a StringBuilder, this is always the text encoding used internally in C#, namely UTF-16.
So here's a version which supports other encodings as well.
IMPORTANT NOTE: The formatting is completely ignored if your XMLDocument object has its preserveWhitespace property enabled when loading the document. This had me stumped for a while, so make sure not to enable that.
My final code:
public static void SaveFormattedXml(XmlDocument doc, String outputPath, Encoding encoding)
{
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
settings.IndentChars = "\t";
settings.NewLineChars = "\r\n";
settings.NewLineHandling = NewLineHandling.Replace;
using (MemoryStream memstream = new MemoryStream())
using (StreamWriter sr = new StreamWriter(memstream, encoding))
using (XmlWriter writer = XmlWriter.Create(sr, settings))
using (FileStream fileWriter = new FileStream(outputPath, FileMode.Create))
{
if (doc.ChildNodes.Count > 0 && doc.ChildNodes[0] is XmlProcessingInstruction)
doc.RemoveChild(doc.ChildNodes[0]);
// save xml to XmlWriter made on encoding-specified text writer
doc.Save(writer);
// Flush the streams (not sure if this is really needed for pure mem operations)
writer.Flush();
// Write the underlying stream of the XmlWriter to file.
fileWriter.Write(memstream.GetBuffer(), 0, (Int32)memstream.Length);
}
}
This will save the formatted xml to disk, with the given text encoding.
If you have a string of XML, rather than a doc ready for use, you can do it this way:
var xmlString = "<xml>...</xml>"; // Your original XML string that needs indenting.
xmlString = this.PrettifyXml(xmlString);
private string PrettifyXml(string xmlString)
{
var prettyXmlString = new StringBuilder();
var xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xmlString);
var xmlSettings = new XmlWriterSettings()
{
Indent = true,
IndentChars = " ",
NewLineChars = "\r\n",
NewLineHandling = NewLineHandling.Replace
};
using (XmlWriter writer = XmlWriter.Create(prettyXmlString, xmlSettings))
{
xmlDoc.Save(writer);
}
return prettyXmlString.ToString();
}
A more simplified approach based on the accepted answer:
static public string Beautify(this XmlDocument doc) {
StringBuilder sb = new StringBuilder();
XmlWriterSettings settings = new XmlWriterSettings
{
Indent = true
};
using (XmlWriter writer = XmlWriter.Create(sb, settings)) {
doc.Save(writer);
}
return sb.ToString();
}
Setting the new line is not necessary. Indent characters also has the default two spaces so I preferred not to set it as well.
Set PreserveWhitespace to true before Load.
var document = new XmlDocument();
document.PreserveWhitespace = true;
document.Load(filename);

Categories

Resources