Formatting XML with tabulation and removing element ending space? - c#

I am trying to do 2 things:
Get the output XML formated with
TABULATION instead of spaces.
Remove the ending space it generates
for video element.
" />
to
"/>
I have tried to use
xmlWriter.Formatting = Formatting.Indented;
as well as
IndentChar
but they did not worked for me dont know why.
This is the code I have currently, I would also like to hear advices and suggestion to improve it:
XmlDocument xmlDoc = new XmlDocument();
XmlTextWriter xmlWriter = new XmlTextWriter(filename, System.Text.Encoding.UTF8);
xmlWriter.WriteProcessingInstruction("xml", "version='1.0' encoding='UTF-8' standalone='yes'");
xmlWriter.WriteComment(#" This file was made by #author");
xmlWriter.WriteStartElement("videos");
xmlWriter.Close();
xmlDoc.Load(filename);
XmlNode root = xmlDoc.DocumentElement;
foreach (int myID in ExportListIDs)
{
XmlElement video = xmlDoc.CreateElement("video");
root.AppendChild(video);
video.SetAttribute("videoID", myID.ToString());
}
xmlDoc.Save(filename);

I have managed to solve question 1 with the below code but I still don't know if it is possible to remove the space between " and /> at the end of an element vide question 2.
XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = Encoding.UTF8;
settings.Indent = true;
settings.IndentChars = "\t";
XmlWriter writeXML = XmlWriter.Create("test.xml", settings);
writeXML.WriteStartDocument();
writeXML.WriteComment(#" This file was made by #author");
writeXML.WriteStartElement("videos");
foreach (var item in myList)
{
writeXML.WriteStartElement("video");
writeXML.WriteAttributeString("ID", item.Key.ToString());
writeXML.WriteAttributeString("Name", item.Value);
writeXML.WriteStartElement("object");
writeXML.WriteAttributeString("A", item.Key.ToString());
writeXML.WriteAttributeString("B", item.Value);
writeXML.WriteEndElement();
writeXML.WriteEndElement();
}
writeXML.WriteEndElement();
writeXML.WriteEndDocument();
writeXML.Close();

Related

Xml not remembering the line break before child elements?

I have an xml file which is like so.
<Root>
<Child123>
more nodes inside
</Child123>
<Child123></Child123>
<Child123></Child123>
<Child123></Child123>
</Root>
My code is generating this file which is not correctly formatting its forgetting to place a hard return at only the <Cild123>. You will see on line 2 the 3 line Child123 is starting on Line 2 when it should be on Line 3?
<Root>
<Child123></Child123>
<Child123></Child123><Child123>
more nodes inside
</Child123><Child123>
more nodes inside
</Child123>
</Root>
This is my code I read the file into a list view and allow the user to pick some lines then I click generate, this then allows the above file to be generated
public static string Seralize<T>(T dataToSerlize)
{
var ns = new XmlSerializerNamespaces();
ns.Add("", "");
var seralize = new XmlSerializer(dataToSerlize.GetType());
var settings = new XmlWriterSettings();
settings.Indent = true;
settings.OmitXmlDeclaration = true;
settings.NewLineChars = "\n";
settings.NewLineHandling = NewLineHandling.Replace;
using (var stream = new StringWriter())
{
using (var test = XmlWriter.Create(stream, settings))
{
seralize.Serialize(test, dataToSerlize, ns);
return stream.ToString();
}
}
}
But as you see its not keeping the formatting correct in the generated xml file how to I ensure that it retains the flow of the first xml
PS I also tried
settings.Encoding = Encoding.UTF-8;
Which I thought may be the issue.
I also tried
settings.NewLineChars = "\n";
But still no joy.
One approach would be to focus your efforts on generating valid XML and let XDocument handle the formatting. Assuming your generated XML is valid, you can do this with it:
using System.Xml.Linq;
string rawXml =
#"<Root>
<Child123></Child123>
<Child123></Child123><Child123>
</Child123><Child123>
</Child123>
</Root>";
XDocument formattedXml = XDocument.Parse(rawXml);
The output of formattedXml.ToString() is:
<Root>
<Child123></Child123>
<Child123></Child123>
<Child123></Child123>
<Child123></Child123>
</Root>

Token StartElement in state Epilog would result in an invalid XML document

I am getting the error "Token StartElement in state Epilog would result in an invalid XML document." when i get the data from datatable and try to convery it to xml file .
Code :
DataTable dtTest = new DataTable();
dtTest.Columns.Add("Name");
dtTest.Columns.Add("NickName");
dtTest.Columns.Add("Code");
dtTest.Columns.Add("reference");
dtTest.Rows.Add("Yash", "POPs", "Vapi", "None1");
dtTest.Rows.Add("shilpa", "shilpa", "valsad", "None2");
dtTest.Rows.Add("Dinesh", "dinu", "pune", "None3");
dtTest.Rows.Add("rahul", "mady", "pardi", "None4");
XmlWriterSettings settings = new XmlWriterSettings();
StringWriter stringwriter = new StringWriter();
XmlTextWriter xmlTextWriter = new XmlTextWriter(stringwriter);
xmlTextWriter.Formatting = Formatting.Indented;
xmlTextWriter.WriteStartDocument();
foreach (var row in dtTest.AsEnumerable())
{
xmlTextWriter.WriteStartElement("");
xmlTextWriter.WriteAttributeString("orderid", row.Field<string>("Name"));
xmlTextWriter.WriteElementString("type", row.Field<string>("Name"));
xmlTextWriter.WriteElementString("status", row.Field<string>("Name"));
xmlTextWriter.WriteElementString("productno", row.Field<string>("Name"));
xmlTextWriter.WriteEndElement();
}
XmlDocument docSave = new XmlDocument();
docSave.LoadXml(stringwriter.ToString());
What is the cause of this error, and how can it be fixed?
You have a few problems here:
You are writing multiple root elements to your document, one for each call to xmlTextWriter.WriteStartElement(""). However, a well-formed XML document must have one and only one root element, so you need to wrap your row elements in some container element.
(You might have been thinking that WriteStartDocument() would write the root element, but it does not. It just writes the XML declaration.)
You are using XmlTextWriter but this class is deprecated. From the docs:
Starting with the .NET Framework 2.0, we recommend that you create XmlWriter instances by using the XmlWriter.Create method and the XmlWriterSettings class to take advantage of new functionality.
If you switch to XmlWriter you will get clearer error messages and better error checking.
You are trying to write elements with no name:
xmlTextWriter.WriteStartElement("");
This would result in malformed XML. XmlTextWriter seems not to check for this, but XmlWriter does.
Putting all of the above together, your code can be modified as follows:
var stringwriter = new StringWriter();
using (var xmlWriter = XmlWriter.Create(stringwriter, new XmlWriterSettings { Indent = true }))
{
xmlWriter.WriteStartDocument();
xmlWriter.WriteStartElement("Root");
foreach (var row in dtTest.AsEnumerable())
{
xmlWriter.WriteStartElement("Row");
xmlWriter.WriteAttributeString("orderid", row.Field<string>("Name"));
xmlWriter.WriteElementString("type", row.Field<string>("Name"));
xmlWriter.WriteElementString("status", row.Field<string>("Name"));
xmlWriter.WriteElementString("productno", row.Field<string>("Name"));
xmlWriter.WriteEndElement();
}
xmlWriter.WriteEndElement();
}
var xml = stringwriter.ToString();
Demo fiddle here.

Why are all of my line-breaks changing from "/r/n" to "/n/" and how can I stop this from happening?

I am saving my files as xml documents, using XDocument.Save(path), and after saving and loading a document all of the line breaks have changed from "/r/n" to "/n/". Why is this happening and how can I fix it?
You can use XmlWriterSettings to control what your line-break characters are:
XmlWriterSettings xws = new XmlWriterSettings();
xws.NewLineChars = "\r\n";
using (XmlWriter xw = XmlWriter.Create("whatever.xml", xws))
{
xmlDocumentInstance.Save(xw);
}
Whatever you're using to read in your XML might be normalizing your line endings.
If you set the PreserveWhiteSpace property on your XmlDocument object before calling Load() and Save() then this will not happen:
var doc = new XmlDocument();
doc.PreserveWhitespace = true;
doc.Load("foo.xml");
...
doc.Save("bar.xml"); // Line endings will not be altered

XML indenting when injecting an XML string into an XmlWriter

I have an XmlTextWriter writing to a file and an XmlWriter using that text writer. This text writer is set to output tab-indented XML:
XmlTextWriter xtw = new XmlTextWriter("foo.xml", Encoding.UTF8);
xtw.Formatting = Formatting.Indented;
xtw.IndentChar = '\t';
xtw.Indentation = 1;
XmlWriter xw = XmlWriter.Create(xtw);
Changed per Jeff's MSDN link:
XmlWriterSettings set = new XmlWriterSettings();
set.Indent = true;
set.IndentChars = "\t";
set.Encoding = Encoding.UTF8;
xw = XmlWriter.Create(f, set);
This does not change the end result.
Now I'm an arbitrary depth in my XmlWriter and I'm getting a string of XML from elsewhere (that I cannot control) that is a single-line, non-indented XML. If I call xw.WriteRaw() then that string is injected verbatim and does not follow my indentation I want.
...
string xml = ExternalMethod();
xw.WriteRaw(xml);
...
Essentially, I want a WriteRaw that will parse the XML string and go through all the WriteStartElement, etc. so that it gets reformatted per the XmlTextWriter's settings.
My preference is a way to do this with the setup I already have and to do this without having to reload the final XML just to reformat it. I'd also prefer not to parse the XML string with the likes of XmlReader and then mimic what it finds into my XmlWriter (very very manual process).
At the end of this I'd rather have a simple solution than one that follows my preferences. (Best solution, naturally, would be simple and follows my preferences.)
How about using a XmlReader to read the xml as xml nodes?
string xml = ExternalMethod();
XmlReader reader = XmlReader.Create(new StringReader(xml));
xw.WriteNode(reader, true);
You shouldn't use XmlTextWriter, as indicated in MSDN where it states:
In the .NET Framework version 2.0
release, the recommended practice is
to create XmlWriter instances using
the XmlWriter.Create method and the
XmlWriterSettings class. This allows
you to take full advantage of all the
new features introduced in this
release. For more information, see
Creating XML Writers.
Instead, you should use XmlWriter.Create to get your writer. You can then use the XmlWriterSettings class to specify things like indentation.
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
settings.IndentChars = "\t";
Update
I think you can just use WriteNode. You take your xml string and load it into an XDocument or XmlReader and then use the node from that to write it into your XmlWriter.
This is the best I've got so far. A very manual process that only supports what is written. My string XML is nothing more than tags, attributes, and text data. If it supported namespaces, CDATA, etc. then this would have to grow accordingly.
Very manual, very messy and very likely prone to bugs but it does accomplish my preferences.
private static void PipeXMLIntoWriter(XmlWriter xw, string xml)
{
byte[] dat = new System.Text.UTF8Encoding().GetBytes(xml);
MemoryStream m = new MemoryStream();
m.Write(dat, 0, dat.Length);
m.Seek(0, SeekOrigin.Begin);
XmlReader r = XmlReader.Create(m);
while (r.Read())
{
switch (r.NodeType)
{
case XmlNodeType.Element:
xw.WriteStartElement(r.Name);
if (r.HasAttributes)
{
for (int i = 0; i < r.AttributeCount; i++)
{
r.MoveToAttribute(i);
xw.WriteAttributeString(r.Name, r.Value);
}
}
if (r.IsEmptyElement)
{
xw.WriteEndElement();
}
break;
case XmlNodeType.EndElement:
xw.WriteEndElement();
break;
case XmlNodeType.Text:
xw.WriteString(r.Value);
break;
default:
throw new Exception("Unrecognized node type: " + r.NodeType);
}
}
}
composing the answers above I have found this works:
private static string FormatXML(string unformattedXml) {
// first read the xml ignoring whitespace
XmlReaderSettings readeroptions= new XmlReaderSettings {IgnoreWhitespace = true};
XmlReader reader = XmlReader.Create(new StringReader(unformattedXml),readeroptions);
// then write it out with indentation
StringBuilder sb = new StringBuilder();
XmlWriterSettings xmlSettingsWithIndentation = new XmlWriterSettings { Indent = true};
using (XmlWriter writer = XmlWriter.Create(sb, xmlSettingsWithIndentation)) {
writer.WriteNode(reader, true);
}
return sb.ToString();
}
How about:
string xml = ExternalMethod();
var xd = XDocument.Parse(xml);
xd.WriteTo(xw);
I was looking for an answer to this issue but in VB.net.
Thanks to Colin Burnett, I solved it. I made two corrections: first, the XmlReader has to ignore white spaces (settings.IgnoreWhiteSpaces); second, the reader has to be back into the element after it reads attributes. Below you can see how the code looks like.
Also I tried the solution of GreyCloud, but in the generated XML there were some annoying empties attributes (xlmns).
Private Sub PipeXMLIntoWriter(xw As XmlWriter, xml As String)
Dim dat As Byte() = New System.Text.UTF8Encoding().GetBytes(xml)
Dim m As New MemoryStream()
m.Write(dat, 0, dat.Length)
m.Seek(0, SeekOrigin.Begin)
Dim settings As New XmlReaderSettings
settings.IgnoreWhitespace = True
settings.IgnoreComments = True
Dim r As XmlReader = XmlReader.Create(m, settings)
While r.Read()
Select Case r.NodeType
Case XmlNodeType.Element
xw.WriteStartElement(r.Name)
If r.HasAttributes Then
For i As Integer = 0 To r.AttributeCount - 1
r.MoveToAttribute(i)
xw.WriteAttributeString(r.Name, r.Value)
Next
r.MoveToElement()
End If
If r.IsEmptyElement Then
xw.WriteEndElement()
End If
Exit Select
Case XmlNodeType.EndElement
xw.WriteEndElement()
Exit Select
Case XmlNodeType.Text
xw.WriteString(r.Value)
Exit Select
Case Else
Throw New Exception("Unrecognized node type: " + r.NodeType)
End Select
End While
End Sub

XDocument.Save() without header

I am editing csproj files with Linq-to-XML and need to save the XML without the <?XML?> header.
As XDocument.Save() is missing the necessary option, what's the best way to do this?
You can do this with XmlWriterSettings, and saving the document to an XmlWriter:
XDocument doc = new XDocument(new XElement("foo",
new XAttribute("hello","world")));
XmlWriterSettings settings = new XmlWriterSettings();
settings.OmitXmlDeclaration = true;
StringWriter sw = new StringWriter();
using (XmlWriter xw = XmlWriter.Create(sw, settings))
// or to write to a file...
//using (XmlWriter xw = XmlWriter.Create(filePath, settings))
{
doc.Save(xw);
}
string s = sw.ToString();
A simpler solution than the accepted answer is to use XDocument.ToString() to get the XML text without the header.
Example:
// Load the file
XDocument xDocument = XDocument.Load(fileName);
// Edit the XML...
// Save the edited XML text to file
File.WriteAllText(fileName, xDocument.ToString());

Categories

Resources