Xml not remembering the line break before child elements? - c#

I have an xml file which is like so.
<Root>
<Child123>
more nodes inside
</Child123>
<Child123></Child123>
<Child123></Child123>
<Child123></Child123>
</Root>
My code is generating this file which is not correctly formatting its forgetting to place a hard return at only the <Cild123>. You will see on line 2 the 3 line Child123 is starting on Line 2 when it should be on Line 3?
<Root>
<Child123></Child123>
<Child123></Child123><Child123>
more nodes inside
</Child123><Child123>
more nodes inside
</Child123>
</Root>
This is my code I read the file into a list view and allow the user to pick some lines then I click generate, this then allows the above file to be generated
public static string Seralize<T>(T dataToSerlize)
{
var ns = new XmlSerializerNamespaces();
ns.Add("", "");
var seralize = new XmlSerializer(dataToSerlize.GetType());
var settings = new XmlWriterSettings();
settings.Indent = true;
settings.OmitXmlDeclaration = true;
settings.NewLineChars = "\n";
settings.NewLineHandling = NewLineHandling.Replace;
using (var stream = new StringWriter())
{
using (var test = XmlWriter.Create(stream, settings))
{
seralize.Serialize(test, dataToSerlize, ns);
return stream.ToString();
}
}
}
But as you see its not keeping the formatting correct in the generated xml file how to I ensure that it retains the flow of the first xml
PS I also tried
settings.Encoding = Encoding.UTF-8;
Which I thought may be the issue.
I also tried
settings.NewLineChars = "\n";
But still no joy.

One approach would be to focus your efforts on generating valid XML and let XDocument handle the formatting. Assuming your generated XML is valid, you can do this with it:
using System.Xml.Linq;
string rawXml =
#"<Root>
<Child123></Child123>
<Child123></Child123><Child123>
</Child123><Child123>
</Child123>
</Root>";
XDocument formattedXml = XDocument.Parse(rawXml);
The output of formattedXml.ToString() is:
<Root>
<Child123></Child123>
<Child123></Child123>
<Child123></Child123>
<Child123></Child123>
</Root>

Related

How to check if XMLReader has valid XML without reading so as to write complete using XMLWriter?

I'm trying to write a string(which is nothing but XMLNodes) into a new XML File using XMLWriter. Few of the strings are valid XML content while few of the string aren't.
String Input:
1.
<Test>
<A a="Hello"></A>
<B b="Hello"></B>
</Test>
Hello
This is Sample String but not XML
Code :
using (XmlWriter writer = XmlWriter.Create(#"C:\\Test.XML"))
{
writer.WriteStartDocument();
string scontent2 = "Hello This is Sample String but not XML";
XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Fragment;
try{
using (StringReader stringReader = new StringReader(scontent))
using (XmlReader xmlReader = XmlReader.Create(stringReader, settings))
{
writer.WriteStartElement("Test");
writer.WriteNode(xmlReader, true);
writer.WriteEndElement();
}catch(XMLException exception){}
}
Expected Output:
The Test element must also not be created if the Exception occurs. If I use, scontent.Read() or any such, the problem is since the pointer moves to a node, the writer.WriteNode(scontent,true) wont write entire nodes(if there are more than two nodes) For ex. <A a="Hello"></A><B b="Hello"></B>. In this case, I've write all nodes using WriteNode for which XMLReader must be in Initial State(XmlReader.State).

Error: The XML declaration must be the first node in the document

I am getting "Unexpected XML declaration. The XML declaration must be the first node in the document, and no white space characters are allowed to appear before it" error while trying to load xml. Both my C# code and contents of XML file are given below. XML definition exists in Line 6 of the xml file and hence the error.
I can not control what's there in the xml file so how can I edit/rewrite it using C# such that xml declaration comes first and then the comments to load it without any error!
//xmlFilepath is the path/name of the xml file passed to this function
static function(string xmlFilepath)
{
XmlReaderSettings readerSettings = new XmlReaderSettings();
readerSettings.IgnoreComments = true;
readerSettings.IgnoreWhitespace = true;
XmlReader reader = XmlReader.Create(XmlFilePath, readerSettings);
XmlDocument xml = new XmlDocument();
xml.Load(reader);
}
XmlDoc.xml
<!-- Customer ID: 1 -->
<!-- Import file: XmlDoc.xml -->
<!-- Start time: 8/14/12 3:15 AM -->
<!-- End time: 8/14/12 3:18 AM -->
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
-----
As the error states, the first five characters of an XML document should be <?xml. No ifs, ands or buts. The comments you have above the opening XML tag are illegal; they must go inside the XML tag (because the comment structure is itself defined by the XML standard and so is meaningless outside the main XML tags).
EDIT: Something like this should be able to rearrange the rows, given the file format from the OP:
var lines = new List<string>();
using (var fileStream = File.Open(xmlFilePath, FileMode.Open, FileAccess.Read))
using(var reader = new TextReader(fileStream))
{
string line;
while((line = reader.ReadLine()) != null)
lines.Add(line);
}
var i = lines.FindIndex(s=>s.StartsWith("<?xml"));
var xmlLine = lines[i];
lines.RemoveAt(i);
lines.Insert(0,xmlLine);
using (var fileStream = File.Open(xmlFilePath, FileMode.Truncate, FileAccess.Write)
using(var writer = new TextWriter(fileStream))
{
foreach(var line in lines)
writer.Write(line);
writer.Flush();
}
That is not valid XML.
As the error clearly states, the XML declaration (<?xml ... ?>) must come first.
I'm using the following function to remove whitespace from xml:
public static void DoRemovespace(string strFile)
{
string str = System.IO.File.ReadAllText(strFile);
str = str.Replace("\n", "");
str = str.Replace("\r", "");
Regex regex = new Regex(#">\s*<");
string cleanedXml = regex.Replace(str, "><");
System.IO.File.WriteAllText(strFile, cleanedXml);
}
Don't put any comments in the beginning of your file!

C# XMLElement.OuterXML in a single line rather than format

I am trying to log some XML responses from a WCF Service using log4net.
I want the output of the XML file to the log to be in properly formed XML. The request comes in as an XMLElement.
Example:
The request comes in as this:
<?xml version="1.0" encoding="utf-8"?>
<ApplicationEvent xmlns="http://courts.wa.gov/INH_TV/ApplicationEvent.xsd">
<Severity xmlns="">Information</Severity>
<Application xmlns="">Application1</Application>
<Category xmlns="">Timings</Category>
<EventID xmlns="">1000</EventID>
<DateTime xmlns="">2012-09-02T12:05:15.234Z</DateTime>
<MachineName xmlns="">Server1</MachineName>
<MessageID xmlns="">10000000-0000-0000-0000-000000000000</MessageID>
<Program xmlns="">Progam1</Program>
<Action xmlns="">Entry</Action>
<UserID xmlns="">User1</UserID>
</ApplicationEvent>
Then if I output this value to log4net.
logger.Info(request.OuterXml);
I get the entire document logged in a single line like so:
<ApplicationEvent xmlns="http://courts.wa.gov/INH_TV/ApplicationEvent.xsd"><Severity xmlns="">Information</Severity><Application xmlns="">Application1</Application><Category xmlns="">Timings</Category><EventID xmlns="">1000</EventID><DateTime xmlns="">2012-09-02T12:05:15.234Z</DateTime><MachineName xmlns="">Server1</MachineName><MessageID xmlns="">10000000-0000-0000-0000-000000000000</MessageID><Program xmlns="">Progam1</Program><Action xmlns="">Entry</Action><UserID xmlns="">User1</UserID></ApplicationEvent>
I would like it to display in the log.txt file formatted correctly as it came in. So far the only way I have found to do this is to convert it to an XElement like so:
XmlDocument logXML = new XmlDocument();
logXML.AppendChild(logXML.ImportNode(request, true));
XElement logMe = XElement.Parse(logXML.InnerXml);
logger.Info(logMe.ToString());
This doesn't seem like good programming to me. I have been searching the documentation and I can't find a built-in way to output this correctly without converting it.
Is there an obvious, better way that I am just missing?
edit1: Removed ToString() since OuterXML is a String value.
edit2: I answered my own question:
So I did some more research, and I guess I missed a piece of code in the documentation.
http://msdn.microsoft.com/en-us/library/system.xml.xmlnode.outerxml.aspx
I have it down to:
using (MemoryStream ms = new MemoryStream())
{
XmlWriterSettings xws = new XmlWriterSettings();
xws.Indent = true;
using (XmlWriter xmlWriter = XmlWriter.Create(ms, xws))
{
request.WriteTo(xmlWriter);
}
ms.Position = 0; StreamReader sr = new StreamReader(ms);
string s = sr.ReadToEnd(); // s will contain indented xml
logger.Info(s);
}
Which is a little more efficient than my current method despite being more verbose.
XElement parse is the cleanest way. You can save a line or two with:
logger.Info(XElement.Parse(request.OuterXml).ToString());

XML Canonicalization returns empty elements in the transformed output

I have a related post asking how to select nodes from an XmlDocument using an XPath statement.
The only way I could get the SelectNodes to work was to create a non default namespace "x" and then explicitly reference the nodes in the XPath statement.
Whilst this works and provides me with a node list, the canonicalization then fails to produce any content to my selected nodes in the output.
I've tried using XmlDsigExcC14NTransform and specifying the namespace but this produces the same output.
Below is an example of the xml output produced (using the XML in my related post):
<Applications xmlns="http://www.myApps.co.uk/">
<Application>
<ApplicantDetails>
<Title>
</Title>
<Forename>
</Forename>
<Middlenames>
<Middlename>
</Middlename>
</Middlenames>
<PresentSurname>
</PresentSurname>
<CurrentAddress>
<Address>
<AddressLine1>
</AddressLine1>
<AddressLine2>
</AddressLine2>
<AddressTown>
</AddressTown>
<AddressCounty>
</AddressCounty>
<Postcode>
</Postcode>
<CountryCode>
</CountryCode>
</Address>
<ResidentFromGyearMonth>
</ResidentFromGyearMonth>
</CurrentAddress>
</ApplicantDetails>
</Application>
<Application>
<ApplicantDetails>
<Title>
</Title>
<Forename>
</Forename>
<Middlenames>
<Middlename>
</Middlename>
</Middlenames>
<PresentSurname>
</PresentSurname>
<CurrentAddress>
<Address>
<AddressLine1>
</AddressLine1>
<AddressLine2>
</AddressLine2>
<AddressTown>
</AddressTown>
<AddressCounty>
</AddressCounty>
<Postcode>
</Postcode>
<CountryCode>
</CountryCode>
</Address>
<ResidentFromGyearMonth>
</ResidentFromGyearMonth>
</CurrentAddress>
</ApplicantDetails>
</Application>
</Applications>
Another StackOverflow user has had a similar problem here
Playing around with this new code, I found that the results differ depending upon how you pass the nodes into the LoadInput method. Implementing the code below worked.
I'm still curious as to why it works one way and not another but will leave that for a rainy day
static void Main(string[] args)
{
string path = #"..\..\TestFiles\Test_1.xml";
if (File.Exists(path) == true)
{
XmlDocument xDoc = new XmlDocument();
xDoc.PreserveWhitespace = true;
using (FileStream fs = new FileStream(path, FileMode.Open))
{
xDoc.Load(fs);
}
//Instantiate an XmlNamespaceManager object.
System.Xml.XmlNamespaceManager xmlnsManager = new System.Xml.XmlNamespaceManager(xDoc.NameTable);
//Add the namespaces used to the XmlNamespaceManager.
xmlnsManager.AddNamespace("x", "http://www.myApps.co.uk/");
// Create a list of nodes to have the Canonical treatment
XmlNodeList nodeList = xDoc.SelectNodes("/x:ApplicationsBatch/x:Applications|/x:ApplicationsBatch/x:Applications//*", xmlnsManager);
//Initialise the stream to read the node list
MemoryStream nodeStream = new MemoryStream();
XmlWriter xw = XmlWriter.Create(nodeStream);
nodeList[0].WriteTo(xw);
xw.Flush();
nodeStream.Position = 0;
// Perform the C14N transform on the nodes in the stream
XmlDsigC14NTransform transform = new XmlDsigC14NTransform();
transform.LoadInput(nodeStream);
// use a new memory stream for output of the transformed xml
// this could be done numerous ways if you don't wish to use a memory stream
MemoryStream outputStream = (MemoryStream)transform.GetOutput(typeof(Stream));
File.WriteAllBytes(#"..\..\TestFiles\CleanTest_1.xml", outputStream.ToArray());
}
}

Formatting XML with tabulation and removing element ending space?

I am trying to do 2 things:
Get the output XML formated with
TABULATION instead of spaces.
Remove the ending space it generates
for video element.
" />
to
"/>
I have tried to use
xmlWriter.Formatting = Formatting.Indented;
as well as
IndentChar
but they did not worked for me dont know why.
This is the code I have currently, I would also like to hear advices and suggestion to improve it:
XmlDocument xmlDoc = new XmlDocument();
XmlTextWriter xmlWriter = new XmlTextWriter(filename, System.Text.Encoding.UTF8);
xmlWriter.WriteProcessingInstruction("xml", "version='1.0' encoding='UTF-8' standalone='yes'");
xmlWriter.WriteComment(#" This file was made by #author");
xmlWriter.WriteStartElement("videos");
xmlWriter.Close();
xmlDoc.Load(filename);
XmlNode root = xmlDoc.DocumentElement;
foreach (int myID in ExportListIDs)
{
XmlElement video = xmlDoc.CreateElement("video");
root.AppendChild(video);
video.SetAttribute("videoID", myID.ToString());
}
xmlDoc.Save(filename);
I have managed to solve question 1 with the below code but I still don't know if it is possible to remove the space between " and /> at the end of an element vide question 2.
XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = Encoding.UTF8;
settings.Indent = true;
settings.IndentChars = "\t";
XmlWriter writeXML = XmlWriter.Create("test.xml", settings);
writeXML.WriteStartDocument();
writeXML.WriteComment(#" This file was made by #author");
writeXML.WriteStartElement("videos");
foreach (var item in myList)
{
writeXML.WriteStartElement("video");
writeXML.WriteAttributeString("ID", item.Key.ToString());
writeXML.WriteAttributeString("Name", item.Value);
writeXML.WriteStartElement("object");
writeXML.WriteAttributeString("A", item.Key.ToString());
writeXML.WriteAttributeString("B", item.Value);
writeXML.WriteEndElement();
writeXML.WriteEndElement();
}
writeXML.WriteEndElement();
writeXML.WriteEndDocument();
writeXML.Close();

Categories

Resources