XML pretty printing/indentation for multiline string [duplicate] - c#

I have a string property which will contain text with newlines. This text has some of the properties of HTML text in that whitespace is disregarded.
If I serialize this type using XML serialization, the newlines are serialized properly, but the indentation is "wrong". I want the serialization process to indent the lines to keep the formatting of the XML, since those whitespace characters will be disregarded later anyway.
Here's an example LINQPad program:
void Main()
{
var d = new Dummy();
d.Text = #"Line 1
Line 2
Line 3";
var serializer = new XmlSerializer(typeof(Dummy));
var ns = new XmlSerializerNamespaces();
ns.Add("", "");
using (var writer = new StringWriter())
{
serializer.Serialize(writer, d, ns);
writer.ToString().Dump();
}
}
[XmlType("dummy")]
public class Dummy
{
[XmlElement("text")]
public string Text
{
get;
set;
}
}
Actual output:
<?xml version="1.0" encoding="utf-16"?>
<dummy>
<text>Line 1
Line 2
Line 3</text>
</dummy>
Desired output:
<?xml version="1.0" encoding="utf-16"?>
<dummy>
<text>
Line 1
Line 2
Line 3
</text>
</dummy>
Is this possible? If so, how? I'd rather not do the hackish way of just adding the whitespace in myself.
The reason for this is that this XML will be viewed and edited by people, so I'd like for the initial output to be better formatted for them out of the box.

I bumped into the same issue. At the end I came out with a custom writer:
public class IndentTextXmlWriter : XmlTextWriter
{
private int indentLevel;
private bool isInsideAttribute;
public IndentTextXmlWriter(TextWriter textWriter): base(textWriter)
{
}
public bool IndentText { get; set; }
public override void WriteStartAttribute(string prefix, string localName, string ns)
{
isInsideAttribute = true;
base.WriteStartAttribute(prefix, localName, ns);
}
public override void WriteEndAttribute()
{
isInsideAttribute = false;
base.WriteEndAttribute();
}
public override void WriteStartElement(string prefix, string localName, string ns)
{
indentLevel++;
base.WriteStartElement(prefix, localName, ns);
}
public override void WriteEndElement()
{
indentLevel--;
base.WriteEndElement();
}
public override void WriteString(string text)
{
if (String.IsNullOrEmpty(text) || isInsideAttribute || Formatting != Formatting.Indented || !IndentText || XmlSpace == XmlSpace.Preserve)
{
base.WriteString(text);
return;
}
string[] lines = text.Split(new[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
string indent = new string(IndentChar, indentLevel * Indentation);
foreach (string line in lines)
{
WriteRaw(Environment.NewLine);
WriteRaw(indent);
WriteRaw(line.Trim());
}
WriteRaw(Environment.NewLine);
WriteRaw(new string(IndentChar, (indentLevel - 1) * Indentation));
}
}
You can use it like this:
[TestMethod]
public void WriteIndentedText()
{
var result = new StringBuilder();
using (var writer = new IndentTextXmlWriter(new StringWriter(result)){Formatting = Formatting.Indented, IndentText = true})
{
string text = #" Line 1
Line 2
Line 3 ";
// some root
writer.WriteStartDocument();
writer.WriteStartElement("root");
writer.WriteStartElement("child");
// test auto-indenting
writer.WriteStartElement("elementIndented");
writer.WriteString(text);
writer.WriteEndElement();
// test space preserving
writer.WriteStartElement("elementPreserved");
writer.WriteAttributeString("xml", "space", null, "preserve");
writer.WriteString(text);
writer.WriteEndDocument();
}
Debug.WriteLine(result.ToString());
}
And the output:
<?xml version="1.0" encoding="utf-16"?>
<root>
<child>
<elementIndented>
Line 1
Line 2
Line 3
</elementIndented>
<elementPreserved xml:space="preserve"> Line 1
Line 2
Line 3 </elementPreserved>
</child>
</root>

Related

c# comparing XML format

I am trying to compare XML user input to a valid XML string. What I do is remove the values from the user input and compare it to the valid XML. At the bottom you can see my code. But as you can see in the XML examples the user input the goodslines has two goodsline children and fewer children. How can i alter my code so that it can this case would return true when compared? Thanks in advance
Valid XML
<?xml version="1.0" encoding="Windows-1252"?>
<goodslines>
<goodsline>
<unitamount></unitamount>
<unit_id matchmode="1"></unit_id>
<product_id matchmode="1"></product_id>
<weight></weight>
<loadingmeter></loadingmeter>
<volume></volume>
<length></length>
<width></width>
<height></height>
</goodsline>
</goodslines>
User input
<?xml version="1.0" encoding="Windows-1252"?>
<goodslines>
<goodsline>
<unitamount>5</unitamount>
<unit_id matchmode="1">colli</unit_id>
<product_id matchmode="1">1109</product_id>
<weight>50</weight>
<loadingmeter>0.2</loadingmeter>
<volume>0.036</volume>
<length>20</length>
<width>20</width>
<height>90</height>
</goodsline>
<goodsline>
<unitamount>12</unitamount>
<unit_id matchmode="1">drums</unit_id>
<product_id matchmode="1">1109</product_id>
<weight>345</weight>
</goodsline>
</goodslines>
Code
public static string Format(string xml)
{
try
{
var stringBuilder = new StringBuilder();
var element = XDocument.Parse(xml);
var settings = new XmlWriterSettings
{
OmitXmlDeclaration = true,
Indent = true,
IndentChars = new string(' ', 3),
NewLineChars = Environment.NewLine,
NewLineOnAttributes = false,
NewLineHandling = NewLineHandling.Replace
};
using (var xmlWriter = XmlWriter.Create(stringBuilder, settings))
element.Save(xmlWriter);
return stringBuilder.ToString();
}
catch(Exception ex)
{
return "Unable to format XML" + ex;
}
}
public static bool Compare(string xmlA, string xmlB)
{
if(xmlA == null || xmlB == null)
return false;
var xmlFormattedA = Format(xmlA);
var xmlFormattedB = Format(xmlB);
return xmlFormattedA.Equals(xmlFormattedB, StringComparison.InvariantCultureIgnoreCase);
}
public static string NoText(string request)
{
string pattern = #"<.*?>";
Regex rg = new Regex(pattern);
var noTextArr = rg.Matches(request)
.Cast<Match>()
.Select(m => m.Value)
.ToArray();
string noText = string.Join("", noTextArr);
return noText;
}

Get var from Other file in c# [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I get 3 files, I want to use the variable values from each other, one file is :
public class Dialogs
{
public Dictionary<string, Phrase> Phrases = new Dictionary<string, Phrase>();
}
public class Phrase
{
public string PhraseID = null;
public string Role = null;
}
the second one goes here:
public class DiaLoader
{
public Dialogs LoadDialog()
{
// Variables
List<Phrase> phrasesList = new List<Phrase>();
Dialogs resultDialog = new Dialogs();
XmlDocument doc = new XmlDocument();
doc.LoadXml("<Phrase Role="2"></Phrase>
<Phrase Role="2"></Phrase>
<Phrase Role="1"></Phrase>
<Phrase Role="1"></Phrase>
<Phrase Role="2"></Phrase>
<Phrase Role="1"></Phrase>
<Phrase Role="2"></Phrase>");
//get all the phrases
XmlNodeList phrases = xml.GetElementsByTagName("Phrase");
foreach (XmlNode phraseNode in phrases)
{
Phrase phrase = NodeToPhrase(phraseNode);
phrasesList.Add(phrase);
}
//Phrase node to phrase
private Phrase NodeToPhrase(XmlNode node)
{
Phrase phrase = new Phrase();
XmlNode roleNode = node.Attributes["Role"];
if (roleNode != null && !string.IsNullOrEmpty(roleNode.Value))
phrase.Role = roleNode.Value;
return phrase;
}
}
as you can see, i give the the string Role value using xml in the second file,, and the third one, I want to create var and get the value of Role, how can I do it? My code goes here:
DiaLoader dia= new DiaLoader();
public void Export(dia.LoadDialog()) {
XmlDocument doc = new XmlDocument();
string myVar = Phrase.Role;//this can't the value of role
}
I see in your example you have some phrases in a xml file, something like:
<Phrases>
<Phrase Role="2">Example 1</Phrase>
<Phrase Role="2">Example 2</Phrase>
<Phrase Role="1">Example 3</Phrase>
<Phrase Role="1">Example 4</Phrase>
<Phrase Role="2">Example 5</Phrase>
<Phrase Role="1">Example 6</Phrase>
<Phrase Role="2">Example 7</Phrase>
</Phrases>
And you want to read all these phrases into a dictionary, and later retrieve the text for certain role.
So, I modified a bit your code to allow to compile.
using System.Collections.Generic;
using System.Linq;
using System.Xml;
namespace Test001
{
public class Dialogs
{
private static string DEFAULT_DATA =
"<Phrases>" +
"<Phrase Role=\"2\">Example 1</Phrase>" +
"<Phrase Role=\"2\">Example 2</Phrase>" +
"<Phrase Role=\"1\">Example 3</Phrase>" +
"<Phrase Role=\"1\">Example 4</Phrase>" +
"<Phrase Role=\"2\">Example 5</Phrase>" +
"<Phrase Role=\"1\">Example 6</Phrase>" +
"<Phrase Role=\"2\">Example 7</Phrase>" +
"</Phrases>"
;
private int nextID;
private Dictionary<string, Phrase> Phrases = new Dictionary<string, Phrase>();
public List<Phrase> PhrasesList
{
get
{
return this.Phrases.Values.ToList();
}
}
public Dialogs()
{
this.Phrases = new Dictionary<string, Phrase>();
this.nextID = 0;
}
public bool Load(string filename = null)
{
this.Phrases.Clear();
this.nextID = 0;
XmlDocument doc = new XmlDocument();
try
{
if (filename == null)
{
doc.LoadXml(DEFAULT_DATA);
}
else
{
doc.Load(filename);
}
}
catch
{
// Error loading data
return false;
}
// Get all the phrases
XmlNodeList phrases = doc.GetElementsByTagName("Phrase");
foreach (XmlNode phraseNode in phrases)
{
Phrase phrase = NodeToPhrase(phraseNode);
this.Add(phrase);
}
return true;
}
public void Add(Phrase phrase)
{
this.Phrases.Add(this.nextID.ToString(), phrase);
this.nextID++;
}
// Parse a xml node to a phrase
private Phrase NodeToPhrase(XmlNode node)
{
Phrase phrase = new Phrase();
XmlNode roleNode = node.Attributes["Role"];
if (roleNode != null && !string.IsNullOrEmpty(roleNode.Value))
{
phrase.Role = roleNode.Value;
phrase.PhraseID = this.nextID.ToString();
if (node.HasChildNodes)
{
phrase.Text = node.FirstChild.Value;
}
this.nextID++;
}
return phrase;
}
}
}
I left unchanged the class Phrase, except added a new field to store the text
public class Phrase
{
public string PhraseID = null;
public string Role = null;
public string Text = null;
}
And for the usage, it would be something like this:
Dialogs dia = new Dialogs();
// dia.Load("full_path_to_your_nice_xml_file.xml")
dia.Load(); // Load default xml data just for testing purposes
var myVar = dia.PhrasesList.Find(phrase => phrase.Role == "2").Text;

How to get the unescaped length of XElement inner text?

I try to parse the following Java resources file - which is an XML.
I am parsing using C# and XDocument tools, so not a Java question here.
<?xml version="1.0" encoding="utf-8"?>
<resources>
<string name="problem"> test </string>
<string name="no_problem"> test </string>
</resources>
The problem is that XDocument.Load(string path) method load this as an XDocument with 2 identical XElements.
I load the file.
string filePath = #"c:\res.xml"; // whatever
var xDocument = XDocument.Load(filePath);
When I parse the XDocument object, here is the problem.
foreach (var node in xDocument.Root.Nodes())
{
if (node.NodeType == XmlNodeType.Element)
{
var xElement = node as XElement;
if (xElement != null) // just to be sure
{
var elementText = xElement.Value;
Console.WriteLine("Text = '{0}', Length = {1}",
elementText, elementText.Length);
}
}
}
This produces the following 2 lines :
"Text = ' test ', Length = 6"
"Text = ' test ', Length = 6"
I want to get the following 2 lines :
"Text = ' test ', Length = 6"
"Text = ' test ', Length = 16"
Document encoding is UTF8, if this is relevant somehow.
string filePath = #"c:\res.xml"; // whatever
var xDocument = XDocument.Load(filePath);
String one = (xDocument.Root.Nodes().ElementAt(0) as XElement).Value;//< test >
String two = (xDocument.Root.Nodes().ElementAt(1) as XElement).Value;//< test >
Console.WriteLine(one == two); //false
Console.WriteLine(String.Format("{0} {1}", (int)one[0], (int)two[0]));//160 32
You have two different strings, and   is there, but in unicode format.
One possible way to get things back is manually replace non-breaking space to " "
String result = one.Replace(((char) 160).ToString(), " ");
Thanks to Dmitry, following his suggestion, I have made a function to make stuff work for a list of unicode codes.
private static readonly List<int> UnicodeCharCodesReplace =
new List<int>() { 160 }; // put integers here
public static string UnicodeUnescape(this string input)
{
var chars = input.ToCharArray();
var sb = new StringBuilder();
foreach (var c in chars)
{
if (UnicodeCharCodesReplace.Contains(c))
{
// Append &#code; instead of character
sb.Append("&#");
sb.Append(((int) c).ToString());
sb.Append(";");
}
else
{
// Append character itself
sb.Append(c);
}
}
return sb.ToString();
}

read only particular instance using xmlreader

I have a xml file that looks like
<Name>AAA</Name>
<Age>23</Age>
<I1>
<Element1>A</Element1>
<Element2>B</Element2>
<Element3>C</Element3>
<\I1>
<I2>
<Element1>AA</Element1>
<Element2>BB</Element2>
<Element3>CC</Element3>
</I2>
I am reading all the values of elements using xmlreader in C# 3.0. But now i have to change by reading only the values within particular start and end tage. For the xml file mentioned above, i need to read <Name>, <Age> by default and then i have a function that returns the value "I1" or "I2" which is basically the element names. If it returns "I1" then i should read only the elements between <I1> and </I1> and should not read <I2> and vice versa. So the code structure would be (just the logic please ignore the syntax errors) like
/******function that returns element name I1 or I2*********/
string elementName = func(a,b);
xmlreader reader = reader.create("test.xml");
while(reader.read())
{
switch(reader.nodetype)
{
case xmlnodetype.element:
string nodeName = reader.name
break;
case xmlnodetype.text
switch(nodeName)
{
/*************Read default name and age values*******/
case "Name":
string personName = reader.value
break;
case "Age":
string personAge = reader.value;
break;
/*******End of reading default values**************/
/*** read only elements between the value returned by function name above
If function returns I1 then i need to read only values between <I1> </I1> else read </I2> and </I2>**/
}
}
}
Thanks!
So assuming, since we dont have any other tags to go off, that your file would look something such as this from beginning to end
<?xml version="1.0" encoding="utf-8" ?>
<RootElement>
<UserInformation>
<Name>AAA</Name>
<Age>23</Age>
<I1>
<Element1>A</Element1>
<Element2>B</Element2>
<Element3>C</Element3>
<\I1>
<I2>
<Element1>AA</Element1>
<Element2>BB</Element2>
<Element3>CC</Element3>
</I2>
</UserInformation>
</RootElement>
and then to call it
System.IO.StreamReader sr = new System.IO.StreamReader("test.xml");
String xmlText = sr.ReadToEnd();
sr.Close();
List<UserInfo> finalList = readXMLDoc(xmlText);
if(finalList != null)
{
//do something
}
private List<UserInfo> readXMLDoc(String fileText)
{
//create a list of Strings to hold our user information
List<UserInfo> userList = new List<UserInfo>();
try
{
//create a XmlDocument Object
XmlDocument xDoc = new XmlDocument();
//load the text of the file into the XmlDocument Object
xDoc.LoadXml(fileText);
//Create a XmlNode object to hold the root node of the XmlDocument
XmlNode rootNode = null;
//get the root element in the xml document
for (int i = 0; i < xDoc.ChildNodes.Count; i++)
{
//check to see if we hit the root element
if (xDoc.ChildNodes[i].Name == "RootElement")
{
//assign the root node
rootNode = xDoc.ChildNodes[i];
break;
}
}
//Loop through each of the child nodes of the root node
for (int j = 0; j < rootNode.ChildNodes.Count; j++)
{
//check for the UserInformation tag
if (rootNode.ChildNodes[j].Name == "UserInformation")
{
//assign the item node
XmlNode userNode = rootNode.ChildNodes[j];
//create userInfo object to hold results
UserInfo userInfo = new UserInfo();
//loop through each if the user tag's elements
foreach (XmlNode subNode in userNode.ChildNodes)
{
//check for the name tag
if (subNode.Name == "Name")
{
userInfo._name = subNode.InnerText;
}
//check for the age tag
if (subNode.Name == "Age")
{
userInfo._age = subNode.InnerText;
}
String tagToLookFor = "CallTheMethodThatReturnsTheCorrectTag";
//check for the tag
if (subNode.Name == tagToLookFor)
{
foreach (XmlNode elementNode in subNode.ChildNodes)
{
//check for the element1 tag
if (elementNode.Name == "Element1")
{
userInfo._element1 = elementNode.InnerText;
}
//check for the element2 tag
if (elementNode.Name == "Element2")
{
userInfo._element2 = elementNode.InnerText;
}
//check for the element3 tag
if (elementNode.Name == "Element3")
{
userInfo._element3 = elementNode.InnerText;
}
}
}
}
//add the userInfo to the list
userList.Add(userInfo);
}
}
}
catch (Exception e)
{
System.Windows.Forms.MessageBox.Show(e.Message);
return null;
}
//return the list
return userList;
}
//struct to hold information
struct UserInfo
{
public String _name;
public String _age;
public String _element1;
public String _element2;
public String _element3;
}

XDocument.ToString() drops XML Encoding Tag

Is there any way to get the xml encoding in the toString() Function?
Example:
xml.Save("myfile.xml");
leads to
<?xml version="1.0" encoding="utf-8"?>
<Cooperations>
<Cooperation>
<CooperationId>xxx</CooperationId>
<CooperationName>Allianz Konzern</CooperationName>
<LogicalCustomers>
But
tb_output.Text = xml.toString();
leads to an output like this
<Cooperations>
<Cooperation>
<CooperationId>xxx</CooperationId>
<CooperationName>Allianz Konzern</CooperationName>
<LogicalCustomers>
...
Either explicitly write out the declaration, or use a StringWriter and call Save():
using System;
using System.IO;
using System.Text;
using System.Xml.Linq;
class Test
{
static void Main()
{
string xml = #"<?xml version='1.0' encoding='utf-8'?>
<Cooperations>
<Cooperation />
</Cooperations>";
XDocument doc = XDocument.Parse(xml);
StringBuilder builder = new StringBuilder();
using (TextWriter writer = new StringWriter(builder))
{
doc.Save(writer);
}
Console.WriteLine(builder);
}
}
You could easily add that as an extension method:
public static string ToStringWithDeclaration(this XDocument doc)
{
if (doc == null)
{
throw new ArgumentNullException("doc");
}
StringBuilder builder = new StringBuilder();
using (TextWriter writer = new StringWriter(builder))
{
doc.Save(writer);
}
return builder.ToString();
}
This has the advantage that it won't go bang if there isn't a declaration :)
Then you can use:
string x = doc.ToStringWithDeclaration();
Note that that will use utf-16 as the encoding, because that's the implicit encoding in StringWriter. You can influence that yourself though by creating a subclass of StringWriter, e.g. to always use UTF-8.
The Declaration property will contain the XML declaration. To get the contents plus declaration, you can do the following:
tb_output.Text = xml.Declaration.ToString() + xml.ToString()
use this:
output.Text = String.Concat(xml.Declaration.ToString() , xml.ToString())
I did like this
string distributorInfo = string.Empty;
XDocument distributors = new XDocument();
//below is important else distributors.Declaration.ToString() throws null exception
distributors.Declaration = new XDeclaration("1.0", "utf-8", "yes");
XElement rootElement = new XElement("Distributors");
XElement distributor = null;
XAttribute id = null;
distributor = new XElement("Distributor");
id = new XAttribute("Id", "12345678");
distributor.Add(id);
rootElement.Add(distributor);
distributor = new XElement("Distributor");
id = new XAttribute("Id", "22222222");
distributor.Add(id);
rootElement.Add(distributor);
distributors.Add(rootElement);
distributorInfo = String.Concat(distributors.Declaration.ToString(), distributors.ToString());
Please see below for what I get in distributorInfo
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<Distributors>
<Distributor Id="12345678" />
<Distributor Id="22222222" />
<Distributor Id="11111111" />
</Distributors>
Similar to the other +1 answers, but a bit more detail about the declaration, and a slightly more accurate concatenation.
<xml /> declaration should be on its own line in a formatted XML, so I'm making sure we have the newline added.
NOTE: using Environment.Newline so it will produce the platform specific newline
// Parse xml declaration menthod
XDocument document1 =
XDocument.Parse(#"<?xml version=""1.0"" encoding=""iso-8859-1""?><rss version=""2.0""></rss>");
string result1 =
document1.Declaration.ToString() +
Environment.NewLine +
document1.ToString() ;
// Declare xml declaration method
XDocument document2 =
XDocument.Parse(#"<rss version=""2.0""></rss>");
document2.Declaration =
new XDeclaration("1.0", "iso-8859-1", null);
string result2 =
document2.Declaration.ToString() +
Environment.NewLine +
document2.ToString() ;
Both results produce:
<?xml version="1.0" encoding="iso-8859-1"?>
<rss version="2.0"></rss>
A few of these answers solve the poster's request, but seem overly complicated. Here's a simple extension method that avoids the need for a separate writer, handles a missing declaration and supports the standard ToString SaveOptions parameter.
public static string ToXmlString(this XDocument xdoc, SaveOptions options = SaveOptions.None)
{
var newLine = (options & SaveOptions.DisableFormatting) == SaveOptions.DisableFormatting ? "" : Environment.NewLine;
return xdoc.Declaration == null ? xdoc.ToString(options) : xdoc.Declaration + newLine + xdoc.ToString(options);
}
To use the extension, just replace xml.ToString() with xml.ToXmlString()
You can also use an XmlWriter and call the
Writer.WriteDocType()
method.
string uploadCode = "UploadCode";
string LabName = "LabName";
XElement root = new XElement("TestLabs");
foreach (var item in returnList)
{
root.Add(new XElement("TestLab",
new XElement(uploadCode, item.UploadCode),
new XElement(LabName, item.LabName)
)
);
}
XDocument returnXML = new XDocument(new XDeclaration("1.0", "UTF-8","yes"),
root);
string returnVal;
using (var sw = new MemoryStream())
{
using (var strw = new StreamWriter(sw, System.Text.UTF8Encoding.UTF8))
{
returnXML.Save(strw);
returnVal = System.Text.UTF8Encoding.UTF8.GetString(sw.ToArray());
}
}
// ReturnVal has the string with XML data with XML declaration tag
Extension method to get the Xml Declaration included, using string interpolation here and chose to add a new line after xml declaration as this is the standard I guess.
public static class XDocumentExtensions {
public static string ToStringIncludeXmlDeclaration(this XDocument doc){
return $"({((doc.Declaration != null ? doc.Declaration.ToString() +
Environment.NewLine : string.Empty) + doc.ToString())}";
}
}
}
Usage:
tb_output.Text = xml.ToStringIncludeXmlDeclaration();

Categories

Resources