Save XDocument without any formatting changes - c#

I have a XML File where i have to replace a single value of an element. For this im loading my XML file into a XDocument: var camtXml = XDocument.Load(fileStream); After im done with my changes and saving the XDocument to a file there are multiple changes that shouldn't be done. As you can see in the following picture (Left side file from XDocument, rigth site original file):
The UTF-8 was changed from upper- to lower case, CR Linefeeds were added and
the indentation has been changed by removing withespaces. I really wan't to use XDocument because of its libary what easily allows to create and iterate through XElements. But the formatting changes are a show stopper. Is there a way to preserve these formatting changes or is there an alternativ to XDocument with the same options like XPath, XElement etc.?
I found this but it didn't solved my problem.
XDocument how to save without Byte Order Mark AND preseve formatting/whitespace

Related

What's the best way to update xml in a file?

I have been looking all over for the best way to update xml in a file. I have just switched over to using XmlReader (coming from the XDocument method) for speed (not having to read the entire file in memory).
My XmlReader method works perfect and when I need to read a value, it opens the xml, starts reading and ONLY reads up to the node needed, then closes everything. It's very fast and effective.
Now that I have that working I want to make a method that UPDATES xml that is already in place. I would like to keep to the same idea and ONLY read in memory what is needed. So the idea would be, read up until the node I'm changing then use the writer to UPDATE that value.
Everything I have seen has a XmlReader reading while using an XmlWriter writing everything. If I did that I would assume that I would have to let it run through the entire file just like the XDocument would do. As an example this answer.
Is it possible to maybe just use the reader and read up to the node I'm trying to edit then change the innerxml or something?
What's the fastest and most efficient method to update XML in a file?
I would like to only read into memory what I'm trying to edit, not
the whole file.
I would also like to account for nodes that do not
exist (that need to be added).
By design, XmlReader represents a "read-only forward-only" view of the document and cannot be used to update the content. Using the Load method of either XmlDocument, XDocument or XElement, will still cause the entire file to be read in to memory. (Under the hood, XDocument and XElement still use an XmlReader.) However, you can combine using a raw XmlReader and XElement together using the overloads of the Load method which take an XmlReader.
You don't describe your XML structure, but you would want to do something similar to this:
var reader = XmlReader.Create(#"file://c:\test.xml");
var document = XElement.Load(reader);
document.Add(new XElement("branch", "leaves"));
document.Save("Tree.xml");
To find a specific node (for example, with a specific attribute value), you'd want to do something similar to this:
var node = document.Descendants("branch")
.SingleOrDefault(e => (string)e.Attribute("name") == "foo");

Convert all HTML entities not predefined for XML to unicode

I am trying to manipulate a string containing HTML-Code and then save the content to a htm-file. Afterwards the htm file is imported to a Word-File. Goal is to append a document formatted in HTML to a Word document. This process is part of a much larger programm and i cannot modify the given parameters.
To easily modify the HTML-Code I thought using XDocument would be a great idea.
So I tried this:
AppendContent(string content, Document doc)
{
string filePath = ...; //somewhere in /AppData/Local
var xDoc = XDocument.Parse(content);
// code left out because irrelevant
// Finding all "img" elements, in order to
// extract the embedded picture and save it as external file
FileHelper.SaveToFile(filePath, xDoc.ToString());
//... After this, the file is appended to the word file (the one in doc)
}
First attempt worked actually, with a small test html. Using any of the big documents I'm trying to append to the word document, cause an exception to be thrown:
XDocument.Parse cannot parse entities like "nbsp" or "uuml" (german ü). I already found out that XML only supports a hand full of predefined entities, so i would have to manually add the definition to the html file. This is not an option, because this operation is supposed to work with ANY Html file.
I found following fix:
var decodedContent = WebUtility.HtmlDecode(content);
var xDoc = XDocument.Parse(decodedContent);
This converts all entities to the representing character. So "uuml" is converted to "ü", etc. This worked until i hit a document that contained the "amp" entity, which is then converted to "&"... and such the XDocument.Parse is complaining again.
I'm looking for a way to convert HTML to unicode-representation ("\0x1234") or a HTML-decode, that does not decode XML-predefined entities.

When saving XML file with XElement, alignment in file changes as well, how to avoid?

I am using
XElement root = XElement.Load(filepath);
to load XML file, then finding things that I need.
IEnumerable<XElement> commands = from command in MyCommands
where (string) command.Attribute("Number") == Number
select command;
foreach (XElement command in commands)
{
command.SetAttributeValue("Group", GroupFound);
}
When I am done with my changes, I save the file with the following code.
root.Save(filepath);
When file is saved, all the lines in my XML file are affected. Visual Studio aligns all the lines by default, but I need to save the original file format.
I cannot alter any part of the document, except the Group attribute values.
command.SetAttributeValue("Group") attributes.
You would need to do:
XElement root = XElement.Load(filepath, LoadOptions.PreserveWhitespace);
then do:
root.Save(filepath, SaveOptions.DisableFormatting);
This will preserve your original whitespace through the use of LoadOptions and SaveOptions.
The information you're looking to preserve is lost to begin in the XDocument.
XDocument doesn't care if your elements had tabs or spaces on the line in front of them and if there are multiple whitespaces between attributes etc. If you want to rely on the Save() method you have to give up the idea you can preserve formatting.
To preserve formatting you'll need to add custom processing and figure out where precisely to make changes. Alternatively you may be able to adjust your save options to match the formatting you have if your XML is coming from a machine and not human edited

Remove Empty Line From XML Document

I am Currently Facing A problem. I am loading a xml file in C# and remove some nodes from it and appending some nodes. now problem is that when i am doing removal from the xml file then there are some empty lines created automatically ,so i want to remove these line .
And when i append some nodes to the parent node in xml then i want the new line in each ending tag
For Eg. My Xml file is
<intro id="S0001">
<title>Introduction Title</title>
<para>This is a paragraph. Note that paragraphs can contain other block–level objects, such as lists, as well as directly containing text.</para>
<para>The introduction can contain all of the text objects that a section can contain, except that it cannot be divided into parts, sections and sub–sections.</para>
<para>The introduction can contain tables:</para>
</intro><part>
<no>Part A</no> Article Structure <sup>&lpar;Part Title&rpar;</sup><section1 id="S0002">`enter code here`
<no>Sect 1</no>
<title>First Section in Part 1 <sup>&lpar;Section 1 Title&rpar;</sup></title>
<shortsectionhead>Short Section Header</shortsectionhead>
<para>This is a section in the first part of the article.</para>
</section1><section1 id="S0003">
Code:
XmlNode partNnode = xmlDoc.SelectSingleNode("//part");
XmlNode introNode=xmlDoc.SelectSingleNode("//intro");
XmlDocumentFragment newNode=xmlDoc.CreateDocumentFragment();
newNode.InnerXml=partNnode.OuterXml;
introNode.ParentNode.InsertAfter(newNode,introNode);
partNnode.ParentNode.RemoveChild(partNnode);
partNnode = xmlDoc.SelectSingleNode("//part");
nodeList = xmlDoc.SelectNodes("//section1");
foreach (XmlNode refrangeNode in nodeList)
{
newNode=xmlDoc.CreateDocumentFragment();
newNode.InnerXml=refrangeNode.??OuterXml;
partNnode.AppendChild(newNode);
}
Please help me
Thanks in advance
If you load and save a XMl file with C#, then the XML should be formatted correctly (an easy way to format strange looking XML files is just to load and save them with some C# code).
If I understand your question correctly, then you are just not happy with the format of the XML file?
Like you want (A):
</intro><part>
But you get (B):
</intro>
<part>
If that is the question, then, in my eyes, you just want a strange thing. Because...
a) Code doesn't care how the XML file is formatted and
b) The format in (B) is the correct one
If you, for what reason ever, want to change it, then you have to parse through the XML file, opening it as a string and checking manually for closed and opened tags.

Appending an XElement to an xml file without saving the whole file. C#

Is it possible in C# is append an XElement to an already existing xml file, without saving the whole xml, but just the new element?
So i don't want something like this, since it will write the whole xml to disk.
XDocument document = new XDocument();
document.Load("filename");
document.Root.add(new XElement("name", "content"));
document.save("filename");
thanks in advance.
Yes, but only by getting a bit more low level than in your example.
In an XML file you can only have one root element, so if you simply append to the file to add a new element, you will create a broken XML file.
However, you could read from the end of the file and parse it to find the start of the root element's end-tag (which would give you a file Position). Then you could open the file as a FileStream for writing, set the write Position to the start of the root-end-tag, and then write your new element to the stream as normal. Then you'd have to complete the file "manually" by appending text to add a new root-end-tag.

Categories

Resources