Append XML to file without writing it from scratch? [duplicate] - c#

This question already has answers here:
Fastest way to add new node to end of an xml?
(10 answers)
Closed 5 years ago.
The standard way to append an XML file in LINQ-to-XML is to read it in, modify the in-memory document, and write the whole file out from scratch. For example:
XDocument doc = XDocument.Load("pathToDoc.xml");
doc.Root.Add(new XElement(namespace + "anotherChild", new XAttribute("child-id", childId)));
doc.Save("pathToDoc.xml");
Or, with a FileStream:
using (FileStream fs = new FileStream("pathToDoc.xml", FileMode.Open, FileAccess.ReadWrite)) {
XDocument doc = XDocument.Load(fs);
doc.Root.Add(new XElement(namespace + "anotherChild", new XAttribute("child-id", childId)));
fs.SetLength(0);
using (var writer = new StreamWriter(fs, new UTF8Encoding(false))) {
doc.Save(writer);
}
}
However, in both cases, the existing XML document is being loaded into memory, modified in memory, and then written from scratch to the XML file. For small XML files this is OK, but for large files with hundreds or thousands of nodes this seems like a very inefficient process. Is there any way to make XDocument (or perhaps something like an XmlWriter) just append the necessary additional nodes to the existing XML document rather than blanking it out and starting from scratch?

This totally depends on the position where you need to add the additional elements. Of course you can implement something that removes the closing "</root>" tag, writes additional elements and then adds the "</root>" again. However, such code is highly optimized for your purpose and you'll probably not find a library for it.
Your code could look like this (quick and dirty, without input checking, assuming that <root/> cannot exist):
using System.IO;
using System.Xml.Linq;
namespace XmlAddElementWithoutLoading
{
class Program
{
static void Main()
{
var rootelement = "root";
var doc = GetDocumentWithNewNodes(rootelement);
var newNodes = GetXmlOfNewNodes(doc);
using (var fs = new FileStream("pathToDoc.xml", FileMode.Open, FileAccess.ReadWrite))
{
using (var writer = new StreamWriter(fs))
{
RemoveClosingRootNode(fs, rootelement);
writer.Write(newNodes);
writer.Write("</"+rootelement+">");
}
}
}
private static void RemoveClosingRootNode(FileStream fs, string rootelement)
{
fs.SetLength(fs.Length - ("</" + rootelement + ">").Length);
fs.Seek(0, SeekOrigin.End);
}
private static string GetXmlOfNewNodes(XDocument doc)
{
var reader = doc.Root.CreateReader();
reader.MoveToContent();
return reader.ReadInnerXml();
}
private static XDocument GetDocumentWithNewNodes(string rootelement)
{
var doc = XDocument.Parse("<" + rootelement + "/>");
var childId = "2";
XNamespace ns = "namespace";
doc.Root.Add(new XElement(ns + "anotherChild", new XAttribute("child-id", childId)));
return doc;
}
}
}

Related

Reading XML file from website from console application

What I want to accomplish is reading an xml file from a website (http://xml.buienradar.nl/). I have been reading about what to use, but I can't see the forest for the trees! Should I be using WebRequest, or XmlDocument, or XDocument, or XmlReader, or XmlTextReader, or? I read that XmlDocument and XDocument read the whole file into memory, and XmlReader doesn't. But is that a problem in this case? What if indeed the xml file is huge?
Can someone help me find a way?
Thanks!
To read huge XML without loading all of it into memory, you can use XmlReader class. But please note that this method requires more code than XDocument or even XmlDocument solution.
var h = WebRequest.CreateHttp("http://xml.buienradar.nl/");
using (var r = h.GetResponse())
using (var resp = r.GetResponseStream())
using (var sr = new StreamReader(resp))
using (var xr = new XmlTextReader(sr))
{
while (xr.Read())
{
// doing something with xr
// for example print it's current node value
Console.WriteLine(xr.Value);
}
}
If you want to test for large XML file, you can try XML from http://www.ins.cwi.nl/projects/xmark/Assets/standard.gz.
It is over 30 MB gzipped. With this method, XML processing don't require much memory, it even don't wait for whole file to finished downloading.
Test code:
var h = WebRequest.CreateHttp("http://www.ins.cwi.nl/projects/xmark/Assets/standard.gz");
using (var r = h.GetResponse())
using (var resp = r.GetResponseStream())
using (var decompressed = new GZipStream(resp, CompressionMode.Decompress))
using (var sr = new StreamReader(decompressed))
using (var xr = new XmlTextReader(sr))
{
while (xr.Read())
{
// doing something with xr
// for example print it's current node value
Console.WriteLine(xr.Value);
}
}
XmlTextReader provides a faster mechanism for reading xml.
string url="http://xml.buienradar.nl/";
XmlTextReader xml=new XmlTextReader(url);
while(xml.Read())
{
Console.WriteLine(xml.Value);
}

Advantages of linq-to-xml compared to OfficeOpenXML.WordProcessing namespace

I'm working on an application that needs to generate Word documents based on user input, database values and a template. I've looked online for examples, and found many different approaches to generate word documents but I've made up my mind and decided to stick with the official Office Open XML SDK 2.5. Now I've just written a simple program that inserts a table (stored in a .xml file) into a word document:
Edit: Question down at the bottom if not interested in the code
static void Main(string[] args)
{
XNamespace ns = XNamespace
.Get(#"http://schemas.openxmlformats.org/wordprocessingml/2006/main");
byte[] byteArray = File.ReadAllBytes(#"C:/Users/Alexander/Downloads/WordTest.docx");
using (var stream = new MemoryStream())
{
XDocument xdoc;
stream.Write(byteArray, 0, byteArray.Length);
using (WordprocessingDocument doc = WordprocessingDocument.Open(stream, true))
{
Then I can do 2 different things which will generate the same output.
1) Using OfficeOpenXml.Wordprocessing namespace methods:
#region Openxml.WordProcessing
var paragraphs = doc.MainDocumentPart.Document.Body.ToList();
Table tbl = new Table(File.ReadAllText(#"C:/users/alexander/downloads/tablecontent.xml"));
var bookmark = paragraphs.SelectMany(p => p.Descendants<BookmarkStart>()
.Where(bm => bm.Id == "0")).FirstOrDefault();
doc.MainDocumentPart.Document.Body.ReplaceChild(tbl, bookmark.Parent);
#endregion Openxml.WordProcessing
2) Using Linq-To-XML:
#region LINQ TO XML
XElement xtbl = XElement.Load(
new FileStream(#"C:/users/alexander/downloads/tablecontent.xml", FileMode.Open));
using (StreamReader sr = new StreamReader(doc.MainDocumentPart.GetStream()))
using (XmlReader xr = XmlReader.Create(sr))
xdoc = XDocument.Load(xr);
//Document - Body - Paragraphs - Runs/Bookmarks/etc.
//any way to write this more clearly in linq-to-xml?
var test = xdoc.Elements().First().Elements().First().Elements()
.SelectMany(e => e.Elements()).ToList();
var startBookmark = test.Where(p => p.Name == XName.Get("bookmarkStart", ns.NamespaceName)
&& p.Attribute(XName.Get("id", ns.NamespaceName)).Value == "0").First();
startBookmark.Parent.ReplaceWith(xtbl);
using (XmlWriter xw = XmlWriter.Create(doc.MainDocumentPart.GetStream()))
xdoc.Save(xw);
#endregion LINQ TO XML
And finally I write the document to a new file:
using (FileStream fs =
new FileStream(#"C:/users/alexander/downloads/WordTestModified.docx", FileMode.Create))
{
stream.WriteTo(fs);
}
As far as I see it, the first option is easier and the code is more clear to read, (no use of XName and no need for extra StreamReader/XmlReader/Writer) but are there any distinct advantages Linq-to-xml has over this approach? This is going to be a big application and I don't want to be limited later on.

How to use XmlDocument object instead of reading XML file from drive?

I didn't know that I can use XSD schema to serialize received XML file. I used xsd.exe to generate cs class from XSD file and now I need to use that class to get data in class properties but I miss one thing and I need help.
This is the code:
private void ParseDataFromXmlDocument_UsingSerializerClass(XmlDocument doc)
{
XmlSerializer ser = new XmlSerializer(typeof(ClassFromXsd));
string filename = Path.Combine("C:\\myxmls\\test", "xmlname.xml");
ClassFromXsdmyClass = ser.Deserialize(new FileStream(filename, FileMode.Open)) as ClassFromXsd;
if (myClass != null)
{
// to do
}
...
Here I use XML file from drive. And I want to use this XmlDocument from parameter that I passed in. So how to adapt this code to use doc instead XML from drive?
You could write the XmlDocument to a MemoryStream, and then Deserialize it like you already did.
XmlDocument doc = new XmlDocument();
ClassFromXsd obj = null;
using (var s = new MemoryStream())
{
doc.Save(s);
var ser = new XmlSerializer(typeof (ClassFromXsd));
s.Seek(0, SeekOrigin.Begin);
obj = (ClassFromXsd)ser.Deserialize(s);
}

getting "Stream Was Not Writable" while saving an XML document that I read via a stream from a zip file

I have a zip file that contains an xml file,
I'm Loading this xml file to an xml document without having to extract the file.
this is done via a stream.
after doing so, I'm modifying the inner text of some nodes.
The Problem is that I'm getting the previous mentioned exception after trying to save the stream, here's the code:
(I'm using DotNetZip here)
ZipFile zipFile = ZipFile.Read(zipPath); // the path is my desktop
foreach (ZipEntry entry in zipFile)
{
if (entry.FileName == "myXML.xml")
{
//creating the stream and loading the xml doc from the zip file:
Stream stream = zipFile[entry.FileName].OpenReader();
XmlReader xReader = XmlReader.Create(stream);
XmlDocument xDoc = new XmlDocument();
xDoc.Load(xReader);
//changing the inner text of the doc nodes:
xDoc.DocumentElement.SelectSingleNode("Account/Name").InnerText = "VeXe";
xDoc.DocumentElement.SelectSingleNode("Account/Money").InnerText = "Million$";
xDoc.Save(stream); // here's where I got the exception.
break;
}
}
I'm not a pro coder, but instead of xDoc.Save(stream); I noticed that it could also take a XmlWriter as a parameter, so I tried making an instance of the XmlWriter immediately after instantiating the XmlReader ..
I tried doing this: xDoc.Save(XmlWriter)
I got an exception saying something like: "Cannot Write After Reading"
how can I successfully save the xDoc ?
ADDED:
I had an idea of saving the xml file in some other place, like a temp folder or something
then adding that saved file in the zip overwriting the old one, then deleting the xml file in the temp ..
but that's not what i want, I want to deal directly with the zip file, in and out, no third parties.
You're attempting to write to the same Stream you've opened it with. You cannot do that.
Perhaps try something like this:
ZipFile zipFile = ZipFile.Read(zipPath); // the path is my desktop
foreach (ZipEntry entry in zipFile)
{
if (entry.FileName == "myXML.xml")
{
//creating the stream and loading the xml doc from the zip file:
using (Stream stream = zipFile[entry.FileName].OpenReader()) {
XmlReader xReader = XmlReader.Create(stream);
XmlDocument xDoc = new XmlDocument();
xDoc.Load(xReader);
}
//changing the inner text of the doc nodes:
xDoc.DocumentElement.SelectSingleNode("Account/Name").InnerText = "VeXe";
xDoc.DocumentElement.SelectSingleNode("Account/Money").InnerText = "Million$";
using (StreamWriter streamWriter = new StreamWriter(pathToSaveTo)) {
xDoc.Save(streamWriter);
break;
}
}
}
A quick look at the docs leads me to believe that you should do it something like this:
using(ZipFile zipFile = ZipFile.Read(zipPath))
foreach (ZipEntry entry in zipFile)
{
if (entry.FileName == "myXML.xml")
{
XmlDocument xDoc = new XmlDocument();
//creating the stream and loading the xml doc from the zip file:
using(Stream stream = zipFile[entry.FileName].OpenReader())
using(XmlReader xReader = XmlReader.Create(stream))
{
xDoc.Load(xReader);
}
//changing the inner text of the doc nodes:
xDoc.DocumentElement.SelectSingleNode("Account/Name").InnerText = "VeXe";
xDoc.DocumentElement.SelectSingleNode("Account/Money").InnerText = "Million$";
using(var ms=new MemoryStream())
using(var sw=new StreamWriter(ms))
{
xDoc.Save(sw);
sw.Flush();
ms.Position=0;
zipFile.UpdateEntry(entry.FileName,ms);
}
break;
}
}

XmlWriter.WriteRaw escapes xml when the writer was created via XElement.CreateWriter

I have noticed that XmlWriter.WriteRaw appears to not work properly (it escapes xml characters) when the writer is created using XElement.CreateWriter. The below test case reproduces the problem. Is my usage incorrect? Does anyone know how to achieve the desired behavior? I need to be able to write a raw xml string to an XmlWriter and incorporate that xml into an XElement.
[Test]
public void XElementWriterTest()
{
var xelement = new XElement("test");
using (var writer = xelement.CreateWriter())
{
writer.WriteRaw(#"<some raw='xml' />");
}
Assert.That(xelement.ToString(), Is.EqualTo(#"<test><some raw='xml' /></test>"));
// actual : "<test><some raw='xml' /></test>"
}
Is XElement.Parse() an option for you at all?
[TestMethod]
public void XElementWriterTest()
{
var xelement = new XElement("test");
const string newXML = #"<some raw='xml' />";
var child = XElement.Parse(newXML);
xelement.Add(child);
Assert.AreEqual(xelement.ToString(SaveOptions.DisableFormatting), #"<test><some raw=""xml"" /></test>");
}

Categories

Resources