What I want to accomplish is reading an xml file from a website (http://xml.buienradar.nl/). I have been reading about what to use, but I can't see the forest for the trees! Should I be using WebRequest, or XmlDocument, or XDocument, or XmlReader, or XmlTextReader, or? I read that XmlDocument and XDocument read the whole file into memory, and XmlReader doesn't. But is that a problem in this case? What if indeed the xml file is huge?
Can someone help me find a way?
Thanks!
To read huge XML without loading all of it into memory, you can use XmlReader class. But please note that this method requires more code than XDocument or even XmlDocument solution.
var h = WebRequest.CreateHttp("http://xml.buienradar.nl/");
using (var r = h.GetResponse())
using (var resp = r.GetResponseStream())
using (var sr = new StreamReader(resp))
using (var xr = new XmlTextReader(sr))
{
while (xr.Read())
{
// doing something with xr
// for example print it's current node value
Console.WriteLine(xr.Value);
}
}
If you want to test for large XML file, you can try XML from http://www.ins.cwi.nl/projects/xmark/Assets/standard.gz.
It is over 30 MB gzipped. With this method, XML processing don't require much memory, it even don't wait for whole file to finished downloading.
Test code:
var h = WebRequest.CreateHttp("http://www.ins.cwi.nl/projects/xmark/Assets/standard.gz");
using (var r = h.GetResponse())
using (var resp = r.GetResponseStream())
using (var decompressed = new GZipStream(resp, CompressionMode.Decompress))
using (var sr = new StreamReader(decompressed))
using (var xr = new XmlTextReader(sr))
{
while (xr.Read())
{
// doing something with xr
// for example print it's current node value
Console.WriteLine(xr.Value);
}
}
XmlTextReader provides a faster mechanism for reading xml.
string url="http://xml.buienradar.nl/";
XmlTextReader xml=new XmlTextReader(url);
while(xml.Read())
{
Console.WriteLine(xml.Value);
}
Related
This question already has answers here:
Fastest way to add new node to end of an xml?
(10 answers)
Closed 5 years ago.
The standard way to append an XML file in LINQ-to-XML is to read it in, modify the in-memory document, and write the whole file out from scratch. For example:
XDocument doc = XDocument.Load("pathToDoc.xml");
doc.Root.Add(new XElement(namespace + "anotherChild", new XAttribute("child-id", childId)));
doc.Save("pathToDoc.xml");
Or, with a FileStream:
using (FileStream fs = new FileStream("pathToDoc.xml", FileMode.Open, FileAccess.ReadWrite)) {
XDocument doc = XDocument.Load(fs);
doc.Root.Add(new XElement(namespace + "anotherChild", new XAttribute("child-id", childId)));
fs.SetLength(0);
using (var writer = new StreamWriter(fs, new UTF8Encoding(false))) {
doc.Save(writer);
}
}
However, in both cases, the existing XML document is being loaded into memory, modified in memory, and then written from scratch to the XML file. For small XML files this is OK, but for large files with hundreds or thousands of nodes this seems like a very inefficient process. Is there any way to make XDocument (or perhaps something like an XmlWriter) just append the necessary additional nodes to the existing XML document rather than blanking it out and starting from scratch?
This totally depends on the position where you need to add the additional elements. Of course you can implement something that removes the closing "</root>" tag, writes additional elements and then adds the "</root>" again. However, such code is highly optimized for your purpose and you'll probably not find a library for it.
Your code could look like this (quick and dirty, without input checking, assuming that <root/> cannot exist):
using System.IO;
using System.Xml.Linq;
namespace XmlAddElementWithoutLoading
{
class Program
{
static void Main()
{
var rootelement = "root";
var doc = GetDocumentWithNewNodes(rootelement);
var newNodes = GetXmlOfNewNodes(doc);
using (var fs = new FileStream("pathToDoc.xml", FileMode.Open, FileAccess.ReadWrite))
{
using (var writer = new StreamWriter(fs))
{
RemoveClosingRootNode(fs, rootelement);
writer.Write(newNodes);
writer.Write("</"+rootelement+">");
}
}
}
private static void RemoveClosingRootNode(FileStream fs, string rootelement)
{
fs.SetLength(fs.Length - ("</" + rootelement + ">").Length);
fs.Seek(0, SeekOrigin.End);
}
private static string GetXmlOfNewNodes(XDocument doc)
{
var reader = doc.Root.CreateReader();
reader.MoveToContent();
return reader.ReadInnerXml();
}
private static XDocument GetDocumentWithNewNodes(string rootelement)
{
var doc = XDocument.Parse("<" + rootelement + "/>");
var childId = "2";
XNamespace ns = "namespace";
doc.Root.Add(new XElement(ns + "anotherChild", new XAttribute("child-id", childId)));
return doc;
}
}
}
I'm working on an application that needs to generate Word documents based on user input, database values and a template. I've looked online for examples, and found many different approaches to generate word documents but I've made up my mind and decided to stick with the official Office Open XML SDK 2.5. Now I've just written a simple program that inserts a table (stored in a .xml file) into a word document:
Edit: Question down at the bottom if not interested in the code
static void Main(string[] args)
{
XNamespace ns = XNamespace
.Get(#"http://schemas.openxmlformats.org/wordprocessingml/2006/main");
byte[] byteArray = File.ReadAllBytes(#"C:/Users/Alexander/Downloads/WordTest.docx");
using (var stream = new MemoryStream())
{
XDocument xdoc;
stream.Write(byteArray, 0, byteArray.Length);
using (WordprocessingDocument doc = WordprocessingDocument.Open(stream, true))
{
Then I can do 2 different things which will generate the same output.
1) Using OfficeOpenXml.Wordprocessing namespace methods:
#region Openxml.WordProcessing
var paragraphs = doc.MainDocumentPart.Document.Body.ToList();
Table tbl = new Table(File.ReadAllText(#"C:/users/alexander/downloads/tablecontent.xml"));
var bookmark = paragraphs.SelectMany(p => p.Descendants<BookmarkStart>()
.Where(bm => bm.Id == "0")).FirstOrDefault();
doc.MainDocumentPart.Document.Body.ReplaceChild(tbl, bookmark.Parent);
#endregion Openxml.WordProcessing
2) Using Linq-To-XML:
#region LINQ TO XML
XElement xtbl = XElement.Load(
new FileStream(#"C:/users/alexander/downloads/tablecontent.xml", FileMode.Open));
using (StreamReader sr = new StreamReader(doc.MainDocumentPart.GetStream()))
using (XmlReader xr = XmlReader.Create(sr))
xdoc = XDocument.Load(xr);
//Document - Body - Paragraphs - Runs/Bookmarks/etc.
//any way to write this more clearly in linq-to-xml?
var test = xdoc.Elements().First().Elements().First().Elements()
.SelectMany(e => e.Elements()).ToList();
var startBookmark = test.Where(p => p.Name == XName.Get("bookmarkStart", ns.NamespaceName)
&& p.Attribute(XName.Get("id", ns.NamespaceName)).Value == "0").First();
startBookmark.Parent.ReplaceWith(xtbl);
using (XmlWriter xw = XmlWriter.Create(doc.MainDocumentPart.GetStream()))
xdoc.Save(xw);
#endregion LINQ TO XML
And finally I write the document to a new file:
using (FileStream fs =
new FileStream(#"C:/users/alexander/downloads/WordTestModified.docx", FileMode.Create))
{
stream.WriteTo(fs);
}
As far as I see it, the first option is easier and the code is more clear to read, (no use of XName and no need for extra StreamReader/XmlReader/Writer) but are there any distinct advantages Linq-to-xml has over this approach? This is going to be a big application and I don't want to be limited later on.
I have a zip file that contains an xml file,
I'm Loading this xml file to an xml document without having to extract the file.
this is done via a stream.
after doing so, I'm modifying the inner text of some nodes.
The Problem is that I'm getting the previous mentioned exception after trying to save the stream, here's the code:
(I'm using DotNetZip here)
ZipFile zipFile = ZipFile.Read(zipPath); // the path is my desktop
foreach (ZipEntry entry in zipFile)
{
if (entry.FileName == "myXML.xml")
{
//creating the stream and loading the xml doc from the zip file:
Stream stream = zipFile[entry.FileName].OpenReader();
XmlReader xReader = XmlReader.Create(stream);
XmlDocument xDoc = new XmlDocument();
xDoc.Load(xReader);
//changing the inner text of the doc nodes:
xDoc.DocumentElement.SelectSingleNode("Account/Name").InnerText = "VeXe";
xDoc.DocumentElement.SelectSingleNode("Account/Money").InnerText = "Million$";
xDoc.Save(stream); // here's where I got the exception.
break;
}
}
I'm not a pro coder, but instead of xDoc.Save(stream); I noticed that it could also take a XmlWriter as a parameter, so I tried making an instance of the XmlWriter immediately after instantiating the XmlReader ..
I tried doing this: xDoc.Save(XmlWriter)
I got an exception saying something like: "Cannot Write After Reading"
how can I successfully save the xDoc ?
ADDED:
I had an idea of saving the xml file in some other place, like a temp folder or something
then adding that saved file in the zip overwriting the old one, then deleting the xml file in the temp ..
but that's not what i want, I want to deal directly with the zip file, in and out, no third parties.
You're attempting to write to the same Stream you've opened it with. You cannot do that.
Perhaps try something like this:
ZipFile zipFile = ZipFile.Read(zipPath); // the path is my desktop
foreach (ZipEntry entry in zipFile)
{
if (entry.FileName == "myXML.xml")
{
//creating the stream and loading the xml doc from the zip file:
using (Stream stream = zipFile[entry.FileName].OpenReader()) {
XmlReader xReader = XmlReader.Create(stream);
XmlDocument xDoc = new XmlDocument();
xDoc.Load(xReader);
}
//changing the inner text of the doc nodes:
xDoc.DocumentElement.SelectSingleNode("Account/Name").InnerText = "VeXe";
xDoc.DocumentElement.SelectSingleNode("Account/Money").InnerText = "Million$";
using (StreamWriter streamWriter = new StreamWriter(pathToSaveTo)) {
xDoc.Save(streamWriter);
break;
}
}
}
A quick look at the docs leads me to believe that you should do it something like this:
using(ZipFile zipFile = ZipFile.Read(zipPath))
foreach (ZipEntry entry in zipFile)
{
if (entry.FileName == "myXML.xml")
{
XmlDocument xDoc = new XmlDocument();
//creating the stream and loading the xml doc from the zip file:
using(Stream stream = zipFile[entry.FileName].OpenReader())
using(XmlReader xReader = XmlReader.Create(stream))
{
xDoc.Load(xReader);
}
//changing the inner text of the doc nodes:
xDoc.DocumentElement.SelectSingleNode("Account/Name").InnerText = "VeXe";
xDoc.DocumentElement.SelectSingleNode("Account/Money").InnerText = "Million$";
using(var ms=new MemoryStream())
using(var sw=new StreamWriter(ms))
{
xDoc.Save(sw);
sw.Flush();
ms.Position=0;
zipFile.UpdateEntry(entry.FileName,ms);
}
break;
}
}
I have an Xdocument object which is populated with xml (the definition for a report -rdl). I would like to give the contents of this XDocument to the report viewer.
this.reportViewer1.LocalReport.LoadReportDefinition();
LoadReportDefinition only seems to take either TextReader or FileStream arguments....but my report definition is loaded within my XDocument? How can I stream the contents of my XDocument?
You can use the StringReader class like so:
using (var textReader = new StringReader(xDocument.ToString()))
{
this.reportViewer1.LocalReport.LoadReportDefinition(textReader);
}
Or alternatively use a Stream:
using (var stream = new MemoryStream())
{
xDocument.Save(stream);
stream.Position = 0;
this.reportViewer1.LocalReport.LoadReportDefinition(stream);
}
I have a method which loops through an XML document using an XMLReader (validating at the same time) extracting specific pieces of information. I also need to compress the entire XML document in preparation for storing it in a database. The code I have to do this is below. Is this (passing the entire XmlReader to StreamWriter.Write()) the appropriate / most efficient way to achieve this? I didn't see a clear way to use the while(validatingReader.Read()) loop to achieve the same result.
XmlSchemaSet schemaSet = new XmlSchemaSet();
schemaSet.Add("schemaNamespace", "schemaLocation");
XmlReaderSettings readerSettings = new XmlReaderSettings();
readerSettings.ValidationType = ValidationType.Schema;
readerSettings.Schemas.Add(schemaSet);
readerSettings.ValidationEventHandler
+= new ValidationEventHandler(XMLValidationError);
using (XmlReader documentReader = requestXML.CreateNavigator().ReadSubtree())
{
using (XmlReader validatingReader =
XmlReader.Create(documentReader, readerSettings))
{
using (MemoryStream output = new MemoryStream())
{
using (DeflateStream gzip =
new DeflateStream(output, CompressionMode.Compress))
{
using (StreamWriter writer =
new StreamWriter(gzip, System.Text.Encoding.UTF8))
{
writer.Write(validatingReader);
this.compressedXMLRequest
= Encoding.UTF8.GetString(output.ToArray());
}
}
}
while (validatingReader.Read())
{
// extract specific element contents
}
}
}
Compression portion looks fine. MemoryStream may not be the best choice for large documents, but check if performance is ok for your scenarios before changing.
"extract specific element" portion will not read anything as reader is forward only, so all content is already read by the time that portion is executed. You may want to recreate the reader.
For future reference:
The code in the Question does not work properly. Passing an XmlReader to a StreamWriter doesn't work as expected. In the end I didn't end up combining compression with validation in this way so I don't exactly have "correct" code to show for this but didn't want to leave the question dangling.