I'm writing an XML document based on a stream of data. This part has been accomplished using the XmlTextWriter and the XElement classes.
Now when I come to read in the document I want to be able to 'delay-load' the XML document so that certain nodes are skipped (i.e. the ones which contain large binary chunks.) and then load them when required.
Is this possible using the XmlDocument class? Or will I have to do things in a more manual way using the XmlTextReader class.
Thanks.
Nick.
Not possible with XmlDocument as the whole document needs to be loaded onto memory before parsed as tree.
XmlTextReader/SAX is the standard solution.
This is not possible with either XmlDocument or XDocument.
note that if you want to use XmlTextReader, it is fwd only. i.e. once youhave skipped it, you cant come back to it.
see MSDN on this
Related
i want to copy or clone a specific node or Element from XML. I tried many codes, but no one worked. I am program with C#.
Here is my XML, I hope my problem is clear!
This is my XML before
I want this XML
I canĀ“t do this manually, because I need more than 30 Tools more.
it really depends on what you are using to parse the xml.
I will give you info for the two most used classes for parsing an xml in .NET.
XmlDocument: then you can use .CloneNode
XDocument: then you can do something like this:
XElement toCopy = ...;
XElement copy = XElement.Parse(toCopy.ToString());
If you are not familiar with xml processing in .NET, there is enough information in msdn for XDocument and XmlDocument.
I have been looking all over for the best way to update xml in a file. I have just switched over to using XmlReader (coming from the XDocument method) for speed (not having to read the entire file in memory).
My XmlReader method works perfect and when I need to read a value, it opens the xml, starts reading and ONLY reads up to the node needed, then closes everything. It's very fast and effective.
Now that I have that working I want to make a method that UPDATES xml that is already in place. I would like to keep to the same idea and ONLY read in memory what is needed. So the idea would be, read up until the node I'm changing then use the writer to UPDATE that value.
Everything I have seen has a XmlReader reading while using an XmlWriter writing everything. If I did that I would assume that I would have to let it run through the entire file just like the XDocument would do. As an example this answer.
Is it possible to maybe just use the reader and read up to the node I'm trying to edit then change the innerxml or something?
What's the fastest and most efficient method to update XML in a file?
I would like to only read into memory what I'm trying to edit, not
the whole file.
I would also like to account for nodes that do not
exist (that need to be added).
By design, XmlReader represents a "read-only forward-only" view of the document and cannot be used to update the content. Using the Load method of either XmlDocument, XDocument or XElement, will still cause the entire file to be read in to memory. (Under the hood, XDocument and XElement still use an XmlReader.) However, you can combine using a raw XmlReader and XElement together using the overloads of the Load method which take an XmlReader.
You don't describe your XML structure, but you would want to do something similar to this:
var reader = XmlReader.Create(#"file://c:\test.xml");
var document = XElement.Load(reader);
document.Add(new XElement("branch", "leaves"));
document.Save("Tree.xml");
To find a specific node (for example, with a specific attribute value), you'd want to do something similar to this:
var node = document.Descendants("branch")
.SingleOrDefault(e => (string)e.Attribute("name") == "foo");
How to add a new node, update an existing node and remove an existing node of an xml document without loading the whole document in memory?
I'm having an xml document and treating it as the memory of my application so would need to be able to do hundreds of reads and writes quickly without loading the whole document.
its structure is like this:
<spiderMemory>
<profileSite profileId="" siteId="">
<links>
<link>
<originalUrl></originalUrl>
<isCrawled></isCrawled>
<isBroken></isBroken>
<isHtmlPage></isHtmlPage>
<firstAppearedLevel></firstAppearedLevel>
</link>
</links>
</profileSite>
</spiderMemory>
How would that be possible with XDocument?
Thanks
If you want to do hundreds of reads and writes quickly... you might be using the wrong technology. Have you tried using a plain old RDBMS?
If you still need the XML representation, then you can create an export methods to produce it from the database.
XML isn't really a good substitute for this kind of problem. Just saying.
Also... what is wrong with having the whole thing in memory? How big can it possibly get? Say 1GB? Suck it up. Say 1TB? Oops. But then XML is wrong, wrong, wrong anyway in that case ;) way too verbose!
You can use XmlReader, something like this :
FileStream stream = new FileStream("test.xml", FileMode.Open);
XmlReader reader = new XmlTextReader(stream);
while(reader.Read())
{
Console.WriteLine(reader.Value);
}
here is an more elaborate example http://msdn.microsoft.com/en-us/library/cc189056%28v=vs.95%29.aspx
As Daren Thomas said, the proper solution is to use RDBMS instead of XML for your needs. I have a partial solution using XML and Java. Stax parser does not parse the whole document in memory and is a lot faster than DOM (still XML parsing will always be slow). A 'pull parser' (eg Stax) allows u to control what gets parsed. A less cleaner way is to throw an exception in SAX parser when you get the element(s) needed.
To modify, the simplest (but slow) way is to use XPath. Another (untested) option is to treat XML file as text and then 'Search and replace' stuff. Here you can use all kinds of text search optimization.
I'm developing a windows app using C#. I chose xml for data storage.
It is required to read xml file, make small changes, and then write it back to hard disk.
Now, what is the easiest way of doing this?
XLinq is much comfortable than the ordinary Xml, because is much more object oriented, supports linq, has lots of implicit casts and serializes to the standard ISO format.
The best way is to use XML Serialization where it loads the XML into a class (with various classes representing all the elements/attributes). You can then change the values in code and then serialize back to XML.
To create the classes, the best thing to do is to use xsd.exe which will generate the c# classes for you from an existing XML document.
I think the easiest way of doing it - it is using XmlDocument class:
var doc = new XmlDocument();
doc.Load("filename or stream or streamwriter or XmlReader");
//do something
doc.Save("filename or stream or streamwriter or XmlWriter");
I think I found the easiest way, check out this Project in Codeproject. It is easy to use as XML elements are accessed similarly to array elements using name strings as indexes.
Code sample to write bool property to XML:
Xmlconfig xcfg = new Xmlconfig("config.xml", true);
xcfg.Settings[this.Name]["AddDateStamp"]["bool"].boolValue = checkBoxAddStamp.Checked;
xcfg.Save("config.xml");
Sample to read the property:
Xmlconfig xcfg = new Xmlconfig("config.xml", true);
checkBoxAddStamp.Checked = xcfg.Settings[this.Name]["AddDateStamp"]["bool"].boolValue;
To write string use .Value, for int .intValue.
You can use LINQ to read XML Files as described here...
LINQ to read XML
Check out linq to XML
I am refactoring some code in an existing system. The goal is to remove all instances of the XmlDocument to reduce the memory footprint. However, we use XPath to manipulate the xml when certain rules apply. Is there a way to use XPath without using a class that loads the entire document into memory? We've replaced all other instances with XmlTextReader, but those only worked because there is no XPath and the reading is very simple.
Some of the XPath uses values of other nodes to base its decision on. For instance, the value of the message node may be based on the value of the amount node, so there is a need to access multiple nodes at one time.
If your XPATH expression is based on accessing multiple nodes, you're just going to have to read the XML into a DOM. Two things, though. First, you don't have to read all of it into a DOM, just the part you're querying. Second, which DOM you use makes a difference; XPathDocument is read-only and tuned for XPATH query speed, unlike the more general purpose but expensive XmlDocument.
I supose that using System.Xml.Linq.XDocument is also prohibited? Otherwise, it would be a good choice, as it is faster than XmlDocument (as I remember).
Supporting XPath means supporting queries like:
//address[/states/state[#code=current()/#code]='California']
or
//item[#id != preceding-sibling/item/#id]
which require the XPath processor to be able to look everywhere in the document. You're not going to find a forward-only XPath processor.
The way to do this is to use XPathDocument, which can take a stream - therefore you can use StringReader.
This returns the value in a forward read way without the overhead of loading the whole XML DOM into memory with XmlDocument.
Here is an example which returns the value of the first node that satisfies the XPath query:
public string extract(string input_xml)
{
XPathDocument document = new XPathDocument(new StringReader(input_xml));
XPathNavigator navigator = document.CreateNavigator();
XPathNodeIterator node_iterator = navigator.Select(SEARCH_EXPRESSION);
node_iterator.MoveNext();
return node_iterator.Current.Value;
}