I guess that XmlDocument is deprecated class and we have to use XDocument instead. But TileUpdateManager.GetTemplateContent from Windows 8 UX core return XmlDocument class. As soon as it is new API, I'm wondering what reason for using XmlDocument here?
(I am using XDocument to refer to Linq2XML & XmlDocument to the System.XML api's)
Reasons I can see:
XDocument requires LINQ, which is .NET specific. Windows Store apps can be built in JavaScript too, so if you were to build a XDocument to work across both JavaScript & .NET then you would need to port LINQ too and now the engineering task is massive.
XDocument works with XML in a functional way, which is very not standard - the standard way to work with XML is via a DOM model. So XmlDocument provides the way that is more aligned with the way it works elsewhere.
I am guessing you are thinking they are separate API's - when XDocument builds on top of the normal XmlReader classes. So unless you completely rewrote XmlDocument you would always need it - and what is the value of rewriting something that works well just so you can hide it inside something else in the new version.
While a lot of cleanup & improvements have been done to the new API's has been done, remember it is still built on top of COM & a lot of the built in Windows API we currently have (this abstraction means should that change in the future it doesn't impact us, but currently it is not talking direct to the kernel), so likely they are leveraging the existing tools & libraries under the covers - which would all be DOM based and better aligned to XmlDocument.
XmlDocument class is in .Net for a while and they might have preferred to keep it like that. Another reason might be to have multi-language support.
You can wrap/decorate the function with another. There are various methods to convert XmlDocument to XDocument
private static XDocument DocumentToXDocument(XmlDocument doc)
{
return XDocument.Parse(doc.OuterXml);
}
private static XDocument DocumentToXDocumentNavigator(XmlDocument doc)
{
return XDocument.Load(doc.CreateNavigator().ReadSubtree());
}
private static XDocument DocumentToXDocumentReader(XmlDocument doc)
{
return XDocument.Load(new XmlNodeReader(doc));
}
Related
I saw this question :
XPathDocument vs. XmlDocument
But it doesnt have the info which im looking for:
my question :
I know that XPathDocument Loads the complete xml into the memory :
My question is from the stage where the xml is already loaded :
which one of them will faster find the desired elements :
XPathDocument with XPathNavigator
or
xmlReader with If's conditions
If by
"the xml is already loaded"
you mean that it's already going to be loaded into an XPathDocument or XmlDocument, then the performance of using either the XPathNavigator or XmlReader will be the same. Both will be traversing already parsed, in-memory nodes representing the XPath data model.
The main difference between the two is the XmlReader will provide forward-only access whereas the XPathNavigator provides cursor access to the document. Directly interacting with XmlReader is very useful when you don't want to incur the cost of loading entire document in-memory. It's not so useful otherwise.
I'd strongly suggest using the XPathNavigator.
There are two primary ways you can interact with the XPathNavigator:
Build your state machine (using if/elses). One huge plus of doing this with the XPathNavigator rather than the XmlReader is, due to the cursor access model of XPathNavigator, your state machine will be vastly simpler. Ex: Need to see if the parent has a specific attribute? Just navigate to it and take a look.
Use XPath queries to find the data you're looking for. May not be as fast, but will probably will be less error prone than building your own state machine. Of course this requires you to be versed in XPath.
I'm in the position to parse XML in .NET. Now I have the choice between at least XmlTextReader and XDocument. Are there any comparisons between those two (or any other XML parsers contained in the framework)?
Maybe this could help me to decide without trying both of them in depth.
The XML files are expected to be rather small, speed and memory usage are a minor issue compared to easiness of use. :-)
(I'm going to use them from C# and/or IronPython.)
Thanks!
If you're happy reading everything into memory, use XDocument. It'll make your life much easier. LINQ to XML is a lovely API.
Use an XmlReader (such as XmlTextReader) if you need to handle huge XML files in a streaming fashion, basically. It's a much more painful API, but it allows streaming (i.e. only dealing with data as you need it, so you can go through a huge document and only have a small amount in memory at a time).
There's a hybrid approach, however - if you have a huge document made up of small elements, you can create an XElement from an XmlReader positioned at the start of the element, deal with the element using LINQ to XML, then move the XmlReader onto the next element and start again.
XmlTextReader is kind of deprecated, do not use it.
From msdn blogs by XmlTeam
Effective Xml Part 1: Choose the right API
Avoid using XmlTextReader. It contains quite a few bugs that could not be fixed without breaking existing applications already using it.
The world has moved on, have you? Xml APIs you should avoid using.
Obsolete APIs are easy since the compiler helps identifying them but there are two more APIs you should avoid using – namely XmlTextReader and XmlTextWriter. We found a number of bugs in these classes which we could not fix without breaking existing applications. The easy route would be to deprecate these classes and ask people to use replacement APIs instead. Unfortunately these two classes cannot be marked as obsolete because they are part of ECMA-335 (Common Language Infrastructure) standard (http://www.ecma-international.org/publications/standards/Ecma-335.htm) – the companion CLILibrary.xml file which is a part of Partition IV).
The good news is that even though these classes are not deprecated there are replacement APIs for these in .NET Framework already and moving to them is relatively easy. First it is necessary to find the places where XmlTextReader or XmlTextWriter is being used (unfortunately it is a manual step). Now all the occurrences of XmlTextReader should be replaced with XmlReader and all the occurrences of XmlTextWriter should be replaced with XmlWriter (note that XmlTextReader derives from XmlReader and XmlTextWriter derives from XmlWriter so the app can already be using these e.g. as formal parameters). The last step is to change the way the XmlReader/XmlWriter objects are instantiated – instead of creating the reader/writer directly it is necessary to the static factory method .Create() present on both XmlReader and XmlWriter APIs.
Furthermore, intellisense in Visual Studio doesn't list XmlTextReader under System.Xml namespace. The class is defined as:
[EditorBrowsable(EditorBrowsableState.Never)]
public class XmlTextReader : XmlReader, IXmlLineInfo, IXmlNamespaceResolver
The XmlReader.Create factory methods return other internal implementations of the abstract class XmlReader depending on the settings passed.
For forward-only streaming API (i.e. that doesn't load the entire thing into memory), use XmlReader via XmlReader.Create method.
For an easier API to work with, go for XDocument aka LINQ To XML. Find XDocument vs XmlDocument here and here.
I would like to write a parser to tell me what part of a string is a methodheader. What is the best way to do this in C#?
The language grammar specification can be found here. I don't think this is proper BNF/EBNF, so perhaps there is a way to transform it into such (like an html parser that puts it into proper BNF.)
Should I use regular expressions or a custom built parser somehow? I am restricted in that I need to build it myself without the help of outside tools.
I found the NRefactory library, part of the open-source SharpDevelop tool, to be very good at parsing C# modules into an abstract syntax tree. Once you have that you can scan through very easily to find the method headers, the locations, and so on.
Though its primary use is for within SharpDevelop (A GUI tool), it is a standalone DLL, and it can be used within any .NET app. The documentation isn't very thorough, as far as I could tell, but Reflector let me examine it and figure things out pretty easily.
some code:
internal static string CreateAstSexpression(string filename)
{
using (var fs = File.OpenRead(filename))
{
using (var parser = ParserFactory.CreateParser(SupportedLanguage.CSharp,
new StreamReader(fs)))
{
parser.Parse();
// RetrieveSpecials() returns an IList<ISpecial>
// parser.Lexer.SpecialTracker.RetrieveSpecials()...
// "specials" == comments, preprocessor directives, etc.
// parser.CompilationUnit retrieves the root node of the result AST
return SexpressionGenerator.Generate(parser.CompilationUnit).ToString();
}
}
}
The ParserFactory class is part of NRefactory.
In my case I wanted a lisp s-expression describing the C# buffer, so I wrote an S-expression generator that walked through the "CompilationUnit". It's just a tree of nodes, starting with namespace, then class/struct/enum. Within the class/struct node, there are method nodes (as well as field, property, etc).
If that finished DLL is not of interest, then maybe this is.
Before finding and embracing NRefactory, I tried to produce a wisent grammar for c#. This was for use within emacs, which has a wisent engine.
I never could get it to work properly.
Maybe it's of use to you.
you said that you didn't want to use "outside tools". Not sure of the motivation for that restriction; if it is homework, then I guess it makes sense, but for other purposes, it really would be a shame to not use the well-tested and well-understood tools that are already out there.
If you take either of the suggestions I've made here, you're building on something that is an outside tool. But some of the options are a little better than others.
I need to manipulate an existing XML document, and create a new one from it, removing a few nodes and attributes, and perhaps adding new ones, what would be the best group of classes to accomplish this?
There are a lot of .NET classes for XML manipulation, and I'm not sure what would be the optimal way to do it.
If it is a really huge XML which cannot fit into memory you should use XmlReader/XmlWriter. If not LINQ to XML is very easy to use. If you don't have .NET 3.5 you could use XmlDocument.
Here's an example of removing a node:
using System.Xml.Linq;
using System.Xml.XPath;
var doc = XElement.Load("test.xml");
doc.XPathSelectElement("//customer").Remove();
doc.Save("test.xml");
Use Linq to XML You can see the XDocument class here
Parsing the document with XML Style Sheets might be the easiest option if it is just a conversion process.
Here is how to use XSLT in .NET.
and
Here is an introduction to XSLT.
It confused me a bit at first, but now I pretty much use XSLT to do all my XML conversions.
If you have an official schema, you can use the XmlSerializer. Otherwise it is best to use the XmlDocument, XmlNode, XmlElement etc classes.
Otherwise it could also depend on what you are using the xml for, i.e. marking up some document, representing objects etc.
I'm having to use .NET 2.0 so can't use any of the nice XDocument stuff.
I'm wondering if anyone has seen any helper/utility methods that still use XmlDocument but make xml creation a bit less tedious?
You could look at the XmlHandler class in Pluto.
It uses XmlDocument internally, but allows very simple reading and writing of values, including handling arrays, classes, etc, with reading and writing to specific locations via XPath queries.