I'm looking to parse a xml file using Python and I was wondering if there was any way of automating the task over manually walking through all xml nodes/attributes using xml.dom.minidom library.
Essentially what would be sweet is if I could load a xml schema for the xml file I am reading then have that automatically generate some kind of data struct/set with all of the data within the xml.
In C# land this is possible via creating a strongly typed dataset class from a xml schema and then using this dataset to read the xml file in.
Is there any equivalent in Python?
lxml is a super-robust xml parsing package. It includes a subpackage, lxml.objectify, that will make an object tree from your xml.
It doesn't generate a class from the schema -- that's probably more a C#/Java thing -- but it does do schema validation so you know what kind of object you're getting back (see "asserting a schema").
You might take a look at lxml.objectify, particularly the E-factory. It's not really an equivalent to the ADO tools, but you may find it useful nonetheless.
hey dude - take beautifulSoup - it is a super library. HEAD over to the site scraperwiki.com
the can help you!
Related
I have XSD file, which seems rather complex (I am very new to working with XSD).
My task is to create a program, which would generate XML files based on the XSD schema (in a more detail - we will get a CSV file with the data and these need to be serialized into a XML). I did a research and tried various techniques of generating C# class from the XSD file, where the most 'compact' was xsd2code plugin for Visual Studio.
Nonetheless, this plugin has generated over 7,000 lines of code which quite shocked me as it was just one giant mess (for me).
My question now is - is there a better way (or maybe some switch I forgot to check) which will generate rather compact C# class? If not, then what is the next step that people have to do once they get C# class? Do they have to additional manual post processing so that the file is more 'programmer-friendly', or ...?
Thank you for your guidance; any help or tip will be highly appreciated!
I want to pass an XML object from code behind file of an aspx to an class library.for that how can i create a XML Object.
please its urgent.
.NET includes multiple XML APIs (XML Document—a typical DOM implementation, a streaming API, an XPath orientated API and LINQ to XML). So lots to chose from.
Without more detail impossible to say which is your best approach. I would suggest starting reading MSDN at "XML Documents and Data".
Load an XML file from disk
http://msdn.microsoft.com/en-us/library/875kz807.aspx
or some XML from a string
http://msdn.microsoft.com/en-us/library/system.xml.xmldocument.loadxml.aspx
Is there something available that could help me convert a XSD into SQL relational tables? The XSD is rather big (in my world anyway) and I could save time and boring typing if something pushed me ahead rather than starting from scratch.
The XSD is here if you want to have a look. It's a standardized/localized format to exchange MSDS.
Altova's XML Spy has a feature that will generate SQL DDL Script from an XSD file. XML Spy will cost you some money though.
Interestingly enough, a developer used a really clever trick of using an XSLT translation to create the DDL script from an XSD file. They have outlined it in two parts here and here.
I might have to try this out myself for future use...
EDIT: Just found this question asked previously here...
There is a command-line tool called XSD2DB, that generates database from xsd-files, available at sourceforge.
For more info: please refer to this existing question How can I create database tables from XSD files?
You can use an XSLT transform. See, for example, here: Generating SQL from XSD and XSL stylesheets with XSLT.
Microsoft has a command-line tool for performing XSLT transformations: Microsoft Command-Line tool for XSLT.
It is also easy to integrate the transforms into a build process using MSBuild or Grunt.
Here is the reference for the Microsoft documentation: XML Standards Reference, including XSD, XSLT, etc.
Using JAXB in Java it is easy to generate from a xml schema file a set of Java classes that xml conforming to that schema can be deserialized to.
Is there some C# equivalent of JAXB? I know that Linq can serialize and deserialize classes to/from xml files. But how can I generate C# classes from xml schema file and then use this classes with linq?
If you're using Visual Studio, try the XML Schema Definition Tool. It takes your schema definitions and produces C# classes -- or it can go the other way and produce schema definitions from classes. It also has a number of other XML-related transformations.
There is a better tool from Microsoft called XsdObjectGen, the XSD Object Code Generator. It is like xsd.exe, but better. Also free, it is not part of the .NET SDK, but is a separate download.
Also see the SO question: XSDObjectGen vs Xsd.exe
Look into using DataSet. It's a bit of a different concept from using "Java Beans". The entire XML document is treated hierarchical set of tables all in a single class. The good part is that theory of encapsulation for OOP is actually enforced. Wow, Microsoft got something right that Sun pooched.
Anyway. You can also look at typed DataSet's if you want make things more interesting. I've used this on major projects with great success.
What are the best functions, practices, and/or techniques to read/write XML with C#/.NET?
If you are working with .NET 3.5, LINQ to XML is certainly a very good way to interact with XML.
MSDN Link
There are classes to read XML:
XmlDocument is slow and memory-intensive: it parses the XML and loads it into an in-RAM DOM, which is good if you want to edit it.
XmlReader is less memory-intensive: it scans the XML from front to back, never needing to keep all of it in RAM at once.
Similarly, for writing you can construct an XmlDocument and then save it, or use an XmlWriter.
After I wrote the above, there's now a new set of APIs which are easier to use: i.e. for example the XDocument and XElement classes.
By far the simplest method I've found for dealing with XML in C# is to use the XML Serialization tools. For example: http://www.dotnetjohn.com/articles.aspx?articleid=173.
Essentially, you can define C# classes that match your XML file (in fact, you can have them created for you if you have an XML definition file) and then you simply initialize instances of those classes directly from the XML file. Once you have them as instances, you can manipulate them as you wish and rewrite them back into XML files just as easily.
In a performance critical application XmlReader/XmlWriter are a good choice (see here) for the sake of simplicity which is offered by Linq to XML and XmlDocument.
I've found the MvpXml project very useful in past scenarios where performance is a consideration. There's a wealth of knowledge about good practice within their project pages: http://www.codeplex.com/MVPXML