I need to validate a small fragment of an xml file against a schema. Essentially, I'd like to ask the question "Does element X in XML document Y conform to its type as defined in schema Z?" and if not get a message describing why. This has to account for restrictions placed on those types as well (e.g. maxLength, minInclusive).
Is this possible?
I don't know about doing this from C#, but it's easily done in XQuery or XSLT 2.0. In XSLT 2.0 it's:
<xsl:copy-of select="doc('doc.xml')//selected/element" validation="strict"/>
and in XQuery it's
validate strict {doc('doc.xml')//selected/element}
All you need is a schema-aware XQuery or XSLT 2.0 processor that runs in your chosen environment.
It turns out this was much easier than I expected. The solution was to create a new schema that contains as its root the one element I want to verify. Once this schema is added to the schemaset, you can simply validate the fragment as you would any complete document.
A microsoft knowledge article that exactly describes validating Xml fragments. This could be useful.
http://support.microsoft.com/kb/318504
Related
I have an XML file coming in and need that to have a few specific tags without that I cannot process that file. How can I make sure if those tags are there or not , I tried using the XSD validation but file format keeps changing and they keep sending additional tags which I do not need to process the file , but having those additional tags does not harm my process.
Is there a way to write the XSD in a way that it only looks for a few tags and ignore the others?
You can create an xsd in which you have all of the elements you require. By default an element has minOccurs=1, which would imply that it's required. Then in order to ignore all of the rest you need to add <xs:any processContents="lax" macOccurs="unbounded"/>, which basically says that the xml may contain any number of additional elements which do not need to be validated.
Consider forgoing an XSD and instead writing XPath checks against the XML to test known-invariant properties of your XSD. XSD is better for when you have a known, relatively static grammar. Ad hoc XPath assertions or Schematron would be better for XML that can't be held do a definitive grammar.
I've been given the task of writing a complex XML file (I do have the XML schema, XSD) in C#, which has the possibility of being quite large depending on the situation. I'd like to implement streaming since the file can be large, so it looks like the best option is to use the XMLWriter. Before I go down the path of extending the XMLWriter class and writing a bunch of custom code, I was wondering if it was possible to, somehow, leverage the XML schema I have? I know I can convert my schema to C# objects using the XML Schema Definition Tool in Visual Studio, but I don't know if this is something I can use with the XMLWriter. I've converted an XML schema to C# objects and serialized them using XMLSerializer in the past, but not with the XMLWriter.
See Generating XML Documents from XML Schemas
http://msdn.microsoft.com/en-us/library/aa302296.aspx
Summary: Priya Lakshminarayanan shows how you can use the classes in the System.XML.Schema namespace of the Microsoft .NET Framework to build a tool that generates sample XML documents that conform to a given schema.
In the Visual Studio Schema Explorer you are able to generate an instance document from any element definition in your Schema. This article exposes the underlying code that makes that happen. I should note that Altova's XMLSpy has a more flexible tool for generating instances from the Schema, allowing you to set various parameters about the depth, repetition, and generated text values.
I used the XMLGenerator code included in the article to create a class that generates new XML document instances from my Schema for the 20 types of documents that we define. I added hints in my Schema as attributes in my own namespace to help the XMLGenerator generate a minimal valid document with some default text to help the users get started with the new document. So there is a lot you can do with the XmlGenerator.
When I am editing an XML document that has an XmlSchema, how can I programmatically determine the elements that can be inserted next? I am using C# and I already know which element I am in. Is there an MSXML method I can call or something else? Thanks.
Sounds like you are after the .Net Schema Object Model (SOM)
Schema Object Model
Here is an article on how to work with the SOM.
Example 1
Tarzan,
As I understand it, you are trying to determine the legal XML that can be added at a specific place in the document, based on the schema being used. If that is correct, it is a very difficult problem to solve. If you have an "any" element in your XSD, your complexity increases because you can literally be any element! Also, XSD schemas can be subclassed (i.e., an element definition structure based on another structure), then that introduces more complexity. There are only couple of products (Oxygen, Visual Studio) that have attempted this with any success (that I know of).
If your schema is fairly simple, and doesn't include any of these deal breakers, you might be able to use the Schema Object Model to find the legal elements at your current location, but only if you know what portion of the XSD applies to your current element.
Does this make sense?
Erick
I've done some XML serialization before but i used Attributes, I'm not sure this is doable for my next assignment, here's a brief list of XML manip requirementes.
General Purpose XMl manipulation, tied to a treeview, no schema.
Load/Save XML.
Load/Save Attributes as well as Values (i believe the term is Element Text?), and be mindful of the Node's name.
Comments can be safely ignored as can Document info markup (ie, the UTF-8 and schema tags)
Any suggestions on how best to handle this?
I probably wouldn't bother with an object model and IXmlSerializable - it sounds like you might just as well talk in terms of an XmlElement / XmlDocument - i.e. pass the data around as a block of xml. Since you have no schema it would be pointless to shred it out; you might as well do it via an xml DOM.
When you say treeview - is this winforms, asp.net, wpf? I believe the asp.net treeview can take an xml source, but for winforms you'd have to iterate the nodes yourself.
Don't know what exactly you mean with "before but i used Attributes" but I would recommend XmlSerializer too:
With "simple" classes it works usually out of the box.
Collections might need some more work, but it depends on your requirements and object structure.
There are other build in XML serializers like XAML or the WCF DataContractSerializer. All have pros and cons. But if you want to fine tune your XML format, XMLSerializer is the most flexibel one.
You can approach your format step by step: If the default looks good, your done. If not you have to add just some attributes in most cases.
If you want complete control, you can still implement IXmlSerialize to fine tune your format.
Everything applies on a per class basis: Use the default where appropriate, add some attributes where required and implement IXmlSerializable as required.
I would suggest you to use the simple XML serialization supported by the .NET framework.
Go through these MSDN documentation
How to Serialize an object
How to Deserialize an object
The following questions are about XML serialization/deserialization and schema validation for a .net library of types which are to be used for data exchange.
First question, if I have a custom xml namespace say "http://mydomain/mynamespace" do I have to add a
[XmlRoot(Namespace = "http://mydomain/mynamespace")]
to every class in my library. Or is there a way to define this namespace as default for the whole assembly?
Second question, is there a reason behind the always added namespaces
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"
even if there is no actual reference to any of the namespaces? I just feel they add noise to the resulting xml. Is there a way to remove them an only have the custom namespace in the resulting xml?
Third question, are there tools to support the generation of schema definitions (e.g. for all public [Serializable] classes of an assembly) and the validation of xml against specific schemas available?
If there are, would you recommend XML Schema from W3C or RELAX NG?
Just to add - the "xsi" etc is there to support things like xsi:nil on values later on - a well-known pattern for nullable values. It has to write the stream "forwards only", and it doesn't know (when it writes the first bit) whether it will need nil or not, so it assumes that writing it unnecessarily once is better than having to use the full namespace potentially lots of times.
1) XmlRoot can only be set at the class/struct/interface level (or on return values). So you can't use it on the assembly level. What you're looking for is the XmlnsDefinitionAttribute, but I believe that only is used by the XamlWriter.
2) If you're worried about clutter you should avoid xml. Well formed xml is full of clutter. I believe there are ways to interract with the xml produced by the serializer, but not directly with the XmlSerializer. You have much more control over the XML produced with the XmlWriter class. Check here for how you can use the XmlWriter to handle namespaces.
3) XSD.exe can be used to generate schemas for POCOs, I believe (I've always written them by hand; I may be using this soon to write up LOTS, tho!).
Tools,
- xsd.exe, with a command line like
xsd /c /n:myNamespace.Schema.v2_0 myschema_v2_0.xsd
I put the schema in a separate project.
liqudXML which is useful if there are several schemas, or you want full support of the schema features (DateTimes with offsets, positive/Negative decimals,), and cross platform generation.