Validation Patterns for Custom XML Documents - c#

I have a web application that generates a medium sized XML dataset to be consumed by a third party.
I thought it would be a good idea to provide some form of schema document for the XML that I generate so I pasted the XML into Visual Studio and got it to generate an XSD.
The annoying thing is that my XML doesn't validate to the XSD that was generated!
Is it better to roll your own XSD?
What about different schema docs like DTDs, Relax NG, or Schematron?
The key is that I would like to be able to validate my document using C#.
What are your XML validation strategies?

Whether you choose XSD and/or Schematron depends on what you are trying to validate. XSD is probably the most common validation strategy, but there are limits on what it can validate. If all you want to do is ensure that the right type of data is in each field, XSD should work for you. If you need to assert, for example, that the value of the <small> element is less than the value of the <big> element, or even more complex business rules involving multiple fields, you probably want Schematron or a hybrid approach.

You will be able to validate your XML with either an XML Schema or a DTD using C#. DTDs are older standards as compared to XML Schemas.
So, I recommend an XML Schema approach.

Related

How to deal with a XML based protocol where the response may conform to one of two XSDs?

I have to read and write data through a protocol where the response XML maybe different according to the error state of the server application. If the response is good it uses let's say Xml_1 with a specific schema but if the response indicates an error it uses Xml_2 with a complete different schema. The good design , in my opinion would be to incorporate the error state to the first schema, but we are just consumers of the this service and we don't have access to the design of the server application. My solution is to (using C#) read the XML response as string, do some searching in order to understand which XML schema is in use and then using the appropriate XML Serializer to convert the response to an object. Is there a more elegant solution?
Is the union of the two schemas a valid schema? (This will typically be the case, for example, if they use different namespaces, but it's likely not to be the case if they are both no-namespace schemas or if they are two versions of the same schema).
If the union is a valid schema, then you could consider validing against that.
Otherwise peeking at the start of the file will often be enough to tell you which vocabulary is in use.
It's possible to parse an XML document without validation, inspect it, and then validate the already parsed document. It's even possible to do this in a single pipeline without putting the whole document in memory. But the details depend on the toolkit you are using. You've tagged the question C# - I'm not sure if this is possible using the Microsoft tools, but it should be possible I think using Saxon-CS. [Disclaimer, my product].

Validate population of generated classes against XSD

I've got an XSD which I've turned into classes using the VS XSD.exe
The code will populate these classes before getting to a stage when I then want to validate that they are "correctly" populated - for example, if a field is mandatory in the XSD, but it is a string object in the generated class, will anything stop me from leaving it blank? - if nothing stops me it will be invalid! (Something like this question)
I'm trying to avoid generating my XML and then reading it back in and validating the actual XML against the XSD if this is at all possible. I want to know that the population of the classes has been done wrong before I even attempt to generate the XML.
Any thoughts or examples would be great!
There are lots of articles about validating XML against an XSD but I can't find anything helpful about validating the population of the generated classes against an XSD. I don't know if this is possible!
xsd.exe will add some extra boolean properties (for example: thisStringSpecified) which you can use for validating mandatory strings.
myClass.ThisStringSpecified = string.IsNullOrEmpty(ThisString);
You can test it afterwards by Serializing the object and use the XSD to validate it (if you want to be really sure).

Validation of XML using XSD

I am creating small application in which i am reading data from XML.
I am using XmlSearializer to read data from xml.
But before reading i am validating the xml using xsd.
So in validating xml using xsd i am having some cases which i think can not be implemented using xsd.
Some validation is based on value of other element.
So i want to make all these validation before i read data from xml.
So is there any way how can i validate xml before reading data and cases that can not be implemented using XSD?
Thanks for support.
I think it's not possible with XSD
See the other answers:
XML Schema - child elements dependent on existence of optional attribute
Restricting XML Elements Based on Another Element via XSD

Overriding or ignoring undeclared entities in C# using LINQ

I have a little utility that runs through looking for certain things in XML files using LINQ. It processes a MASSIVE collection of them rather quickly and nicely. However, about 20% of a certain batch of files fail to be read and are skipped, failing because of the degree symbol's presence as ° in the files. This is the "Reference to undeclared entity 'deg'." a previous question was about.
The solutions offered in the previous question cannot be directly applied here. I am not at liberty to go around modifying the files, and making copies of them and replacing instances or inserting tags in the copies seems inefficient. What would be the best way to go about getting LINQ to ignore the undeclared entities, which have absolutely no bearing on what my program does anyway? Or is there perhaps a good way of getting an XDocument.Load to be fed some entity declarations beforehand?
Unfortunately entities form part of the well-formedness rules for XML (2.1 Well-Formed XML Documents). It seems like you're saying you want the XDocument.Load to load what is notionally an XML file, but does not in fact conform to the rules, which it won't do, quite reasonably.
If your users are passing you what are supposed to be XML files, but that have undefined entities, then either you have to get them to provide the files in a valid format, or manage the incorrectness youself at load-time, in the ways that have been suggested.
It seems to me, from your restrictions, that the neatest approach would be to follow the example linked-to and create some settings to pass into the XMLReader along the lines of (Validating an XML Document in the DOM).
If there are entities which aren't defined and aren't listed in public schemas, you'll need to create your own schema which defines all the entities you need. So, create a generic settings for the XMLReader which references your own, custom schema. Add the necessary entities to this schema as certain files fail to load and then you'll build up a list of all the entites that you need to define in order that the XML files are valid.
Then, for each document you try to load, create an XMLReader for the file using the settings above and call the XDocument(XMLReader) overload.

Read DTD or Schema and list all valid child elements or attributes for a given element

I want to develop an application something like XML editor.. providing intellisense like feature when user types an element, the application will read the DTD or schema and list the valid child elements and attributes (something like Oxygen XML Editor).
Is there an API that i can get this done?
I'm not familiar with an API that performs this task.
If you choose to implement this yourself, however, here's a couple of thoughts:
An XML schema is itself an XML file, that is structured according to the meta-schema. You can easily use one of the existing APIs to unmarshal a schema into an object structure that you can easily work with in-memory.
A DTD is not an XML structure, but any DTD can be represented as a simple schema. Therefore you should try and find a way to convert a DTD into a schema (and apply your schema solution).
HTH
You might find XSD4J useful:
XSD4J is a library to parse XML Schema
files into a structure of Java
objects, convert those back into an
XML DOM tree (and hence plain text)
again, and allow for performing
several queries on the XSD objects.
The library currently supports most
real-world features such as simple and
complex types, type restrictions and
attributes.

Categories

Resources