How to write a XSD to validate random element names? - c#

I have written an application which receives many variation of XML requests. In our business we have to validate XMLs against XSD at the beginning of any request.
The problem:
As I said above I have to validate them at the beginning and those XMLs have almost the same schema and I need to write a general XSD for them.
I have provided some prototype XML for my question:
XML1:
<_9D94DEB4-7C2D-45A5-A4FB-89FB1CF20672>
<Param1>value</Param1>
<Type>Category</Type>
</_9D94DEB4-7C2D-45A5-A4FB-89FB1CF20672>
XML2: Almost the same schema but root element name is different and it has an extra child element.
<_7603DCD1-F270-43EA-86E3-0FB3161478F6>
<Param1>value</Param1>
<Type>Page</Type>
<SearchText>Sample</SearchText>
</_7603DCD1-F270-43EA-86E3-0FB3161478F6>
As you can see the root element names are different but their schema is almost the same, How could I write a general XSD for them?
Thanks in advance.

The only thing these two XML instances have in common is that both have a Type element whose value is a string. Calling that "almost the same schema" seems rather an exageration. But perhaps there is more commonality than you have shown us?
In principle XSD allows you to validate the instance against a global type in your schema, irrespective of the element name. Whether your particular schema processor provides an API to do that is another question.
Your schema could then simply define the top-level type:
<xs:complexType name="myTopLevelType">
<xs:sequence>
<xs:element name="Param1" type="xs:string"/>
<xs:element name="Type" type="xs:string"/>
etc
</xs:sequence>
</xs:complexType>
If you choose Saxon as your schema validator then you can invoke "validation-by-type" from the Java API but not from the command line. In fact, probably the easiest way to do it is to invoke the validation from XSLT:
<xsl:import-schema schemaLocation="mySchema.xsd"/>
then:
<xsl:copy-of select="doc('instance.xml')/*" type="myTopLevelType"/>

All you need to do is write each of roots as direct children of your schema element and define types in your XSD.
For example:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="_9D94DEB4-7C2D-45A5-A4FB-89FB1CF20672">
...
<xs:complexType>
<xs:attribute name="Param2" type="Param2" use="required">
</xs:attribute>
<xs:attribute name="Type" type="Type" use="required">
</xs:attribute>
<xs:attribute name="SearchText" type="SearchText" use="required">
</xs:attribute>
</xs:complexType>
...
</xs:element>
<xs:element name="_7603DCD1-F270-43EA-86E3-0FB3161478F6">
...
<xs:complexType>
<xs:attribute name="Param1" type="Param1" use="required">
</xs:attribute>
<xs:attribute name="Type" type="Type" use="required">
</xs:attribute>
</xs:complexType>
...
</xs:element>
</xs:schema>
<!-- Your defenition and restriction of types-->
<xs:simpleType name="Param1">
<xs:restriction base="xs:string">
</xs:simpleType>
<xs:simpleType name="Param2">
<xs:restriction base="xs:string">
</xs:simpleType>
<xs:simpleType name="Type">
<xs:restriction base="xs:string">
</xs:simpleType>
<xs:simpleType name="SearchText">
<xs:restriction base="xs:string">
</xs:simpleType>

Related

XML Schema Union getting "string" is not valid for the element error

I am an experienced programmer but just recently took on a job maintaining an app that uses xml schema. They want to add some validation on an item that accepts Longitude. They want to continue to accept a blank and also 0, 0.0000000, or if another value is entered they want to make sure that at the least it is in the United States. (i.e. between -125 and -67)
The current xml schema simply allows any value.
<xs:element name="Location">
<xs:complexType>
<xs:sequence>
<xs:element name="LocLongitude"/>
</xs:sequence>
</xs:complexType>
</xs:element>
There are multiple venders sending this info in. Here is an example of what they may send:
<Location>
<LocLongitude xsi:type="xsd:string"></LocLongitude>
</Location>
Now looking at what the users want I found that I can use a union to encapsulate multiple checks. This is what I am using now.
<xs:element name="Location">
<xs:complexType>
<xs:sequence>
<xs:element name="LocLongitude" nillable="true">
<xs:simpleType>
<xs:union>
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value=""/>
<xs:enumeration value="0"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType>
<xs:restriction base="xs:double">
<xs:minInclusive value="0.00000000"/>
<xs:maxInclusive value="0.00000000"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType>
<xs:restriction base="xs:double">
<xs:minInclusive value="-125"/>
<xs:maxInclusive value="-67"/>
</xs:restriction>
</xs:simpleType>
</xs:union>
</xs:simpleType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
It validates correctly if I use:
<Location>
<LocLongitude />
</Location>
Now if I use what the current vendors are using (see below):
<Location>
<LocLongitude xsi:type="xsd:string"></LocLongitude>
</Location>
We get an error:
THE XSI:TYPE ATTRIBUTE VALUE '' IS NOT VALID FOR THE ELEMENT 'LOCLONGITUDE', EITHER BECAUSE IT IS NOT A TYPE VALIDLY DERIVED FROM THE TYPE IN THE SCHEMA, OR BECAUSE IT HAS XSI:TYPE DERIVATION BLOCKED.
My question is, can I get this to work while still allowing the vendors to include xsi:type="xsd:string"?
No, the type chosen for xs:type must be validly derived from the type provided by the associated element. You cannot on the one hand define a type that restricts the value space and on the other hand supports a broader xs:type declaration.
See also: How to restrict the value of an XML element using xsi:type in XSD?

Allow XSD date element to be empty string

I would like to allow an element to be a xs:date or an empty string.
Here's an XML Schema that I've tried:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:lp="urn:oio:ebst:diadem:lokalplan:1"
targetNamespace="urn:oio:ebst:diadem:lokalplan:1"
elementFormDefault="qualified" xml:lang="DA"
xmlns:m="urn:oio:ebst:diadem:metadata:1">
<xs:import schemaLocation="../key.xsd" namespace="urn:oio:ebst:diadem:metadata:1" />
<xs:element name="DatoVedtaget" type="lp:DatoVedtagetType" />
<xs:complexType name="DatoVedtagetType">
<xs:simpleContent>
<xs:extension base="xs:date">
<xs:attribute ref="m:key" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:complexType name="DatoVedtagetTypeString">
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute ref="m:key" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:schema>
I want the element to be DatoVedtagetType in a case it includes a value, and I want it to be DatoVedtagetTypeString if it is empty. How I implement such a conditional functionality this schema?
Per comments on the question, the goal is to have DatoVedtaget be a xs:date or empty. Here is a way to express such a constraint:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:lp="urn:oio:ebst:diadem:lokalplan:1"
xmlns:m="urn:oio:ebst:diadem:metadata:1"
targetNamespace="urn:oio:ebst:diadem:lokalplan:1"
elementFormDefault="qualified"
xml:lang="DA">
<xs:import schemaLocation="../key.xsd" namespace="urn:oio:ebst:diadem:metadata:1" />
<xs:element name="DatoVedtaget" type="lp:DatoVedtagetType" />
<xs:simpleType name="empty">
<xs:restriction base="xs:string">
<xs:enumeration value=""/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="dateOrEmpty">
<xs:union memberTypes="xs:date lp:empty"/>
</xs:simpleType>
<xs:complexType name="DatoVedtagetType">
<xs:simpleContent>
<xs:extension base="lp:dateOrEmpty">
<xs:attribute ref="m:key" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:schema>
Rather than using a union type as #kjhughes does, my own preferred solution is to use a list type allowing zero or one occurrences:
<xs:simpleType name="dateOrEmpty">
<xs:list itemType="xs:date" maxLength="1"/>
</xs:simpleType>
One reason for the preference is that it's less code. Another reason is that if you're writing schema-aware XSLT or XQuery code, the resulting value is easier to manipulate (the atomized value has type xs:date? which is easier to manipulate, e.g. to test for empty, than a union type).

WCF Schema common elements always null

I am using xsd to generate the object available in the OperationContract. The address, city, state and zip elements of the XSD are common
<xs:element name="Address" nillable="true">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:maxLength value="50"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
and used throughout the XML.
<xs:element ref="Address" />
When I compile the schema the classes generate correctly using the common elements.
When I run the service the OperationContext contains the expected request from the client:
<NameLast>Last</NameLast>
<NameFirst>First</NameFirst>
<Address xmlns="http://tempuri.org/">123 2nd St</Address>
<City xmlns="http://tempuri.org/">Somewhere</City>
However the common elements have the xmlns attribute (shown above) and in the received object all common elements contain null values.
My reputation is not high enough to show screenshot, but all data NOT in common elements are passed correct. Such as NameLast = "Last", Address = Null.
I am new to using Schemas and would appreciate any direction. Thanks.
I believe what you want is this, placed before your closing </xs:schema> tag:
<xs:simpleType name="AddressType">
<xs:restriction base="xs:string">
<xs:maxLength value="50"/>
</xs:restriction>
</xs:simpleType>
and used throughout the schema:
<xs:element name="Address" type="AddressType" />

Xml validation error The value is invalid according to its datatype 'CopyFrom' - The Enumeration constraint failed

I am trying to validate an Xml file
My xsd schema fragment:
<xs:attribute name="PostIndex" use="optional">
<xs:annotation>
<xs:documentation>Post Index</xs:documentation>
</xs:annotation>
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:length value="6"/>
<xs:pattern value="\d{0}|\d{6}"/>
</xs:restriction>
</xs:simpleType>
</xs:attribute>
XML file fragment:
<Atr1>
<Atr2 Atr="A9F130BE-3974-4698-B9F9-72037BC0E97F" PostIndex="123456" />
<Atr2 Atr3="123" Atr4="11111" />
</Atr1>
when I run the validation code, it passes schema validation I have error:
The 'PostIndex' attribute is invalid - The value '123456' is invalid
according to its datatype 'String' - The Enumeration constraint
failed.
Here is the XSD and XML I used, it works fine. So please post your entire XSD and XML
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="computer">
<xs:annotation>
<xs:documentation xml:lang="it-IT">Definizione di un computer</xs:documentation>
<xs:documentation xml:lang="en-US">Definition of a computer</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:attribute name="PostInt">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:length value="6"/>
<xs:pattern value="\d{0}|\d{6}"/>
</xs:restriction>
</xs:simpleType>
</xs:attribute>
</xs:complexType>
</xs:element>
</xs:schema>
XML
<computer PostInt="123456" />
I used following online validator

Validation xml to xsd to only catch specific errors

I have an import file that needs to have skip and continue on specific errors. I want to ignore the errors for data type, min/max length, and required fields. I want to catch and display errors about items not formatted correctly and in the wrong location.
In this case the file contains a collection of people.
I want to catch are errors:
1: A Children node outside of a person node.
2: A Child outside of a person node.
3: A Person out side of the people node.
I want to ignore errors:
1: Child does not have a name.
2: Person does not have birth date.
<xs:element name="People">
<xs:complexType>
<xs:sequence>
<xs:element name="Person" minOccurs="1" maxOccurs="unbounded">
<xs:complexType>
<xs:all>
<xs:element name="FirstName" minOccurs="1" maxOccurs="1">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:minLength value="1"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name="LastName" minOccurs="1" maxOccurs="1">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:minLength value="1"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name="BirthDate" type="Date" minOccurs="1" maxOccurs="1"/>
<xs:element name="Children">
<xs:complexType>
<xs:sequence>
<xs:element name="Child" minOccurs="1" maxOccurs="unbounded">
<xs:complexType>
<xs:all>
<xs:element name="FirstName" minOccurs="0" maxOccurs="1">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:minLength value="1"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name="BirthDate" type="Date" minOccurs="1" maxOccurs="1"/>
</xs:all>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:all>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
Change your schema as follows:
In the Firstname element declaration under Child, add an attribute, type="xs:string". Also, remove all the content of the element declaration (simpleType and so on). You can make the tag self-closing if you want.
In the Birthdate element declaration, change minOccurs from 1 to 0.
The first change removes the restriction that's currently placed on the child's name that the content be at least one character long. Adding the type attribute is necessary since you're removing the current definition of the element type.
The second change tells the validator that birth date is not required.
Make those changes and the XML that you want to validate, should.
Whether to stop or carry on when invalid input is presented is, in principle, entirely the consuming software's choice, so what you are describing is logically coherent, though perhaps a bit unusual. If you can get the information you need through the API you're using, no reason not to make the software behave as you describe, and that's probably the preferable option. (But I can't help you with it.)
If you can't get the required information through the API (some APIs do assume that validation is just a yes/no kind of thing), one possible fallback alternative would be to validate using a separate schema weakened as described in ssamuel's answer, so that the only validation errors are the ones you wish to regard as fatal.
That is: there are two ways to solve this problem: (1) move past the idea that you must always abort on invalid input: get the validator to give you more information so you can decide to stop or continue. Or (2) move past the idea that there is a a single schema that applies to all processing of the document: use one schema for document creation and a different schema for deciding what to do with the input.

Categories

Resources