xsd validation limit root element - c#

I have this xsd:
<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" >
<xsd:element name="F">
<xsd:complexType>
<xsd:sequence>
<xsd:element maxOccurs="unbounded" name="A" />
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
This xml is valid, but it is wrong
<F><F>
<A/>
</F></F>
I have to valid only this xml
<F>
<A/>
</F>
How to do it in xsd?
C# code
XmlDocument xml = new XmlDocument();
using (MemoryStream ms = new MemoryStream(File.ReadAllBytes(xml)))
{
xml.Load(ms);
}
XmlSchemaSet schemas = new XmlSchemaSet();
schemas.Add("", xsdpath);
XDocument _xml = XDocument.Parse(xml.OuterXml);
_xml.Validate(schemas, (o, e) =>{});

Result.
Validate() catches only errors, not warnings.
xmlReader has more options to check xml by xsd
If xml and xsd have different namespaces, validate() will be always true.
To fix it you should remove namespaces from both files or write the same namespace.

Related

Xml Cleaning based on XSD in C#

How to clean an XML file removing all elements not present in a provided XSD?
This does not work:
public static void Main()
{
XmlTextReader xsdReader = new XmlTextReader(#"books.xsd");
XmlSchema schema = XmlSchema.Read(xsdReader, null);
XmlReaderSettings settings = new XmlReaderSettings();
settings.Schemas.Add(schema);
settings.ValidationType = ValidationType.Schema;
settings.ValidationEventHandler += new ValidationEventHandler(ValidationCallBack);
XmlReader xmlReader = XmlReader.Create(#"books.xml", settings);
XmlWriter xmlWriter = XmlWriter.Create(#"books_clean.xml");
xmlWriter.WriteNode(xmlReader, true);
xmlWriter.Close();
xmlReader.Close();
}
private static void ValidationCallBack(object sender, ValidationEventArgs args)
{
((XmlReader)sender).Skip();
}
When I use the above, instead of removing all "junk" tags, it removes only the first junk tag and leaves the second one. As far as why I need to accept this file, I am using an old SQLServer 2012 instance which requires the XML to match the XSD exactly even if the extra elements in the XML are not used by the application. I do not have control over the source XML which is provided by a 3rd party tool with an unpublished XSD.
Sample Files:
Books.xsd
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="bookstore">
<xs:complexType>
<xs:sequence>
<xs:element name="book" maxOccurs="unbounded" minOccurs="0">
<xs:complexType>
<xs:sequence>
<xs:element type="xs:string" name="title"/>
<xs:element type="xs:float" name="price"/>
</xs:sequence>
<xs:attribute type="xs:string" name="genre" use="optional"/>
<xs:attribute type="xs:string" name="ISBN" use="optional"/>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Books.xml
<bookstore>
<book genre='novel' ISBN='10-861003-324'>
<title>The Handmaid's Tale</title>
<price>19.95</price>
<junk>skdjgklsdg</junk>
<junk2>skdjgklsdg</junk2>
</book>
<book genre='novel' ISBN='1-861001-57-5'>
<title>Pride And Prejudice</title>
<price>24.95</price>
<junk>skdjgssklsdg</junk>
</book>
</bookstore>
Code mostly copied from: Validating an XML against referenced XSD in C#
If it's simply a question of removing all elements whose names don't appear anywhere in the schema, then it possibly feasible, as described below. However, in the general case (a) this doesn't ensure the instance will be valid against the schema (the elements might be in the wrong order, for example), and (b) it might remove elements that the schema actually allows (because of wildcards).
If the approach of removing unknown elements looks useful, you could do it as follows:
(a) write an XSLT stylesheet that extracts all the element names from the schema by looking for xs:element[#name] declarations, generating a document with the format:
<allowedElements>
<allow name="book" namespace=""/>
<allow name="isbn" namespace=""/>
</allowedElement>
(b) write a second (streamable) XSLT stylesheet:
<xsl:transform version="3.0" xmlns:xsl="....">
<xsl:mode on-no-match="shallow-copy" streamable="yes"/>
<xsl:key name="k" match="allow" use="#name, #namespace" composite="yes"/>
<xsl:template match="*[not(key('k', (local-name(), namespace-uri()), doc('allowed-elements.xml'))]"/>
</xsl:transform>
The below successfully removes all of the junk tags from the provided examples. The second xsl:template tag is applied first and matches everything except the specifically white-listed tags. Then the first xsl:template tag writes a copy of the nodes to XmlWriter.
Code:
public static void Main()
{
XmlReader xmlReader = XmlReader.Create("books.xml");
XslCompiledTransform myXslTrans = new XslCompiledTransform();
myXslTrans.Load("books.xslt");
XmlTextWriter myWriter = new XmlTextWriter("books_clean.xml", null);
myXslTrans.Transform(xmlReader, null, myWriter);
xmlReader.Close();
myWriter.Close();
}
books.xslt
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:mode streamable="yes"/>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*[
not(name()='bookstore') and
not(name()='book') and
not(name()='title') and
not(name()='price')
]" />
</xsl:stylesheet>

"The element is missing.." trying to generate class with <element ref = />

Using the XSD tool included with VS 2013, I receive the following message trying to generate a class from an xsd that contains <xsd:element ref=.../> -
Schema validation warning: The 'http://www.w3.org/2000/09/xmldsig#:KeyName' element is not declared. Line 14, position 8.
Warning: Schema could not be validated. Class generation may fail or may produce incorrect results.
Error: Error generating classes for schema 'test'.
- The element 'http://www.w3.org/2000/09/xmldsig#:Signature' is missing.
This is a cut down xsd that demonstrates the problem:
<?xml version="1.0" encoding="utf-8"?>
<xsd:schema id="test"
targetNamespace="http://tempuri.org/test.xsd"
elementFormDefault="qualified"
xmlns="http://tempuri.org/test.xsd"
xmlns:mstns="http://tempuri.org/test.xsd"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:sig="http://www.w3.org/2000/09/xmldsig#"
>
<xsd:import schemaLocation="xmldsig-core-schema.xsd" namespace="http://www.w3.org/2000/09/xmldsig#" />
<xsd:complexType name="test" >
<xsd:sequence >
<xsd:element ref="sig:Signature" minOccurs="0" maxOccurs="unbounded"></xsd:element>
</xsd:sequence>
</xsd:complexType>
<xsd:element type="test" name="top"/>
</xsd:schema>
I'm pretty sure the import and namespaces are okay. Resharper and the VS Schema Designer do not complain. I suspect that this is something that the tool just doesn't do.
Any ideas how I can proceed?
It turns out that this has been answered here.
https://stackoverflow.com/a/17278163/2516770
I need to add the imported file to the file list of the xsd command line parameters:
xsd test.xsd xmldsig-core-schema.xsd /c

xsd to c# without included xsd

To generate classes from xsd file is it necessary need included/imported xsd? Is possible to generate it without included/imported xsd? Because I want write own xsd to C# classes generator.
<?xml version="1.0"?>
<xsd:schema xmlns="" xmlns:Extended="" xmlns:xsd="" targetNamespace="">
<xsd:include schemaLocation="file2.xsd"/>
<xsd:import namespace="" schemaLocation="file2.xsd"/>
<xsd:complexType name="ActualEndTimeType">
<xsd:simpleContent>
<xsd:restriction base="DateTimeType"/>
</xsd:simpleContent>
</xsd:complexType>
<xsd:complexType name="ActualFinishTimeType">
<xsd:simpleContent>
<xsd:restriction base="DateTimeType"/>
</xsd:simpleContent>
</xsd:complexType>
<xsd:complexType name="ActualStartTimeType">
<xsd:simpleContent>
<xsd:restriction base="DateTimeType"/>
</xsd:simpleContent>
</xsd:complexType>
<-- etc... -->
</xsd:schema>

Why is my XmlReader not validating against the Schema?

I'm trying to read an XML file and validate against the schema specified by that file. I will not know the schema's location ahead of time, so I need to use the schema specified by the xml file.
Here's the relevant code (inspired by this answer):
var settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
settings.ValidationFlags |= XmlSchemaValidationFlags.ProcessInlineSchema;
settings.ValidationFlags |= XmlSchemaValidationFlags.ProcessSchemaLocation;
settings.ValidationFlags |= XmlSchemaValidationFlags.ReportValidationWarnings;
settings.ValidationEventHandler += new ValidationEventHandler(ValidationFailed);
//settings.Schemas.Add("http://www.publishing.org", new XmlTextReader(#"C:\path\to\schema\Book.xsd"));
validatingReader = XmlReader.Create(xmlInputReader, settings);
while (validatingReader.Read()) ;
If I uncomment the settings.Schemas.Add line and comment the settings.ValidationFlags |= XmlSchemaValidationFlags.ProcessSchemaLocation out, everything works. I have also tested both the schema and the XML against an external validator.
The event handler message reports "Cannot load the schema for the namespace 'http://www.publishing.org' - Specified argument was out of the range of valid values. Parameter name: baseUri." and it occurs on line 2 (at the root element), followed by "Could not find schema information for the element 'http://www.publishing.org:[each element]'.
My first thought (and still the only thing I know it can be) was that the URI wasn't pointing to the xsd, but I've used 1) A full path via file:///C:\path\to\schema\Book.xsd, 2) A URI relative to the xml file, and 3) A URI relative to the application's current directory. The Visual Studio XML editor has no problem with any of these, but the XmlReader can't seem to find any of them.
Here's a simple schema and an xml instance (my actual schema is more complex, but this fails too):
<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org"
version="1.0" elementFormDefault="qualified">
<xsd:element name="Book" type="BookType"/>
<xsd:complexType name="BookType">
<xsd:sequence>
<xsd:element name="Title" type="xsd:string" minOccurs="1" maxOccurs="1"/>
<xsd:element name="Author" type="xsd:string" minOccurs="1" maxOccurs="unbounded"/>
<xsd:element name="Date" type="xsd:string" minOccurs="1" maxOccurs="1"/>
<xsd:element name="ISBN" type="xsd:string" minOccurs="1" maxOccurs="1"/>
<xsd:element name="Publisher" type="xsd:string" minOccurs="1" maxOccurs="1"/>
</xsd:sequence>
</xsd:complexType>
</xsd:schema>
<?xml version="1.0" encoding="UTF-8"?>
<Book xmlns="http://www.publishing.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.publishing.org ../etc/Book.xsd">
<!-- Book.xsd file:///C:\path\to\schema\Book.xsd -->
<Title>Historic Doubts Relative to Napoleon Bonaparte</Title>
<Author>Richard Whately</Author>
<Author>Whately, Richard</Author>
<Date>1849</Date>
<ISBN>1465554777</ISBN>
<Publisher>Warren P. Draper</Publisher>
</Book>
I think everything is correct concerning my namespaces. I have also tried loading through an XmlDocument, but I get the same results. It has to be a problem locating the XSD, right?
I agree it should be a path problem.
I was able to use your code ( and the example you used :) ) .
I tested the validation against a local copy of the xsd, in a file, by setting my local file path inside the xml.
It did nothing when I used your exact xml, and indeed threw the validation error if I changed a tag.
My xsi:schemaLocation looks like:
xsi:schemaLocation="http://www.publishing.org C:\Users\Mike\Desktop\xml_test_files\test.xsd"
Did you try that simple local folder path?

Validating XML against XSD with <xs:any/> - Warning : Could not find schema information for the element

I am trying to validate my XML files against an XSD to check if the files have the correct format.
In my XSd file I want the Row element to contain as many and whatever element possible, thus the any element.
With an online validator, I checked that the validity of XSD and checked my Schema on one of the files I want to check. Everything was valid.
The online validator is this one: http://www.utilities-online.info/xsdvalidation/
I based my parsing code on this topic : c# XML Schema validation
I get that my files are not valid: Could not find schema information for the element <MYELEMENT>
The elements that are not found are the ones in my in the content of my Row element.
The complete .XSD is :
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Root">
<xs:complexType>
<xs:sequence>
<xs:element name="Row" minOccurs="0" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:any minOccurs='1' maxOccurs='unbounded' processContents="lax" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
The XML I tested with is :
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<Root>
<Row>
<MODE_SAISIE_CT>'DEGRADE'</MODE_SAISIE_CT>
<MODE_STATUT>'F'</MODE_STATUT>
<MODE_LIBELLE>'Dégradé'</MODE_LIBELLE>
<DATE_MODE_DEGRADE>'17/08/2011 15:28:17'</DATE_MODE_DEGRADE>
</Row>
<Row>
<MODE_SAISIE_CT>'STANDARD'</MODE_SAISIE_CT>
<MODE_STATUT>'V'</MODE_STATUT>
<MODE_LIBELLE>'Standard'</MODE_LIBELLE>
<DATE_MODE_DEGRADE>'17/08/2011 15:53:06'</DATE_MODE_DEGRADE>
</Row>
</Root>
How can I manage the parsing if I have an any element in my schema ?
Without seeing a complete XSD and input XML that exhibit the issue, it's unclear what to recommend, but perhaps this working example will help you identify your problem:
This input XML:
<?xml version="1.0" encoding="utf-8"?>
<root>
<Row>
<MYELEMENT/>
</Row>
</root>
Is valid against this XSD:
<?xml version="1.0" encoding="utf-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="root">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="Row">
<xsd:complexType>
<xsd:sequence>
<xsd:any processContents="lax" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>

Categories

Resources