I searched and did not find any questions addressing this problem.
I am attempting to validate various XML against a schema and it seems to be validating ALL well-formed XML, instead of just XML that conforms to the schema. I have posted the code I am using, the Schema, a sample valid XML and a sample invalid XML.
I have been struggling with this for awhile. I am in the dark on most of this. I've had to learn how to write an XSD, write the XSD, then learn how to parse XML in C#. None of which I have ever done before. I have used many tutorials and the microsoft website to come up with the following. I think this should work, but it doesn't.
What am I doing wrong?
private bool ValidateXmlAgainstSchema(string sourceXml, string schemaUri)
{
bool validated = false;
try
{
// REF:
// This should create a SCHEMA-VALIDATING XMLREADER
// http://msdn.microsoft.com/en-us/library/w5aahf2a(v=vs.110).aspx
XmlReaderSettings xmlSettings = new XmlReaderSettings();
xmlSettings.Schemas.Add("MySchema.xsd", schemaUri);
xmlSettings.ValidationType = ValidationType.Schema;
xmlSettings.ValidationFlags = XmlSchemaValidationFlags.None;
XmlReader xmlReader = XmlReader.Create(new StringReader(sourceXml), xmlSettings);
// parse the input (not sure this is needed)
while (xmlReader.Read()) ;
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(xmlReader);
validated = true;
}
catch (XmlException e)
{
// load or parse error in the XML
validated = false;
}
catch (XmlSchemaValidationException e)
{
// Validation failure in XML
validated = false;
}
catch (Exception e)
{
validated = false;
}
return validated;
}
The XSD / Schema. The intent is to accept XML that contains either an Incident or a PersonOfInterest.
<?xml version="1.0" encoding="utf-8"?>
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="MySchema.xsd"
xmlns="MySchema.xsd"
elementFormDefault="qualified"
>
<xs:element name="Incident" type="IncidentType"/>
<xs:element name="PersonOfInterest" type="PersonOfInterestType"/>
<xs:complexType name="IncidentType">
<xs:sequence>
<xs:element name="Description" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="PersonOfInterest" type="PersonOfInterestType" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="PersonOfInterestType">
<xs:sequence>
<xs:element name="Name" type="xs:string" minOccurs="1" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
Here is a sample of valid XML
<?xml version="1.0" encoding="utf-8" ?>
<Incident
xmlns="MySchema.xsd"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3schools.com MySchema.xsd"
>
<Description>something happened</Description>
<PersonOfInterest>
<Name>Joe</Name>
</PersonOfInterest>
<PersonOfInterest>
<Name>Sue</Name>
</PersonOfInterest>
</Incident>
This is a sample of well-formed invalid XML which should throw an exception (I thought), but when I try it, the code returns true, indicating it is valid against the schema.
<ghost>Boo</ghost>
The reason your <ghost>Boo</ghost> validates is that the parser cannot find any schema matching the xml. If there is no schema then the parser assumed validity, providing the xml is well-formed. It's counter-intuitive I know, and will probably differ based on parser implementation.
This notwithstanding, there are several problems with your code:
Two Root Elements
This is a big no-no in xsd - you can only have a single root element. Some parsers will actually throw an exception, others tolerate it but will only use the first root element (in your case Incident) for any subsequent validation.
Use of schemaLocation attribute
This should take the value (namespace) (URI) where the namespace is the targetNamespace of the schema and the URI is the location of the schema. In your case you appear to be using the schema file name as your target namespace. Additionally, looking at your code, you are loading the schema into your xml reader so you don't actually need the schemaLocation attribute at all. This is an optional attribute and some parsers completely ignore it.
I would suggest the following changes:
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://MyMoreMeaningfulNamespace"
xmlns="http://MyMoreMeaningfulNamespace"
elementFormDefault="qualified"
>
<xs:element name="Root">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="Incident" type="IncidentType"/>
<xs:element maxOccurs="unbounded" name="PersonOfInterest" type="PersonOfInterestType"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:complexType name="IncidentType">
<xs:sequence>
<xs:element name="Description" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="PersonOfInterest" type="PersonOfInterestType" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="PersonOfInterestType">
<xs:sequence>
<xs:element name="Name" type="xs:string" minOccurs="1" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
Which validates this instance
<Root xmlns="http://MyMoreMeaningfulNamespace">
<Incident>
<Description>something happened</Description>
<PersonOfInterest>
<Name>Joe</Name>
</PersonOfInterest>
<PersonOfInterest>
<Name>Sue</Name>
</PersonOfInterest>
</Incident>
<Incident>
...
</Incident>
<PersonOfInterest>
<Name>Manny</Name>
</PersonOfInterest>
<PersonOfInterest>
...
</PersonOfInterest>
</Root>
Related
I used xsd.exe to generate a .cs class.
The xsd file as below
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="SendComments">
<xs:complexType>
<xs:sequence>
<xs:element name="Input">
<xs:complexType>
<xs:sequence>
<xs:element name="TransId" maxOccurs="1" minOccurs="0" type="xs:string"/>
<xs:element name="SampleId" minOccurs="0" maxOccurs="1" type="xs:long"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="Output" type="xs:string" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
In my generated class, it has the correct field generated.
However, when I call the serializer. The SampleId field being ignored.
Serializer code segment:
var serializer = new XmlSerializer(typeof(SendComments));
using (StringWriter stringWriter = new StringWriter())
{
serializer.Serialize(stringWriter, SPCComment);
return stringWriter.ToString();
}
Result:
<?xml version="1.0" encoding="utf-16"?>
<SPCSendComments xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Input>
<TransId>-</TransId>
</Input>
</SPCSendComments>
I tried with other .xsd file, all the primitive type (bool, int, long) is being ignored when serializing.
I wonder what will be the cause that primitive type being ignored.
Your generated class has an extra field SampleIdSpecified that indicates if the field is null or not. Set that to true and the field will be serialized.
If you set TransId to null, that will also be ignored.
They're being ignored because they're optional fields in your schema. They have minOccurs = 0 which means that they don't need to be there for the XML to be valid.
Our software spits out a number of xml files and I need to determine which is which. For example, there are three different types of xml file (heavily abbreviated):
"IQ.xml"
<?xml version="1.0" encoding="ISO-8859-1"?>
<Catalog xmlns:dt="urn:schemas-microsoft-com:datatypes">
<Rec>
<ITEM dt:dt="string"></ITEM>
<QTY dt:dt="string"></QTY>
</Rec>
</Catalog>
"IMR.xml"
<?xml version="1.0" encoding="ISO-8859-1"?>
<Catalog xmlns:dt="urn:schemas-microsoft-com:datatypes">
<Rec>
<ITEMS dt:dt="string"></ITEMS>
<MFG dt:dt="string"></MFG>
<ROUTE dt:dt="string"></ROUTE>
</Rec>
</Catalog>
"RP.xml"
<?xml version="1.0" encoding="ISO-8859-1"?>
<Catalog xmlns:dt="urn:schemas-microsoft-com:datatypes">
<Rec>
<REF dt:dt="string"></REF>
<PON dt:dt="string"></PON>
</Rec>
</Catalog>
Anyone of these could be passed out at any time and I need a way to determine where to pass these files to. What is the best way to achieve this? Could a schema be used to test the xml file against the fields and then a result passed back?
My initial thoughts were to test against a schema if it doesn't match the first , move on to the second and so on. This is then hard coded and cannot be changed when different XML file types are later added so I'm not too keen on this. I'm not sure at this stage whether this is even the best approach?
This will be coded in C# so I'm not sure whether there are any inbuilt functions which can help or whether it will have to be custom written.
Has anyone needed to do this before? How did you tackle this?
What I would suggest is to validate the XML file over a schema(like you yourself suggested).
Regarding your problem related to the flexibility of your code to later support other schema's there are many choices but it depends on what you want to do.
For example you can keep all your schema's I an config file, and when you validate a new XML file you can run it programmatically through supported schema's, if there is no match.you can throw an exception(unsupported XML file structure for example).
You might also define statically combinations between certain XML files and certain schema's, which you can later deduce programmatically.
Of course when you want to support new schemas you'll need to change the code... But that's a normal behavior.
To create a fully generic and automated method of handling any kind of XML file and any kind of schema will be difficult and you'll need to probably use some sort of naming convention where you would deduce the associated schema from the name or from some information embedded inside the XML file. This could be done at runtime but even then you'll probably support only a limited number of behaviors and you'll need new code when you want to expand your application.
Use an XmlReader with an XmlReaderSettings which specifies the type of validation to perform and a ValidationEventHandler. This can be wrapped into a method that will give you the schema or schemas against which the XML document was successfully validated.
If you're concerned about new schemas being added in the future, then just store them in a central location like a directory and grab them at runtime. New schemas could simply be dropped into the directory as needed.
void Main()
{
var rootDirectory = #"C:\Testing";
var schemaDirectory = Path.Combine(rootDirectory, "Schemas");
var dataDirectory = Path.Combine(rootDirectory, "Data");
var schemaFiles = new[] {
Path.Combine(schemaDirectory, "IQ.xsd"),
Path.Combine(schemaDirectory, "IMR.xsd"),
Path.Combine(schemaDirectory, "RP.xsd")
};
var dataFiles = new[] {
Path.Combine(dataDirectory, "IQ.xml"),
Path.Combine(dataDirectory, "IMR.xml"),
Path.Combine(dataDirectory, "RP.xml")
};
var results = FindMatchingSchemas(dataFiles[1], schemaFiles).Dump();
Console.WriteLine("Matching schema is: {0}", results.First(r => r.Value));
}
private static Dictionary<string, bool> FindMatchingSchemas(string dataFile, string[] schemaFiles)
{
var results = new Dictionary<string, bool>();
foreach (var schemaFile in schemaFiles)
{
results.Add(schemaFile, true);
// Set the validation settings.
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
settings.ValidationFlags |= XmlSchemaValidationFlags.ProcessInlineSchema;
settings.ValidationFlags |= XmlSchemaValidationFlags.ProcessSchemaLocation;
settings.ValidationFlags |= XmlSchemaValidationFlags.ReportValidationWarnings;
settings.ValidationEventHandler += new ValidationEventHandler((object sender, ValidationEventArgs args) =>
{
// Validation error
results[schemaFile] = false;
});
settings.Schemas.Add(null, schemaFile);
// Create the XmlReader object.
XmlReader reader = XmlReader.Create(dataFile, settings);
// Parse the file.
while (reader.Read());
}
return results;
}
// Output: Matching schema is: C:\Testing\Schemas\IMR.xsd
There is a free website which can generate XSD documents from XML documents. http://www.freeformatter.com/xsd-generator.html
IQ.xsd
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Catalog">
<xs:complexType>
<xs:sequence>
<xs:element name="Rec">
<xs:complexType>
<xs:sequence>
<xs:element type="xs:string" name="ITEM"/>
<xs:element type="xs:short" name="QTY"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
IMR.xsd
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Catalog">
<xs:complexType>
<xs:sequence>
<xs:element name="Rec">
<xs:complexType>
<xs:sequence>
<xs:element type="xs:short" name="ITEMS"/>
<xs:element type="xs:string" name="MFG"/>
<xs:element type="xs:string" name="ROUTE"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
RP.xsd
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Catalog">
<xs:complexType>
<xs:sequence>
<xs:element name="Rec">
<xs:complexType>
<xs:sequence>
<xs:element type="xs:string" name="REF"/>
<xs:element type="xs:short" name="PON"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Derived from Validating an XML against referenced XSD in C#
I'm having a problem validating XML against schema. Simplified code and examples:
Verification code:
public static void ValidateXmlAgainstSchema(StreamReader xml, XmlSchema xmlSchema)
{
var settings = new XmlReaderSettings { IgnoreWhitespace = true, IgnoreComments = true };
settings.Schemas.Add(xmlSchema);
settings.ValidationType = ValidationType.Schema;
settings.ValidationEventHandler += (obj, args) => { if (args.Exception != null) throw args.Exception; };
using (var reader = XmlReader.Create(xml, settings))
using (XmlReader validatingReader = XmlReader.Create(reader, settings))
{
while (validatingReader.Read()){}
}
}
Schema:
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://foo.com/"
xmlns="http://foo.com/">
<xs:simpleType name="myBool">
<xs:restriction base="xs:string">
<xs:enumeration value="true"/>
<xs:enumeration value="false"/>
<xs:enumeration value="file_not_found"/>
</xs:restriction>
</xs:simpleType>
<xs:complexType name="dataType">
<xs:sequence>
<xs:element name="id" type="xs:string" minOccurs="1" maxOccurs="1" />
<xs:element name="name" type="xs:string" minOccurs="0" maxOccurs="1" />
</xs:sequence>
</xs:complexType>
<xs:element name="foo">
<xs:complexType>
<xs:sequence>
<xs:element name="data" type="dataType" minOccurs="0" maxOccurs="unbounded" />
</xs:sequence>
<xs:attribute name="myBool" type="myBool" use="optional" />
</xs:complexType>
</xs:element>
</xs:schema>
XML:
1.
<?xml version="1.0"?>
<foo xmlns="http://foo.com/" myBool="true">
<data>
<id>1</id>
<name>abc</name>
</data>
</foo>
This example throws an exception:
System.Xml.Schema.XmlSchemaValidationException: The element 'foo' in namespace 'http://foo.com/' has invalid child element 'data'
in namespace 'http://foo.com/'. List of possible elements expected: 'data'.
My understanding is that if the namespace is defined for an element, all child elements will have the same namespace, unless defined otherwise. It doesn't work though. I can make it validate by adding elementFormDefault="qualified" to the schema, which makes all elements default to targetNamespace. Is that a good way of doing it?
2.
<?xml version="1.0"?>
<a:foo xmlns:a="http://foo.com/" a:myBool="true">
<a:data>
<a:id>1</a:id>
<a:name>abc</a:name>
</a:data>
</a:foo>
This example fails with the message:
The 'http://foo.com/:myBool' attribute is not declared.
Every element and attribute has an explicit namespace, so the xml should be valid. Even the error message suggest parser is looking for the attribute I expect it to, but fails to find it. I can make it validate by changing a:myBool to myBool. Why doesn't it work in the first form and works in the other?
elementFormDefault won't do anything to attributes, to set the equivilent for those you need attributeFormDefault. However, by default both of these are set to "unqualified".
The reason approach 2 - a:myBool="true" - failed is becuase the attributeFormDefault value wasn't overridden. If you want to namespace attributes, you can either set this to "qualified" or set the form attribute on the attribute declaration itself to "qualified", like so:
<xs:attribute name="myBool" type="myBool" use="optional" form="qualfied"/>
This should make this a valid element start for approach 2:
<a:foo xmlns:a="http://foo.com/" a:myBool="true">
As for why approach 1 failed, I'm not sure, your XSD and XML match. It might be worth adding setting the attributeFormDefault attribute on the root XSD element to "unqualified", just in case the XSLT engine doesn't recognise their default settings when they aren't declared. Like so:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://foo.com/"
attributeFormDefault="unqualified"
xmlns="http://foo.com/">
Hello all i have a scenario where i have
TWO XML FILES
one xml say "Books.xml" will create the schema for DATASET and i will use other XML file to load data into DATASET and all data in 2nd file will be inserted into DATASET
I have done these things but i m stuck at validating XML i want to validate 2nd xml file
means data entered into dataset should be exactly valid according to DATASET schema which will be according to First XML
i have tried it using XSD but i am not sure if XSD will be able to validate TYPE of data say if i have to input number and i have a string in my xml ,it should throw an exception
i have my xml like this
<DATA>
<HISTORY>
<book1>SOmebook</book1>
</HISTORY>
<POETRY>
<book2>Books</book2>
</POETRY>
</DATA>
and i am generating my XSD Using VisualStudio
to validate it im using a method something like this
XmlReaderSettings settings = //dont know exact settings
string data = null;
XmlReader Reader = XmlReader.Create(File.Open("C:\books.XML", FileMode.Open), settings, data);
DataSet ds = new DataSet();
ds.ReadXml(Reader);
Here's a quick example on how you can use a schema to validate. I'm using the shiporder example from W3Schools.
Schema:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="shiporder">
<xs:complexType>
<xs:sequence>
<xs:element name="orderperson" type="xs:string"/>
<xs:element name="shipto">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="item" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="title" type="xs:string"/>
<xs:element name="note" type="xs:string" minOccurs="0"/>
<xs:element name="quantity" type="xs:positiveInteger"/>
<xs:element name="price" type="xs:decimal"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="orderid" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
</xs:schema>
XML:
<?xml version="1.0" encoding="ISO-8859-1"?>
<shiporder orderid="889923">
<orderperson>John Smith</orderperson>
<shipto>
<name>Ola Nordmann</name>
<address>Langgt 23</address>
<city>4000 Stavanger</city>
<country>Norway</country>
</shipto>
<item>
<title>Empire Burlesque</title>
<note>Special Edition</note>
<quantity>1</quantity>
<price>10.90</price>
</item>
<item>
<title>Hide your heart</title>
<quantity>1</quantity> <!--Change to "one" to see validation error-->
<price>9.90</price>
</item>
</shiporder>
And here's the validation using Linq2XML:
XDocument document = XDocument.Load("shiporder.xml");
XmlSchemaSet schemas = new XmlSchemaSet();
schemas.Add("", XmlReader.Create("schema.xsd"));
bool errors = false;
document.Validate(schemas, (o, e) =>
{
Console.WriteLine("Validation error: {0}", e.Message);
errors = true;
});
if (!errors)
{
Console.WriteLine("XML document successfully validated.");
}
else
{
Console.WriteLine("XML document does not validate.");
}
Change the value of of the <quantity> node as mentioned in the comment in order to see how the XML does not validate because of invalid type.
I need to use a complex type defined in a different assembly in my xsd schema. Both of my .xsd schemas are defined as embedded resources and I've tried to link the sheet I have to import in the assembly, who needs it with no results.
Basically, when I need to validate one of my xml pages, I call this function, but it isn't able to add in cascade the xml schema sets of the types inside Operations.
public static XmlSchema GetDocumentSchema(this Document doc)
{
var actualType = doc.GetType();
var stream = actualType.Assembly.GetManifestResourceStream(actualType.FullName);
if (stream == null)
{
throw new FileNotFoundException("Unable to load the embedded file [" + actualType.FullName + "]");
}
var documentSchema = XmlSchema.Read(stream, null);
foreach (XmlSchemaExternal xmlInclude in documentSchema.Includes)
{
var includeStream = xmlInclude.SchemaLocation != "Operations.xsd"
? actualType.Assembly.GetManifestResourceStream(xmlInclude.Id)
: typeof (Operations).Assembly.GetManifestResourceStream(xmlInclude.Id);
if (includeStream == null)
{
throw new FileNotFoundException("Unable to load the embedded include file [" + xmlInclude.Id + "]");
}
xmlInclude.Schema = XmlSchema.Read(includeStream, null);
}
return documentSchema;
}
This is the main schema:
<?xml version="1.0" encoding="utf-8"?>
<xs:schema id="ExampleSheet"
attributeFormDefault="unqualified"
elementFormDefault="qualified"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:include id="Operations" schemaLocation="Operations.xsd"/>
<xs:element name="ExampleSheet">
<xs:complexType>
<xs:sequence>
<xs:element name="Operations" type="Operations"/>
</xs:sequence>
<xs:attribute name="version" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
</xs:schema>
And this is the schema of Operations:
<xs:schema id="Operations"
elementFormDefault="qualified"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
>
<xs:element name="Operations" type="Operations"/>
<xs:complexType name="Operations">
<xs:choice minOccurs="1" maxOccurs="unbounded">
<xs:element name="Insert" type="Insert"/>
<xs:element name="InsertUpdate" type="InsertUpdate"/>
<xs:element name="Update" type="Update"/>
<xs:element name="Delete" type="Delete"/>
</xs:choice>
<xs:attribute name="version" type="xs:string" use="required"/>
<xs:attribute name="store" type="xs:string" use="required"/>
<xs:attribute name="chain" type="xs:string" use="optional"/>
</xs:complexType>
</xs:schema>
For example, if I have an ExampleSheet with an Insert, it isn't able to recognize it. Operations and Insert are classes who implement IXmlSerializable, and the first one retrieves the schema sets of the inner types using a custom XmlSchemaProvider.
Am I doing something wrong?
How can I help my ExampleSheet to accet the members of Operations?
Should it ExampleSheet implement IXmlSerializable so I can build the reader and writer as I want, and would the schema be still useful?
Instead of XmlSchema, have you looked into the XmlSchemaSet class?
I have not done a lot with XML Serialization, so I don't know if it will fit your current application, but I've used it before in a similar situation where I have to refer to types defined in 3 separate schemas.
The complete XmlSchemaSet object will then have access to all of the types in each of the schemas.