Validate an XML against a specific XSD schema - c#

I have a webservice that gets specific XML which does not have a schema specified in the file itself.
I do have XSD schemas in my project which will be used to test the obtained XML files against them.
The problem is that whatever I do the validator seems to accept the files even when they aren't valid.
The code I'm using is this (some parts omitted to make it easier):
var schemaReader = XmlReader.Create(new StringReader(xmlSchemeInput));
var xmlSchema = XmlSchema.Read(schemaReader, ValidationHandler);
var xmlReaderSettings = new XmlReaderSettings();
xmlReaderSettings.Schemas.Add(xmlSchema);
xmlReaderSettings.ValidationEventHandler += ValidationHandler;
xmlReaderSettings.ValidationType = ValidationType.Schema;
xmlReaderSettings.ValidationFlags |= XmlSchemaValidationFlags.ProcessIdentityConstraints;
xmlReaderSettings.ValidationFlags |= XmlSchemaValidationFlags.ReportValidationWarnings;
xmlReaderSettings.ValidationFlags |= XmlSchemaValidationFlags.ProcessSchemaLocation;
using(var xmlReader = XmlReader.Create(new StringReader(xmlInput), xmlReaderSettings))
{
while (xmlReader.Read()) { }
}
// return if the xml is valid or not
I've found several solutions with an inline specified schema which work great, but with a schema specified like this (which I assume should work) I can't seem to find any.
Am I doing something wrong? Or am I just wrong in assuming this is how it should work?
Thanks!

Try adding
xmlReaderSettings.Schemas.Compile()
after
xmlReaderSettings.Schemas.Add(xmlSchema);
worked for me in that situation.

Related

Validating xml against an xsd that has include and import in c#

I would like to cache the xsd then perform validation against it rather than loading xsd every time for any xml in order to increase performance. However, I could not manage to do it. My guess is that the compilation does not add the include and import elements of the xsd files that is why I get the error below.
Here are the steps:
First I added the xsd file to XmlSchemaSet
public XmlSchemaSet SchemaSet;
public XmlValidator()
{
this.SchemaSet = new XmlSchemaSet();
using (XmlReader xsd = XmlReader.Create(xsdPath))
{
this.SchemaSet.Add(null, xsd);
}
this.SchemaSet.Compile();
}
Then I used this XmlSchemaSet to validate the xml as follows:
public void ValidateSchema(byte[] xml)
{
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationFlags |= XmlSchemaValidationFlags.ProcessSchemaLocation;
settings.ValidationType = ValidationType.Schema;
settings.DtdProcessing = DtdProcessing.Parse;
// need to add dtd first. it should not be added to the schema set above
using (XmlReader dtd = XmlReader.Create(dtdPath, settings))
{
settings.Schemas.Add(null, dtd);
}
settings.DtdProcessing = DtdProcessing.Prohibit;
settings.Schemas.Add(this.SchemaSet); // add the schema set
using (MemoryStream memoryStream = new MemoryStream(xml, false))
{
using (XmlReader validator = XmlReader.Create(memoryStream, settings))
{
while (validator.Read());
}
}
}
Notes: dtdPath, xsdPath are valid, input xml is valid, xsd files are valid
The error is:
The'http://www.unece.org/cefact/namespaces/StandardBusinessDocumentHeader:StandardBusinessDocument' element is not declared.
I created a XmlReaderSettings with following options:
XmlReaderSettings settings = new XmlReaderSettings();
settings.XmlResolver = new XmlXsdResolver(); // Need this for resolving include and import
settings.ValidationType = ValidationType.Schema; // This might not be needed, I am using same settings to validate the input xml
settings.DtdProcessing = DtdProcessing.Parse; // I have an include that is dtd. maybe I should prohibit dtd after I compile the xsd files.
Then I used it with an XmlReader to read the xsd. The important part is that I had to put a basePath so that the XmlXsdResolve can find other xsd files.
using (XmlReader xsd = XmlReader.Create(new FileStream(xsdPath, FileMode.Open, FileAccess.Read), settings, basePath))
{
settings.Schemas.Add(null, xsd);
}
settings.Schemas.Compile();
This is the XmlXsdResolver to find included and imported xsd files:
protected class XmlXsdResolver : XmlUrlResolver
{
public override object GetEntity(Uri absoluteUri, string role, Type ofObjectToReturn)
{
return base.GetEntity(absoluteUri, role, ofObjectToReturn);
}
}

XML validation against XSD that includes more XSDs: type is not declared

I have a problem in validating XML against XSD when the base XSD is importing some other XSDs from site. For example, for the following XSD item, it is throwing error.
<link:linkbase xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:link = 'http://www.xbrl.org/2003/linkbase' xmlns:xbrli = 'http://www.xbrl.org/2003/instance' xmlns:xlink = 'http://www.w3.org/1999/xlink' xsi:schemaLocation = 'http://www.xbrl.org/2003/linkbase http://www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd' >
Is there any solution for importing the XSD by release version of DLLs. I am using the following C# code for validating XML against the XSD. The same is working when I execute it through Visual Studio.
var schemas = new XmlSchemaSet();
schemas.Add(null, xsdFilePath);
var readerSettings = new XmlReaderSettings();
readerSettings.ValidationType = ValidationType.Schema;
readerSettings.ValidationFlags |= XmlSchemaValidationFlags.ProcessSchemaLocation;
readerSettings.ValidationFlags |= XmlSchemaValidationFlags.ReportValidationWarnings;
readerSettings.Schemas.Add(schemas);
using (var xmlReader = XmlReader.Create(xmlFilePath, readerSettings))
{
while (xmlReader.Read())
{
}
}
Obviously, the parser cannot find the schema xbrl-instance-2003-12-31. From the w3 schema specs:
(xsi:schemaLocation) records the author's warrant with pairs of URI references (one for the namespace name, and one for a hint as to the location of a schema document defining names for that namespace name)
that is, the first part of your schemaLocation definition xbrl.org/2003/xbrl-instance-2003-12-31.xsd is the namespace. If the parser doesn't already know where to find the schema for such namespace, you must provide it with the location. For example:
<xs:import
namespace='xbrl.org/2003/instance'
schemaLocation='xbrl.org/2003/xbrl-instance-2003-12-31.xsd http:/xbrl.org/2003/xbrl-instance-2003-12-31.xsd'/>

Handle error coused by double reading schema

Everything works if a xml document has no a reference to an XML Schema
<Application xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.companyname.com/blabla"
xmlns="http://www.companyname.com/blabla">
But if the xml has the reference to schema on local machine like this:
<Application xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.companyname.com/blabla
Schemas\myschema.xsd"
xmlns="http://www.companyname.com/blabla">
This results with error "The global element 'TopElementName' has already been declared.
XmlReaderSettings xrs = new XmlReaderSettings();
xrs.ValidationType = ValidationType.Schema;
xrs.ValidationFlags |= XmlSchemaValidationFlags.ProcessInlineSchema;
xrs.ValidationFlags |= XmlSchemaValidationFlags.ProcessSchemaLocation;
xrs.ValidationFlags |= XmlSchemaValidationFlags.ReportValidationWarnings;
//xsd is located (intalled) in this same location where myapp.exe is.
string startLoc = System.Reflection.Assembly.GetExecutingAssembly().Location;
string xsd = Path.Combine(Path.GetDirectoryName(startLoc), "myschema.xsd");
using (Stream schemaStr = new FileStream(xsd, FileMode.Open))
{
XmlSchema s = XmlSchema.Read(schemaStr, null);
xrs.Schemas.Add(s);
}
xrs.Schemas.Compile();
using (XmlReader r = XmlReader.Create(xmlPath, xrs))
{
while (r.Read()){}
r.Close();
}
How to avoid this error?
The simplest solution: Invoke your schema processor with options that tell it to read the schema documents you specify at invocation time, and to ignore the xsi:schemaLocation hints in the input being validated. (If your schema validator doesn't have such options, get a new schema validator.)
The bogus xsi:schemaLocation in your first example should be fixed, independently of the validation options.
I just deleted the following flag:
XmlSchemaValidationFlags.ProcessSchemaLocation

Validating an XML against an embedded XSD in C#

Using the following MSDN documentation I validate an XML file against a schema: http://msdn.microsoft.com/en-us/library/8f0h7att%28v=vs.100%29.aspx
This works fine as long as the XML contains a reference to the schema location or the inline schema. Is it possible to embed the schema "hard-coded" into the application, i.e. the XSD won't reside as a file and thus the XML does not need to reference it?
I'm talking about something like:
Load XML to be validated (without schema location).
Load XSD as a resource or whatever.
Do the validation.
Try this:
Stream objStream = objFile.PostedFile.InputStream;
// Open XML file
XmlTextReader xtrFile = new XmlTextReader(objStream);
// Create validator
XmlValidatingReader xvrValidator = new XmlValidatingReader(xtrFile);
xvrValidator.ValidationType = ValidationType.Schema;
// Add XSD to validator
XmlSchemaCollection xscSchema = new XmlSchemaCollection();
xscSchema.Add("xxxxx", Server.MapPath(#"/zzz/XSD/yyyyy.xsd"));
xvrValidator.Schemas.Add(xscSchema);
try
{
while (xvrValidator.Read())
{
}
}
catch (Exception ex)
{
// Error on validation
}
You can use the XmlReaderSettings.Schemas property to specify which schema to use. The schema can be loaded from a Stream.
var schemaSet = new XmlSchemaSet();
schemaSet.Add("http://www.contoso.com/books", new XmlTextReader(xsdStream));
var settings = new XmlReaderSettings();
settings.Schemas = schemaSet;
using (var reader = XmlReader.Create(xmlStream, settings))
{
while (reader.Read());
}
You could declare the XSD as an embedded resource and load it via GetManifestResourceStream as described in this article: How to read embedded resource text file
Yes, this is possible. Read the embedded resource file to string and then create your XmlSchemaSet object adding the schema to it. Use it in your XmlReaderSettings when validating.

Can't find "schemaLocation" attribute

I'm creating a schema to validate some XML, but when it comes down to actually reading in the document, I'm getting the error:
The 'http://www.w3.org/2001/XMLSchema:schemaLocation' attribute is not declared.
This is what the beginning of one of the XML files using the schema looks like.
<?xml version="1.0"?>
<envelope xsi:schemaLocation="C:\LocalPath MySchema.xsd"
xmlns:xsi="http://www.w3.org/2001/XMLSchema"
xmlns="http://tempuri.org/MySchema.xsd">
...
</envelope>
My validation code looks like this:
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
Settings.Schemas.Add(#"http://tempuri.org/MySchema.xsd",
#"C:\LocalPath\ MySchema.xsd");
XmlReader reader = XmlReader.Create(#"C:\LocalPath\testxml\somefile.xml", settings);
xmlDoc.Load(reader);
ValidationEventHandler eventHander = new ValidationEventHandler(validationHandler);
xmlDoc.Validate(eventHander);
The namespace http://www.w3.org/2001/XMLSchema (with conventional prefix xsd or xs) is for schema documents; the schemaLocation attribute you want is in the namespace http://www.w3.org/2001/XMLSchema-instance (which has the conventional prefix xsi for "XML Schema Instance namespace").

Categories

Resources