Im trying to validate an XML file using a .DTD but it gives me the following error.
'ENTITY' is an unexpected token. The expected token is 'DOCTYPE'. Line 538, position 3.
public static void Validate(string xmlFilename, string schemaFilename)
{
XmlTextReader r = new XmlTextReader(xmlFilename);
XmlValidatingReader validator = new XmlValidatingReader(r);
validator.ValidationType = ValidationType.Schema;
XmlSchemaCollection schemas = new XmlSchemaCollection();
schemas.Add(null, schemaFilename);
validator.ValidationEventHandler += new ValidationEventHandler(ValidationEventHandler);
try
{
while (validator.Read())
{ }
}
catch (XmlException err)
{
Console.WriteLine(err.Message);
}
finally
{
validator.Close();
}
}
The DTD im using to validate = http://www.editeur.org/onix/2.1/reference/onix-international.dtd
I hope someone can help me thanks!
I realise this is a really old question, but for anyone else struggling with this problem, here's what I did.
I gave up trying to validate with the DTD.
Instead, I ended up using the onix 2.1 xsd available at http://www.editeur.org/15/Previous-Releases/#R%202.1%20Downloads. I had to set the default namespace:
var nt = new NameTable();
var ns = new XmlNamespaceManager(nt);
ns.AddNamespace(string.Empty, "http://www.editeur.org/onix/2.1/reference");
var context = new XmlParserContext(null, ns, null, XmlSpace.None);
and then when loading the xml, turn off DTD parsing (this is using .NET4)
var settings = XmlReaderSettings
{
ValidationType = System.Xml.ValidationType.Schema,
DtdProcessing = DtdProcessing.Ignore
}
using(var reader = XmlReader.Create("path to xml file", settings)) { ... }
Edit:
Just noticed: your validation type is also set wrong. Try setting it to ValidationType.DTD instead of Schema.
ValidationType at MSDN
--
The error means exactly as it states- the DTD that is referenced is not well formed, as DOCTYPE should be present before any other declarations in a DTD.
Document Type Definition (Wikipedia)
Introduction to DTD (w3schools)
You might be able to get around this by downloading a local copy, modifying it to add in the expected root element yourself, and then referencing your edited version in your source.
Related
EDIT.
I have a problem with XmlDsigXPathTransform valiation. Sad to say even when I copied 1:1 the example from docs the xpath validations ends failed. What am I missing? I can't figure anything anymore about this when even the docs example fails.
https://learn.microsoft.com/pl-pl/dotnet/api/system.security.cryptography.xml.xmldsigxpathtransform?view=netframework-4.6.1
var signatureReference = new Reference { Uri = "", };
XmlDsigXPathTransform XPathTransform =
CreateXPathTransform(XPathString);
signatureReference.DigestMethod = "http://www.w3.org/2001/04/xmlenc#sha256";
signatureReference.AddTransform(XPathTransform);
signedXml.AddReference(signatureReference);
private static XmlDsigXPathTransform CreateXPathTransform(string XPathString)
{
XmlDocument doc = new XmlDocument();
XmlElement xPathElem = doc.CreateElement("XPath");
xPathElem.InnerText = XPathString;
XmlDsigXPathTransform xForm = new XmlDsigXPathTransform();
xForm.LoadInnerXml(xPathElem.SelectNodes("."));
return xForm;
}
The XmlDsigXPathTransform is no longer considered safe, so any document using it is automatically considered to have an invalid signature.
https://referencesource.microsoft.com/#System.Security/system/security/cryptography/xml/signedxml.cs,8b616077b30145cd
If you really want to use it, you have to enable it in the Windows Registry on whatever computers are going to call CheckSignature.
https://support.microsoft.com/en-us/topic/after-you-apply-security-update-3141780-net-framework-applications-encounter-exception-errors-or-unexpected-failures-while-processing-files-that-contain-signedxml-922edd45-a91e-c755-bb30-2604acf37362
SignedXml is old and outdated, my recommendation is to not use it at all, unless you have to for compatibility (the .NET team calls it legacy and says it's not being invested in on issues, e.g. https://github.com/dotnet/runtime/issues/44674#issuecomment-875163316).
I have a c# script that validates an XML document against an XSD document, as follows:
static bool IsValidXml(string xmlFilePath, string xsdFilePath)
{
XmlReaderSettings settings = new XmlReaderSettings();
settings.Schemas.Add(null, xsdFilePath);
settings.ValidationType = ValidationType.Schema;
settings.Schemas.Compile();
try
{
XmlReader xmlRead = XmlReader.Create(xmlFilePath, settings);
while (xmlRead.Read())
{ };
xmlRead.Close();
}
catch (Exception e)
{
return false;
}
return true;
}
I've compiled this after looking at a number of MSDN articles and questions here where this is the solution. It does correctly validate that the XSD is formed well (returns false if I mess with the file) and checks that the XML is formed well (also returns false when messed with).
I've also tried the following, but it does the exact same thing:
static bool IsValidXml(string xmlFilePath, string xsdFilePath)
{
XDocument xdoc = XDocument.Load(xmlFilePath);
XmlSchemaSet schemas = new XmlSchemaSet();
schemas.Add(null, xsdFilePath);
try
{
xdoc.Validate(schemas, null);
}
catch (XmlSchemaValidationException e)
{
return false;
}
return true;
}
I've even pulled a completely random XSD off the internet and thrown it into both scripts, and it still validates on both. What am I missing here?
Using .NET 3.5 within an SSIS job.
In .NET you have to check yourself if the validator actually matches a schema component; if it doesn't, there is no exception thrown, and so your code will not work as you expect.
A match means one or both of the following:
there is one global element in your schema set with a qualified name that is the same as your XML document element's qualified name.
the document element has an xsi:type attribute, that is a qualified name pointing to a global type in your schema set.
In streaming mode, you can do this check easily. This pseudo-kind-of-code should give you an idea (error handling not shown, etc.):
using (XmlReader reader = XmlReader.Create(xmlfile, settings))
{
reader.MoveToContent();
var qn = new XmlQualifiedName(reader.LocalName, reader.NamespaceURI);
// element test: schemas.GlobalElements.ContainsKey(qn);
// check if there's an xsi:type attribute: reader["type", XmlSchema.InstanceNamespace] != null;
// if exists, resolve the value of the xsi:type attribute to an XmlQualifiedName
// type test: schemas.GlobalTypes.ContainsKey(qn);
// if all good, keep reading; otherwise, break here after setting your error flag, etc.
}
You might also consider the XmlNode.SchemaInfo which represents the post schema validation infoset that has been assigned to a node as a result of schema validation. I would test different conditions and see how it works for your scenario. The first method is recommended to reduce the attack surface in DoS attacks, as it is the fastest way to detect completely bogus payloads.
I am using following code to validate XML agains the XSD:
public static bool IsValidXmlOld(string xmlFilePath, string xsdFilePath)
{
if (File.Exists(xmlFilePath) && File.Exists(xsdFilePath))
{
try
{
XDocument xdocXml = XDocument.Load(xmlFilePath);
var schemas = new XmlSchemaSet();
schemas.Add(null, xsdFilePath);
Boolean result = true;
xdocXml.Validate(schemas, (sender, e) =>
{
result = false;
});
return result;
}
catch (Exception ex)
{
// Logging logic + error handling logic
throw new Exception(ex.Message);
}
}
throw new Exception("Either the Schema or the XML file does not exist Please check");
}
For some reason, it always returns true even if the XML is not valid for given XSD. I picked up this code from following link:
Validate XML against XSD in a single method. sounds like that result= false never gets called even if the xml is completely invalid.
I have a pair of valid and invalid XML that goes against a particular XSD
Valid XML
XSD
Invalid XML
If I try to validate them on This web site then the valid one passes the validation test against the invalid one BUT the invalid XML Fails the test. However, the code above passes both the XMLs invariably.
At the same time it fails the validation when I use some basic XML like following:
XDocument doc2 = new XDocument(
new XElement("Root",
new XElement("Child1", "content1"),
new XElement("Child3", "content1")
)
);
with following error:
The 'Root' element is not declared.: {0}
Now, it clearly demonstrates that the code is not completely incapable of failing a validation. However, what is so special about the 3. Invalid XML that the code passes that particular XML when This Site clearly fails it?
I have a Web Method (within a SOAP Web Service) with a signature of:
public msgResponse myWebMethod([XmlAnyElement] XmlElement msgRequest)
I chose to use the XmlElement parameter after reading that it would allow me to perform my own XSD validation on the parameter. The problem is that the parameter can be quite large (up to 80Mb of XML) so calling XmlElement.OuterXML() as suggested in the link isn't a very practical method.
Is there another way to validate the XmlElement object against an XSD?
More generally, is this an inappropriate approach for implementing a web service expecting large amounts of XML? I've come across some hints at using SoapExtensions for gaining access to the input stream directly but am not sure this is the correct approach for my situation.
Note: Unfortunately, I'm chained to an existing WSDL and XSD that I have no power to alter which is why I went with a non-WCF implementation in the first place.
Here's a quick example. Just pass your XmlElement to this method:
private static void TheAnswer(IXPathNavigable inputElement)
{
var schemas = new XmlSchemaSet();
schemas.Add("http://foo.org/importvalidator.xsd",
#"..\..\validator.xsd");
var settings = new XmlReaderSettings
{
Schemas = schemas,
ValidationFlags =
XmlSchemaValidationFlags.
ProcessIdentityConstraints |
XmlSchemaValidationFlags.
ReportValidationWarnings,
ValidationType = ValidationType.Schema
};
settings.ValidationEventHandler +=
(sender, e) =>
Console.WriteLine("{0}: {1}", e.Severity, e.Message);
using (
XmlReader documentReader =
inputElement.CreateNavigator().ReadSubtree())
{
using (
XmlReader validatingReader = XmlReader.Create(
documentReader, settings))
{
while (validatingReader.Read())
{
}
}
}
}
It would be fantastic if you could help me rid of these warnings below.
I have not been able to find a good document. Since the warnings are concentrated in just the private void ValidateConfiguration( XmlNode section ) section, hopefully this is not terribly hard to answer, if you have encountered this before.
Thanks!
'System.Configuration.ConfigurationException.ConfigurationException(string)' is obsolete: 'This class is obsolete, to create a new exception create a System.Configuration!System.Configuration.ConfigurationErrorsException'
'System.Xml.XmlValidatingReader' is obsolete: 'Use XmlReader created by XmlReader.Create() method using appropriate XmlReaderSettings instead. http://go.microsoft.com/fwlink/?linkid=14202'
private void ValidateConfiguration( XmlNode section )
{
// throw if there is no configuration node.
if( null == section )
{
throw new ConfigurationException("The configuration section passed within the ... class was null ... there must be a configuration file defined.", section );
}
//Validate the document using a schema
XmlValidatingReader vreader = new XmlValidatingReader( new XmlTextReader( new StringReader( section.OuterXml ) ) );
// open stream on Resources; the XSD is set as an "embedded resource" so Resource can open a stream on it
using (Stream xsdFile = XYZ.GetStream("ABC.xsd"))
using (StreamReader sr = new StreamReader(xsdFile))
{
vreader.ValidationEventHandler += new ValidationEventHandler(ValidationCallBack);
vreader.Schemas.Add(XmlSchema.Read(new XmlTextReader(sr), null));
vreader.ValidationType = ValidationType.Schema;
// Validate the document
while (vreader.Read()) { }
if (!_isValidDocument)
{
_schemaErrors = _sb.ToString();
throw new ConfigurationException("XML Document not valid");
}
}
}
// Does not cause warnings.
private void ValidationCallBack( object sender, ValidationEventArgs args )
{
// check what KIND of problem the schema validation reader has;
// on FX 1.0, it gives a warning for "<xs:any...skip" sections. Don't worry about those, only set validation false
// for real errors
if( args.Severity == XmlSeverityType.Error )
{
_isValidDocument = false;
_sb.Append( args.Message + Environment.NewLine );
}
}
Replace
throw new ConfigurationException(....)
with
throw new ConfigurationErrorsException(....)
Replace XmlValidatingReader vreader = new XmlValidatingReader(...)
with
var vreader = XmlReader.Create(new StringReader(section.OuterXml),
new XmlReaderSettings
{
ValidationType = ValidationType.Schema
});
Basically, it's telling you to use the XmlReaderSettings instead of the XmlValidatingReader, which was deprecated.
Personally I'm not going to do the conversion, I think that you actually doing that will be good for your coding development, so here is some resources:
Look at the overloads of the XmlReader.Create() method, specifically this one.
Then have a look at the different properties associated with the XmlReaderSettings class: http://msdn.microsoft.com/en-us/library/system.xml.xmlreadersettings_members.aspx
Give it a try, see what happens and if your still having problems, ask another question :)
HTH