I am using this piece of code from MSDN to create an XSD from an XML
XmlReader reader = XmlReader.Create("contosoBooks.xml");
XmlSchemaSet schemaSet = new XmlSchemaSet();
XmlSchemaInference schema = new XmlSchemaInference();
schemaSet = schema.InferSchema(reader);
foreach (XmlSchema s in schemaSet.Schemas())
{
textbox.text = s.ToString();
}
I want to output the .xsd based on my xml file. When I generate the .xsd file, the only content I get inside it is: System.Xml.Schema.XmlSchema
When I generate the XSD using Visual Studio option to create Schema, it comes out properly. However, I have over 150 xml docs that I need to create XSD for hence need a programmatic option. Can anyone help?
xsd.exe can do what you want:
If you specify an XML file (.xml extension), Xsd.exe infers a schema
from the data in the file and produces an XSD schema. The output file
has the same name as the XML file, but with the .xsd extension.
The following command generates an XML schema from myFile.xml and saves it to the specified directory.
xsd myFile.xml /outputdir:myOutputDir
You can read more about it here and here
OR
You can try programmatically like this:
XmlReader reader = XmlReader.Create(#"yourxml.xml");
XmlSchemaSet schemaSet = new XmlSchemaSet();
XmlSchemaInference schema = new XmlSchemaInference();
schemaSet = schema.InferSchema(reader);
foreach (XmlSchema s in schemaSet.Schemas())
{
using (var stringWriter = new StringWriter())
{
using (var writer = XmlWriter.Create(stringWriter))
{
s.Write(writer);
}
textbox.text = stringWriter.ToString();
}
}
This is what you're missing...
instead of simply doing s.ToString(), do this:
XmlWriter writer;
int count = 0;
foreach (XmlSchema s in schemaSet.Schemas())
{
writer = XmlWriter.Create((count++).ToString() + "_contosobooks.xsd");
s.Write(writer);
writer.Close();
Console.WriteLine("Done " + count);
}
reader.Close();
You can then write proper logic to do the read/write more gracefully, read many xml files and create corresponding xsd files, etc.
I took the contosobooks.xml from here:
https://code.google.com/p/code4cs/source/browse/trunk/AppCase/dNet/Xml/data/contosoBooks.xml?spec=svn135&r=135
and the output xsd is:
<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" targetNamespace="http://www.contoso.com/books" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="bookstore">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="book">
<xs:complexType>
<xs:sequence>
<xs:element name="title" type="xs:string" />
<xs:element name="author">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="0" name="name" type="xs:string" />
<xs:element minOccurs="0" name="first-name" type="xs:string" />
<xs:element minOccurs="0" name="last-name" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="price" type="xs:decimal" />
</xs:sequence>
<xs:attribute name="genre" type="xs:string" use="required" />
<xs:attribute name="publicationdate" type="xs:date" use="required" />
<xs:attribute name="ISBN" type="xs:string" use="required" />
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
I use this command line:
Administrator:Developer Command Prompt for VS2013
Create xsd file from xml:
xsd FullFilePath.xml /outputdir:myOutputDir
Generate object to represent that xsd file in our project.
xsd /c /n:myNameSpace FullPath\fileName.xsd
then it will tell where the .cs file is created.
Use the command-line tool xsd.exe
Example usage:
nameOf.xsd /c /n:yourNamespace /out:C:\path\to\save
/c parameter creates a class (as opposed to /dataset which creates a strongly typed DataSet)
You can usually find this at: C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin
And then use a batch file or Powershell to create all 150 classes.
Download microsoft visual studio 2010 or 2012.its very easy. Just click right on .xml file then choose microsft visual studio,then you will see xml tab, click then xml schema. It will generate xsd, save it to your local.
Related
Our software spits out a number of xml files and I need to determine which is which. For example, there are three different types of xml file (heavily abbreviated):
"IQ.xml"
<?xml version="1.0" encoding="ISO-8859-1"?>
<Catalog xmlns:dt="urn:schemas-microsoft-com:datatypes">
<Rec>
<ITEM dt:dt="string"></ITEM>
<QTY dt:dt="string"></QTY>
</Rec>
</Catalog>
"IMR.xml"
<?xml version="1.0" encoding="ISO-8859-1"?>
<Catalog xmlns:dt="urn:schemas-microsoft-com:datatypes">
<Rec>
<ITEMS dt:dt="string"></ITEMS>
<MFG dt:dt="string"></MFG>
<ROUTE dt:dt="string"></ROUTE>
</Rec>
</Catalog>
"RP.xml"
<?xml version="1.0" encoding="ISO-8859-1"?>
<Catalog xmlns:dt="urn:schemas-microsoft-com:datatypes">
<Rec>
<REF dt:dt="string"></REF>
<PON dt:dt="string"></PON>
</Rec>
</Catalog>
Anyone of these could be passed out at any time and I need a way to determine where to pass these files to. What is the best way to achieve this? Could a schema be used to test the xml file against the fields and then a result passed back?
My initial thoughts were to test against a schema if it doesn't match the first , move on to the second and so on. This is then hard coded and cannot be changed when different XML file types are later added so I'm not too keen on this. I'm not sure at this stage whether this is even the best approach?
This will be coded in C# so I'm not sure whether there are any inbuilt functions which can help or whether it will have to be custom written.
Has anyone needed to do this before? How did you tackle this?
What I would suggest is to validate the XML file over a schema(like you yourself suggested).
Regarding your problem related to the flexibility of your code to later support other schema's there are many choices but it depends on what you want to do.
For example you can keep all your schema's I an config file, and when you validate a new XML file you can run it programmatically through supported schema's, if there is no match.you can throw an exception(unsupported XML file structure for example).
You might also define statically combinations between certain XML files and certain schema's, which you can later deduce programmatically.
Of course when you want to support new schemas you'll need to change the code... But that's a normal behavior.
To create a fully generic and automated method of handling any kind of XML file and any kind of schema will be difficult and you'll need to probably use some sort of naming convention where you would deduce the associated schema from the name or from some information embedded inside the XML file. This could be done at runtime but even then you'll probably support only a limited number of behaviors and you'll need new code when you want to expand your application.
Use an XmlReader with an XmlReaderSettings which specifies the type of validation to perform and a ValidationEventHandler. This can be wrapped into a method that will give you the schema or schemas against which the XML document was successfully validated.
If you're concerned about new schemas being added in the future, then just store them in a central location like a directory and grab them at runtime. New schemas could simply be dropped into the directory as needed.
void Main()
{
var rootDirectory = #"C:\Testing";
var schemaDirectory = Path.Combine(rootDirectory, "Schemas");
var dataDirectory = Path.Combine(rootDirectory, "Data");
var schemaFiles = new[] {
Path.Combine(schemaDirectory, "IQ.xsd"),
Path.Combine(schemaDirectory, "IMR.xsd"),
Path.Combine(schemaDirectory, "RP.xsd")
};
var dataFiles = new[] {
Path.Combine(dataDirectory, "IQ.xml"),
Path.Combine(dataDirectory, "IMR.xml"),
Path.Combine(dataDirectory, "RP.xml")
};
var results = FindMatchingSchemas(dataFiles[1], schemaFiles).Dump();
Console.WriteLine("Matching schema is: {0}", results.First(r => r.Value));
}
private static Dictionary<string, bool> FindMatchingSchemas(string dataFile, string[] schemaFiles)
{
var results = new Dictionary<string, bool>();
foreach (var schemaFile in schemaFiles)
{
results.Add(schemaFile, true);
// Set the validation settings.
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
settings.ValidationFlags |= XmlSchemaValidationFlags.ProcessInlineSchema;
settings.ValidationFlags |= XmlSchemaValidationFlags.ProcessSchemaLocation;
settings.ValidationFlags |= XmlSchemaValidationFlags.ReportValidationWarnings;
settings.ValidationEventHandler += new ValidationEventHandler((object sender, ValidationEventArgs args) =>
{
// Validation error
results[schemaFile] = false;
});
settings.Schemas.Add(null, schemaFile);
// Create the XmlReader object.
XmlReader reader = XmlReader.Create(dataFile, settings);
// Parse the file.
while (reader.Read());
}
return results;
}
// Output: Matching schema is: C:\Testing\Schemas\IMR.xsd
There is a free website which can generate XSD documents from XML documents. http://www.freeformatter.com/xsd-generator.html
IQ.xsd
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Catalog">
<xs:complexType>
<xs:sequence>
<xs:element name="Rec">
<xs:complexType>
<xs:sequence>
<xs:element type="xs:string" name="ITEM"/>
<xs:element type="xs:short" name="QTY"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
IMR.xsd
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Catalog">
<xs:complexType>
<xs:sequence>
<xs:element name="Rec">
<xs:complexType>
<xs:sequence>
<xs:element type="xs:short" name="ITEMS"/>
<xs:element type="xs:string" name="MFG"/>
<xs:element type="xs:string" name="ROUTE"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
RP.xsd
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Catalog">
<xs:complexType>
<xs:sequence>
<xs:element name="Rec">
<xs:complexType>
<xs:sequence>
<xs:element type="xs:string" name="REF"/>
<xs:element type="xs:short" name="PON"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Derived from Validating an XML against referenced XSD in C#
I am testing switching from .NET version 4.5.1 to 4.6 and ran into an NullReferenceExceptionin the xsd validation when using an unique constraint on an optional attribute.
at System.Xml.Schema.KeySequence.ToString()
at System.Xml.Schema.XmlSchemaValidator.EndElementIdentityConstraints(Object typedValue, String stringValue, XmlSchemaDatatype datatype)
at System.Xml.Schema.XmlSchemaValidator.InternalValidateEndElement(XmlSchemaInfo schemaInfo, Object typedValue)
at System.Xml.XsdValidatingReader.ProcessEndElementEvent()
at System.Xml.XsdValidatingReader.ProcessElementEvent()
at System.Xml.XsdValidatingReader.ProcessReaderEvent()
at System.Xml.XsdValidatingReader.Read()
at ConsoleApplication.Program.Main(String[] args)
This is stripped code which runs when targeting v4.5.x but fails with a NullReferenceException when using 4.6. (Tested on Win7 with VS2013 and VS2015). Is this legal in xml? Even if not it should raise some XmlException.
Schema:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Enumerations">
<xs:complexType>
<xs:sequence>
<xs:element name="Enum" minOccurs="0" maxOccurs="unbounded">
<xs:complexType>
<xs:attribute name="id" type="xs:string" use="optional"/>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:unique name="unique_EnumId_contraint">
<xs:selector xpath="Enum"/>
<xs:field xpath="#id"/>
</xs:unique>
</xs:element>
</xs:schema>
XML:
<?xml version="1.0" encoding="utf-8"?>
<Enumerations>
<Enum />
<Enum />
</Enumerations>
C# code:
var settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
settings.Schemas.Add(null, "enumerations.xsd");
using (var xmlReader = XmlReader.Create("enumerations.xml", settings))
{
while (xmlReader.Read())
{
if (xmlReader.NodeType == XmlNodeType.Element)
{
Console.CursorLeft = xmlReader.Depth * 4;
Console.WriteLine(xmlReader.Name);
}
}
}
I can reproduce this. Looks like a bug to me (<rant>.NET 4.6 has a lot...</rant>). You should report it to Microsoft Connect.
While this is fixed, you can check the source here: http://referencesource.microsoft.com/#System.Xml/System/Xml/Schema/ConstraintStruct.cs,091791a9542f1952
What it tells us is it can be overcome using an AppContext switch, so just add this code before any other and it will work:
AppContext.SetSwitch("Switch.System.Xml.IgnoreEmptyKeySequences", true);
More on this switch is available here: Mitigation: XML Schema Validation - note the sentence: "The impact of this change should be minimal" :-)
PS: I believe you can also change these switches using the proper .config file.
I searched and did not find any questions addressing this problem.
I am attempting to validate various XML against a schema and it seems to be validating ALL well-formed XML, instead of just XML that conforms to the schema. I have posted the code I am using, the Schema, a sample valid XML and a sample invalid XML.
I have been struggling with this for awhile. I am in the dark on most of this. I've had to learn how to write an XSD, write the XSD, then learn how to parse XML in C#. None of which I have ever done before. I have used many tutorials and the microsoft website to come up with the following. I think this should work, but it doesn't.
What am I doing wrong?
private bool ValidateXmlAgainstSchema(string sourceXml, string schemaUri)
{
bool validated = false;
try
{
// REF:
// This should create a SCHEMA-VALIDATING XMLREADER
// http://msdn.microsoft.com/en-us/library/w5aahf2a(v=vs.110).aspx
XmlReaderSettings xmlSettings = new XmlReaderSettings();
xmlSettings.Schemas.Add("MySchema.xsd", schemaUri);
xmlSettings.ValidationType = ValidationType.Schema;
xmlSettings.ValidationFlags = XmlSchemaValidationFlags.None;
XmlReader xmlReader = XmlReader.Create(new StringReader(sourceXml), xmlSettings);
// parse the input (not sure this is needed)
while (xmlReader.Read()) ;
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(xmlReader);
validated = true;
}
catch (XmlException e)
{
// load or parse error in the XML
validated = false;
}
catch (XmlSchemaValidationException e)
{
// Validation failure in XML
validated = false;
}
catch (Exception e)
{
validated = false;
}
return validated;
}
The XSD / Schema. The intent is to accept XML that contains either an Incident or a PersonOfInterest.
<?xml version="1.0" encoding="utf-8"?>
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="MySchema.xsd"
xmlns="MySchema.xsd"
elementFormDefault="qualified"
>
<xs:element name="Incident" type="IncidentType"/>
<xs:element name="PersonOfInterest" type="PersonOfInterestType"/>
<xs:complexType name="IncidentType">
<xs:sequence>
<xs:element name="Description" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="PersonOfInterest" type="PersonOfInterestType" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="PersonOfInterestType">
<xs:sequence>
<xs:element name="Name" type="xs:string" minOccurs="1" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
Here is a sample of valid XML
<?xml version="1.0" encoding="utf-8" ?>
<Incident
xmlns="MySchema.xsd"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3schools.com MySchema.xsd"
>
<Description>something happened</Description>
<PersonOfInterest>
<Name>Joe</Name>
</PersonOfInterest>
<PersonOfInterest>
<Name>Sue</Name>
</PersonOfInterest>
</Incident>
This is a sample of well-formed invalid XML which should throw an exception (I thought), but when I try it, the code returns true, indicating it is valid against the schema.
<ghost>Boo</ghost>
The reason your <ghost>Boo</ghost> validates is that the parser cannot find any schema matching the xml. If there is no schema then the parser assumed validity, providing the xml is well-formed. It's counter-intuitive I know, and will probably differ based on parser implementation.
This notwithstanding, there are several problems with your code:
Two Root Elements
This is a big no-no in xsd - you can only have a single root element. Some parsers will actually throw an exception, others tolerate it but will only use the first root element (in your case Incident) for any subsequent validation.
Use of schemaLocation attribute
This should take the value (namespace) (URI) where the namespace is the targetNamespace of the schema and the URI is the location of the schema. In your case you appear to be using the schema file name as your target namespace. Additionally, looking at your code, you are loading the schema into your xml reader so you don't actually need the schemaLocation attribute at all. This is an optional attribute and some parsers completely ignore it.
I would suggest the following changes:
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://MyMoreMeaningfulNamespace"
xmlns="http://MyMoreMeaningfulNamespace"
elementFormDefault="qualified"
>
<xs:element name="Root">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="Incident" type="IncidentType"/>
<xs:element maxOccurs="unbounded" name="PersonOfInterest" type="PersonOfInterestType"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:complexType name="IncidentType">
<xs:sequence>
<xs:element name="Description" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="PersonOfInterest" type="PersonOfInterestType" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="PersonOfInterestType">
<xs:sequence>
<xs:element name="Name" type="xs:string" minOccurs="1" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
Which validates this instance
<Root xmlns="http://MyMoreMeaningfulNamespace">
<Incident>
<Description>something happened</Description>
<PersonOfInterest>
<Name>Joe</Name>
</PersonOfInterest>
<PersonOfInterest>
<Name>Sue</Name>
</PersonOfInterest>
</Incident>
<Incident>
...
</Incident>
<PersonOfInterest>
<Name>Manny</Name>
</PersonOfInterest>
<PersonOfInterest>
...
</PersonOfInterest>
</Root>
I need to use a complex type defined in a different assembly in my xsd schema. Both of my .xsd schemas are defined as embedded resources and I've tried to link the sheet I have to import in the assembly, who needs it with no results.
Basically, when I need to validate one of my xml pages, I call this function, but it isn't able to add in cascade the xml schema sets of the types inside Operations.
public static XmlSchema GetDocumentSchema(this Document doc)
{
var actualType = doc.GetType();
var stream = actualType.Assembly.GetManifestResourceStream(actualType.FullName);
if (stream == null)
{
throw new FileNotFoundException("Unable to load the embedded file [" + actualType.FullName + "]");
}
var documentSchema = XmlSchema.Read(stream, null);
foreach (XmlSchemaExternal xmlInclude in documentSchema.Includes)
{
var includeStream = xmlInclude.SchemaLocation != "Operations.xsd"
? actualType.Assembly.GetManifestResourceStream(xmlInclude.Id)
: typeof (Operations).Assembly.GetManifestResourceStream(xmlInclude.Id);
if (includeStream == null)
{
throw new FileNotFoundException("Unable to load the embedded include file [" + xmlInclude.Id + "]");
}
xmlInclude.Schema = XmlSchema.Read(includeStream, null);
}
return documentSchema;
}
This is the main schema:
<?xml version="1.0" encoding="utf-8"?>
<xs:schema id="ExampleSheet"
attributeFormDefault="unqualified"
elementFormDefault="qualified"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:include id="Operations" schemaLocation="Operations.xsd"/>
<xs:element name="ExampleSheet">
<xs:complexType>
<xs:sequence>
<xs:element name="Operations" type="Operations"/>
</xs:sequence>
<xs:attribute name="version" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
</xs:schema>
And this is the schema of Operations:
<xs:schema id="Operations"
elementFormDefault="qualified"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
>
<xs:element name="Operations" type="Operations"/>
<xs:complexType name="Operations">
<xs:choice minOccurs="1" maxOccurs="unbounded">
<xs:element name="Insert" type="Insert"/>
<xs:element name="InsertUpdate" type="InsertUpdate"/>
<xs:element name="Update" type="Update"/>
<xs:element name="Delete" type="Delete"/>
</xs:choice>
<xs:attribute name="version" type="xs:string" use="required"/>
<xs:attribute name="store" type="xs:string" use="required"/>
<xs:attribute name="chain" type="xs:string" use="optional"/>
</xs:complexType>
</xs:schema>
For example, if I have an ExampleSheet with an Insert, it isn't able to recognize it. Operations and Insert are classes who implement IXmlSerializable, and the first one retrieves the schema sets of the inner types using a custom XmlSchemaProvider.
Am I doing something wrong?
How can I help my ExampleSheet to accet the members of Operations?
Should it ExampleSheet implement IXmlSerializable so I can build the reader and writer as I want, and would the schema be still useful?
Instead of XmlSchema, have you looked into the XmlSchemaSet class?
I have not done a lot with XML Serialization, so I don't know if it will fit your current application, but I've used it before in a similar situation where I have to refer to types defined in 3 separate schemas.
The complete XmlSchemaSet object will then have access to all of the types in each of the schemas.
I have an XML document containing types from 2 XML schemas. One (theirs.xsd) is a proprietary schema that I am integrating with (and cannot edit). To do this I am defining my own type (mine.xsd) that is an element within an 'any' element is the proprietary type.
I use Visual Studio's xsd.exe to generate C# classes from the schemas. However, the 'any' element in the proprietary type is generated as XmlElement[], and therefore my type doesn't get deserialized.
So I guess I can go one of two ways: either generate classes that will deserialize my type rather then keeping it as an XmlElement, or take the XmlElements and deserialize them individually. To deserialize I need an XmlReader, so I would need to go from an XmlElement to an XmlReader which I'm not sure how to do. Thanks.
Example:
File: theirs.xsd
<xs:element name="ProprietaryContainer">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
File: mine.xsd
<xs:element name="MyPairType">
<xs:complexType>
<xs:sequence>
<xs:element name="key" type="xs:string"/>
<xs:element name="value" type="xs:long"/>
</xs:sequence>
</xs:complexType>
</xs:element>
File: message.xml
<their:ProprietaryContainer>
<their:name>pairContainer</their:name>
<mine:MyPairType>
<mine:key>abc</mine:key>
<mine:value>long</mine:value>
</mine:MyPairType>
</their:ProprietaryContainer>
From the question:
To deserialize I need an XmlReader, so I would need to go from an XmlElement to an XmlReader which I'm not sure how to do
using(XmlReader reader = new XmlNodeReader(element)) {
//... use reader
}