Process XML in C# using external entity file - c#

I am processing an XML file (which does not contain any dtd or ent declarations) in C# that contains entities such as é and à. I receive the following exception when attempting to load an XML file...
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(record);
Reference to undeclared entity
'eacute'.
I was able to track down the proper ent file here. How do I tell XmlDocument to use this ent file when loading my XML file?

In versions of the framework prior to .Net 4 you use ProhibitDtd of an XmlReaderSettings instance.
var settings = new XmlReaderSettings();
settings.ProhibitDtd = false;
string DTD = #"<!DOCTYPE doc [
<!ENTITY % iso-lat1 PUBLIC ""ISO 8879:1986//ENTITIES Added Latin 1//EN//XML""
""http://www.oasis-open.org/docbook/xmlcharent/0.3/iso-lat1.ent"">
%iso-lat1;
]> ";
string xml = string.Concat(DTD,"<xml><txt>rené</txt></xml>");
XmlDocument xd = new XmlDocument();
xd.Load(XmlReader.Create(new MemoryStream(
UTF8Encoding.UTF8.GetBytes(xml)), settings));
From .Net 4.0 onward use the DtdProcessing property with a value of DtdProcessing.Parse which you set on the XmlTextReader.
XmlDocument xd = new XmlDocument();
using (var rdr = new XmlTextReader(new StringReader(xml)))
{
rdr.DtdProcessing = DtdProcessing.Parse;
xd.Load(rdr);
}

I ran into the same problem, and not wanting to modify my XML (or DTD), I decided to create my own XmlResolver to add entities on the fly.
My implementation actually reads entities from the config file, but this should be enough to do what you're asking for. In this example, I'm converting a right single curly quote into an apostrophe.
class XmlEntityResolver : XmlResolver {
public override object GetEntity(Uri absoluteUri,
string role,
Type ofObjectToReturn)
{
if (absoluteUri.toString() == "-//MY PUB ID") {
MemoryStream ms = new MemoryStream();
StreamWriter sw = new StreamWriter(ms);
sw.Write("<!ENTITY rsquo \"'\">");
sw.Flush();
ms.Position = 0;
return ms;
}
else {
return base.GetEntity(absoluteUri, role, ofObjectToReturn);
}
}
}
Then, when you declare your XmlDocument, just set the resolver prior to load.
XmlDocument doc = new XmlDocument();
doc.XmlResolver = new XmlEntityResolver();
doc.Load(XML_FILE);

é is not a valid XML entity by default whereas it is a valid HTML entity by default.
You would need to define é as a valid XML entity for XML parsing purposes.
EDIT:
To add a reference to your external ent file you need to do that within the XML file itself. Save the ent file to disk and place it within the same directory as the document being parsed.
<!ENTITY % stuff SYSTEM "iso-lat1.ent">
%stuff;
If you want to go a different route check out the information on ENTITY declaration.

According to this, you have to reference them within the file; you cannot tell LoadXml to do this for you.

Your question has been answered in 2004 itself at MSDN Article........ You can find it here.......
http://msdn.microsoft.com/en-us/library/aa302289.aspx

Related

How to configure the XML parser to disable external entity resolution in c#

var xDoc = XDocument.Load(fileName);
I am using above code in a function to load an XML file. Functionality wise its working fine but it is showing following Veracode Flaw after Veracode check.
Description
The product processes an XML document that can contain XML entities with URLs that resolve to documents outside
of the intended sphere of control, causing the product to embed incorrect documents into its output. By default, the
XML entity resolver will attempt to resolve and retrieve external references. If attacker-controlled XML can be
submitted to one of these functions, then the attacker could gain access to information about an internal network, local
filesystem, or other sensitive data. This is known as an XML eXternal Entity (XXE) attack.
Recommendations
Configure the XML parser to disable external entity resolution.
What I need to do to resolve it.
If you are not using external entity references in your XML, you can disable the resolver by setting it to null, from How to prevent XXE attack ( XmlDocument in .net)
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.XmlResolver = null;
xmlDoc.LoadXml(OurOutputXMLString);
If you are expecting the document to contain entity references, then you will need to create a custom resolver and whitelist what you are expecting. Especially, any references to websites that you do not control.
Implement a custom XmlResolver and use it for reading the XML. By default, the XmlUrlResolver is used, which automatically downloads the resolved references.
public class CustomResolver : XmlUrlResolver
{
public override object GetEntity(Uri absoluteUri, string role, Type ofObjectToReturn)
{
// base calls XmlUrlResolver.DownloadManager.GetStream(...) here
}
}
And use it like this:
var settings = new XmlReaderSettings { XmlResolver = new CustomResolver() };
var reader = XmlReader.Create(fileName, settings);
var xDoc = XDocument.Load(reader);
According to the official OWASP documentation you need to do this:
Use of XercesDOMParser do this to prevent XXE:
XercesDOMParser *parser = new XercesDOMParser;
parser->setCreateEntityReferenceNodes(false);
Use of SAXParser, do this to prevent XXE:
SAXParser* parser = new SAXParser;
parser->setDisableDefaultEntityResolution(true);
Use of SAX2XMLReader, do this to prevent XXE:
SAX2XMLReader* reader = XMLReaderFactory::createXMLReader();
parser->setFeature(XMLUni::fgXercesDisableDefaultEntityResolution, true);
Take a look at these guide: https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html
you can try this way:
XmlDocument doc = new XmlDocument() { XmlResolver = null };
System.IO.StringReader sreader = new System.IO.StringReader(fileName);
XmlReader reader = XmlReader.Create(sreader, new XmlReaderSettings() { XmlResolver = null });
doc.Load(reader);
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
// to be compliant, completely disable DOCTYPE declaration:
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
// or completely disable external entities declarations:
factory.setFeature("http://xml.org/sax/features/external-general-entities", false);
factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
// or prohibit the use of all protocols by external entities:
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "");

How to resolve IOException "file used by another process" while saving XmlDocument?

When I'm trying to save the XML Document I edited the IOException "file used by another process" occured when I try to save that document.
Any ideas how to solve this?
Note: This method is called everytime a new element in the XmlDocument should be written.
public void saveRectangleAsXMLFragment()
{
XmlDocument doc = new XmlDocument();
doc.Load("test.xml");
XmlDocumentFragment xmlDocFrag = doc.CreateDocumentFragment();
String input = generateXMLInput();
xmlDocFrag.InnerXml = input;
XmlElement mapElement = doc.DocumentElement;
mapElement.AppendChild(xmlDocFrag);
input = null;
mapElement = null;
xmlDocFrag = null;
doc.Save("test.xml");
}
Its probably one of your other methods, or other part of the code which opened the file and didnt calose it well. Try to search for this kind of problem.
try this if your's application is only access that .xml file
1. Create a Object globally
object lockData = new object();
2.Use than object to lock statement where you save and load xml
lock(lockData )
{
doc.Load("test.xml");
}
lock(lockData )
{
doc.Save("test.xml");
}
From Jon Skeet's related answer (see https://stackoverflow.com/a/8354736/4151626)
There seems to be a bug in XmlDocument.Save()'s treatment of the file stream, where it becomes pinned and is neither Closed() nor Disposed(). By taking direct control of the creation and disposition of the stream outside of the XmlDocument.Save() I was able to get around this halting error.
//e.g.
XmlWriter xw = new XmlWriter.Create("test.xml");
doc.Save(xw);
xw.Close();
xw.Dispose();

Validating an XML against an embedded XSD in C#

Using the following MSDN documentation I validate an XML file against a schema: http://msdn.microsoft.com/en-us/library/8f0h7att%28v=vs.100%29.aspx
This works fine as long as the XML contains a reference to the schema location or the inline schema. Is it possible to embed the schema "hard-coded" into the application, i.e. the XSD won't reside as a file and thus the XML does not need to reference it?
I'm talking about something like:
Load XML to be validated (without schema location).
Load XSD as a resource or whatever.
Do the validation.
Try this:
Stream objStream = objFile.PostedFile.InputStream;
// Open XML file
XmlTextReader xtrFile = new XmlTextReader(objStream);
// Create validator
XmlValidatingReader xvrValidator = new XmlValidatingReader(xtrFile);
xvrValidator.ValidationType = ValidationType.Schema;
// Add XSD to validator
XmlSchemaCollection xscSchema = new XmlSchemaCollection();
xscSchema.Add("xxxxx", Server.MapPath(#"/zzz/XSD/yyyyy.xsd"));
xvrValidator.Schemas.Add(xscSchema);
try
{
while (xvrValidator.Read())
{
}
}
catch (Exception ex)
{
// Error on validation
}
You can use the XmlReaderSettings.Schemas property to specify which schema to use. The schema can be loaded from a Stream.
var schemaSet = new XmlSchemaSet();
schemaSet.Add("http://www.contoso.com/books", new XmlTextReader(xsdStream));
var settings = new XmlReaderSettings();
settings.Schemas = schemaSet;
using (var reader = XmlReader.Create(xmlStream, settings))
{
while (reader.Read());
}
You could declare the XSD as an embedded resource and load it via GetManifestResourceStream as described in this article: How to read embedded resource text file
Yes, this is possible. Read the embedded resource file to string and then create your XmlSchemaSet object adding the schema to it. Use it in your XmlReaderSettings when validating.

Creating an XML with multiple root elements

I'm trying to create an XML with multiple root elements. I can't change that because that is the way I'm supposed to send the XML to the server. This is the error I get when I try to run the code:
System.InvalidOperationException: This operation would create an incorrectly structured document.
Is there a way to overwrite this error and have it so that it ignores this?
Alright so let me explain this better:
Here is what I have
XmlDocument doc = new XmlDocument();
doc.LoadXml(_application_data);
Now that creates the XML document and I can add a fake root element to it so that it works. However, I need to get rid of that and convert it into a DocumentElement object.
How would I go about doing that?
Specify Fragment when creating XmlWriter as shown here
XmlWriterSettings settings = new XmlWriterSettings();
settings.OmitXmlDeclaration = true;
settings.ConformanceLevel = ConformanceLevel.Fragment;
settings.CloseOutput = false;
// Create the XmlWriter object and write some content.
MemoryStream strm = new MemoryStream();
using (XmlWriter writer = XmlWriter.Create(strm, settings))
{
writer.WriteElementString("orderID", "1-456-ab");
writer.WriteElementString("orderID", "2-36-00a");
writer.Flush();
}
If it has multiple root elements, it's not XML. If it resembles XML in other ways, you could place everything under a root element, then when you send the string to the server, you just combine the serialized child elements of this root element, or as #Austin points out, use an inner XML method if available.
just create an XML with single root then get it's content as XML text.
you are talking about XML fragment anyways, since good xml has only one root.
this is sample to help you started:
var xml = new XmlDocument();
var root = xml.CreateElement("root");
root.AppendChild(xml.CreateElement("a"));
root.AppendChild(xml.CreateElement("b"));
Console.WriteLine(root.InnerXml); // outputs "<a /><b />"

XslCompiledTransform and Serialization

I am trying to implement some functions that will convert one object to another with XslCompiledTransform.
I found some implementations for Serializing an object to XML string and DeSerialize the XML string to an object.
Another function does the XslCompiledTransform from object1 to obejbct2.
To generate the XSLT file i used the Altova MapForce, just loaded the XML of the serialized objects and mapped some attributes.
Now for the problems:
first I noticed that the XslCompiledTransform doesn't work with XSLT version 2.0. is there any newer functions that do work with XSLT 2.0? maybe some settings?
secondly I get an exception when trying to DeSerialize the XML to an object:
"There was an error deserializing the object of type myObject Input string was not in a correct format."
I don't understand where is the problem.
Does anybody have a sample code that does such a thing? all I find in google are Transformations of HTML code and not objects.
Here are the functions:
private static string runXSLT(string xsltFile, string inputXML)
{
XmlDocument XmlDoc = new XmlDocument();
// Load the style sheet.
XslCompiledTransform xslt = new XslCompiledTransform(true);
xslt.Load(xsltFile);
StringReader StrReader = new StringReader(inputXML);
XmlTextReader XmlReader = new XmlTextReader(StrReader);
//Create an XmlTextWriter which outputs to memory stream
Stream stream = new MemoryStream();
XmlWriter writer = new XmlTextWriter(stream, Encoding.UTF8);
// Execute the transform and output the results to a file.
xslt.Transform(XmlReader, writer);
stream.Position = 0;
XmlDoc.Load(stream);
return XmlDoc.InnerXml;
}
public static string SerializeAnObject(object AnObject)
{
XmlDocument XmlDoc = new XmlDocument();
DataContractSerializer xmlDataContractSerializer = new DataContractSerializer(AnObject.GetType());
MemoryStream MemStream = new MemoryStream();
try
{
xmlDataContractSerializer.WriteObject(MemStream, AnObject);
MemStream.Position = 0;
XmlDoc.Load(MemStream);
return XmlDoc.InnerXml;
}
finally
{
MemStream.Close();
}
}
public static Object DeSerializeAnObject(string XmlOfAnObject, Type ObjectType)
{
StringReader StrReader = new StringReader(XmlOfAnObject);
DataContractSerializer xmlDataContractSerializer = new DataContractSerializer(ObjectType);
XmlTextReader XmlReader = new XmlTextReader(StrReader);
try
{
Object AnObject = xmlDataContractSerializer.ReadObject(XmlReader);
return AnObject;
}
finally
{
XmlReader.Close();
StrReader.Close();
}
}
Thanks allot,
Omri.
XslCompiledTransform does not support XSLT 2.0. In fact, XSLT 2.0 is not supported within the .NET Framework at all (you could try the Saxon version for .NET, but be aware that this is just the Java version running inside IKVM).
From your description I did not understand why you are taking the detour via XML to convert one object into another. Why don't you simply provide a constructor in your target object that takes your input object as a paramater? Then you can code all the mapping inside that constructor. This is not onlyby far more efficient than serializing, transforming and deserializing your objects you will also get the type safety of C#.

Categories

Resources