Verion number XML 1.1 is invalid [duplicate] - c#

This is in the context of Web Services (client end).
I need to interface with a back-end system (Java) and it is a requirement to transmit some control characters in the  and  range.
I'm well aware that XML 1.0 doesn't support this, but am interested to know if the .NET 4 platform or .NET 4.5 web services framework support conversations in XML 1.1.

No, it doesn't look like XmlReader (the core of much of the XML support in .NET) supports 1.1:
using System;
using System.IO;
using System.Xml;
class Program
{
static void Main(string[] args)
{
string xml = "<?xml version=\"1.1\" ?><tag>&#x1</tag>";
var reader = XmlReader.Create(new StringReader(xml));
while (reader.Read());
}
}
Output:
Unhandled Exception: System.Xml.XmlException: Version number '1.1' is invalid.
Line 1, position 16.
I've looked at XmlReaderSettings to see if anything there would help, but I don't think it does. Basically I think you're stuck for the moment :(
EDIT: Reading around XML 1.1 a bit, it looks like it's not widely deployed or recommended, so I'm not particularly surprised that it's not supported in .NET 4.5. My guess is that it never will be, given that it's not a particularly new recommendation. The most recent version is the 2nd edition which dates back to 2006. If it's not supported 7 years later, I suspect there'd have to be some significant event to make it worth supporting in the future.

I am sure this is not the best option but if you download IKVM you can use java classes in your .Net code after referencing a few assemblies (really .Net code :) )
var fXmlFile = new java.io.File(xmlfile);
var dbFactory = javax.xml.parsers.DocumentBuilderFactory.newInstance();
var dBuilder = dbFactory.newDocumentBuilder();
var doc = dBuilder.parse(fXmlFile);
var nList = doc.getElementsByTagName("controlcharacters");
var chars = nList.item(0).getTextContent().ToCharArray();
XML File:
<?xml version="1.1" ?>
<root>
<controlcharacters></controlcharacters>
</root>

Related

cXML .net XMLReader error The parameter entity replacement text must nest properly within markup declarations

Is there something I need to configure in the XmlReaderSettings to encourage .net (4.8, 6, 7) to handle some cXML without throwing the following exception:
Unhandled exception. System.Xml.Schema.XmlSchemaException: The parameter entity replacement text must nest properly within markup declarations.
Sample cXML input
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE cXML SYSTEM "http://xml.cxml.org/schemas/cXML/1.2.041/cXML.dtd">
<cXML payloadID="donkeys#example.com" timestamp="2023-02-13T01:01:01Z">
<Header>
</Header>
<Request deploymentMode="production">
</Request>
</cXML>
Sample Application
using System.Xml;
using System.Xml.Linq;
namespace Donkeys
{
internal class Program
{
static void Main()
{
XmlReaderSettings settings = new()
{
XmlResolver = new XmlUrlResolver(),
DtdProcessing = DtdProcessing.Parse,
ValidationType = ValidationType.DTD,
};
FileStream fs = File.OpenRead("test.xml"); // sample cXML from question
XmlReader reader = XmlReader.Create(fs, settings);
XDocument.Load(reader); // this blows up
}
}
}
I'm looking to use the XmlUrlResolver to cache the DTDs but without ignoring the validation I get the error above but i'm not really sure why?
So far I've tried different validation flags but they don't validate at all unless I use ValidationType.DTD which goes pop.
The actual resolver seems to work fine; if I subclass it, it is returning the DTD (as a MemoryStream) as expected.
I can add an event handler to ignore the issue but this feels lamer than I'd like.
using System.Xml;
using System.Xml.Linq;
namespace Donkeys
{
internal class Program
{
static void Main()
{
XmlReaderSettings settings = new()
{
XmlResolver = new XmlUrlResolver(),
DtdProcessing = DtdProcessing.Parse,
ValidationType = ValidationType.DTD,
IgnoreComments = true
};
settings.ValidationEventHandler += Settings_ValidationEventHandler;
FileStream fs = File.OpenRead("test.xml");
XmlReader reader = XmlReader.Create(fs, settings);
XDocument dogs = XDocument.Load(reader);
}
private static void Settings_ValidationEventHandler(object? sender, System.Xml.Schema.ValidationEventArgs e)
{
// this seems fragile
if (e.Message.ToLower() == "The parameter entity replacement text must nest properly within markup declarations.".ToLower()) // and this would be a const
return;
throw e.Exception;
}
}
}
I've spent some time over the last few days looking into this and trying to get my head around what's going on here.
As far as I can tell, the error The parameter entity replacement text must nest properly within markup declarations is being reported incorrectly. My understanding of the spec is that this message means that you have mismatched < and > elements in the replacement text of a parameter entity in a DTD.
The following example is taken from this O'Reilly book sample page and demonstrates something that genuinely should reproduce this error:
<!ENTITY % finish_it ">">
<!ENTITY % bad "won't work" %finish_it;
Indeed the .NET DTD parser reports the same error for these two lines of DTD.
This doesn't mean you can't have < and > characters in parameter entity replacement text at all: the following two lines will declare an empty element with name Z, albeit in a somewhat round-about way:
<!ENTITY % Nested "<!ELEMENT Z EMPTY>">
%Nested;
The .NET DTD parser parses this successfully.
However, the .NET DTD parser appears to be objecting to this line in the cXML DTD, which defines the Object.ANY parameter entity:
<!ENTITY % Object.ANY '|xades:QualifyingProperties|cXMLSignedInfo|Extrinsic'>
There are of course no < and > characters in the replacement text, so the error is baffling.
This is by no means a new problem. I found this unanswered Stack Overflow question which basically reports the same problem. Also, this MSDN Forum post basically has the same problem, and it was asked in 2007. So is this unclear but intentional behaviour, or a bug that has been in .NET for 15+ years? I don't know.
For those who do want to look into things further, the following is about the minimum necessary to reproduce the problem. The necessary C# code to read the XML file can be taken from the question and adapted, I don't see the need to repeat it here:
example.dtd:
<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT A EMPTY>
<!ENTITY % Rest '|A' >
<!ELEMENT example (#PCDATA %Rest;)*>
example.xml:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE example SYSTEM "example.dtd">
<example/>
There are various ways to tweak this to get rid of the error. One way is to move the | character from the parameter entity into the ELEMENT example declaration. Replacing #PCDATA with another element (which you would also have to define) is another way.
But enough of the theory behind the problem. How can you actually move forwards with this?
I would take a local copy of the cXML DTD and adjust it to work around this error. You can download the DTD from the URL in your sample cXML input. The %Object.ANY; parameter entity is only used once in the DTD: I would replace this one occurrence with the replacement text, |xades:QualifyingProperties|cXMLSignedInfo|Extrinsic.
You then need to adjust the .NET XML parser to use your modified copy of the cXML DTD instead of fetching the the one from the given URL. You create a custom URL resolver for this, for example:
using System.Xml;
namespace Donkeys
{
internal class CXmlUrlResolver : XmlResolver
{
private static readonly Uri CXml1_2_041 = new Uri("http://xml.cxml.org/schemas/cXML/1.2.041/cXML.dtd");
private readonly XmlResolver urlResolver;
public CXmlUrlResolver()
{
this.urlResolver = new XmlUrlResolver();
}
public override object GetEntity(Uri absoluteUri, string role, Type ofObjectToReturn)
{
if (absoluteUri == CXml1_2_041)
{
// Return a Stream that reads from your custom version of the DTD,
// for example:
return File.OpenRead(#"SomeFilePathHere\cXML-1.2.401.dtd");
}
return this.urlResolver.GetEntity(absoluteUri, role, ofObjectToReturn);
}
}
}
This checks to see what URI is being requested, and if it matches the cXML URI, returns a stream that reads from your customised copy of the DTD. If some other URI is given, it passes the request to the nested XMLResolver, which then deals with it. You will of course need to use an instance of CXmlUrlResolver instead of XmlUrlResolver() when creating your XmlReaderSettings.
I don't know how many versions of cXML you will have to deal with, but if you are dealing with multiple versions, you might have to create a custom copy of the DTD for each version, and have your resolver return the correct local copy for each different URI.
A similar approach is given at this MSDN Forums post from 2008, which also deals with difficulties parsing cXML with .NET. This features a custom URL resolver created by subclassing XmlUrlResolver. Those who prefer composition over inheritance may prefer my custom URL resolver instead.

ASP.NET Core 3.1 using XmlSerializerFormatters()

I currently have an application based on .Net Core 2.2 which works. I need to move this project forward to .Net Core 3.1 but I cannot seem to get the XML Deserialized in the controller. In both apps I created a WCF connected service successfully. The WDSL now has more classes defined but are basically the same. I diffed the files and
Left handside is newly generated fill:
< [System.CodeDom.Compiler.GeneratedCodeAttribute("Microsoft.Tools.ServiceModel.Svcutil", "2.0.2")]
---
> [System.CodeDom.Compiler.GeneratedCodeAttribute("Microsoft.Tools.ServiceModel.Svcutil", "2.0.1-preview-30310-0943")]
This repeats with every class in Reference.cs. My problem is my Postman tests fail with the new controllers. By using Calculatus Eliminatus I have managed to track down the difference, The old parsing would accept:
<?xml version="1.0" encoding="UTF-8"?>
<ConnectedServiceRequestX xmlns="http://somename.com/api/01">
<Timestamp>2021-04-05T16:35:43</Timestamp>
<ApiKey>TopSecretKey</ApiKey>
<CustomerId>ABC</CustomerId>
</ConnectedServiceRequestX>
The new parser only works if the posted XML is like this:
<?xml version="1.0" encoding="UTF-8"?>
<ConnectedServiceRequestX>
<Timestamp xmlns="http://somename.com/api/01">2021-04-05T16:35:43</Timestamp>
<ApiKey xmlns="http://somename.com/api/01">TopSecretKey</ApiKey>
<CustomerId xmlns="http://somename.com/api/01">ABC</CustomerId>
</ConnectedServiceRequestX>
The new parser throws an exception when putting xmlns="http://somename.com/api/01" at class level XML Item. I need to support the older XML input as I have no ownership of the system accessing our service. This is a case where a big corporation is dictating the interface that they will use to access our data and we are a small outfit.
I am inclined to think there is some option I can supply to .XmlSerializerFormatters() such that the xmlns will default to what namespace is provided on the class level XML Item. Any help is appreciated.
The following should work :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Serialization;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = #"c:\temp\test.xml";
static void Main(string[] args)
{
XmlReader reader = XmlReader.Create(FILENAME);
XmlSerializer serializer = new XmlSerializer(typeof(ConnectedServiceRequestX));
ConnectedServiceRequestX request = (ConnectedServiceRequestX)serializer.Deserialize(reader);
}
}
[XmlRoot(Namespace = "http://somename.com/api/01")]
public class ConnectedServiceRequestX
{
public DateTime Timestamp { get; set; }
public string ApiKey { get; set; }
public string CustomerId { get; set; }
}
}
After studying the diffs between the Generated files (April 2019 and today) I noticed a one line difference preceding some of the classes. There are 60 plus classes in the C# file generated from the WDSL (I apologize for not noticing earlier). Anyhow, The line is as follows:
[XmlRootAttribute("ConnectedServiceRequestX", Namespace="http://somename.com/api/01", IsNullable = false)]
These lines we added at some point after the class generation but prior to their initial entry into source control. They require:
using System.Xml.Serialization;
to be added as well. What this does is allows the xmlns attribute to be placed in the outer tag (Class Item tag) and not have to be replicated in the inner tags as follows:
<?xml version="1.0" encoding="UTF-8"?>
<ConnectedServiceRequestX xmlns="http://somename.com/api/01">
<Timestamp>2021-04-05T16:35:43</Timestamp>
<ApiKey>TopSecretKey</ApiKey>
<CustomerId>ABC</CustomerId>
</ConnectedServiceRequestX>
Which is how my legacy tests with Postman were designed. I wrote (generated) this code when I was learning ASP.Net Web Services and I do not remember modifying the generated files to get the Connected Service / XML Post to work. So the short answer is when generating code (adding a Connected Service - WCF/WDSL) with Visual Studio there still may be some modifications to the Reference.cs file which allow more friendly XML to be posted to the endpoints.
I am now up and running (ASP.NET Core 3.1) using my legacy tests which give me confidence to go to production with the updated app. I hope this helps others.

.NET Standard F# library does not store non-English characters properly

I create a .NET Standard F# library with F# 4.3.4 (I also tested with 4.5) with the following code:
namespace ClassLibrary2
module Say =
let a = "国".Length.ToString()
let b = sprintf "%A" ("国".ToCharArray() |> Array.map int)
let c = "国"
When referencing that library from another project (.net core or .net framework):
Console.WriteLine(Say.a); // F# .net standard
Console.WriteLine(Say.b);
Console.WriteLine(Say.c == "国");
I get the following output:
2
[|65533; 65533|]
False
The equivalent C# .NET Standard library:
using System;
using System.Linq;
namespace ClassLibrary1
{
public static class Class1
{
public static string a = "国".Length.ToString();
public static string b = String.Join(", ", "国".ToCharArray().Select(i => ((int)i).ToString()));
public static string c = "国";
}
}
gives the expected output:
1
22269
True
Here's a repo showing the issue: https://github.com/liboz/Kanji-Bug.
This looks likely to be a bug, but I was wondering what would be a reasonable workaround for this problem? Specifically, I want to be able to be able to check equality for strings with something like Say.c = "国" where I might be using non-English characters while using a .NET Standard library.
So, the issue appears to be that the first file that the dotnet cli generates in an F# library does not use Unicode for its encoding. So, when creating a .NET Standard F# library that file for me was generated with Shift-JIS encoding, likely due to region settings on my own computer. Therefore, the solution to my issue was to simply save the default Library1.fs file with UTF-8 encoding manually so that it would have the same encoding as all the other files.

Validate HL7 with C# and nHapi for .NET

I'm looking to validate an HL7 2.3 standard message using C# and .NET version of nHapi project:
https://github.com/duaneedwards/nHapi
I've downloaded the dll's and added to my project both NHapi.Base.dll and NHapi.Model.V23.dll.
I know I should use:
NHapi.Base.validation.MessageValidator
But I can't figure out how IValidationContext theContext should be configured in order to check 2.3 version.
In addition, I can't find any appropriate API docs for it.
Can someone assist?
Methods to validate the message are embedded into the parser. The Implementation of specific rules was intentionally left to implementers (to improve the flexibility). What you need to do is to create the new context:
public class CustomContext : DefaultValidationContext //:IValidationContext
{
//Define the rules and rule Bindings
}
public class Rule1 : IMessageRule
{
//Check whatever you want in the fully parsed message
//For example, check for the mandatory segment, groups cardinalities etc.
}
then
PipeParser p = new PipeParser();
CustomContext myContext = new CustomContext();
p.ValidationContext = myContext;
This is a good starting point: NHapi Documentation
Even I was looking for some solution to validate HL7 V2 messages using NHapi and could not find any good articles. So I decided to go through the NHapi object module to see any helpful information to validate the structure and I found something.
The NHapi HL7 v2 IMessage is implemented using IType interface and it has a property called ExtraComponent. NHapi parser does not throw any exceptions on invalid structure but populates the ExtraComponent property. So if you find ExtraComponent.numComponents() to be more than 0 then you have structural issues on the message.
I have written a validator code in C#. You can download it from github.
https://github.com/shivkumarhaldikar/NHapiValidatator

(MfA) - Name property on CollectionDataContractAttribute on Dictionary type is ignored

Please note I am posting this QA here to help others and as a partner to bug report 11881 on the Xamarin.Android Bugzilla area. As a result the type described below is for demonstration purposes only. I have posted an initial answer, also making reference to the same bug report, but hopefully at some point this question can be 'answered' with 'this has been fixed in version x.y'.
I have a following type shared between Mono for Android and Windows RT sources:
[CollectionDataContract(Name = "MyDictionary",
Namespace = "http://foo.bar/schema",
ItemName = "pair",
KeyName = "mykey",
ValueName = "myvalue")]
public class MyDictionary : Dictionary<string, string>
{
}
This is read from our Web API (running on Asp.Net Web API, Framework 4.5) as XML which looks like this:
<MyDictionary
xmlns:i="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://foo.bar/schema">
<pair>
<mykey>message1</mykey>
<myvalue>Hello</myvalue>
<pair>
<pair>
<mykey>message2</mykey>
<myvalue>World</myvalue>
<pair>
</MyDictionary>
When using the class as described above, this XML deserializes correctly on Windows, Windows Phone and Win-RT platforms.
However, on Mono for Android builds I get
System.Runtime.Serialization.SerializationException: Expected element 'ArrayOfpair' in namespace 'http://foo.bar/schema', but found Element node 'MyDictionary' in namespace 'http://foo.bar/schema'.
What have I done wrong?
Assuming I hadn't done anything wrong - I wrote an NUnitLight unit test to test whether an instance of MyDictionary would be serialized correctly by the Mono for Android implementation of the DataContractSerializer:
public string Serialize(object o)
{
DataContractSerializer ser = new DataContractSerializer(o.GetType());
using (var ms = new MemoryStream())
{
ser.WriteObject(ms, o);
return Encoding.Default.GetString(ms.ToArray());
}
}
[Test]
public void ShouldSerializeDictionaryCorrectlyAndDeserialize()
{
MyDictionary dict = new MyDictionary();
dict["message1"] = "hello";
dict["message2"] = "world";
var s = Serialize(dict);
Assert.That(s, Is.StringStarting("<MyDictionary"));
}
The unit test fails, with the output string starting 'ArrayOfpair' and not 'MyDictionary', which would be consistent with the original behaviour where deserialization of the correct XML fails because it doesn't start 'ArrayOfpair'.
This is, therefore, a good candidate for a bug in the Mono for Android implementation of the DataContractSerializer (I have reported the bug here) - but until that bug is both confirmed and fixed, a workaround will be needed. In my case I have shared codebase issues (Android, Windows and Monotouch) to contend with and so I don't want to just rewrite this type for Android. If I come up with a decent workaround, I'll post it on this answer.
Please note - I don't yet know if this also applies to Monotouch - we don't yet have a complete enough build of our component to run the same test, so it might do, I just don't know.

Categories

Resources