Xdocument, Xelement.Save incorrect encoding - c#

I 'm having problem with the code presented:
string serializedLicence = SerializationHelper.ToXML(licenceInfo);
var licenceFileXml = new XElement("Licence", new XElement("LicenceData", serializedLicence)));
XmlDocument signedLicence = SignXml(licenceFileXml.ToString(), Properties.Resources.PRIVATE_KEY);
signedLicence.Save(saveFileDialogXmlLicence.FileName);
The created file has an incorrect coding of strings send to XElement constructors aswell as the signature, that assigned with custom SignXml() method (which creates signature with XmlDocument.DocumentElement.AppendChild() method, but that's irrelevant right now). The output:
<?xml version="1.0" encoding="utf-16" standalone="yes"?>
<Licence>
<LicenceData><?xml version="1.0" encoding="utf-16"?>
<LicenceInfo
//stuff stuff stuff
</LicenceInfo></LicenceData>
<Signature><SignedInfo xmlns="h stuff stuff stuff</Signature>
</Licence>
So basically I'm taking serialized object string and put it between markers, and this part gets encoded wrong. Debugger shows me, that the text in XElement object is holding < and > just after creating it. I could parse it manually, but that's inapropriate.
Note: befeore that, I was straight signing the deserialisation xml and it worked fine, so I can't figure it out why XDocument uses different encoding than XmlSerializer/XmlDocument object.
Also: I think I could just use XmlDocument object to build the file, but I'm curious what's wrong.

You're adding serializedLicence as string, so it's treated as text, not as XML and that's why it looks like that in you document.
var licenceFileXml = new XElement("Licence",
new XElement("LicenceData",
XDocument.Parse(serializedLicence).Root)));

Related

C# Parsing XML in ISO-8859-1

I'm working on a tool for validating XML files grabbed from a mainframe. For reasons beyond my control every XML file is encoded in ISO 8859-1.
<?xml version="1.0" encoding="ISO 8859-1"?>
My C# application utilizes the System.XML library to parse the XML and eventually a string of a message contained within one of the child nodes.
If I manually remove the XML encoding line it works just fine. But i'd like to find a solution that doesn't require manual intervention. Are there any elegant approaches to solving this? Thanks in advance.
The exception that is thrown reads as:
System.Xml.XmlException' occurred in System.Xml.dll. System does not support 'ISO 8859-1' encoding. Line 1, position 31
My code is
XMLDocument xmlDoc = new XMLDocument();
xmlDoc.Load(//fileLocation);
As Jeroen pointed out in a comment, the encoding should be:
<?xml version="1.0" encoding="ISO-8859-1"?>
not:
<?xml version="1.0" encoding="ISO 8859-1"?>
(missing dash -).
You can use a StreamReader with an explicit encoding to read the file anyway:
using (var reader = new StreamReader("//fileLocation", Encoding.GetEncoding("ISO-8859-1")))
{
var xmlDoc = new XmlDocument();
xmlDoc.Load(reader);
// ...
}
(from answer by competent_tech in other thread I linked in an earlier comment).
If you do not want the using statement, I guess you can do:
var xmlDoc = new XmlDocument();
xmlDoc.LoadXml(File.ReadAllText("//fileLocation", Encoding.GetEncoding("ISO-8859-1")));
Instead of XmlDocument, you can use the XDocument class in the namespace System.Xml.Linq if you refer the assembly System.Xml.Linq.dll (since .NET 3.5). It has static methods like Load(Stream) and Parse(string) which you can use as above.

How to place XML Processing Instruction on Line 1 using System.XML.Linq

I am writing a console application that generates an XML file that will be consumed by a server job processing application that was written a long time ago. The server app requires a processing instruction: <?JtJob jobname?>. I'm using C# XDocument to generate my xml:
XDocument xml = new XDocument(new XProcessingInstruction("JtJob", "FieldInspection3_Rejected"),
new XElement("Document",
new XElement("DataFile", tempFileName),
new XElement("FormType","Corrected Form Package"),
new XElement("BYOD_RejectComment",reasonForRejection),
new XElement("BYOD_FromTech",techEmail)
)
);
xml.Save(Path.Combine("C:\\Data", DateTime.Now.ToString("yyyyMMdd_HHmmssffff") + "_Rejected.xml"));
For some reason, the server app requires the processing instruction to be on the first line. If my xml file looks like this:
<?xml version="1.0" encoding="utf-8"?><?JtJob FieldInspection3_Rejected?>
<Document>
<DataFile>C:\Windows\TEMP\tmp387F.tmp</DataFile>
<FormType>Corrected Form Package</FormType>
<BYOD_RejectComment>you're ugly</BYOD_RejectComment>
<BYOD_FromTech>example#gmail.com</BYOD_FromTech>
</Document>
Everything works fine. But when it looks like this:
<?xml version="1.0" encoding="utf-8"?>
<?JtJob FieldInspection3_Rejected?>
<Document>
<DataFile>C:\Windows\TEMP\tmp387F.tmp</DataFile>
<FormType>Corrected Form Package</FormType>
<BYOD_RejectComment>you're ugly</BYOD_RejectComment>
<BYOD_FromTech>example#gmail.com</BYOD_FromTech>
</Document>
It errors. My problem is, using the XDocument code above, it generates the second output.
Without loading my generated xml back in as a string and manipulating the string, is there a way for me to tell XDocument to create the processing instruction on the first line?
I know the blame is definitely to be placed on the server app for not accepting valid XML syntax, but my goal is to get this to work, not fix a 20 year old program.
Edit: Thanks! Using the save override preserved the formatting. Didn't make it all one line, but it allowed me to keep the PI on line 1.
Edit 2: Well, that didn't help me either. But I found out what would help me! XDocument.Save() by default outputs UTF8 With BOM. I changed it to without BOM by using XMLTextWriter and that worked.
What if you used the XDocument.Save(String, SaveOptions) method to get an output all on a single line?
So do this instead:
xml.Save(fileName, SaveOptions.DisableFormatting);
This would force the declaration to be onto the first line with the downside of having the entire document on the first line, but if it works for that program then so be it.
You'll want to use a XDocument.Save() overload that allows you to specify formatting options:
xml.Save(Path.Combine("C:\\Data", DateTime.Now.ToString("yyyyMMdd_HHmmssffff") + "_Rejected.xml"),
SaveOptions.DisableFormatting);
https://msdn.microsoft.com/en-us/library/bb551426(v=vs.110).aspx

XmlException: Text node cannot appear in this state. Line 1, position 1

Before I get into the issue, I'm aware there is another question that sounds exactly the same as mine. However, I've tried that solution (using Notepad++ to encode the xml file as UTF-8 (without BOM) ) and it doesn't work.
XmlDocument namesDoc = new XmlDocument();
XmlDocument factionsDoc = new XmlDocument();
namesDoc.LoadXml(Application.persistentDataPath + "/names.xml");
factionsDoc.LoadXml(Application.persistentDataPath + "/factions.xml");
Above is the code I have problems with. I'm not sure what the problem is.
<?xml version="1.0" encoding="UTF-8"?>
<factions>
<major id="0">
...
Above is a section of the XML file (the start of it - names.xml is also the same except it has no 'id' attribute). The file(s) are both encoded in UTF-8 - in the latest notepad++ version, there is no option of "encode in UTF-8 without BOM" afaik UTF-8 is the same as UTF-8 without BOM.
Does anyone have any idea what the cause may be? Or am I doing something wrong/forgetting something? :/
You are receiving an error because the .LoadXml() method expects a string argument that contains the XML data, not the location of an XML file. If you want to load an XML file then you need to use the .Load() method, not the .LoadXml() method.

Error in XML document (2, 2)

when I am running this program, I am facing this error
public static object Load(Stream stream,Type newType)
{
//create s serializer and load the object
XmlSerializer serializer=new XmlSerializer(newType);
object newobject =serializer.Deserialize(stream);
//return the new object
return newobject;
}
?xml version="1.0"?>
-<Address xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <FirstName>ali </FirstName> <FamilyName>bradaran</FamilyName> <UserLevel>عادی</UserLevel> <Password>123</Password> </Address>
Your problem is that there is an error in the XML document you are trying to read.
Open your XML document in Internet Explorer. If it is valid, it will display. If it is not, the error will be described and shown, which should help you track down the problem.
If the XML you posted is a genuine representation of what you're reading, there is a minus character and two semicolon characters that shouldn't be in the file. I'm also not sure you would want the xmlns attributes in your Address element?
I suggest you search for some XML tutorials on the web so you can get a better understanding of how XML must be formed.

Reading contents of XML file without having to remove the XML declaration

I want to read all XML contents from a file. The code below only works when the XML declaration (<?xml version="1.0" encoding="UTF-8"?>) is removed. What is the best way to read the file without removing the XML declaration?
XmlTextReader reader = new XmlTextReader(#"c:\my path\a.xml");
reader.Read();
string rs = reader.ReadOuterXml();
Without removing the XML declaration, reader.ReadOuterXml() returns an empty string.
<?xml version="1.0" encoding="UTF-8"?>
<s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://www.w3.org/2005/08/addressing">
<s:Header>
<a:Action s:mustUnderstand="1">http://www.as.com/ver/ver.IClaimver/Car</a:Action>
<a:MessageID>urn:uuid:b22149b6-2e70-46aa-8b01-c2841c70c1c7</a:MessageID>
<ActivityId CorrelationId="16b385f3-34bd-45ff-ad13-8652baeaeb8a" xmlns="http://schemas.microsoft.com/2004/09/ServiceModel/Diagnostics">04eb5b59-cd42-47c6-a946-d840a6cde42b</ActivityId>
<a:ReplyTo>
<a:Address>http://www.w3.org/2005/08/addressing/anonymous</a:Address>
</a:ReplyTo>
<a:To s:mustUnderstand="1">http://localhost/ver.Web/ver2011.svc</a:To>
</s:Header>
<s:Body xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Car xmlns="http://www.as.com/ver">
<carApplication>
<HB_Base xsi:type="HB" xmlns="urn:core">
<Header>
<Advisor>
<AdvisorLocalAuthorityCode>11</AdvisorLocalAuthorityCode>
<AdvisorType>1</AdvisorType>
</Advisor>
</Header>
<General>
<ApplyForHB>yes</ApplyForHB>
<ApplyForCTB>yes</ApplyForCTB>
<ApplyForFSL>yes</ApplyForFSL>
<ConsentSupplied>no</ConsentSupplied>
<SupportingDocumentsSupplied>no</SupportingDocumentsSupplied>
</General>
</HB_Base>
</carApplication>
</Car>
</s:Body>
</s:Envelope>
Update
I know other methods that use NON-xml reader (e.g. by using File.ReadAllText()). But I need to know a way that uses an xml method.
There can be no text or whitespace before the <?xml ?> encoding declaration other than a BOM, and no text between the declaration and the root element other than line break.
Anything else is an invalid document.
UPDATE:
I think your expectation of XmlTextReader.read() is incorrect.
Each call to XmlTextReader.Read() steps through the next "token" in the XML document, one token at a time. "Token" means XML elements, whitespace, text, and XML encoding declaration.
Your call to reader.ReadOuterXML() is returning an empty string because the first token in your XML file is an XML declaration, and an XML declaration does not have an OuterXML.
Consider this code:
XmlTextReader reader = new XmlTextReader("test.xml");
reader.Read();
Console.WriteLine(reader.NodeType); // XMLDeclaration
reader.Read();
Console.WriteLine(reader.NodeType); // Whitespace
reader.Read();
Console.WriteLine(reader.NodeType); // Element
string rs = reader.ReadOuterXml();
The code above produces this output:
XmlDeclaration
Whitespace
Element
The first "token" is the XML declaration.
The second "token" encountered is the line break after the XML declaration.
The third "token" encountered is the <s:Envelope> element. From here a call to reader.ReadOuterXML() will return what I think you're expecting to see - the text of <s:Envelope> element, which is the entire soap packet.
If what you really want is to load the XML file into memory as objects, just call
var doc = XDocument.Load("test.xml")
and be done with the parsing in one fell swoop.
Unless you're working with an XML doc that is so monstrously huge that it won't fit in system memory, there's really not a lot of reason to go poking through the XML document one token at a time.
What about
XmlDocument doc=new XmlDocument;
doc.Load(#"c:\my path\a.xml");
//Now we have the XML document - convert it to a String
//There are many ways to do this, one should be:
StringWriter sw=new StringWriter();
doc.Save(sw);
String finalresult=sw.ToString();
EDIT: I'm assuming you mean you actually have text between the document declaration and the root element. If that's not the case, please clarify.
Without removing the extra text, it's simply an invalid XML file. I wouldn't expect it to work. You don't have an XML file - you have something a bit like an XML file, but with extraneous stuff before the root element.
IMHO you can't read this file. It's because there's a plain text before the root element <s:Envelope> which makes whole document invalid.
You're parsing an XML document as XML just to obtain the source text? Why?
If you really want to do that then:
string rs;
using(var rdr = new StreamReader(#"c:\my path\a.xml"))
rs = rdr.ReadToEnd();
Will work, but I'm really not sure that is what you actually want. This pretty much ignores that it's XML and just reads the text. Useful for some things, but not a lot.

Categories

Resources