I am serializing a C# object into an XML document and sending the XML document to a third party vendor. The vendor is telling me that the encoding specification in the document is UTF-16, but the XML document contains UTF-8 content and they can't use it. Here is the code I am using to create the XML file, which runs without error and creates an XML document.
// Instantiate xmlSerializer with my object type.
XmlSerializer xmlSerializer = new XmlSerializer(typeof(MyObject));
// Instantiate a new stream and pass file location and mode.
Stream stream = new FileStream(#"C:\doc.xml", FileMode.Create);
// Instantiate xmlWriter and pass stream and encoding.
XmlWriter xmlWriter = new XmlTextWriter(stream, Encoding.Unicode);
// Call serialize method and pass xmlWriter and my object.
xmlSerializer.Serialize(xmlWriter, myObject);
// Close writer and stream.
xmlWriter.Close();
stream.Close();
When I run this, the XML Doc shows this on the first line:
<?xml version="1.0" encoding="UTF-16"?>
I've tried changing the Encoding from Encoding.Unicode to Encoding.UTF8 in the XmlTextWriter, but that doesn't change the first line of the XML Doc and it still shows UTF-16.
I also tried using the Serialize method signature that takes 4 parameters (writer, object, namespaces, encoding) and specified UTF8 as the encoding and that didn't change the XML Doc specification either.
I believe all I need to do is change the encoding that shows in the XML Doc to UTF-8 and the third party vendor will be happy. I can't figure out what I am doing wrong.
If I change from Encoding.Unicode to Encoding.UTF8, the file is generated properly. Perhaps you're looking at an old version of your file?
In an unrelated bit, you should use using for deterministic disposal of objects which implement IDisposable:
XmlSerializer xmlSerializer = new XmlSerializer(typeof(MyObject));
using (Stream stream = new FileStream(#".\doc.xml", FileMode.Create))
using (XmlWriter xmlWriter = new XmlTextWriter(stream, Encoding.UTF8))
{
xmlSerializer.Serialize(xmlWriter, myObject);
}
Related
I already searched a lot today about this and I can't find how to Deserialize with UTF-8 encoding.
<?xml version="1.0" encoding="UTF-8"?>
<AvailabilityRequestV2 xmlns="" xmlns:xsi="http://www.w3.org/2001/XMLSchema- instance"
siteid="0000"
apikey="0000"
async="false" waittime="0">
<Type>4</Type>
<Id>159266</Id>
<Radius>0</Radius>
<Latitude>0</Latitude>
<Longitude>0</Longitude>
</AvailabilityRequestV2>
If I try this
string xmlString = File above;
XmlSerializer serializer = new XmlSerializer(typeof(AvailabilityRequestV2));
AvailabilityRequestV2 request = (AvailabilityRequestV2)serializer.Deserialize(
new MemoryStream(Encoding.UTF8.GetBytes(xmlString)));
If I put in debugging mode the mouse over request I get this:
{<?xml version="1.0" encoding="utf-16"?><AvailabilityRequestV2
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
..................
How can I force to be UTF-8 ?
I only saw to Serialize, but Deserialize I didn't.
You can use a StreamReader and specify UTF-8, you can also tell it to use the BOM if present:
using (StreamReader reader = new StreamReader("my.xml",Encoding.UTF8,true)) {
XmlSerializer serializer = new XmlSerializer(typeof(SomeType));
object result = serializer.Deserialize(reader);
}
I'm unsure what happens when the XML reader however encounters the encoding="utf-16" directive within the XML, it may switch over.
Once you have slurped the contents of a file into a .Net/CLR string, it is UTF-16 encoded: it has been transformed from its original source encoding. The CLR uses UTF-16 internally—hence the reason for a char being 16 bits.
As a result, the encoding specified in the document's [original] XML Declaration is now at odds with the actual encoding of the document.
Best to pass a StreamReader as recommended by #Lloyd above.
I think the example from #Lloyd needs the new keyword:
using (StreamReader reader = new StreamReader("my.xml",Encoding.UTF8,true)) {
I need to generate a huge xml file from different sources (functions). I decide to use XmlTextWriter since it uses less memory than XmlDocument.
First, initiate an XmlWriter with underlying MemoryStream
MemoryStream ms = new MemoryStream();
XmlTextWriter xmlWriter = new XmlTextWriter(ms, new UTF8Encoding(false, false));
xmlWriter.Formatting = Formatting.Indented;
Then I pass the XmlWriter (note xml writer is kept open until the very end) to a function to generate the beginning of the XML file:
xmlWriter.WriteStartDocument();
xmlWriter.WriteStartElement();
// xmlWriter.WriteEndElement(); // Do not write the end of root element in first function, to add more xml elements in following functions
xmlWriter.WriteEndDocument();
xmlWriter.Flush();
But I found that underlying memory stream is empty (by converting byte array to string and output string). Any ideas why?
Also, I have a general question about how to generate a huge xml file from different sources (functions). What I do now is keeping the XmlWriter open (I assume the underlying memory stream should open as well) to each function and write. In the first function, I do not write the end of root element. After the last function, I manually add the end of root element by:
string endRoot = "</Root>";
byte[] byteEndRoot = Encoding.ASCII.GetBytes(endRoot);
ms.Write(byteEndRoot, 0, byteEndRoot.Length);
Not sure if this works or not.
Thanks a lot!
Technically you should only ask one question per question, so I'm only going to answer the first one because this is just a quick visit to SO for me at the moment.
You need to call Flush before attempting to read from the Stream I think.
Edit
Just bubbling up my second hunch from the comments below to justify the accepted answer here.
In addition to the call to Flush, if reading from the Stream is done using the Read method and its brethren, then the position in the stream must first be reset back to the start. Otherwise no bytes will be read.
ms.Position = 0; /*reset Position to start*/
StreamReader reader = new StreamReader(ms);
string text = reader.ReadToEnd();
Console.WriteLine(text);
Perhaps you need to call Flush() on the xml stream before checking the memory streazm.
Make sure you call Flush on the XmlTextWriter before checking the memory stream.
I created an XSD file from Visual Studio and can generate a sample XML as well, but my goal is to use this XSD to create an XML file at runtime.
I used XSD.exe to generate a class from my XSD file and then created a program to populate the object from the "class". How can I serialize the object to an XML file?
Both those examples leave the stream open, and XmlFormatter is part of the BizTalk libs - so XmlSerializer would be more appropriate:
using (Stream stream = File.Open(fileName, FileMode.Create))
{
XmlSerializer serializer = new XmlSerializer(typeof(MyObject));
serializer.Serialize(stream, MyObject);
stream.Flush();
}
When you have created classes to serialize and deserialize the Xml file using the XSD.exe tool you can write your instances back to files using ..
Serialization! (Archive)
Stream stream = File.Open(filename, FileMode.Create);
XmlFormatter formatter = new XmlFormatter (typeof(XmlObjectToSerialize));
formatter.Serialize(stream, xmlObjectToSerialize);
stream.Flush();
Binary format is binary, use the XML version for XML:
XmlFormatter serializer = new XmlFormatter(typeof(MyObject));
serializer.Serialize(stream, object1);
All,
I have a list of objects which I have serialized to an XML document using XmlSerializer.
However I would like to wrap the whole result into two tags:
<message>
<!-My Serialized content goes here-->
</message>
Do I need to open it as an XML Document and Add a new root element or is there another way of doing it ?
Rgds,
MK
XmlSerializer writes to an XmlWriter. Write the start tag to the writer first, then serialize, and close your message tag at the end.
Example:
XmlWriter writer = // Your writer
XmlSerializer ser = new XmlSerializer(typeof(DateTime));
writer.WriteStartElement("message");
ser.Serialize(writer,DateTime.Now);
writer.WriteEndElement();
This snippet <!--Please don't delete this--> is part of my xml file. After running this method, the resulting xml file does not contain this snippet anymore <!--Please don't delete this-->. Why is this?
Here's my method:
XmlSerializer serializer = new XmlSerializer(typeof(Settings));
TextWriter writer = new StreamWriter(path);
serializer.Serialize(writer, settings);
writer.Close();
Well, this is quite obvious:
the XmlSerializer will parse the XML file and extract all instances of Settings from it - your comment won't be part of any of those objects
when you write those back out again, only the contents of the Settings objects is written out again
Your comment will fall through the cracks - but I don't see any way you could "save" that comment as long as you're using the XmlSerializer approach.
What you need to do is use the XmlReader / XmlWriter instead:
XmlReader reader = XmlReader.Create("yourfile.xml");
XmlWriter writer = XmlWriter.Create("your-new-file.xml");
while (reader.Read())
{
writer.WriteNode(reader, true);
}
writer.Close();
reader.Close();
This will copy all xml nodes - including comments - to the new file.
<!-- --> signifies a comment in XML. You are writing an object out to XML - objects do not have comments as they get compiled out during compilation.
That is, the Settings object (which is probably a de-serialized form of your .config XML) does not hold comments in memory after de-serializing, so they will not get serialized back either. There is nothing you can do about this behavior of the framework as there is no built in mechanism to de-serialize comments using XmlSerializer.