Recently upgraded a 3.5 project to 4.5. There is a chunk of data that we are serializing and storing in the database, but everytime a save occurs in the upgraded project, the XML formatting has changed, throwing errors, and I can't seem to figure out the core issue. There are 2 SO questions in particular that mention encoding changes, but I've tried switching to UTF8 (in a few different ways specified in the answers on those questions), without any success - with UTF8 I just got a mess of strange characters throughout the entire file.
The main issues that I can see occurring are:
A leading ? character is added to the XML (which I've come to find out is a valid character, but we aren't handling apparently)
Child nodes aren't being included with some of the nodes.
Here is our serialization method:
public static string SerializeXml<T>(T instance)
{
XmlSerializer serializer = new XmlSerializer(typeof(T));
MemoryStream memStream = new MemoryStream();
XmlTextWriter xmlWriter = new XmlTextWriter(memStream, Encoding.Unicode);
serializer.Serialize(xmlWriter, instance);
memStream = (MemoryStream)xmlWriter.BaseStream;
return UnicodeEncoding.Unicode.GetString(memStream.ToArray()).Replace("<?xml version=\"1.0\" encoding=\"utf-16\"?>", "");
}
and our deserialization method:
public static T DeserializeXml<T>(string xml)
{
XmlSerializer xs = new XmlSerializer(typeof(T));
StringReader reader = new StringReader(xml);
return (T)xs.Deserialize(reader);
}
Any help would be appreciated, I am not too familiar with serialization or encoding. Just curious what may have changed with the upgrade to 4.5, or if there is something I need to take a closer look at.
If you want to Serialize to a String you need to use UTF16. If you want to Serialize with UTF8 you need to serialize to a byte[]. Strings in C# are UTF16 so in the code you posted I believe all your data is encoded with UTF16 but because you are omitting the Xml Declaration code assumes it is UTF8.
I would recommend using functions like this and not omitting the XmlDeclaration:
public static string SerializeXmlToString<T>(T instance)
{
XmlSerializer serializer = new XmlSerializer(typeof(T));
XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = Encoding.Unicode;
StringBuilder builder = new StringBuilder();
using (StringWriter writer = new StringWriter(builder))
using (XmlWriter xmlWriter = XmlWriter.Create(writer, settings))
{
serializer.Serialize(xmlWriter, instance);
}
return builder.ToString();
}
public static byte[] SerializeXml<T>(T instance)
{
XmlSerializer serializer = new XmlSerializer(typeof(T));
XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = Encoding.UTF8;
using (MemoryStream memStream = new MemoryStream())
{
using (XmlWriter xmlWriter = XmlWriter.Create(memStream, settings))
{
serializer.Serialize(xmlWriter, instance);
}
return memStream.ToArray();
}
}
public static T DeserializeXml<T>(string data)
{
XmlSerializer serializer = new XmlSerializer(typeof(T));
using (StringReader reader = new StringReader(data))
{
return (T)serializer.Deserialize(reader);
}
}
public static T DeserializeXml<T>(byte[] bytes)
{
XmlSerializer serializer = new XmlSerializer(typeof(T));
using(MemoryStream stream = new MemoryStream(bytes))
{
return (T)serializer.Deserialize(stream);
}
}
Related
I am trying to serialize an object to XML and the problem that I am getting is that the object gets serialized to
<something />
instead of
<something/>
I believe that both are valid XML syntax, but I have to get <something/>
Here is my code
public static string Serialize<T>(T ObjectToSerialize)
{
XmlWriterSettings settings = new XmlWriterSettings()
{
OmitXmlDeclaration = true,
Encoding = Encoding.UTF8,
};
XmlSerializer xmlSerializer = new XmlSerializer(ObjectToSerialize.GetType());
using (StringWriter textWriter = new StringWriter())
{
using (var xw = XmlWriter.Create(textWriter, settings))
{
xmlSerializer.Serialize(xw, ObjectToSerialize);
}
return textWriter.ToString();
}
}
How can I fix it?
May not be the most efficient solution, but you could do a simple String.Replace before returning serialized data in Serialize<T>().
Replacing
return textWriter.ToString();
With
return textWriter.ToString().Replace(" />","/>");
I seem to be getting some junk at the head of my serialized XML string. I have a simple extension method
public static string ToXML(this object This)
{
DataContractSerializer ser = new DataContractSerializer(This.GetType());
var settings = new XmlWriterSettings { Indent = true };
using (MemoryStream ms = new MemoryStream())
using (var w = XmlWriter.Create(ms, settings))
{
ser.WriteObject(w, This);
w.Flush();
return UTF8Encoding.Default.GetString(ms.ToArray());
}
}
and when I apply it to my object I get the string
<?xml version="1.0" encoding="utf-8"?>
<RootModelType xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.datacontract.org/2004/07/WeinCad.Data">
<MoineauPump xmlns:d2p1="http://schemas.datacontract.org/2004/07/Weingartner.Numerics">
<d2p1:Rotor>
<d2p1:Equidistance>0.0025</d2p1:Equidistance>
<d2p1:Lobes>2</d2p1:Lobes>
<d2p1:MajorRadius>0.04</d2p1:MajorRadius>
<d2p1:MinorRadius>0.03</d2p1:MinorRadius>
</d2p1:Rotor>
</MoineauPump>
</RootModelType>
Note the junk at the beginning. When I try to deserialize this
I get an error. If I copy paste the XML into my source minus
the junk prefix I can deserialize it. What is the junk text
and how can I remove it or handle it?
Note my deserialization code is
public static RootModelType Load(Stream data)
{
DataContractSerializer ser = new DataContractSerializer(typeof(RootModelType));
return (RootModelType)ser.ReadObject(data);
}
public static RootModelType Load(string data)
{
using(var stream = new MemoryStream(Encoding.UTF8.GetBytes(data))){
return Load(stream);
}
}
This fix seems to work
public static string ToXML(this object obj)
{
var settings = new XmlWriterSettings { Indent = true };
using (MemoryStream memoryStream = new MemoryStream())
using (StreamReader reader = new StreamReader(memoryStream))
using(XmlWriter writer = XmlWriter.Create(memoryStream, settings))
{
DataContractSerializer serializer =
new DataContractSerializer(obj.GetType());
serializer.WriteObject(writer, obj);
writer.Flush();
memoryStream.Position = 0;
return reader.ReadToEnd();
}
}
I am writing an xml file but am missing some value for specific field. I check that when the object comes which contains the value that specific value exist, but after writing the xml the value doesn't exist.
This is the code that I use, I think XmlTextWriter could be the cause of the wrong xml.
There is another method which could be used for it, that is TextWriter but it failed to convert into memorystream.
string xmlString = null;
MemoryStream memoryStream = new MemoryStream();
XmlSerializer xs = new XmlSerializer(typeof(T));
// XmlTextWriter xmlTextWriter = new XmlTextWriter(memoryStream, Encoding.ASCII);
TextWriter xmlTextWriter=new StreamWriter(memoryStream,Encoding.ASCII);
xs.Serialize(xmlTextWriter, obj);
memoryStream =(MemoryStream)xmlTextWriter.
//(MemoryStream)xmlTextWriter.BaseStream;
xmlString = ASCIIByteArrayToString(memoryStream.ToArray());
return `xmlString;`
Any idea how I can know why and where the problem occurs.
I think you are over-complicating it with the memory stream. You can serialize to a StringWriter (which derives from TextWriter) then call ToString() if you want to get the XML string.
XmlSerializer xs = new XmlSerializer(typeof(T));
StringWriter sw = new StringWriter();
xs.Serialize(sw, obj);
return sw.ToString();
Try disposing properly your IDisposable resources by wrapping them in using statements:
public string SerializeToXml<T>(T obj)
{
using (var stream = new MemoryStream())
{
var xs = new XmlSerializer(typeof(T));
xs.Serialize(stream, obj);
return Encoding.UTF8.GetString(stream.ToArray());
}
}
i want the xml encoding to be:
<?xml version="1.0" encoding="windows-1252"?>
To generate encoding like encoding="windows-1252" I wrote this code.
var myns = OS.xmlns;
using (var stringWriter = new StringWriter())
{
var settings = new XmlWriterSettings
{
Encoding = Encoding.GetEncoding(1252),
OmitXmlDeclaration = false
};
using (var writer = XmlWriter.Create(stringWriter, settings))
{
var ns = new XmlSerializerNamespaces();
ns.Add(string.Empty, myns);
var xmlSerializer = new XmlSerializer(OS.GetType(), myns);
xmlSerializer.Serialize(writer, OS,ns);
}
xmlString= stringWriter.ToString();
}
But I am still not getting my expected encoding what am I missing? Please guide me to generate encoding like encoding="windows-1252"?. What do I need to change in my code?
As long as you output the XML directly to a String (through a StringBuilder or a StringWriter) you'll always get UTF-8 or UTF-16 encondings. This is because strings in .NET are internally represented as Unicode characters.
In order to get the proper encoding you'll have to switch to a binary output, such as a Stream.
Here's a quick example:
var settings = new XmlWriterSettings
{
Encoding = Encoding.GetEncoding(1252)
};
using (var buffer = new MemoryStream())
{
using (var writer = XmlWriter.Create(buffer, settings))
{
writer.WriteRaw("<sample></sample>");
}
buffer.Position = 0;
using (var reader = new StreamReader(buffer))
{
Console.WriteLine(reader.ReadToEnd());
Console.Read();
}
}
Related resources:
C# in Depth: Strings in C# and .NET
I am sending a request to a web service which requires a string containing XML, of which I have been giving an XSD.
I've ran xsd.exe and created a class based on this but am unsure of the best way to create the xml string to send, for example a stream, XMLDocument or some form of serialization.
UPDATE
I found this here
public static string XmlSerialize(object o)
{
using (var stringWriter = new StringWriter())
{
var settings = new XmlWriterSettings
{
Encoding = Encoding.GetEncoding(1252),
OmitXmlDeclaration = true
};
using (var writer = XmlWriter.Create(stringWriter, settings))
{
var xmlSerializer = new XmlSerializer(o.GetType());
xmlSerializer.Serialize(writer, o);
}
return stringWriter.ToString();
}
}
which lets me control the tag attribute.
What I am doing on several occasions is creating a class/struct to hold the data on the client-side program and serializing the data as a string. Then I make the web request and send it that XML string. Here is the code I use to serialize an object to XML:
public static string SerializeToString(object o)
{
string serialized = "";
System.Text.StringBuilder sb = new System.Text.StringBuilder();
//Serialize to memory stream
System.Xml.Serialization.XmlSerializer ser = new System.Xml.Serialization.XmlSerializer(o.GetType());
System.IO.TextWriter w = new System.IO.StringWriter(sb);
ser.Serialize(w, o);
w.Close();
//Read to string
serialized = sb.ToString();
return serialized;
}
As long as all the contents of the object are serializable it will serialize any object.
Use Xstream framework to generate an xml string. Hope this helps!
Here's what I have done before:
private static string CreateXMLString(object o)
{
XmlSerializer serializer = new XmlSerializer(typeof(object));
var stringBuilder = new StringBuilder();
using (var writer = XmlWriter.Create(stringBuilder))
{
serializer.Serialize(writer, o);
}
return stringBuilder.ToString();
}