I am managing a large project and need to serialize and send an object in xml format. The object is ~130 mb.
(NOTE: I did not write this project, so making edits outside of this method, or drastically changing the architecture is not an option. It works great normally, but when the object is this large, it throws out of memory exception. I need to do it another way to handle large objects.)
The current code is this:
public static string Serialize(object obj)
{
string returnValue = null;
if (null != obj)
{
System.Runtime.Serialization.DataContractSerializer formatter = new System.Runtime.Serialization.DataContractSerializer(obj.GetType());
XDocument document = new XDocument();
System.IO.StringWriter writer = new System.IO.StringWriter();
System.Xml.XmlTextWriter xmlWriter = new XmlTextWriter(writer);
formatter.WriteObject(xmlWriter, obj);
xmlWriter.Close();
returnValue = writer.ToString();
}
return returnValue;
}
It is throwing an out of memory exception right at returnValue = writer.ToString().
I rewrote it to use "using" blocks which I prefer:
public static string Serialize(object obj)
{
string returnValue = null;
if (null != obj)
{
System.Runtime.Serialization.DataContractSerializer formatter = new System.Runtime.Serialization.DataContractSerializer(obj.GetType());
using (System.IO.StringWriter writer = new System.IO.StringWriter())
{
using (System.Xml.XmlTextWriter xmlWriter = new XmlTextWriter(writer))
{
formatter.WriteObject(xmlWriter, obj);
returnValue = writer.ToString();
}
}
}
return returnValue;
}
researching this, it appears the ToString method on StringWriter actually uses double the RAM. (I actually have plenty of RAM free, over 4 gb, so not really sure why I am getting an out of memory error).
Well, I found the best solution was to serialize to a file directly, then instead of passing a string along, I pass the file:
public static void Serialize(object obj, FileInfo destination)
{
if (null != obj)
{
using (TextWriter writer = new StreamWriter(destination.FullName, false))
{
XmlTextWriter xmlWriter = null;
try
{
xmlWriter = new XmlTextWriter(writer);
DataContractSerializer formatter = new DataContractSerializer(obj.GetType());
formatter.WriteObject(xmlWriter, obj);
}
finally
{
if (xmlWriter != null)
{
xmlWriter.Flush();
xmlWriter.Close();
}
}
}
}
}
Of course, now I have another problem which I will post ... and that is deserializing the file!
Related
I m using Xsd2Code to serialize my object in order to generate a Xml file.
It works fine, just when the file contains much data, I get an OutOfMemoryException. Here's the code I used to serialize my object :
/// Serializes current EntityBase object into an XML document
/// </summary>
// <returns>string XML value</returns>
public virtual string Serialize() {
System.IO.StreamReader streamReader = null;
System.IO.MemoryStream memoryStream = null;
try {
memoryStream = new System.IO.MemoryStream();
Serializer.Serialize(memoryStream, this);
memoryStream.Seek(0, System.IO.SeekOrigin.Begin);
streamReader = new System.IO.StreamReader(memoryStream);
return streamReader.ReadToEnd();
}
finally {
if (streamReader != null) {
streamReader.Dispose();
}
if (memoryStream != null) {
memoryStream.Dispose();
}
}
}
My request here, is how can I extend the memory buffer, or how can I avoid such an exception?
Regards.
You don't show the complete ToString() output of the OutOfMemoryException so it's hard to say for sure how much this will help, but one possibility would be to write directly to a StringWriter without creating an intermediate MemoryStream, like so:
public virtual string Serialize()
{
return this.Serialize(Serializer);
}
Using the extension method:
public static class XmlSerializerExtensions
{
class NullEncodingStringWriter : StringWriter
{
public override Encoding Encoding { get { return null; } }
}
public static string Serialize<T>(this T obj, XmlSerializer serializer = null, bool indent = true)
{
if (serializer == null)
serializer = new XmlSerializer(obj.GetType());
// Precisely emulate the output of http://referencesource.microsoft.com/#System.Xml/System/Xml/Serialization/XmlSerializer.cs,2c706ead96e5c4fb
// - Indent by 2 characters
// - Suppress output of the "encoding" tag.
using (var textWriter = new NullEncodingStringWriter())
{
using (var xmlWriter = new XmlTextWriter(textWriter))
{
if (indent)
{
xmlWriter.Formatting = Formatting.Indented;
xmlWriter.Indentation = 2;
}
serializer.Serialize(xmlWriter, obj);
}
return textWriter.ToString();
}
}
}
You might also consider eliminating the formatting and indentation to save more string memory by setting indent = false.
This will reduce your peak memory footprint somewhat, since it completely eliminates the need to have a large MemoryStream in memory at the same time as the resulting string. It won't reduce your peak memory requirement enormously, however, since the memory taken by the MemoryStream will have been proportional to the memory taken by the final XML string.
Beyond that, I can only suggest trying to stream directly to your database.
I am trying to return an SqlXml object from a method which initializes it using a method local memory stream. I.e.
using (Stream memoryStream = new MemoryStream())
{
using (XmlWriter writer = XmlWriter.Create(memoryStream, new XmlWriterSettings { OmitXmlDeclaration = true }))
{
serializer.Serialize(writer, myList.ToArray(), ns);
return new SqlXml(memoryStream);
}
}
Now the method that calls it and tries to access it's fields fails with an object disposed exception.
I gave a quick glance at SqlXml.cs and saw it is just keeping an reference to the stream which describes the behaviour.
public SqlXml(Stream value) {
// whoever pass in the stream is responsible for closing it
// similar to SqlBytes implementation
if (value == null) {
SetNull();
}
else {
firstCreateReader = true;
m_fNotNull = true;
m_stream = value;
}
I would really like to avoid caller having to pass the stream and being responsible for it's lifetime. Is there any other way to fully initializing the SqlXml object and safely disposing the memory stream?
edit:
One possible solution is to have a temp SqlXml variable and then use it to initialize return object via create reader constructor:
using (Stream memoryStream = new MemoryStream())
{
using (XmlWriter writer = XmlWriter.Create(memoryStream, new XmlWriterSettings { OmitXmlDeclaration = true }))
{
serializer.Serialize(writer, myList.ToArray(), ns);
SqlXml s = new SqlXml(memoryStream);
return new SqlXml(s.CreateReader());
}
}
But this still looks a bit clunky to me.
The using statement will call dispose on the stream when the block exits. Take the MemoryStream out of the using-block and it will not dispose before return.
class Program
{
static void Main(string[] args)
{
var s = GetData();
var r = s.CreateReader();
while (r.Read())
{
if (r.NodeType == XmlNodeType.Element)
{
System.Console.WriteLine(r.Name);
}
}
r.Close();
}
private static SqlXml GetData()
{
var mem = new MemoryStream();
//TODO: Deserialize or query data.
return new SqlXml(mem);
}
}
I seem to be getting some junk at the head of my serialized XML string. I have a simple extension method
public static string ToXML(this object This)
{
DataContractSerializer ser = new DataContractSerializer(This.GetType());
var settings = new XmlWriterSettings { Indent = true };
using (MemoryStream ms = new MemoryStream())
using (var w = XmlWriter.Create(ms, settings))
{
ser.WriteObject(w, This);
w.Flush();
return UTF8Encoding.Default.GetString(ms.ToArray());
}
}
and when I apply it to my object I get the string
<?xml version="1.0" encoding="utf-8"?>
<RootModelType xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.datacontract.org/2004/07/WeinCad.Data">
<MoineauPump xmlns:d2p1="http://schemas.datacontract.org/2004/07/Weingartner.Numerics">
<d2p1:Rotor>
<d2p1:Equidistance>0.0025</d2p1:Equidistance>
<d2p1:Lobes>2</d2p1:Lobes>
<d2p1:MajorRadius>0.04</d2p1:MajorRadius>
<d2p1:MinorRadius>0.03</d2p1:MinorRadius>
</d2p1:Rotor>
</MoineauPump>
</RootModelType>
Note the junk at the beginning. When I try to deserialize this
I get an error. If I copy paste the XML into my source minus
the junk prefix I can deserialize it. What is the junk text
and how can I remove it or handle it?
Note my deserialization code is
public static RootModelType Load(Stream data)
{
DataContractSerializer ser = new DataContractSerializer(typeof(RootModelType));
return (RootModelType)ser.ReadObject(data);
}
public static RootModelType Load(string data)
{
using(var stream = new MemoryStream(Encoding.UTF8.GetBytes(data))){
return Load(stream);
}
}
This fix seems to work
public static string ToXML(this object obj)
{
var settings = new XmlWriterSettings { Indent = true };
using (MemoryStream memoryStream = new MemoryStream())
using (StreamReader reader = new StreamReader(memoryStream))
using(XmlWriter writer = XmlWriter.Create(memoryStream, settings))
{
DataContractSerializer serializer =
new DataContractSerializer(obj.GetType());
serializer.WriteObject(writer, obj);
writer.Flush();
memoryStream.Position = 0;
return reader.ReadToEnd();
}
}
I have a custom type UserSettingConfig I want to save in my database, I want to save it as pure XML as the type might be changed later and migrating pure xml is easier than a binary objet.
public class Serialize
{
private readonly DataContractSerializer _serializer;
public Serialize()
{
_serializer = new DataContractSerializer(typeof(UserSettingConfig));
}
public string SerializeObject(UserSettingConfig userSettingConfig)
{
using (var memoryStream = new MemoryStream())
{
_serializer.WriteObject(memoryStream, userSettingConfig);
string userSettingXml = memoryStream.ToString();
memoryStream.Close();
return userSettingXml;
}
}
public UserSettingConfig DeSerializeObject(string userSettingXml)
{
UserSettingConfig userSettingConfig;
using (var stream = new MemoryStream(userSettingXml))
{
stream.Position = 0;
userSettingConfig = (UserSettingConfig)_serializer.ReadObject(stream);
}
return userSettingConfig;
}
}
This dont work as the Memory Stream want a byte array or int
I want my Serialize to return a string (I can save as varchar(MAX) in my database)
DataContractSerializer.WriteObject has an overload that takes an XmlWriter. You can construct one of those that writes the XML to a StringBuilder:
private static string SerializeToString(object objectToSerialize)
{
var serializer = new DataContractSerializer(objectToSerialize.GetType());
var output = new StringBuilder();
var xmlWriter = XmlWriter.Create(output);
serializer.WriteObject(xmlWriter, objectToSerialize);
xmlWriter.Close();
return output.ToString();
}
You may also consider serializing to JSON instead of XML, using the excellent JSON.NET library which can serialize even the most complex objects easily. JSON is very compact and is still readable.
To serialize:
string json = Newtonsoft.Json.JavaScriptConvert.SerializeObject(anySerializableObject);
To deserialize:
MyClass instance = (MyClass) Newtonsoft.Json.JavaScriptConvert.DeserializeObject(json, typeof(MyClass));
If you need xml without xml declaration, you should use XmlWriterSettings. For instance when you need xml string for node but not entire xml document.
private static string SerializeToString(object objectToSerialize)
{
var serializer = new DataContractSerializer(objectToSerialize.GetType());
var output = new StringBuilder();
var xmlWriter = XmlWriter.Create(output, new XmlWriterSettings() { OmitXmlDeclaration = true});
serializer.WriteObject(xmlWriter, objectToSerialize);
xmlWriter.Close();
return output.ToString();
}
I am sending a request to a web service which requires a string containing XML, of which I have been giving an XSD.
I've ran xsd.exe and created a class based on this but am unsure of the best way to create the xml string to send, for example a stream, XMLDocument or some form of serialization.
UPDATE
I found this here
public static string XmlSerialize(object o)
{
using (var stringWriter = new StringWriter())
{
var settings = new XmlWriterSettings
{
Encoding = Encoding.GetEncoding(1252),
OmitXmlDeclaration = true
};
using (var writer = XmlWriter.Create(stringWriter, settings))
{
var xmlSerializer = new XmlSerializer(o.GetType());
xmlSerializer.Serialize(writer, o);
}
return stringWriter.ToString();
}
}
which lets me control the tag attribute.
What I am doing on several occasions is creating a class/struct to hold the data on the client-side program and serializing the data as a string. Then I make the web request and send it that XML string. Here is the code I use to serialize an object to XML:
public static string SerializeToString(object o)
{
string serialized = "";
System.Text.StringBuilder sb = new System.Text.StringBuilder();
//Serialize to memory stream
System.Xml.Serialization.XmlSerializer ser = new System.Xml.Serialization.XmlSerializer(o.GetType());
System.IO.TextWriter w = new System.IO.StringWriter(sb);
ser.Serialize(w, o);
w.Close();
//Read to string
serialized = sb.ToString();
return serialized;
}
As long as all the contents of the object are serializable it will serialize any object.
Use Xstream framework to generate an xml string. Hope this helps!
Here's what I have done before:
private static string CreateXMLString(object o)
{
XmlSerializer serializer = new XmlSerializer(typeof(object));
var stringBuilder = new StringBuilder();
using (var writer = XmlWriter.Create(stringBuilder))
{
serializer.Serialize(writer, o);
}
return stringBuilder.ToString();
}