Xsd2Code serialize file OutOfMemoryException - c#

I m using Xsd2Code to serialize my object in order to generate a Xml file.
It works fine, just when the file contains much data, I get an OutOfMemoryException. Here's the code I used to serialize my object :
/// Serializes current EntityBase object into an XML document
/// </summary>
// <returns>string XML value</returns>
public virtual string Serialize() {
System.IO.StreamReader streamReader = null;
System.IO.MemoryStream memoryStream = null;
try {
memoryStream = new System.IO.MemoryStream();
Serializer.Serialize(memoryStream, this);
memoryStream.Seek(0, System.IO.SeekOrigin.Begin);
streamReader = new System.IO.StreamReader(memoryStream);
return streamReader.ReadToEnd();
}
finally {
if (streamReader != null) {
streamReader.Dispose();
}
if (memoryStream != null) {
memoryStream.Dispose();
}
}
}
My request here, is how can I extend the memory buffer, or how can I avoid such an exception?
Regards.

You don't show the complete ToString() output of the OutOfMemoryException so it's hard to say for sure how much this will help, but one possibility would be to write directly to a StringWriter without creating an intermediate MemoryStream, like so:
public virtual string Serialize()
{
return this.Serialize(Serializer);
}
Using the extension method:
public static class XmlSerializerExtensions
{
class NullEncodingStringWriter : StringWriter
{
public override Encoding Encoding { get { return null; } }
}
public static string Serialize<T>(this T obj, XmlSerializer serializer = null, bool indent = true)
{
if (serializer == null)
serializer = new XmlSerializer(obj.GetType());
// Precisely emulate the output of http://referencesource.microsoft.com/#System.Xml/System/Xml/Serialization/XmlSerializer.cs,2c706ead96e5c4fb
// - Indent by 2 characters
// - Suppress output of the "encoding" tag.
using (var textWriter = new NullEncodingStringWriter())
{
using (var xmlWriter = new XmlTextWriter(textWriter))
{
if (indent)
{
xmlWriter.Formatting = Formatting.Indented;
xmlWriter.Indentation = 2;
}
serializer.Serialize(xmlWriter, obj);
}
return textWriter.ToString();
}
}
}
You might also consider eliminating the formatting and indentation to save more string memory by setting indent = false.
This will reduce your peak memory footprint somewhat, since it completely eliminates the need to have a large MemoryStream in memory at the same time as the resulting string. It won't reduce your peak memory requirement enormously, however, since the memory taken by the MemoryStream will have been proportional to the memory taken by the final XML string.
Beyond that, I can only suggest trying to stream directly to your database.

Related

c# serialization xmlwriter stringwriter out of memory for large objecthelp

I am managing a large project and need to serialize and send an object in xml format. The object is ~130 mb.
(NOTE: I did not write this project, so making edits outside of this method, or drastically changing the architecture is not an option. It works great normally, but when the object is this large, it throws out of memory exception. I need to do it another way to handle large objects.)
The current code is this:
public static string Serialize(object obj)
{
string returnValue = null;
if (null != obj)
{
System.Runtime.Serialization.DataContractSerializer formatter = new System.Runtime.Serialization.DataContractSerializer(obj.GetType());
XDocument document = new XDocument();
System.IO.StringWriter writer = new System.IO.StringWriter();
System.Xml.XmlTextWriter xmlWriter = new XmlTextWriter(writer);
formatter.WriteObject(xmlWriter, obj);
xmlWriter.Close();
returnValue = writer.ToString();
}
return returnValue;
}
It is throwing an out of memory exception right at returnValue = writer.ToString().
I rewrote it to use "using" blocks which I prefer:
public static string Serialize(object obj)
{
string returnValue = null;
if (null != obj)
{
System.Runtime.Serialization.DataContractSerializer formatter = new System.Runtime.Serialization.DataContractSerializer(obj.GetType());
using (System.IO.StringWriter writer = new System.IO.StringWriter())
{
using (System.Xml.XmlTextWriter xmlWriter = new XmlTextWriter(writer))
{
formatter.WriteObject(xmlWriter, obj);
returnValue = writer.ToString();
}
}
}
return returnValue;
}
researching this, it appears the ToString method on StringWriter actually uses double the RAM. (I actually have plenty of RAM free, over 4 gb, so not really sure why I am getting an out of memory error).
Well, I found the best solution was to serialize to a file directly, then instead of passing a string along, I pass the file:
public static void Serialize(object obj, FileInfo destination)
{
if (null != obj)
{
using (TextWriter writer = new StreamWriter(destination.FullName, false))
{
XmlTextWriter xmlWriter = null;
try
{
xmlWriter = new XmlTextWriter(writer);
DataContractSerializer formatter = new DataContractSerializer(obj.GetType());
formatter.WriteObject(xmlWriter, obj);
}
finally
{
if (xmlWriter != null)
{
xmlWriter.Flush();
xmlWriter.Close();
}
}
}
}
}
Of course, now I have another problem which I will post ... and that is deserializing the file!

Returning SqlXml initialized with memory stream

I am trying to return an SqlXml object from a method which initializes it using a method local memory stream. I.e.
using (Stream memoryStream = new MemoryStream())
{
using (XmlWriter writer = XmlWriter.Create(memoryStream, new XmlWriterSettings { OmitXmlDeclaration = true }))
{
serializer.Serialize(writer, myList.ToArray(), ns);
return new SqlXml(memoryStream);
}
}
Now the method that calls it and tries to access it's fields fails with an object disposed exception.
I gave a quick glance at SqlXml.cs and saw it is just keeping an reference to the stream which describes the behaviour.
public SqlXml(Stream value) {
// whoever pass in the stream is responsible for closing it
// similar to SqlBytes implementation
if (value == null) {
SetNull();
}
else {
firstCreateReader = true;
m_fNotNull = true;
m_stream = value;
}
I would really like to avoid caller having to pass the stream and being responsible for it's lifetime. Is there any other way to fully initializing the SqlXml object and safely disposing the memory stream?
edit:
One possible solution is to have a temp SqlXml variable and then use it to initialize return object via create reader constructor:
using (Stream memoryStream = new MemoryStream())
{
using (XmlWriter writer = XmlWriter.Create(memoryStream, new XmlWriterSettings { OmitXmlDeclaration = true }))
{
serializer.Serialize(writer, myList.ToArray(), ns);
SqlXml s = new SqlXml(memoryStream);
return new SqlXml(s.CreateReader());
}
}
But this still looks a bit clunky to me.
The using statement will call dispose on the stream when the block exits. Take the MemoryStream out of the using-block and it will not dispose before return.
class Program
{
static void Main(string[] args)
{
var s = GetData();
var r = s.CreateReader();
while (r.Read())
{
if (r.NodeType == XmlNodeType.Element)
{
System.Console.WriteLine(r.Name);
}
}
r.Close();
}
private static SqlXml GetData()
{
var mem = new MemoryStream();
//TODO: Deserialize or query data.
return new SqlXml(mem);
}
}

Convert XDocument to byte array (and byte array to XDocument)

I've taken over a system that stores large XML documents in SQL Server in binary format.
Currently the data is saved by converting it to a string, then converting that string to a byte array. But recently with some large XML documents I'm getting out memory exceptions when attempting to convert to a string, so I want to bypass this process and go straight from the XDocument to a byte array.
The Entity Framework class holding the XML has been extended so that the binary data is accessible as a string like this:
partial class XmlData
{
public string XmlString { get { return Encoding.UTF8.GetString(XmlBinary); } set { XmlBinary = Encoding.UTF8.GetBytes(value); } }
}
I want to further extend the class to look something like this:
partial class XmlData
{
public string XmlString{ get { return Encoding.UTF8.GetString(XmlBinary); } set { XmlBinary = Encoding.UTF8.GetBytes(value); } }
public XDocument XDoc
{
get
{
// Convert XmlBinary to XDocument
}
set
{
// Convert XDocument to XmlBinary
}
}
}
I think I've nearly figured out the conversion, but when I use the partial classes XmlString method to get the XML back from the DB, the XML has always been cut off near the end, always at a different character count:
var memoryStream = new MemoryStream();
var xmlWriter = XmlWriter.Create(memoryStream);
myXDocument.WriteTo(xmlWriter);
XmlData.XmlBinary = memoryStream.ToArray();
SOLUTION
Here's the basic conversion:
var settings = new XmlWriterSettings { OmitXmlDeclaration = true, Encoding = Encoding.UTF8 };
using (var memoryStream = new MemoryStream())
using (var xmlWriter = XmlWriter.Create(memoryStream, settings))
{
myXDocument.WriteTo(xmlWriter);
xmlWriter.Flush();
XmlData.XmlBinary = memoryStream.ToArray();
}
But for some reason in this process, some weird non ascii characters get added to the XML so using my previous XmlString method would load those weird characters and XDocument.Parse() would break, so my new partial class looks like this:
partial class XmlData
{
public string XmlString
{
get
{
var xml = Encoding.UTF8.GetString(XmlBinary);
xml = Regex.Replace(xml, #"[^\u0000-\u007F]", string.Empty); // Removes non ascii characters
return xml;
}
set
{
value = Regex.Replace(value, #"[^\u0000-\u007F]", string.Empty); // Removes non ascii characters
XmlBinary = Encoding.UTF8.GetBytes(value);
}
}
public XDocument XDoc
{
get
{
using (var memoryStream = new MemoryStream(XmlBinary))
using (var xmlReader = XmlReader.Create(memoryStream))
{
var xml = XDocument.Load(xmlReader);
return xml;
}
}
set
{
var settings = new XmlWriterSettings { OmitXmlDeclaration = true, Encoding = Encoding.UTF8 };
using (var memoryStream = new MemoryStream())
using (var xmlWriter = XmlWriter.Create(memoryStream, settings))
{
value.WriteTo(xmlWriter);
xmlWriter.Flush();
XmlBinary = memoryStream.ToArray();
}
}
}
}
That sounds like buffer of one of streams / writers was not flushed during read or write - use using (...) for autoclose, flush and dispose, and also check that in all places where you finished read / write you've done .Flush()

Save datacontract serialized object as string to save in sql db

I have a custom type UserSettingConfig I want to save in my database, I want to save it as pure XML as the type might be changed later and migrating pure xml is easier than a binary objet.
public class Serialize
{
private readonly DataContractSerializer _serializer;
public Serialize()
{
_serializer = new DataContractSerializer(typeof(UserSettingConfig));
}
public string SerializeObject(UserSettingConfig userSettingConfig)
{
using (var memoryStream = new MemoryStream())
{
_serializer.WriteObject(memoryStream, userSettingConfig);
string userSettingXml = memoryStream.ToString();
memoryStream.Close();
return userSettingXml;
}
}
public UserSettingConfig DeSerializeObject(string userSettingXml)
{
UserSettingConfig userSettingConfig;
using (var stream = new MemoryStream(userSettingXml))
{
stream.Position = 0;
userSettingConfig = (UserSettingConfig)_serializer.ReadObject(stream);
}
return userSettingConfig;
}
}
This dont work as the Memory Stream want a byte array or int
I want my Serialize to return a string (I can save as varchar(MAX) in my database)
DataContractSerializer.WriteObject has an overload that takes an XmlWriter. You can construct one of those that writes the XML to a StringBuilder:
private static string SerializeToString(object objectToSerialize)
{
var serializer = new DataContractSerializer(objectToSerialize.GetType());
var output = new StringBuilder();
var xmlWriter = XmlWriter.Create(output);
serializer.WriteObject(xmlWriter, objectToSerialize);
xmlWriter.Close();
return output.ToString();
}
You may also consider serializing to JSON instead of XML, using the excellent JSON.NET library which can serialize even the most complex objects easily. JSON is very compact and is still readable.
To serialize:
string json = Newtonsoft.Json.JavaScriptConvert.SerializeObject(anySerializableObject);
To deserialize:
MyClass instance = (MyClass) Newtonsoft.Json.JavaScriptConvert.DeserializeObject(json, typeof(MyClass));
If you need xml without xml declaration, you should use XmlWriterSettings. For instance when you need xml string for node but not entire xml document.
private static string SerializeToString(object objectToSerialize)
{
var serializer = new DataContractSerializer(objectToSerialize.GetType());
var output = new StringBuilder();
var xmlWriter = XmlWriter.Create(output, new XmlWriterSettings() { OmitXmlDeclaration = true});
serializer.WriteObject(xmlWriter, objectToSerialize);
xmlWriter.Close();
return output.ToString();
}

de-serialze a byte[] that was serialzed with XmlSerializer

I have a byte[] that was serialized with the following code:
// Save an object out to the disk
public static void SerializeObject<T>(this T toSerialize, String filename)
{
XmlSerializer xmlSerializer = new XmlSerializer(toSerialize.GetType());
TextWriter textWriter = new StreamWriter(filename);
xmlSerializer.Serialize(textWriter, toSerialize);
textWriter.Close();
}
problem is the data serialized looks like this:
iVBORw0KGgoAAAANSUhEUgAAAPAAAAFACAIAAAANimYEAAAAAXNSR0IArs4c6QAAAARnQU1BAACx......
when it gets stored in my database it looks like this:
0x89504E470D0A1A0A0000000D49484452000000F00000014008020000000D8A660400000001......
What is the difference, and how can I get the data from the disk back into a byte[]?
Note: the data is a Bitmap formatted as a png like this:
public byte[] ImageAsBytes
{
get
{
if (_image != null)
{
MemoryStream stream = new MemoryStream();
_image .Save(stream, ImageFormat.Png);
return stream.ToArray();
}
else
{
return null;
}
}
set
{
MemoryStream stream = new MemoryStream(value);
_image = new Bitmap(stream);
}
}
iVBORw0KGgoAAAANSUhEUgAAAPAAAAFACAIAAAANimYEAAAAA...
is base64 encoded representation of the binary data.
0x89504E470D0A1A0A0000000D49484452000000F000000140080...
is hexadecimal.
To get the data back from the disk use XmlSerializer and deserialize it back to the original object:
public static T DeserializeObject<T>(string filename)
{
var serializer = new XmlSerializer(typeof(T));
using (var reader = XmlReader.Create(filename))
{
return (T)serializer.Deserialize(reader);
}
}
But if you only have the base64 string representation you could use the FromBase64String method:
byte[] buffer = Convert.FromBase64String("iVBORw0KGgoAAANimYEAAAAA...");
Remark: make sure you always dispose disposable resources such as streams and text readers and writers. This doesn't seem to be the case in your SerializeObject<T> method nor in the getter and setter of the ImageAsBytes property.

Categories

Resources