How to handle XmlWriter.Create in a subclass - c#

I need to change the behaviour of XmlWriter for my project to change the way that empty xml elements are serialised. Currently, my code uses XmlWriter and XmlSerializer like so:
public string Serialize(object o)
{
XmlWriterSettings settings = new XmlWriterSettings();
...
StringWriter stringWriter = new StringWriter();
XmlWriter xmlWriter = XmlWriter.Create(stringWriter, settings);
XmlSerializer serializer = new XmlSerializer(o.GetType());
serializer.Serialize(xmlWriter, o);
return stringWriter.ToString();
}
When serializing my xml, empty elements are being serialized to <emptyElement/>, but I need the xml to serialize empty elements to <emptyElement></emptyElement>. The best solution I've found for this was stated in this a Microsoft forum years ago: https://social.msdn.microsoft.com/Forums/en-US/979315cf-6727-4979-a554-316218ab8b24/xml-serialize-empty-elements?forum=xmlandnetfx
The faster and safer way of doing this is by writing your own subclass of the XmlWriter and give it to XmlSerializer.
YourXmlWriter would aggregate standard one and would translate all WriteEndElement() calls to WriteFullEndElement() calls.
I've tried writing my own subclass of XmlWriter, overriding the two methods I need to override:
public abstract class CustomXmlWriter : XmlWriter
{
public override void WriteEndElement()
{
WriteFullEndElement();
}
public override Task WriteEndElementAsync()
{
return WriteFullEndElementAsync();
}
}
In theory, I believe this should work. However, when trying to use the code, I'm coming up against a brick wall around XmlWriter.Create. I cannot cast the resulting XmlWriter to my CustomXmlWriter for obvious reasons, and I can't override the method as it's a static method.
How am I meant to deal with the static Create method? The only other way I can think of doing this is to scrap the idea of my own CustomXmlWriter, and to simply manipulate the string at the end of my method, but this feels very wrong. I don't know if what I'm trying to achieve is possible, or if there is a simple setting somewhere that I cannot seem to find anywhere.

Try following Regex as a temporary fix. The variable input can be the entire xml string and it will replace every occurrence. :
static void Main(string[] args)
{
string input = "<emptyElement/>";
string patternNullTag = #"\<(?'tagname'\w+)/\>";
string output = Regex.Replace(input, patternNullTag, ReplaceNullElement);
}
static string ReplaceNullElement(Match match)
{
string tagname = match.Value.Replace("<", "").Replace("/>", "");
string newElement = "<" + tagname + ">" + "</" + tagname + ">";
return newElement;
}

Related

How to make an XML tag mandatory with a self-closing tag, using the serialiser?

I'm working on a C# program and I'm trying to serialise XML.
I have the following tag:
using System.Xml.Serialization;
...
[XmlElement("MV")]
public MultiVerse MultiVerse { get; set; }
When I don't fill in this value, the tag <MV> is not present, but I would like to get a tag <MV/> in that case:
Currently I have <HM><ID>Some_ID</ID></HM>.
I'd like to have <HM><ID>Some_ID</ID><MV/></HM>.
I already tried preceeding the line with [Required] but that didn't work, and I think that filling in the IsNullable attribute is the good approach.
Edit1, after some investigation on the internet
On the internet, there are quite some advises on modifying the XmlWriter but in my project, the whole serialisation is done as follows:
public override string ToString()
{
...
using (var stream = new StringWriter())
using (var writer = XmlWriter.Create(stream, settings))
{
var serializer = new XmlSerializer(base.GetType());
serializer.Serialize(writer, this, ns);
return stream.ToString();
}
...
}
As you can see, this is so general that I prefer not to do any modifications in here, hence I'm looking for a way to customise the [XmlElement] directive.
Edit2: XmlWriter settings:
The XmlWriter settings look as follows:
// Remove Declaration
var settings = new XmlWriterSettings
{
Indent = false,
OmitXmlDeclaration = true,
NewLineHandling = NewLineHandling.None,
NewLineOnAttributes = false,
};
Does anybody have an idea?
Thanks in advance
There is https://learn.microsoft.com/en-us/dotnet/api/system.xml.serialization.xmlelementattribute.isnullable?view=net-6.0 so e.g.
[XmlElement("MV", IsNullable=true)]
public MultiVerse MultiVerse { get; set; }
would give you, for a null value, a serialization as <MV xsi:nil="true" /> (or possibly <MV xsi:nil="true"></MV> as ensuring the short tag notation is not something the standard writers give you control over but my experience is that .NET usually uses it for empty elements so you might be lucky that your wanted serialization format is the default one .NET outputs).
This is the way I'm currently solving my issue (don't laugh):
public override string ToString()
{
string temp = base.ToString();
temp = temp.Replace(" p2:nil=\"true\"", "");
temp = temp.Replace(" xmlns:p2=\"http://www.w3.org/2001/XMLSchema-instance\"", "");
temp = temp.Replace("MV />", "MV/>");
return temp;
}
This is hidious! Does anybody have a better solution?
Thanks

C# fastest way to serialise object to xml string

Background: I have been tasked with serialising C# objects into xml string. The xml string is then passed to webservice and are written to disk in xml file. The task of serialising needs to occur within 5 mins timeframe that the process gets. The consumer webservice only accepts string as xml.
I have been researching into various ways of creating xml string from xml serialiser, xmlwriter, xdocument, stringbuilder to write xml string, object to json to xml, linq to xml but I needed to know if anyone had experience of doing something similar. Main aim is to have a high performant xml string that is not so verbose and error prone like creating xml in string.
My object is Called Employee and has 18 string/date properties. The objects are created in memory and we get around 4000k objects in total once the process boots up. The process runs for 1 hour a day, loads data from data file and creates person objects. A number of functions are performed on the objects. Once objects are ready, they need to be serialised and data in xml is sent to webservice and is writren to xml file. So in short, these objects need to be serialised and saved to disk and sent to webservice.
Does anyone recommend any high performant yet easy to. Maintain approach? Apologies for not positing any code because I can create a class and add xml serialiser etc code but i dont think it will add any value at the moment as currently I am looking for past experiences plus i want to ensure i dont go on a wild goose chase and want to implement with right solution.
I have tried following serialiser code but it takes 10+ mins to serialise all 4000k objects.
public static bool Serialize<T>(T value, ref string serializeXml)
{
if (value == null)
{
return false;
}
try
{
XmlSerializer xmlserializer = new XmlSerializer(typeof(T));
StringWriter stringWriter = new StringWriter();
XmlWriter writer = XmlWriter.Create(stringWriter);
xmlserializer.Serialize(writer, value);
serializeXml = stringWriter.ToString();
writer.Close();
return true;
}
catch (Exception ex)
{
return false;
}
}
I have also tried caching serialiser but doesn't give any performance improvements
According to your requirement, speed is the most demanding part. We need to write a benchmark here. As mentioned in the comment, besides XmlSerializer, we can use DataContractSerializer for our purpose. There are several Q&A related to the difference between these two, e.g.:
DataContractSerializer vs XmlSerializer: Pros and Cons of each serializer
Linq to Xml VS XmlSerializer VS DataContractSerializer
Difference between DataContractSerializer vs XmlSerializer
Another options are manually write your XML either using StringBuilder or XmlWriter. Although in the requirement you mentioned:
Main aim is to have a high performant xml string that is not so verbose and error prone like creating xml in string
these three serializer is added for comparison. Of course, in the case of StringBuilder, the text must be escaped. Here, I used System.Security.SecurityElement.Escape. The object to be serialized looks like:
//Simple POCO with 11 String properties, 7 DateTime properties
[DataContractAttribute()]
public class Employee
{
[DataMember()]
public string FirstName { set; get; }
[DataMember()]
public string LastName { set; get; }
//...omitted for clarity
[DataMember()]
public DateTime Date03 { set; get; }
[DataMember()]
public DateTime Date04 { set; get; }
}
and all the properties have value (not null), assigned prior to calling the serializer. The serializer codes looks like:
//Serialize using XmlSerializer
public static bool Serialize<T>(T value, ref StringBuilder sb)
{
if (value == null)
return false;
try
{
XmlSerializer xmlserializer = new XmlSerializer(typeof(T));
using (XmlWriter writer = XmlWriter.Create(sb))
{
xmlserializer.Serialize(writer, value);
writer.Close();
}
return true;
}
catch (Exception ex)
{
Console.WriteLine(ex);
return false;
}
}
//Serialize using DataContractSerializer
public static bool SerializeDataContract<T>(T value, ref StringBuilder sb)
{
if (value == null)
return false;
try
{
DataContractSerializer xmlserializer = new DataContractSerializer(typeof(T));
using (XmlWriter writer = XmlWriter.Create(sb))
{
xmlserializer.WriteObject(writer, value);
writer.Close();
}
return true;
}
catch (Exception ex)
{
Console.WriteLine(ex);
return false;
}
}
//Serialize using StringBuilder
public static bool SerializeStringBuilder(Employee obj, ref StringBuilder sb)
{
if (obj == null)
return false;
sb.Append(#"<?xml version=""1.0"" encoding=""utf-16""?>");
sb.Append("<Employee>");
sb.Append("<FirstName>");
sb.Append(SecurityElement.Escape(obj.FirstName));
sb.Append("</FirstName>");
//... Omitted for clarity
sb.Append("</Employee>");
return true;
}
//Serialize using XmlSerializer (manually add elements)
public static bool SerializeManual(Employee obj, ref StringBuilder sb)
{
if (obj == null)
return false;
try
{
using (var xtw = XmlWriter.Create(sb))
{
xtw.WriteStartDocument();
xtw.WriteStartElement("Employee");
xtw.WriteStartElement("FirstName");
xtw.WriteString(obj.FirstName);
xtw.WriteEndElement();
//...Omitted for clarity
xtw.WriteEndElement();
xtw.WriteEndDocument();
xtw.Close();
}
return true;
}
catch(Exception ex)
{
Console.WriteLine(ex);
return false;
}
}
In the benchmark, 4M Employee objects are given as the argument and XML is written to preallocated StringBuilder (parameter ref StringBuilder sb). For DataContractSerializer and Manual XmlWriter, benchmark with Parallel.Invoke (3 parallel tasks) also performed. Required processing time for each serializer:
//Simple POCO with 11 String properties, 7 DateTime properties
XmlSerializer =00:02:37.8151125 = 157 sec: 100% (reference)
DataContractSerializer=00:01:10.3384361 = 70 sec: 45% (3-Parallel: 47sec = 30%)
StringBuilder =00:01:22.5742122 = 82 sec: 52%
Manual XmlWriter =00:00:57.8436860 = 58 sec: 37% (3-Parallel: 40sec = 25%)
Environment: .Net Framework 4.5.2, Intel(R) Core(TM) i5-3337U # 1.80GHz 1.80GHz, Windows 10, 6.0GB Memory. I expect StringBuilder will be the fastest, but it wasn't. Perhaps, the bottleneck is in System.Security.SecurityElement.Escape().
The conclusion: DataContractSerializer is within the requirement, processing time is 30-45% compared to XmlSerializer. The results may differs depend on the environment, and you should make your own benchmark.

Serializing an object with a string property containing double quotes

I have a object which has a string property that has a value with double quotes in it. I need to serialize this object and then use that XML. I wont be deserializing this xml.
I am having trouble getting the right content in the XML file. Let me explain with a code sample:
[Serializable]
public class Test {
[XmlElement]
public string obj { get; set; }
}
class Program {
static void Main(string[] args) {
var st ="Priority == \"1\"";
Test test = new Test();
test.obj = st;
//Serialize this object
XmlSerializer xsSubmit = new XmlSerializer(typeof(Test));
StringWriter sww = new StringWriter();
XmlWriter writer = XmlWriter.Create(sww, new XmlWriterSettings {
OmitXmlDeclaration = true
});
var ns = new XmlSerializerNamespaces();//just to make things simpler here
ns.Add(string.Empty, string.Empty);
xsSubmit.Serialize(writer, test, ns);
//My XML
var xml = sww.ToString();
}
}
I need my xml to be:
<Test><obj>Priority=="1"</obj></Test>
I now get:
<Test><obj>Priority==\"1\"</obj></Test>
I even tried to encode the string into HTML using var html = HttpUtility.HtmlEncode(st);
In this case, the varible html is in the right format however on serializing I get:
<Test><obj>Priority==&quot;1&quot;</obj></Test>
Need some help please.
There was no issue with the code.
I actually get<Test><obj>Priority=="1"</obj></Test> and this is fine. The mistake I was making was that I was reading the value on the debugger. When I write it somewhere, the content was in the correct format.
The " didnt get converted to " because double quotes are as such accepted in an XML document. I can work with that in this case!

XmlWriter.WriteRaw escapes xml when the writer was created via XElement.CreateWriter

I have noticed that XmlWriter.WriteRaw appears to not work properly (it escapes xml characters) when the writer is created using XElement.CreateWriter. The below test case reproduces the problem. Is my usage incorrect? Does anyone know how to achieve the desired behavior? I need to be able to write a raw xml string to an XmlWriter and incorporate that xml into an XElement.
[Test]
public void XElementWriterTest()
{
var xelement = new XElement("test");
using (var writer = xelement.CreateWriter())
{
writer.WriteRaw(#"<some raw='xml' />");
}
Assert.That(xelement.ToString(), Is.EqualTo(#"<test><some raw='xml' /></test>"));
// actual : "<test><some raw='xml' /></test>"
}
Is XElement.Parse() an option for you at all?
[TestMethod]
public void XElementWriterTest()
{
var xelement = new XElement("test");
const string newXML = #"<some raw='xml' />";
var child = XElement.Parse(newXML);
xelement.Add(child);
Assert.AreEqual(xelement.ToString(SaveOptions.DisableFormatting), #"<test><some raw=""xml"" /></test>");
}

Does static XML Serializer in C# cause memory over grow?

I just can't find a simple answer to this simple question I have from Dr Google. I have the following serializing function which I put in a static module. It is called many times by my application to serialize lots of XML files. Will this cause memory to over grow? (Ignore the text write part of the code)
public static void SerializeToXML<T>(String inFilename,T t)
{
XmlSerializer serializer = new XmlSerializer(t.GetType());
string FullName = inFilename;
TextWriter textWriter = new StreamWriter(FullName);
serializer.Serialize(textWriter, t);
textWriter.Close();
textWriter.Dispose();
}
Will this cause memory to over grow?
No. There will be no memory over growing. static will let you call SerializeToXML method without create a new instance of the class. Not anything else.
So if you're calling this method many times, You even shrinking the memory usage with a static method.
Though you wrote to ignore the text write part, You should use using statement for unmanaged resources:
public static void SerializeToXML<T>(String inFilename,T t)
{
XmlSerializer serializer = new XmlSerializer(t.GetType());
string FullName = inFilename;
using (TextWriter textWriter = new StreamWriter(FullName))
{
serializer.Serialize(textWriter, t);
textWriter.Close();
}
}

Categories

Resources