Maintaining object relational mapping with serialization - c#

I am trying to figure out a way to make some of my database objects serializable to and from XML files.
I am using an Entity Framework data model for my objects and making them available to my client using WCF RIA Services. I want to be able to take a given object from the database and serialize it to an XML file, and vice-versa.
In the past I have tried this and the problems I run into are as follows:
If I implement IXmlSerializable for each object, then at the time of deserialization each object knows nothing of the other objects being deserialized. It is in a kind of bubble and it has no way of resolving a foreign key ID to an object reference.
For the above problem, the only solution I found was to write one big serialization and deserialization method where a parent object keeps track of references and assign them as needed. This feels like a very bad way of doing it since I have to constantly maintain this large method anytime an object changes, instead of each object being responsible for its own serialization.
The standard XML design of nesting objects inside each other does not work well for ORM models. The reason is that some objects may have references to and be used by multiple other objects, so I can't create those objects as sub-elements of a parent object.
Consider the following XML:
<User Name="John Smith">
<FavoriteMovies>
<Movie Name="The Big Lebowski" Year="1998" ... />
</FavoriteMovies>
</User>
<User Name="Robert Jones">
<FavoriteMovies>
<Movie Name="The Big Lebowski" Year="1998" ... />
</FavoriteMovies>
</User>
Clearly I shouldn't have two instances of the same movie. Rather the serialization should look something like this:
<User Name="John Smith">
<FavoriteMovies>
<Id>5</Id>
</FavoriteMovies>
</User>
<User Name="Robert Jones">
<FavoriteMovies>
<Id>5</Id>
</FavoriteMovies>
</User>
<Movies>
<Movie Id="5" Name="The Big Lebowski" Year="1998" ... />
</Movies>
WCF already knows how to serialize and deserialize my objects into SOAP/JSON/etc. using Data Services. Is that something I can just re-use when serializing to an XML file?
It occurs to me that relying on a database foreign key ID probably won't work since in many cases the objects will have the default ID. WCF manages to serialize the objects without relying on these being set, and the IDs are only assigned once it gets saved to the SQL database.

Not familiar with EF and how the object model works but for most objects that successfully are serialized/deserialized over WCF you can you just use the DataContractSerializer directly. Refer to this article for a simple walkthrough.
Since you will not need the XML to be interoperable, you can probably also use the preserveObjectReferences setting to avoid redundant data.

I highly recommend Json.NET serializer http://json.codeplex.com/ over DataContractSerializer as soon as the latter is being bit buggy. You can also look this question for example for more research about why Json.Net serializer is better.

Related

Extend .NET XmlSerializer?

The question is simple as stated in the title.
I need to somehow extend the existing .NET XMLSerializer.
The reasons are
It already does so many things. I do not want to reinvent the wheel.
I need to deserialize a bunch of tags to a key/value pair. E.g.
<City>
<Suburb1>Test1</Suburb1>
<Suburb2>Test2</Suburb2>
</City>
This needs to be deserialized to
`List<KeyValuePair<string,string>` suburbsList
`suburbsList.Add(new KeyValuePair("Suburb1", "Test1"))
`suburbsList.Add(new KeyValuePair("Suburb2", "Test2"))
The example here is simplified, so having strongly typed properties is not an option.
I tried UnknownNode event from the XmlSerializer class but it does not seem to do what I'm intending. TBH, I have not explored a lot on this though.

How to use flattened ViewModel for a Web API method

I am creating a Web API service that acts as a facade for my clients to a more complex messaging API on the backend. The .XSD that represents the calls I need to make to the backend API is obviously not something I want them to understand. My goal is to flatten out the required elements in a ViewModel class that can be used by the client. My POST might be something like below:
public HttpResponseMessage Post(FlattenedViewModel flattenedViewModel)
{
}
The idea of the flattened view model is to prevent my clients from having to understand any complex structuring of data to call my API. It's a lot easier to submit this (could be JSON or XML):
<PersonFirstName>John</PersonFirstName>
<PersonLastName>Smith</PersonLastName>
<PersonPhone>123-456-7890</PersonPhone>
than this:
<Person>
<Name>
<FirstName>John</FirstName>
<LastName>Smith</LastName>
</Name>
<Communication>
<Type>
<Phone>123-456-7890</Phone>
</Type>
</Communication>
</Person>
I understand creating the class structure to represent the 2nd example is not difficult and easy for all of us to understand. However, my real .XSD is about 50x this example. My goal is to provide an easier interface and ability to have a flattened view, so please use that as a constraint of this question. Imagine it like a user was entering data on a form and pressed submit; a form is like a flattened view of data to be entered.
The hurdles I am encountering are the following:
Having a node that can repeat a finite set of times is solvable. However, nodes with the following constraint on the .xsd: maxOccurs="unbounded" do not appear to be initially doable with a flattened view. Is there another way of doing this so I don't have to introduce a collection? Or can I introduce a collection but still allow the user to not have to understand a complex structure (like my 1st example)? Please provide an example of what that would look like if possible.
I have node names that are repeated among different parts of the .xsd but are unrelated. For example the node ID or Date. My solution is to append the parent node name to the value to create a property like SubmitDate or PersonID. The issue I now have is my ViewModel class property names don't match the ones of my entities that must be mapped to in the domain model. I'm using ValueInjecter, so is there any type of streamlined way I can still map properties to other classes that have different names (i.e. annotation or something)?
Any help is appreciated, thank you!
I believe the answer lies in creating custom injections for ValueInjector to use and then simply making a call to 'InjectFrom' to invoke them...
_person.InjectFrom<CustomPersonInjection>(flattenedViewModel);
I had a quick look around for some specific examples that might help you but could find anything within a reasonable time frame (they're out there though, google 'valueinjecter custom injections').
Here are some links to get you started:
Deep Cloning example: http://valueinjecter.codeplex.com/wikipage?title=Deep%20Cloning&referringTitle=Home
Custom Convention Injection: Using ValueInjecter to map between objects with different property names

Can I add [XmlElement] attribute to List members without breaking backwards compatability?

I believe the following:
public List<Vector3> Vectors;
Will serialize out to:
<Vectors>
<Vector3>
<X>0</X>
<Y>0</Y>
<Z>0</Z>
</Vector3>
</Vectors>
I want to remove the encasing tag which I believe I can do like this:
[XmlElement("Vector3")]
public List<Vector3> Vectors;
Which should serialize to:
<Vector3>
<X>0</X>
<Y>0</Y>
<Z>0</Z>
</Vector3>
But I'm afraid that would break old XML files that are still using the "Vectors" tag around the list. Is there a common way to solve this?
EDIT: The list above would be part of a container object, so the full XML might begin with
<Container>
and end with
</Container>
I left that out originally to keep the question shorter.
I don't believe XML has any sort of built in mechanism for versioning. I think your best bet is going to be writing some external mechanism which can detect the "version" as defined by you and deserialize the old version into your new object manually. You probably will also want to define a new version member variable or property which will serialize with your object in case you run into the same problem again, because once you change the schema a 2nd time, you will have 3 versions to worry about.
You can either write a custom deserialize method by defining IXmlSerializable on your object and defining the readXml/writeXml functions, or you can use some external process to generate the new XML format based on the old version. Perhaps load the XML file into an XmlDocument first, fix it how you want (i.e. move the Vector3 nodes up a level and remove the Vectors node), then save the document's OuterXml value into a string and deserialize via a MemoryStream.

Is it possible to set a default value when deserializing xml in C# (.NET 3.5)?

I've got a little problem that's slightly frustrating. Is it possible to set a default value when deserializing xml in C# (.NET 3.5)? Basically I'm trying to deserialize some xml that is not under my control and one element looks like this:
<assignee-id type="integer">38628</assignee-id>
it can also look like this:
<assignee-id type="integer" nil="true"></assignee-id>
Now, in my class I have the following property that should receive the data:
[XmlElementAttribute("assignee-id")]
public int AssigneeId { get; set; }
This works fine for the first xml element example, but the second fails. I've tried changing the property type to be int? but this doesn't help. I'll need to serialize it back to that same xml format at some point too, but I'm trying to use the built in serialization support without having to resort to rolling my own.
Does anyone have experience with this kind of problem?
It looks like your source XML is using xsi:type and xsi:nil, but not prefixing them with a namespace.
What you could do is process these with XSLT to turn this:
<assignees>
<assignee>
<assignee-id type="integer">123456</assignee-id>
</assignee>
<assignee>
<assignee-id type="integer" nil="true"></assignee-id>
</assignee>
</assignees>
into this:
<assignees xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<assignee>
<assignee-id xsi:type="integer">123456</assignee-id>
</assignee>
<assignee>
<assignee-id xsi:type="integer" xsi:nil="true" />
</assignee>
</assignees>
This would then be handled correctly by the XmlSerializer without needing any custom code. The XSLT for this is rather trivial, and a fun exercise. Start with one of the many "copy" XSLT samples and simply add a template for the "type" and "nil" attributes to ouput a namespaced attribute.
If you prefer you could load your XML document into memory and change the attributes but this is not a good idea as the XSLT engine is tuned for performance and can process quite large files without loading them entirely into memory.
You might want to take a look at the OnDeserializedAttribute,OnSerializingAttribute, OnSerializedAttribute, and OnDeserializingAttribute to add custom logic to the serialization process
XmlSerializer uses xsi:nil - so I expect you'd need to do custom IXmlSerializable serialization for this. Sorry.

Storing Relational Data in XML

I'm wondering what the best practices are for storing a relational data structure in XML. Particulary, I am wondering about best practices for enforcing node order. For example, say I have three objects: School, Course, and Student, which are defined as follows:
class School
{
List<Course> Courses;
List<Student> Students;
}
class Course
{
string Number;
string Description;
}
class Student
{
string Name;
List<Course> EnrolledIn;
}
I would store such a data structure in XML like so:
<School>
<Courses>
<Course Number="ENGL 101" Description="English I" />
<Course Number="CHEM 102" Description="General Inorganic Chemistry" />
<Course Number="MATH 103" Description="Trigonometry" />
</Courses>
<Students>
<Student Name="Jack">
<EnrolledIn>
<Course Number="CHEM 102" />
<Course Number="MATH 103" />
</EnrolledIn>
</Student>
<Student Name="Jill">
<EnrolledIn>
<Course Number="ENGL 101" />
<Course Number="MATH 103" />
</EnrolledIn>
</Student>
</Students>
</School>
With the XML ordered this way, I can parse Courses first. Then, when I parse Students, I can look up each Course listed in EnrolledIn (by its Number) in the School.Courses list. This will give me an object reference to add to the EnrolledIn list in Student. If Students, however, comes before Courses, such a lookup to get a object reference is not possible. (Since School.Courses has not yet been populated.)
So what are the best practices for storing relational data in XML?
- Should I enforce that Courses must always come before Students?
- Should I tolerate any ordering and create a stub Course object whenever I encounter one I have not yet seen? (To be expanded when the definition of the Course is eventually reached later.)
- Is there some other way I should be persisting/loading my objects to/from XML? (I am currently implementing Save and Load methods on all my business objects and doing all this manually using System.Xml.XmlDocument and its associated classes.)
I am used to working with relational data out of SQL, but this is my first experience trying to store a non-trivial relational data structure in XML. Any advice you can provide as to how I should proceed would be greatly appreciated.
Don't think in SQL or relational when working with XML, because there are no order constraints.
You can however query using XPath to any portion of the XML document at any time. You want the courses first, then "//Courses/Course". You want the students enrollments next, then "//Students/Student/EnrolledIn/Course".
The bottom line being... just because XML is stored in a file, don't get caught thinking all your accesses are serial.
I posted a separate question, "Can XPath do a foreign key lookup across two subtrees of an XML?", in order to clarify my position. The solution shows how you can use XPath to make relational queries against XML data.
While you can specify order of child elements using a <xsd:sequence>, by requiring child objects to come in specific order you make your system less flexible (i.e., harder to update using notepad).
Best thing to do is to parse out all your data, then perform what actions you need to do. Don't act during the parse.
Obviously, the design of the XML and the data behind it precludes serializing a single POCO to XML. You need to control the serialization and deserialization logic in order to unhook and re-hook objects together.
I'd suggest creating a custom serializer that builds the xml representation of this object graph. It can thereby control not only the order of serialization, but also handle situations where nodes aren't in the expected order. You could do other things such as adding custom attributes to use for linking objects together which don't exist as public properties on the objects being serialized.
Creating the xml would be as simple as iterating over your objects a few times, building up collections of XElements with the expected representation of the objects as xml. When you're done you can stitch them together into an XDocument and grab the xml from it. You can make multiple passes over the xml on the reverse side to re-create your object graph and restore all references.
Node ordering is only important if you need to do forward-only processing of the data, e.g. using an XmlReader or a SAX parser. If you're going to read the XML into a DOM before processing it (which you are if you're using XmlDocument), node order doesn't really matter. What matters more is that the XML be structured so that you can query it with XPath efficiently, i.e. without having to use "//".
If you take a look at the schema that the DataSetGenerator produces, you'll see that there's no ordering associated with the DataTable-level elements. It may be that ADO processes elements in some sequence not represented in the schema (e.g. one DataTable at a time), or it may be that ADO does forward-only processing and doesn't enforce relational constraints until the DataSet is fully read. I don't know. But it's clear that ADO doesn't couple the processing order to the document order.
(And yes, you can specify the order of child elements in an XML schema; that's what xs:sequence does. If you don't want node order to be enforced, you use an unbounded xs:choice.)
The order is not usually important in XML. In this case the Courses could come after Students. You parse the XML and then you make your queries on the entire data.
From experience, XML isn't the best to store relational data. Have you investigated YAML? Do you have the option?
If you don't, a safe way would be to have a strict DTD for the XML and enforce that way. You could also, as you suggest, keep a hash of objects created. That way if a Student creates a Course you keep that Course around for future updating when the tag is hit.
Also remember you can use XPath queries to access specific nodes directly, so you can enforce parsing of courses first regardless of position in the XML document. (making a more complete answer, thanks to dacracot)
You could also use two XML files, one for courses and a second for students. Open and parse the first before you do the second.
I's been a while, but I seem to remember doing a base collection of 'things' in one part of an xml file, and referring to them in another using the schema features keyref and refer. I found a few examples here. My apologies if this is not what you're looking for.
XML is definitely not a friendly place for relational data.
If you absolutely need to do this, then I'd recommend a funky inverted kind of logic.
In your example, you've got Schools, which offers many courses, taken by many students.
Your XML might follow as such:
<School>
<Students>
<Student Name="Jack">
<EnrolledIn>
<Course Number="CHEM 102" Description="General Inorganic Chemistry" />
<Course Number="MATH 103" Description="Trigonometry" />
</EnrolledIn>
</Student>
<Student Name="Jill">
<EnrolledIn>
<Course Number="ENGL 101" Description="English I" />
<Course Number="MATH 103" Description="Trigonometry" />
</EnrolledIn>
</Student>
</Students>
</School>
This obviously isn't the least repetitive way to do this (it's relational data!), but it's easily parse-able.

Categories

Resources