How to separate deserialization attributes from my model classes?

How to separate deserialization attributes from my model classes? - c#

I am currently developing an application that rely heavily on the .NET serializer for converting back and forth between objects and XML. It works fine, but embedding serialization/deserialization attributes directly into my model classes seam like a pretty poor design choice.
Is it somehow possible to sparate these attributes from the class itself? An example of what I would like to achieve can be seen here
Thanks in advance and have a great day

Unfortunately the answer to this isn't as easy and straightforward as you may hope. Serializers sometimes need hints on how to map a text representation of your data to the conceptual object representation (and vice versa). This is often more true of XML than JSON because it is more structured (elements, attributes, namespaces, schema, etc). The EF fluent model builder example you referred to isn't for serialization, it's for mapping to/from a relational schema, which is quite different from XML serialization.
Even tools like JSON.NET have these kinds of attributes, which are necessary when the names of your serialized members don't quite match the properties on your object, and you don't want to write a custom converter.
If the attribute pollution really bothers you, then you can introduce another layer between the models and the XML. You can then have types which contain the attributes and serve the sole purpose of serializing to and from XML, and then use a tool like AutoMapper or ValueInjecter to convert from that layer into your model layer.
I too don't always like attributes polluting my types, for example with MVC model validation attributes and especially for giving hints to EF on how my entity model maps to a relational schema. However this is one instance where I think it can be appropriate, because you get a lot out of it with a pretty minimal amount of code.
There seems to be at least one fluent XML serialization tool out there, but not sure how good it is:
https://fluentlyxml.codeplex.com/
http://trycatchfail.com/blog/post/fluent-xml-serializatione28093introduction.aspx

Related

long to xml, xsd.exe, custom classes

I have 3 classes that map to my database. I need to insert an xml file into the database via these classes. The xml and classes are structured differently. Should I use xsd.exe to generate the classes of the xml and then map these generated classes to my database classes? Or should I use linq to xml to directly map the xml to the classes.

My experiences with XSD were that if it works for what you are using it for, its a very convenient thing, and completely worth doing.
On the other hand though, Depending on how familiar you are with using linq, you will probably end up with a better overall solution if you write the conversion directly.
XSD can be very convenient but I'm not always a fan of how the results are spat out. Overall Personally I'd lean towards using linq.

AutoMapper 2 way mapping

I am using AutoMapper.org to map my DTO objects to Model objects in MVC4. DTO objects are retrieved from SOAP web services. The operations on the services are mostly CRUD.
This works nicely.
I have 2 questions. Firstly, is it bad practise to map both ways (2 way mapping). So when I update on screen map the Model to a DTO, as well as the orginal mapping of DTO to Model?
Second question, is it possible for AutoMapper to map enums?

I map both ways without issue - I map from the DTOs to the business objects to get the data, and map back the other way to save the data. This is so that the DTOs that are used in my WCF service are reusable (I'm using CSLA framework and the data portal model in CSLA doesn't really let 3rd parties consume the service without having access to my bizobj library).
It does mean that some of the business logic is repeated in the web layer, but since the rules are sparse this isn't a big issue
In my case I don't think it's a bad thing. I have a very simple data model which is mostly reads, there is only the occasional time when data goes back across to be modified.
As far as I know it maps enums natively (assuming it's a direct enum to enum - since enum is just a primitive underneath), but you can always provide your own custom type converters to resolve any enum issues or if you need to do string parsing for enums.

Entities used to serialize data have changed. How can the serialized data be upgraded for the new entities?

I have a bunch of simple entity instances that I have serialized to a file. In the future, I know that the structure of these entities (ie, maybe I will rename Name to Header or something). The thing is, I don't want to lose the data that I have saved in all these old files. What is the proper way to either
load the data from the old entities into new entities
upgrade the old files so that they can be used with new entities
Note: I think I am stuck with binary serialization, not xml serialization.
Thanks in advance!
Edit: So I have an answer for the case I have described. I can use a dataContractSerializer and do something like
[DataMember("bar")]
private string foo;
and change the name in the code and keep the same name that was used for serialization. But what about the following additional cases:
The original entity has new members which can be serialized
Some serialized members that were in the original entity are removed
Some members have actually changed in function (suppose that the original class had a FirstName and LastName member and it has been refactored to have only a FullName member which combines the two)
To handle these, I need some sort of interpreter/translator deserialization class but I have no idea what I should use

I you have used BinaryFormatter, then note that this is a field serializer, not a property serializer; you can hack around it by not changing the field names. Unless it is an auto implemented property, in which case you can't.
To be honest, BinaryFormatter is a poor choice if you want the flexibility to mutate the types. A contract-based serializer is much more flexible here. For example, XmlSerializer and DataContractSerializer allow ou to control the names via an attribute.
If you want binary, I would go with protobuf-net (perhaps because I wrote it...) - here there are no names - just numeric identifiers. But the protobuf format was designed by Google specifically to allow painless upgrade of APIs.
Of course, you might also look at a DTO as permenant contract; in which case consider having a v1 DTO, a v2 DTO etc. Not the way I tend to do it myself, but definitely an option.

No matter what serialization mechanism you use, renaming a property is a breaking change. The problem with binary serialization is that you cannot easily upgrade the files to the new format which would be an easier task with text format serialization.

You'll need to write a program that
deserializes the data into the old version of the entity
transforms the old version of the entity into the new version of the entity
serializes the new version of the entity back to the file.
If you've serialized it to XML, you could probably write an XSLT to make the required changes directly.

Serialization vs LINQ

I am currently writing an application to manage some customers. The customers have some relations like orders. You can imagine this like the northwind database. I want to save the data in an xml file. My application should read, modify and save the data. I think, there are two approaches. The first approach is to save, read and modify the data with the XmlSerializer class. The second approach is to do the operations with LINQ-to-XML. All of my classes are written in simple C# classes. So, I am not sure. What do you think? What should I use for my needs?
Thanks in advance!

LINQ to XML is good for querying XML Documents.
If you're serializing/de-serializing an object, I would leave that to the XmlSerializer class.

Are you really, really sure that you want to use XML for this purpose? You can use SQL Server Compact Edition to have SQL Server capabilities that are built-in to your compiled application with no external footprint. A database is really a much better choice than using XML inthe way that you are describing.

There is much to consider when doing serialization. Even though the following is related to C++, it discusses some of the complications of serialization and what data structures to use when serializing.
http://www.parashift.com/c++-faq-lite/serialization.html
Also if another application is going to be using the output serialization, I would avoid using XmlSerializer class and construct my own schema with data migration and backwards-compatibility in mind.

Linq to XML so leater if you change your mind to set in EF or Linq to SQL will be easy.

I would recommend that you use the DataContractSerializer instead of the XmlSerializer. The XmlSerializer is still supported, however, only critical bugs are being fixed (see the comment to XSD.EXE - Incorrect Class Generated for Abstract Type With Derived Types on 10/1/2008:
Unfortunately, we're only proactively fixing the most critical customer impactful issues in XmlSerializer and xsd.exe. If this issue is causing business impact please contact Microsoft Product Support Services and we will be happy to explore various options.
The only downside, if it is one, is that the XmlSerializer allows you detailed control over the format of the XML. If you are only going to use the XML for your own purposes, then this doesn't matter, and the improved speed and feature set of the DataContractSerializer should be attractive to you.
BTW, it allows you "the best of both worlds". It can serialize data to a binary form of XML, which is more compact, but which is still XML, and can be read in by the XmlDictionaryReader class (which is an XmlReader).

Plain Old CLR Object vs Data Transfer Object

POCO = Plain Old CLR (or better: Class) Object
DTO = Data Transfer Object
In this post there is a difference, but frankly most of the blogs I read describe POCO in the way DTO is defined: DTOs are simple data containers used for moving data between the layers of an application.
Are POCO and DTO the same thing?

A POCO follows the rules of OOP. It should (but doesn't have to) have state and behavior. POCO comes from POJO, coined by Martin Fowler [anecdote here]. He used the term POJO as a way to make it more sexy to reject the framework heavy EJB implementations. POCO should be used in the same context in .Net. Don't let frameworks dictate your object's design.
A DTO's only purpose is to transfer state, and should have no behavior. See Martin Fowler's explanation of a DTO for an example of the use of this pattern.
Here's the difference: POCO describes an approach to programming (good old fashioned object oriented programming), where DTO is a pattern that is used to "transfer data" using objects.
While you can treat POCOs like DTOs, you run the risk of creating an anemic domain model if you do so. Additionally, there's a mismatch in structure, since DTOs should be designed to transfer data, not to represent the true structure of the business domain. The result of this is that DTOs tend to be more flat than your actual domain.
In a domain of any reasonable complexity, you're almost always better off creating separate domain POCOs and translating them to DTOs. DDD (domain driven design) defines the anti-corruption layer (another link here, but best thing to do is buy the book), which is a good structure that makes the segregation clear.

It's probably redundant for me to contribute since I already stated my position in my blog article, but the final paragraph of that article kind of sums things up:
So, in conclusion, learn to love the POCO, and make sure you don’t spread any misinformation about it being the same thing as a DTO. DTOs are simple data containers used for moving data between the layers of an application. POCOs are full fledged business objects with the one requirement that they are Persistence Ignorant (no get or save methods). Lastly, if you haven’t checked out Jimmy Nilsson’s book yet, pick it up from your local university stacks. It has examples in C# and it’s a great read.
BTW, Patrick I read the POCO as a Lifestyle article, and I completely agree, that is a fantastic article. It's actually a section from the Jimmy Nilsson book that I recommended. I had no idea that it was available online. His book really is the best source of information I've found on POCO / DTO / Repository / and other DDD development practices.

POCO is simply an object that does not take a dependency on an external framework. It is PLAIN.
Whether a POCO has behaviour or not it's immaterial.
A DTO may be POCO as may a domain object (which would typically be rich in behaviour).
Typically DTOs are more likely to take dependencies on external frameworks (eg. attributes) for serialisation purposes as typically they exit at the boundary of a system.
In typical Onion style architectures (often used within a broadly DDD approach) the domain layer is placed at the centre and so its objects should not, at this point, have dependencies outside of that layer.

I wrote an article for that topic: DTO vs Value Object vs POCO.
In short:
DTO != Value Object
DTO ⊂ POCO
Value Object ⊂ POCO

I think a DTO can be a POCO. DTO is more about the usage of the object while POCO is more of the style of the object (decoupled from architectural concepts).
One example where a POCO is something different than DTO is when you're talking about POCO's inside your domain model/business logic model, which is a nice OO representation of your problem domain. You could use the POCO's throughout the whole application, but this could have some undesirable side effect such a knowledge leaks. DTO's are for instance used from the Service Layer which the UI communicates with, the DTO's are flat representation of the data, and are only used for providing the UI with data, and communicating changes back to the service layer. The service layer is in charge of mapping the DTO's both ways to the POCO domain objects.
Update Martin Fowler said that this approach is a heavy road to take, and should only be taken if there is a significant mismatch between the domain layer and the user interface.

TL;DR:
A DTO describes the pattern of state transfer. A POCO doesn't describe much of anything except that there is nothing special about it. It's another way of saying "object" in OOP. It comes from POJO (Java), coined by Martin Fowler who literally just describes it as a fancier name for 'object' because 'object' isn't very sexy and people were avoiding it as such.
Expanding...
Okay to explain this in a far more high-brow way that I ever thought would be needed, beginning with your original question about DTOs:
A DTO is an object pattern used to transfer state between layers of concern. They can have behavior (i.e. can technically be a poco) so long as that behavior doesn't mutate the state. For example, it may have a method that serializes itself. For it to be a proper DTO, it needs to be a simple property bag; it needs to be clear that this object is not a strong model, it has no implied semantic meaning, and it doesn't enforce any form of business rule or invariant. It literally only exists to move data around.
A POCO is a plain object, but what is meant by 'plain' is that it is not special and does not have any specific requirements or conventions. It just means it's a CLR object with no implied pattern to it. A generic term. I've also heard it extended to describe the fact that it also isn't made to work with some other framework. So if your POCO has a bunch of EF decorations all over it's properties, for example, then it I'd argue that it isn't a simple POCO and that it's more in the realm of DAO, which I would describe as a combination of DTO and additional database concerns (e.g. mapping, etc.). POCOs are free and unencumbered like the objects you learn to create in school
Here some examples of different kinds of object patterns to compare:
View Model: used to model data for a view. Usually has data annotations to assist binding and validation for particular view (i.e. generally NOT a shared object), or in this day and age, a particular view component (e.g. React). In MVVM, it also acts as a controller. It's more than a DTO; it's not transferring state, it's presenting it or more specifically, forming that state in a way that is useful to a UI.
Value Object: used to represent values, should be immutable
Aggregate Root: used to manage state and invariants. should not allow references to internal entities other than by ID
Handlers: used to respond to an event/message.
Attributes: used as decorations to deal with cross-cutting concerns. May only be allowed to be used on certain objects levels (e.g. property but not class, method but not property, etc.)
Service: used to perform complex tasks. Typically some form of facade.
Controller: used to control flow of requests and responses. Typically restricted to a particular protocol or acts as some sort of mediator; it has a particular responsibility.
Factory: used to configure and/or assemble complex objects for use when a constructor isn't good enough. Also used to make decisions on which objects need to be created at runtime.
Repository/DAO: used to access data. Typically exposes CRUD operations or is an object that represents the database schema; may be marked up with implementation specific attributes. In fact, one of these schema DAO objects is actually another kind of DTO...
API Contracts: Likely to be marked up with serialization attributes. Typically needs to have public getters and setters and should be lightweight (not an overly complex graph); methods unrelated to serialization are not typical and discouraged.
These can be seen as just objects, but notice that most of them are generally tied to a pattern or have implied restrictions. So you could call them "objects" or you could be more specific about its intent and call it by what it is. This is also why we have design patterns; to describe complex concepts in a few words. DTO is a pattern. Aggregate root is a pattern, View Model is a pattern (e.g. MVC & MVVM).
A POCO doesn't describe a pattern. It is just a different way of referring to classes/objects in OOP which could be anything. Think of it as an abstract concept; they can be referring to anything. IMO, there's a one-way relationship though because once an object reaches the point where it can only serve one purpose cleanly, it is no longer a POCO. For example, once you mark up your class with decorations to make it work with some framework (i.e. 'instrumenting' it), it is no longer a POCO. Therefore I think there are some logical relationships like:
A DTO is a POCO (until it is instrumented)
A POCO might not be a DTO
A View Model is a POCO (until it is instrumented)
A POCO might not be View Model
The point in making a distinction between the two is about keeping patterns clear and consistent in effort to not cross concerns and lead to tight coupling. For example if you have a business object that has methods to mutate state, but is also decorated to hell with EF decorations for saving to SQL Server AND JsonProperty so that it can be sent back over an API endpoint. That object would be intolerant to change, and would likely be littered with variants of properties (e.g. UserId, UserPk, UserKey, UserGuid, where some of them are marked up to not be saved to the DB and others marked up to not be serialized to JSON at the API endpoint).
So if you were to tell me something was a DTO, then I'd probably make sure it was never used for anything other than moving state around. If you told me something was a view model, then I'd probably make sure it wasn't getting saved to a database, and I'd know that it's ok to put 'hacky' things in there to make sure the data is usable by a UI. If you told me something was a Domain Model, then I'd probably make sure it had no dependencies on anything outside of the domain and certainly no dependencies on any technical implementation details (databases, services etc.), only abstractions. But if you told me something was a POCO, you wouldn't really be telling me much at all other than it is not and should not be instrumented.
History
Paraphrased from Fowler's explanation: In a world where objects were fancy (e.g. followed a particular pattern, had instrumentation etc.), it somehow encouraged people to avoid using not-fancy objects to capture business logic. So they gave it a fancy name POJO. If you want an example, the one he refers to is an "Entity Bean" which is one of those kinds of objects that have very specific conventions and requirements, etc.. If you don't know what that is --> Java Beans.
In contrast, a POJO/POCO is just the regular ole object that you'd learn out to create in school.

A primary use case for a DTO is in returning data from a web service. In this instance, POCO and DTO are equivalent. Any behavior in the POCO would be removed when it is returned from a web service, so it doesn't really matter whether or not it has behavior.

DTO objects are used to deserialize data into objects from different sources. Those objects are NOT your Model (POCO) objects. You need to transform those objects into your Model (POCO) objects. The transformation is mostly a copy operation. You can fill those POCO objects directly from the source if its an internal source, but its not adviceable if its an external source. External sources have API's with descriptions of the Schema they use. Its much easier then to load the request data in an DTO and after that transform those in your POCO's. Yes its an extra step, but with a reason. The rule is to load the data from your source in an object. It can be JSON, XML whatever. When loaded then transform that data in what you need in your model. So most of times the DTO is an object image of the external source. Sometimes you even get the Schema's of the source providers then you can deserialize even easier, XML works like that with XSD's.

here is the general rule: DTO==evil and indicator of over-engineered software. POCO==good. 'enterprise' patterns have destroyed the brains of a lot of people in the Java EE world. please don't repeat the mistake in .NET land.

Don't even call them DTOs. They're called Models....Period. Models never have behavior. I don't know who came up with this dumb term DTO but it must be a .NET thing is all I can figure. Think of view models in MVC, same dam** thing, models are used to transfer state between layers server side or over the wire period, they are all models. Properties with data. These are models you pass ove the wire. Models, Models Models. That's it.
I wish the stupid term DTO would go away from our vocabulary.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.