I have two separate programs that need to share information. This sharing will be done by one app placing an XML serialized object in a database, and the other app retrieving it on a different machine. The objects share the same variables but the properties and methods are different.
How exact do the classes have to match between the two programs?
Is the match line by line or just variable, property, and method names?
I ended up using the Newtonsoft.Json library instead of xml and used the <JsonObject(MemberSerialization.OptIn)> and JsonProperty() attributes to control what got serialized.
You did not specify which kind of serialization you were after.
The standard NET binary serializer is not well suited for data exchange between 2 different assemblies. When you go to deserialize, you'll get an an error similar to [Culture].[Assembly].[Version].SourceClass cannot be deserialized to [Culture].[Assembly].[Version].DestClass. This will happen even if the classes are identical.
There are several ways around this. A) Use the same service DLL on both sides to do the serializing B) trick it into deserializing by using an override to report a matching Culture-Assembly-Version-Class, but that seems dodgy or C) use XML serialization, but that makes for very wordy output, which is also readable.
For Binary Serialization, rather than the NET binary formatter, there is ProtoBuf-NET which is faster, produces much smaller output and uses nearly identical syntax.
How exact do the classes have to match between the two programs
ProtoBuf uses a numeric index rather than property name, so they shouldn't have to be too similar. Of course there has to be some similarity or the destination may not have a clue what the data represents. The code in the class can be quite different because it stays put.
Serialization stores only the data for an object - member variables, properties, etc. As long as the data types are compatible, it should work. You do not need a line by line match for the functions.
It all depends on the serializer you are using. Some require a perfect match, others tend to be more loosely coupled to the objects.
How exact do the classes have to match between the two programs?
Well, not at all. But they should be similar in some way because otherwise the serialization doesn't make sense.
Is the match line by line or variables and method names?
As, stated above: there must be some overlap. Usually the property names must be the same. But of course you can also provide a custom mapping.
Take a look at the Newtonsoft library, u can use it (for json) like this:
JsonConvert.DeserializeObject<IEnumerable<Unit>>(result);
It's independent of the object method that serialized the string.
Related
I am building a C# integration tool, but I am having some trouble figuring out if I should create different classes for the data that I am receiving from different requests from the source application using REST. The responses are similar in a way that the constructs are the same, but for different information. I.e they would have an "Attributes" tag, but the attributes may vary per class. In the same breath, about 60% or more of the attributes are the same.
It looks like they reused the same constructs, but depending on the data, there are may be more things in the result.
My question is, what is the best practice when creating the classes for the JSON Deserialisation? Do you create multiple classes with the same name and same content(diff namespaces), or do you combine the classes into a "Generic" data type and just include the "extra" attributes, even though they wont all be used by one object.
The assumption is that the "null" values will not be considered in the deserialisation. Thus "extra" fields defined will just be ignored if not found.
The problem comes in the Classes where I would like to be able to define DataType1 and DataType2, but when combining the classes this becomes a problem...
Would like to hear your thoughts :)
Rgs,
Francois
Personally I prefer to deserialize in generic classes (lists and dictionaries or whatever your deserialization library offers) and then manually copy the data to whatever further data structures I use internally. Most of the time the "deserialization classes" really are used just for deserialization and the after that the data is immediately copied to further data structures that don't match the deserialization structures. So there's very little value to them.
Well this is basically like a generic binary writer... let's say you have an object, and you don't know what it is, but you have it. How do you write it's binary data to a binary file to be able to retrieve later?
My original idea that I don't know how to do was:
Figure out all the members of the object somehow (reflection maybe)
Unless the members are of types writable by the BinaryWriter, repeat step 1 on the member
Make a header that states the types of the members and how they are assembled into the object (somehow)
Write the header thing
Write all the core level members
I don't know how to use Reflection much so I'm not sure how to do most of the above.
It should be quite doable however.
How should I do this, if it's possible? Or how should I implement the above?
bin
Simplest approach is to use BinaryFormatter. However you should be very careful with any changes to your classes if you want to load instances saved by previous versions of your application.
The hard aspect is not writing out objects, but reading them back. The .NET framework provides various techniques for serialization and deserialization of class types which are supposed to automate the process, but all of the built-in techniques I'm familiar with have various limitations.
A major problem is that .NET makes no distinction between a storage location which holds a reference to an object for the purpose of identifying an object which is used by other code, for the purpose of only identifying immutable aspects of the object's state other than identity, or for the purpose of encapsulating the object's mutable state. Without knowing what a field is supposed to represent, it's not possible to know how it should be serialized or deserialized. For example, suppose that a particular type has a field of type int[], which holds a reference to a single-element array which holds the value 23. It may be that the purpose of that field is to hold the value 23, or it may be that the purpose of that field is to identify an array whose first element should be incremented every time something happens. In the former scenario, serialization should write out the fact that it's a single element array containing the value 23. In the latter scenario, if serialization is going to be possible at all, it will require knowing what is significant about the array to which the field holds a reference.
While various people have written various methods to automatically serialize various classes, I tend to be skeptical of such things. If one doesn't know what the fields of a class are used for, one should be cautious making any assumptions about what state is encapsulated thereby.
It might be possible with BinaryFormatter. But think of an object structure where you have many of your unknown objects which all reference a common object. If you serialize all of your unknown objects you end up with as many copies of the common object as there are unknown objects.
And there might be many fields of the unknown object which are not relevant as they are set by the constructor or other classes, they could be in an inconsistent state when deserialized.
So it might be not so hard to serialize them, but how do you want to deserialize them?
In one section of my application, I use type generated from xsd scheme - I have 2 version of schemas 2008 and 2009 - type I use is DatumType - in every scheme this type contain the same properties - they are exact, except namespaces.
Is there any way how to cast DatumType (2008) to DatumType (2009) so I can work in my application only with one type, instead of two?
I am working with c# and win forms, thanks!
No, there is no way to cast one to the other, because these are two unrelated types, as far as the compiler knows.
If the fields of the target type are assignable, you can write a short method that uses reflection to copy the fields.
You could also build code that saves objects of the source type to XML, and reads that XML into the objects of the target type. This is slightly more fragile, because it relies on the presence of identical fields and the fact that they are converted to XML in the same way.
It seems to me the easiest thing to do would be to build a small method to convert one type to the other (since they all share properties), or, if you have access to the source, implement an interface so that you can use the two classes as that interface.
In other words, if we have two classes, B and C, which inherit interface A (which contains all the properties we're interested in), we can typecast any object of those two classes as an A.
You can pre-process your XML file with a simple XSLT that corrects the namespace differences.
The part 1 of 2 of Identity explain how to do it. Basically a transform has templates that matches elements and give an output for each matched element. The trick is to have a specific template to match the Datum elements and transform them, and a generic transform that matches all kind of elements and simply copy them.
If you don't have experience with XSLT, don't be afraid. It's easier to learn than you can expect. You can use a tutorial like XSLT tutorial which will allow you to understand the 'Identity' explanation.
You can use XslCompiledTransform Class to apply the transform.
You can use Visual Studio to test and debug your XSLT file.
The answer to just about every single question about using C# with JSON seems to be "use JSON.NET", but that's not the answer I'm looking for.
The reason I say that is, from everything I've been able to read in the documentation, JSON.NET is basically just a better performing version of the DataContractSerializer built into the .NET framework...
Which means if I want to deserialize a JSON string, I have to define the full, strongly-typed class for EVERY request I might have. So if I have a need to get categories, posts, authors, tags, etc., I have to define a new class for every one of these things.
This is fine if I built the client and know exactly what the fields are, but I'm using someone else's API, so I have no idea what the contract is unless I download a sample response string and create the class manually from the JSON string.
Is that the only way it's done? Is there not a way to have it create a kind of hashtable that can be read with json["propertyname"]?
Finally, if I do have to build the classes myself, what happens when the API changes and they don't tell me (as twitter seems to be notorious for doing)? I'm guessing my entire project will break until I go in and update the object properties...
So what exactly is the general workflow when working with JSON? And by general I mean library-agnostic. I want to know how it's done in general, not specifically to a target library...
It is very hard to be library-agnostic as you request because how you work with json really depends on the library you use. As an example inside JSON.NET there are multiple ways you could work with JSON. There is the method you talk about with direct serialization into objects. That is type safe but will break if the data from your API changes. However, there is also a LINQ-to-JSON that provides a JObject (which behaves fairly similarly to XElement) that provides a way to do JObject["key"] as you requested in your question. If you are really just looking for a flexible way to work with JSON inside C#, then check out JSON.NET's LINQ-to-JSON.
In reality no matter how you do it, if the API changes your code is likely to break. Even if you are just strictly a hashtable-based approach, your code will still be likely to break if the data coming back changes.
Edit
JSON.NET Documentation
Examples
If you check out the examples, the second one should give you a good example of how LINQ-to-JSON works. It allows you to work with it without defining any classes. Everything gets converted to standard framework classes (mostly collections and strings). This avoids the need to maintain classes.
I've been a Perl developer for over a decade, and I've just recently started to work in C#. I'm surprised by how much I like it (I don't like Java at all) but one of the most difficult cognitive switches is going from "Everything can be treated as a string and the language takes care of conversions" to "Pre-define your types." In this case string-thinking might be an advantage, because it's what you need to do for the kind of API you're asking for.
You need to write a JSON parser that understands the syntax, which is fairly simple: comma-separated lists, key/value pairs, {} for hashes/objects, [] for arrays, and quoting/escaping constructs. You'll want to create a Hashtable to start because the top-level entity in JSON is always an object, then scan the JSON string character-by-character. Pull out key/value pairs; if the value starts with { then add it as a new Hashtable, if it starts with [ add it as a new ArrayList, otherwise add it as a string. If you get { or [ you'll need to recursively descend to add the child data elements.
If .NET has a good recursive descent parser, you could probably use that to make the job simpler or more robust, but JSON is simple enough to make this a good and reasonably completable exercise.
I write a desktop application that can open / edit / save documents.
Those documents are described by several objects of different types that store references to each other. Of course there is a Document class that that serves as the root of this data structure.
The question is how to save this document model into a file.
What I need:
Support for recursive structures.
It must be able to open files even if they were produced from slightly different classes. My users don't want to recreate every document after every release just because I added a field somewhere.
It must deal with classes that are not known at compile time (for plug-in support).
What I tired so far:
XmlSerializer -> Fails the first and last criteria.
BinarySerializer -> Fails the second criteria.
DataContractSerializer: Similar to XmlSerializer but with support for cyclic (recursive) references. Also it was designed with (forward/backward) compatibility in mind: Data Contract Versioning. [edit]
NetDataContractSerializer: While the DataContractSerializer still requires to know all types in advance (i.e. it can't work very well with inheritance), NetDataContractSerializer stores type information in the output. Other than that the two seem to be equivalent. [edit]
protobuf-net: Didn't have time to experiment with it yet, but it seems similar in function to DataContractSerializer, but using a binary format. [edit]
Handling of unknown types [edit]
There seem two be two philosophies about what to do when the static and dynamic type differ (if you have a field of type object but a, lets say, Person-object in it). Basically the dynamic type must somehow get stored in the file.
Use different XML tags for different dynamic types. But since the XML tag to be used for a particular class might not be equal to the class name, its only possible to go this route if the deserializer knows all possible types in advance (so that he can scan them for attributes).
Store the CLR type (class name, assembly name & version) during serialization. Use this info during deserialization to instantiate the right class. The types must not be known prior to deserialization.
The second one is simpler to use, but the resulting file will be CLR dependent (and less sensitive to code modifications). Thats probably why XmlSerializer and DataContractSerializer choose the first way. NetDataContractSerializer is not recomended because its using the second approch (So does BinarySerializer by the way).
Any ideas?
The one you haven't tried is DataContractSerializer. There is a constructor that takes a parameter bool preserveObjectReferences that should handle the first criteria.
The WCF data contract serializer is probably closest to your needs, although not perfect.
There is only limited support for backwards compatibility (i.e. whether old versions of the program can read documents generated with a newer version). New fields are supported (via IExtensibleDataObject), but new classes or new enum values not.
I would think the XmlSerializer is your best bet. You won't be able to support everything on your requirements list without a bit of work in your Document classes - but the XmlSerializer architecture gives you extensibility points which should allow you to tap into its mechanism deep enough to do just about anything.
Using the IXmlSerializable interface - by implementing that on your classes you want to store - you should be able to do just about anything, really.
The interface exposes basically two methods - ReadXml And WriteXml
public void WriteXml (XmlWriter writer)
{
// do what you need to do to write out your XML for this object
}
public void ReadXml (XmlReader reader)
{
// do what you need to do to read your object from XML
}
Using these two methods, you should be able to capture the necessary state information from just about any object you might want to store, and turn it into XML that can be persisted to disk - and deserialized back into an object when the time comes!
XmlSerializer can work for your first criteria, however you must provide the recursion for objects like the TreeView control.
BinaryFormatter can work for all 3 criteria. If a class changes, you may have to create a conversion tool to convert old format documents to a new format. Or recognize an older format, deserialize to the old, and then save to the new - keeping your old class format around for a little while.
This will help cover version tolerance which is what I think you're after: MSDN - Version Tolerant Serialization