I am building a C# integration tool, but I am having some trouble figuring out if I should create different classes for the data that I am receiving from different requests from the source application using REST. The responses are similar in a way that the constructs are the same, but for different information. I.e they would have an "Attributes" tag, but the attributes may vary per class. In the same breath, about 60% or more of the attributes are the same.
It looks like they reused the same constructs, but depending on the data, there are may be more things in the result.
My question is, what is the best practice when creating the classes for the JSON Deserialisation? Do you create multiple classes with the same name and same content(diff namespaces), or do you combine the classes into a "Generic" data type and just include the "extra" attributes, even though they wont all be used by one object.
The assumption is that the "null" values will not be considered in the deserialisation. Thus "extra" fields defined will just be ignored if not found.
The problem comes in the Classes where I would like to be able to define DataType1 and DataType2, but when combining the classes this becomes a problem...
Would like to hear your thoughts :)
Rgs,
Francois
Personally I prefer to deserialize in generic classes (lists and dictionaries or whatever your deserialization library offers) and then manually copy the data to whatever further data structures I use internally. Most of the time the "deserialization classes" really are used just for deserialization and the after that the data is immediately copied to further data structures that don't match the deserialization structures. So there's very little value to them.
Related
I have two separate programs that need to share information. This sharing will be done by one app placing an XML serialized object in a database, and the other app retrieving it on a different machine. The objects share the same variables but the properties and methods are different.
How exact do the classes have to match between the two programs?
Is the match line by line or just variable, property, and method names?
I ended up using the Newtonsoft.Json library instead of xml and used the <JsonObject(MemberSerialization.OptIn)> and JsonProperty() attributes to control what got serialized.
You did not specify which kind of serialization you were after.
The standard NET binary serializer is not well suited for data exchange between 2 different assemblies. When you go to deserialize, you'll get an an error similar to [Culture].[Assembly].[Version].SourceClass cannot be deserialized to [Culture].[Assembly].[Version].DestClass. This will happen even if the classes are identical.
There are several ways around this. A) Use the same service DLL on both sides to do the serializing B) trick it into deserializing by using an override to report a matching Culture-Assembly-Version-Class, but that seems dodgy or C) use XML serialization, but that makes for very wordy output, which is also readable.
For Binary Serialization, rather than the NET binary formatter, there is ProtoBuf-NET which is faster, produces much smaller output and uses nearly identical syntax.
How exact do the classes have to match between the two programs
ProtoBuf uses a numeric index rather than property name, so they shouldn't have to be too similar. Of course there has to be some similarity or the destination may not have a clue what the data represents. The code in the class can be quite different because it stays put.
Serialization stores only the data for an object - member variables, properties, etc. As long as the data types are compatible, it should work. You do not need a line by line match for the functions.
It all depends on the serializer you are using. Some require a perfect match, others tend to be more loosely coupled to the objects.
How exact do the classes have to match between the two programs?
Well, not at all. But they should be similar in some way because otherwise the serialization doesn't make sense.
Is the match line by line or variables and method names?
As, stated above: there must be some overlap. Usually the property names must be the same. But of course you can also provide a custom mapping.
Take a look at the Newtonsoft library, u can use it (for json) like this:
JsonConvert.DeserializeObject<IEnumerable<Unit>>(result);
It's independent of the object method that serialized the string.
I have long held the believe that your domain model should not be responsible for serializing itself to XML. I have used the IXmlSerializable interface in the past to control how my objects are serialized but ideally I'd prefer the nuts and bolts of the serialization to live outside the object.
However I've never been able to actually implement this in a clean manner and I was wondering if there was any patterns I was overlooking to make this happen. Basically I want my object model to do it's thing and be oblivious to XML serialization (or any other serialization for that matter) and then handed off to some service that spiders the object and serializes it.
I've tried doing this with extension methods but this falls short when you want to serialize a collection of type object. I've looked at doing it with object wrappers and DTO's that then serialize but then you've got the overhead of maintaining another set of objects and having to create these objects when you want to serialize which again can get messy when you have collections of type object.
The only other thing is using reflection but I'd worry about the processing overheads.
Is there a sane way to do what I'm asking or should I just bite the bullet and make my objects xml aware?
Using the System.Xml.Serialization Attributes is putting the nuts and bolts outside of your code. You are defining metadata and with the exception of optional parameters, no extra code is required. Implementing IXmlSerializable and doing the serialization by hand is error prone and should be avoided. Why? You are defining your data 3 times.
XML Schema
Class
Serialization code
Using attributes, you can scrub step 3.
XML and C# has an impedance mismatch. Like it or not, at some point, you will need to define the serialization to create the right document model.
Arguably, the classes you are serializing should not be performing any work. They are just a data store. Try abstracting your logic away from serialized objects - it may give you a warmer feeling.
Update
If you really, really hate attributes, try using the adapter pattern to serialize your model. The XML code will be in a separate class or assembly and you can work with your model across storage mediums. You will suffer the consequence of having to update the serialization separately when you update your model.
Well this is basically like a generic binary writer... let's say you have an object, and you don't know what it is, but you have it. How do you write it's binary data to a binary file to be able to retrieve later?
My original idea that I don't know how to do was:
Figure out all the members of the object somehow (reflection maybe)
Unless the members are of types writable by the BinaryWriter, repeat step 1 on the member
Make a header that states the types of the members and how they are assembled into the object (somehow)
Write the header thing
Write all the core level members
I don't know how to use Reflection much so I'm not sure how to do most of the above.
It should be quite doable however.
How should I do this, if it's possible? Or how should I implement the above?
bin
Simplest approach is to use BinaryFormatter. However you should be very careful with any changes to your classes if you want to load instances saved by previous versions of your application.
The hard aspect is not writing out objects, but reading them back. The .NET framework provides various techniques for serialization and deserialization of class types which are supposed to automate the process, but all of the built-in techniques I'm familiar with have various limitations.
A major problem is that .NET makes no distinction between a storage location which holds a reference to an object for the purpose of identifying an object which is used by other code, for the purpose of only identifying immutable aspects of the object's state other than identity, or for the purpose of encapsulating the object's mutable state. Without knowing what a field is supposed to represent, it's not possible to know how it should be serialized or deserialized. For example, suppose that a particular type has a field of type int[], which holds a reference to a single-element array which holds the value 23. It may be that the purpose of that field is to hold the value 23, or it may be that the purpose of that field is to identify an array whose first element should be incremented every time something happens. In the former scenario, serialization should write out the fact that it's a single element array containing the value 23. In the latter scenario, if serialization is going to be possible at all, it will require knowing what is significant about the array to which the field holds a reference.
While various people have written various methods to automatically serialize various classes, I tend to be skeptical of such things. If one doesn't know what the fields of a class are used for, one should be cautious making any assumptions about what state is encapsulated thereby.
It might be possible with BinaryFormatter. But think of an object structure where you have many of your unknown objects which all reference a common object. If you serialize all of your unknown objects you end up with as many copies of the common object as there are unknown objects.
And there might be many fields of the unknown object which are not relevant as they are set by the constructor or other classes, they could be in an inconsistent state when deserialized.
So it might be not so hard to serialize them, but how do you want to deserialize them?
My class contains no methods, only several fields, like host, port, labels, channels etc. etc.
I.e. its kind of config.
Should I use regular Class for representing configs? I want to make it obvious to reader that this instance is just a container for other values.
upd My config is pretty big and comes from xml, so it's a tree.
Yes, most likely you should be using class. There are rare case as pointed in other replies to use struct.
Name your class "ContainerForConfigurationProperties", than look at the resulting code. If it looks bad - refactor by changing class name till you are happy. Note that you may find that after coming up with good name some properties no longer fit into your class - it may mean that you class actually was container for several sets of properties - refactor by splitting the class.
If you use a class with public automatic-property getter/setters, then you can easily serialize/deserialize it (say to XML). Especially if the intent is to be consumed by other readers/developers, then using properties will shield them from changes when building against updated versions of your library. It also leaves the door open in the future if you want to implement anything in terms of tracking value changes, issuing events, performing validation, or just straight-up debugging with breakpoints.
Just call it a class, that's fine. It should be obvious what it's supposed to do, hold config info.
You may want to create an Interface in cases where you'll have a number of different config classes. For example, you might have an IConfig interface that has a few properties and then additional interface elements in more config interfaces (IHostConfig, ILabelConfig, etc.) that you can fit together to build your specific classes with a common, understandable, interface.
The answer to just about every single question about using C# with JSON seems to be "use JSON.NET", but that's not the answer I'm looking for.
The reason I say that is, from everything I've been able to read in the documentation, JSON.NET is basically just a better performing version of the DataContractSerializer built into the .NET framework...
Which means if I want to deserialize a JSON string, I have to define the full, strongly-typed class for EVERY request I might have. So if I have a need to get categories, posts, authors, tags, etc., I have to define a new class for every one of these things.
This is fine if I built the client and know exactly what the fields are, but I'm using someone else's API, so I have no idea what the contract is unless I download a sample response string and create the class manually from the JSON string.
Is that the only way it's done? Is there not a way to have it create a kind of hashtable that can be read with json["propertyname"]?
Finally, if I do have to build the classes myself, what happens when the API changes and they don't tell me (as twitter seems to be notorious for doing)? I'm guessing my entire project will break until I go in and update the object properties...
So what exactly is the general workflow when working with JSON? And by general I mean library-agnostic. I want to know how it's done in general, not specifically to a target library...
It is very hard to be library-agnostic as you request because how you work with json really depends on the library you use. As an example inside JSON.NET there are multiple ways you could work with JSON. There is the method you talk about with direct serialization into objects. That is type safe but will break if the data from your API changes. However, there is also a LINQ-to-JSON that provides a JObject (which behaves fairly similarly to XElement) that provides a way to do JObject["key"] as you requested in your question. If you are really just looking for a flexible way to work with JSON inside C#, then check out JSON.NET's LINQ-to-JSON.
In reality no matter how you do it, if the API changes your code is likely to break. Even if you are just strictly a hashtable-based approach, your code will still be likely to break if the data coming back changes.
Edit
JSON.NET Documentation
Examples
If you check out the examples, the second one should give you a good example of how LINQ-to-JSON works. It allows you to work with it without defining any classes. Everything gets converted to standard framework classes (mostly collections and strings). This avoids the need to maintain classes.
I've been a Perl developer for over a decade, and I've just recently started to work in C#. I'm surprised by how much I like it (I don't like Java at all) but one of the most difficult cognitive switches is going from "Everything can be treated as a string and the language takes care of conversions" to "Pre-define your types." In this case string-thinking might be an advantage, because it's what you need to do for the kind of API you're asking for.
You need to write a JSON parser that understands the syntax, which is fairly simple: comma-separated lists, key/value pairs, {} for hashes/objects, [] for arrays, and quoting/escaping constructs. You'll want to create a Hashtable to start because the top-level entity in JSON is always an object, then scan the JSON string character-by-character. Pull out key/value pairs; if the value starts with { then add it as a new Hashtable, if it starts with [ add it as a new ArrayList, otherwise add it as a string. If you get { or [ you'll need to recursively descend to add the child data elements.
If .NET has a good recursive descent parser, you could probably use that to make the job simpler or more robust, but JSON is simple enough to make this a good and reasonably completable exercise.