Server side - C# or java
Client side Objective C
I need a way to serialize an object in C#\java and de-serialize it in Objective C.
I'm new to Objective C and I was wondering where I can get information about this issue.
Thanks.
Apart from the obvious JSON/XML solutions, protobuf may also be interesting. There are Java//c++/python backends for it and 3rd parties have created backends for C# and objective-c (never used that one though) as well.
The main advantages are it being much, much faster to parse[1], much smaller[2] since it's a binary format and the fact that versioning was an important factor from the beginning.
[1] google claims 20-100times compared to XML
[2] 3-10times according to the same source
Another technology similar to protobufs is Apache Thrift.
Apache Thrift is a software framework for scalable cross-language services development. Apache Thrift allows you to define data types and service interfaces in a simple definition file. Taking that file as input, the compiler generates code to be used to easily build RPC clients and servers that communicate seamlessly across programming languages.
JSON for relatively straight forward object graphs
XML/REST for more complex object graphs (distinction between Arrays / Collections / nested arrays etc)
Sudzc. I am using it. It is pretty easy to invoke a Webservice from i-os app.
You dont have to write code to serialize object.
JSON is probably the best choice, because:
It is simple to use
It is human-readable
It is data-based rather than being tied to any more complex object model
You will be able to find decent libraries for import/export in most languages.
Serialisation of more complex objects is IMHO not a good idea from the perspective of portability since often one language/platform has no effective way of expressing a concept from another language / platform. e.g. as soon as you start declaring "types" or "classes" of serialised objects you run into the thorny issue of differing object models between languages.
On iOS there are couple of JSON frameworks and libraries with an Objective-C API:
JSONKit
SBJson
TouchJson
are probably the most prominent.
JSONKit is fast and simple, but can only parse a contiguous portion of JSON text. This means, you need to save downloaded data into a temporary file, or you need to save all downloaded JSON text into a NSMutableData object (kept in memory). Only after the JSON text has been downloaded completely you can start parsing.
SBJson is more flexible to use. It provides an additional "SAX style" interface, can parse partial input and can parse more than one JSON document per "input" (for example several JSON documents per network connection). This is very handy when you want to connect to a "streaming API" (e.g. Twitter Streaming API), where many JSON documents can arrive per connection. The drawback is, it is a much slower than JSONKit.
TouchJson is even somewhat slower than SBJson.
My personal preference is some other, though. It is faster than JSONKit (20% faster on arm), has an additional SAX style API, can handle "streaming APIs", can simultaneously download and parse, can handle very large JSON strings without severely impacting memory foot-print, while it is especially easy to use with NSURLConnection. (Well, I'm probably biased since I'm the author).
You can take a look at JPJson (Apache License v2):
JPJson - it's still in beta, though.
Related
I am receiving data via UDP from a C/C++ application. This application is doing a memcpy of the class into a buffer and throwing it our way. Our application is written in C# and I need to somehow make sense of the data. We have access to the header files of the structures - everything is basically a struct or an enum. We can't change the format the data comes in and the header files are likely to change fairly often.
I have considered re-writing our comms classes in C++ to receive the data and then I have more control of its serialisation, but that will take a long time and my C++ is rusty, not to mention I don't have a lot of experience with C++ threading which would be a requirement.
I have also created a few prototype C++ libraries with the provided header files to be accessed via C#, but I can't quite get my head around how I actually create and use an actual instance of the class in C# itself (every time I look into this, all I see are extern function calls, not the use of external types).
I have also looked into Marshalling. However, as the data is liable to change quite often, I think this is a last resort and feels quite manual.
Does anyone know of any options or have any more targeted reading or advice on this matter?
Why not use Google Protocol Buffers on each end i.e. c++ and c#. You would take your c++ definition and let PB do all the serialisation for you.
Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages. more...
It works across different OSs even where primitive type conversation would normally be a problem.
I have a system where a serialized file is created with a C# program and then deserialized in another C# program. I'm wondering if it's possible to do binary deserialization of a C# file in Java?
Thanks
You can try using some serializator that has implementations for both platforms and outputs data in a platform-independet format, like Protobuf.
Or if you need a full RPC over network between Java and C# application, you can go for Apache Thrift.
I assume you are speaking of an object serialized with BinaryFormatter. The answer then is a qualified "yes," since Java implements a Turing machine. However, this is will not be straightforward.
In this case the data will be in a format most suitable for consumption by a .NET runtime, and will contain information about .NET types and assemblies. You would have to implement your own reader for this format, and then have some way to map between .NET and Java types. (The Mono project implements a BinaryFormatter compatible with .NET's, so you could use their reader implementation as a reference.)
As an alternative, consider using another format for data serialization, such as JSON. This will give you instant portability to a wide array of languages, as well as the possibility for easy human inspection of the data.
Deserializing an object in Java which was serialized with C#'s built-in binary serialization would you'd to implement C#'s deserialization logic in java. That's a pretty involved process, so let's compare some options:
Use a third party library for serialization which works for C# and Java.
Write a routine to serialize each object. One in C#, one in Java. This will be tedious, and hard to maintain.
Implement C#'s serialization logic in Java, or vice versa. This will be difficult, time consuming, and you likely won't get it right the first time.
I recommend option 1, use a third-party library. Here's two third-party libraries I've used and highly suggest.
Google ProtoBufs
Apache Thrift
You can use any cross-platform binary format. Your options include, among others:
Protobuf
BSON (Binary JSON)
GZIP
JSON and XML (herrrrp) are also options, albeit text-based ones.
One other option would be to base64-encode the data, and decode it on the other side; albeit you may get a huge payload because it's binary (probably not a good idea).
I am working on a system that has components written in the following languages:
C
C++
C#
PHP
Python
These components all use (infrequently changing) data that comes from the same source and can be cached and accesed from memcache for performance reasons.
Because different data types may be stored differently by different language APIs to memcache, I am wondering if it would be better to store ALL data as string (objects will be stored as JSON string).
However, this in itself may pose problems as strings (will almost surely) have different internal representations accross the different languages, so I'm wondering about how wise that decision is.
As an aside, I am using the 1 writer, multiple readers 'pattern' so concurrency is not an issue.
Can anyone (preferably with ACTUAL experience of doing something similar) advice on the best format/way to store data in memcache so that it may be consumed by different programming languages?
memcached I think primarily only understands byte[] and representation of byte is same in all languages. You can serialize your objects using protocol buffers or a similar library and consume it in any other language. I've done this in my projects.
Regardless of the back-end chosen, (memcached, mongodb, redis, mysql, carrier pigeon) the most speed-efficient way to store data in it would be a simple block of data (so the back-end has no knowledge of it.) Whether that's string, byte[], BLOB, is really all the same.
Each language will need an agreed mechanism to convert objects to a storable data format and back. You:
Shouldn't build your own mechanism, that's just reinventing the wheel.
Should think about whether 'invalid' objects might end up in the back-end. (either because of a bug in a writer, or because objects from a previous revision are still present)
When it comes to choosing a format, I'd recommend two: JSON or Protocol Buffers. This is because their encoded size and encode/decode speed is among the smallest/fastest of all the available encodings.
Comparison
JSON:
Libraries available for dozens of languages, sometimes part of the standard library.
Very simple format - Human-readable when stored, human-writable!
No coordination required between different systems, just agreement on object structure.
No set-up needed in many languages, eg PHP: $data = json_encode($object); $object = json_decode($data);
No inherent schema, so readers need to validate decoded messages manually.
Takes more space than Protocol Buffers.
Protocol Buffers:
Generating tools provided for several languages.
Minimal size - difficult to beat.
Defined schema (externally) through .proto files.
Auto-generated interface objects for encoding/decoding, eg C++: person.SerializeToOstream(&output);
Support for differing versions of object schemas to add new optional members, so that existing objects aren't necessarily invalidated.
Not human-readable or writable, so possibly harder to debug.
Defined schema introduces some configuration management overhead.
Unicode
When it comes to Unicode support, both handle it without issues:
JSON: Will typically escape non-ascii characters inside the string as \uXXXX, so no compatibility problem there. Depending on the library, it may be also possible to force UTF-8 encoding.
Protocol Buffers: Seem to use UTF-8, though I haven't found info in Google's documentation in 3-foot-high letters to that effect.
Summary
Which one you go with will depend on how exactly your system will behave, how often changes to the data structure occur, and how all the above points will affect you.
Not going to lie you could do it in redis. Redis is a key-value database written to be high performance it allows the transfer of data between languages using a number of different client libraries these are the client libraries Here is an example in java and python
Edit 1: Code is untested. If you spot an error please let me know :)
Edit 2: I know I didn't use the prefered redis client for java but the point still stands.
Python
import redis
r = redis.Redis()
r.set('test','123')
Java
import org.jredis.RedisException;
import org.jredis.ri.alphazero.JRedisClient;
import static org.jredis.ri.alphazero.support.DefaultCodec.*;
class ExampleCode{
private final JRedisClient client = new JRedisClient();
public static void main(String[] args) throws RedisException {
System.out.println(toStr(client.get('test')))
}
}
I'm on a project that processes and reports on large sets of aggregatable row based data. There is a primary aggregation service and then many clients who can subscribe to different views of the data from that server. The objects are passed back and forth between the Java server and the C# clients encoded in JSON. We're noticing that the parsing of the objects is taking a lot of time and somewhat memory intensive. Have others used JSON for this purpose or seen similar behavior?
We used to use straight XML across the wire and had to use custom serialization (ie. manual) for alot of the objects. While not JSON we did have performance hits due to this constraint. Once we migrated all our tech to a similar architecture we were able to switch to binary serialization which worked much better.
However on the objects where we had issues with performance due to size we made some modifications. Since we had access to the code on both ends (and both were c#) we were able to binary serialize the payload and then base64 encode it since it had to be text across the wire. It did help a good bit in terms of object size and the serialization ran a bit faster.
Since you are going from Java to C# you won't really have that luxury. So the only thing I can think of in your case would be to try and optimize your parsing of the JSON response. You may be able to use some code profiling tools to help you identify portions that are causing you performance issues and then try to optimize those. Also, on the deserialize to JSON make sure you use a string builder to build your final string. If you are doing standard concat operations it will kill performance as well.
Also, you might want to check around I have seen on the web several JSON serializers written for c# some may be faster than what you are doing, who knows.
Not sure if that helps you all that much but there is some info from things we have seen with string based message passing.
UPDATE: Just saw this on dotnetkicks: JSON.Net it's an update from james for the json.net serializers. May help out.
I know for java there are any number of opensource JSON serializers and deserializers. We use FlexJSON.
JSON can be expensive to decode. If performance is an issue try using something like Hessian.
Is there an easy way to serialize data in c++ (either to xml or binary), and then deserialize the data in C#?
I'm working with some remote WINNT machines that won't run .Net. My server app is written entirely in C#, so I want an easy way to share simple data (key value pairs mostly, and maybe some representation of a SQL result set). I figure the best way is going to be to write the data to xml in some predefined format on the client, transfer the xml file to my server, and have a C# wrapper read the xml into a usable c# object.
The client and server are communicating over a tcp connection, and what I really want is to serialize the data in memory on the client, transfer the binary data over the socket to a c# memory stream that I can deserialize into a c# object (eliminating file creation, transfer, etc), but I don't think anything like that exists. Feel free to enlighten me.
Edit
I know I can create a struct in the c++ app and define it in c# and transfer data that way, but in my head, that feels like I'm limiting what can be sent. I'd have to set predefined sizes for objects, etc
Protocol Buffers might be useful to you.
Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages – Java, C++, or Python.
.NET ports are available from Marc Gravell and Jon Skeet.
I checked out all mentioned projects like prottocol buffers, json, xml, etc. but after I have found BSON I use this because of the following reasons:
Easy to use API
Available in many languages (C, C++, Haskell, Go, Erlang, Perl, PHP, Python, Ruby, C#, ...)
Binary therefore very space efficient and fast (less bytes->less time)
constistent over platforms (no problems with endianess, etc)
hierarchical. The data model is comparable to json (what the name suggests) so most data modelling tasks should be solvable.
No precompiler necessary
wideley used (Mongodb, many languages)
C++ doesn't have structural introspection (you can't find out the fields of a class at runtime), so there aren't general mechanisms to write a C++ object. You either have to adopt a convention and use code generation, or (more typically) write the serialisation yourself.
There are some libraries for standard formats such as ASN.1, HDF5, and so on which are implementation language neutral. There are proprietary libraries which serve the same purpose (eg protocol buffers).
If you're targeting a particular architecture and compiler, then you can also just dump the C++ object as raw bytes, and create a parser on the C# side.
Quite what is better depends how tightly coupled you want your endpoints to be, and whether the data is mainly numerical (HDF5), tree and sequence structures (ASN.1), or simple plain data objects (directly writing the values in memory)
Other options would be:
creating a binary file that contains the data in the way you need it
( not a easy & portable solution )
XML
YAML
plain text files
There are a lot of options you can choose from. Named pipes, shared
memory, DDE, remoting... Depends on your particular need.
Quick googling gave the following:
Named pipes
Named Shared Memory
DDE
As mentioned already, Protocol Buffers are a good option.
If that option doesn't suit your needs, then I would look at sending the XML over to the client (you would have to prefix the message with the length so you know how much to read) and then using an implementation of IXmlSerializer or use the DataContract/DataMember attributes in conjunction with the DataContractSerializer to get your representation in .NET.
I would recommend against using the marshaling attributes, as they aren't supported on things like List<T> and a number of other standard .NET classes which you would use normally.