How I can deserialize python pickles in C#? - c#

I have some python data, serialized to pickles and need to use it in C# program. So is there any way to deserialize python pickles in C#? I can't change data format to JSON or etc.

You say you can't change the program that generates the pickle. But surely you can write a separate Python program to read the pickle and write it out again as JSON?
import json, pickle
with open("data.pickle", "rb") as fpick:
with open("data.json", "w") as fjson:
json.dump(pickle.load(fpick), fjson)

Quote from the documentation:
The data format used by pickle is Python-specific. This has the
advantage that there are no restrictions imposed by external standards
such as XDR (which can’t represent pointer sharing); however it means
that non-Python programs may not be able to reconstruct pickled Python
objects.
So the answer to your question is no, you cannot deserialize it in C#. You will have to use an interoperable format such as XML or JSON if you need to communicate with other platforms.

You can try embedding IronPython and unpickling from there, then making the unpickled object available to the C# application.
Note that pickles are designed to serialize Python objects, so this approach only works if you have very simple objects with clear mappings to C# equivalents. It also requires that your IronPython environment have access to all modules defining the classes of all objects contained in the pickle (same as in CPython).
You should try to serialize your data some other more interoperable way (such as JSON or XML) if possible.

Pyrolite has an Unpickler class that will turn a pickle into an object.

There is now a NuGet Razorvine.Pickle, for serializing and deserializing pickle files in .NET.

Related

Deserializing C# Binary in Java

I have a system where a serialized file is created with a C# program and then deserialized in another C# program. I'm wondering if it's possible to do binary deserialization of a C# file in Java?
Thanks
You can try using some serializator that has implementations for both platforms and outputs data in a platform-independet format, like Protobuf.
Or if you need a full RPC over network between Java and C# application, you can go for Apache Thrift.
I assume you are speaking of an object serialized with BinaryFormatter. The answer then is a qualified "yes," since Java implements a Turing machine. However, this is will not be straightforward.
In this case the data will be in a format most suitable for consumption by a .NET runtime, and will contain information about .NET types and assemblies. You would have to implement your own reader for this format, and then have some way to map between .NET and Java types. (The Mono project implements a BinaryFormatter compatible with .NET's, so you could use their reader implementation as a reference.)
As an alternative, consider using another format for data serialization, such as JSON. This will give you instant portability to a wide array of languages, as well as the possibility for easy human inspection of the data.
Deserializing an object in Java which was serialized with C#'s built-in binary serialization would you'd to implement C#'s deserialization logic in java. That's a pretty involved process, so let's compare some options:
Use a third party library for serialization which works for C# and Java.
Write a routine to serialize each object. One in C#, one in Java. This will be tedious, and hard to maintain.
Implement C#'s serialization logic in Java, or vice versa. This will be difficult, time consuming, and you likely won't get it right the first time.
I recommend option 1, use a third-party library. Here's two third-party libraries I've used and highly suggest.
Google ProtoBufs
Apache Thrift
You can use any cross-platform binary format. Your options include, among others:
Protobuf
BSON (Binary JSON)
GZIP
JSON and XML (herrrrp) are also options, albeit text-based ones.
One other option would be to base64-encode the data, and decode it on the other side; albeit you may get a huge payload because it's binary (probably not a good idea).

IS the output of binary java serialization is equals to c# serialization

I am working on a project which is made up on c# and there are some of data are serialized And now I need the same values to be serialized in java.
So, can I get the java serialized out put which should equivalent to the c# serialized out put. Because c# code is already been done I cant change the reader format. I need to send the same data by java which is currently in c#.
So, is the serialized out put of both the language are same.
So, is the serialized out put of both the language are same.
Certainly not if you use the default binary serialization mechanisms of each platform. It would be almost inconceivable that they could be compatible.
You should pick a platform-neutral serialization format, such as Protocol Buffers, Thrift, YAML, JSON, XML (with custom serializers) etc.
The binary serialization is almost always not the same, because of different serialization implementations. However thats why we have XML or Json and other inter-compatible formats, so we can use them regardless of the technology we use.

Deserializing a java serialized file in C#

I have a have a java project that serializes some objects and ints to a file with functions like
ObjectOutputStream oos = new ObjectOutputStream(fos);
oos.writeInt(RANK_SIZE);
oos.writeObject(_firstArray);
oos.writeObject(_level[3]);
oos.writeObject(_level[4]);
...
Now I have trouble deserializing that file with C# (tried the BinaryFormatter) as it apparently can only deserialize the whole file into a single object (or arrays, but I have different objects with different lenghts).
I tried first to port the generation of these files to C#, but failed miserably. These files are small and I don't must to generate them myself.
Do I need to alter the way those files are generated in Java or can I deserialize it in any way?
There is nothing in the standard .NET framework which knows how to deserialize Java objects. You could, in theory, use the Java serialization spec and write your own deserialization code in C#. But it would be a large and complex project, and I don't think you'd find many customers interested in using it.
Far, far easier would be to change the way you serialize the data to use a portable format: JSON, or some form of XML. There are both Java and C# libraries to deal with such formats and the total effort would be orders of magnitude less. I would vastly prefer this second approach.
Maybe an at first sight counterintuitive solution might be to serialize them to some commonly understandable format like JSON? I'm not even sure the binary format for serialized objects in Java is guaranteed to remain unchanged between Java versions....
Jackson is my personal favorite when it comes to Java JSON libraries.
http://www.ikvm.net/ IKVM can do it perfectly.

my C# client app need to deserialize complex JSON sent by Java app

If my C# client app need to deserialize complex JSON from Java server app, what is the best option I have?
Here are two conditions need to consider:
1) speed is the most important
2) Json format could include information about the Java data type, C# client app. need to recognize it and convert it to C# corespondent type. for exmaple,
...,"Variable1" : [ "java.math.BigDecimal", 0E-8 ],
"Variable2" : [ "com.xmlasia.x5.refdata.instrument.model.MarginGroup"],...
IMO because of point 2, the only way is to build my own deserializer, am I right?
Regard to point 1, if I use Json.net to deserialize the Json, and then convert to arraylist, with it have significant impact on the speed? Is there an other better way?
The disadvantage of the arraylist approach is that the extractJson method get really messy, and I think arraylist is slow.
I think that the easiest would be to build some bridge that will translate this JSON to something more interoperable.
It is unlikely that “speed is the most important”, otherwise the data would need to be send in a binary format. However all the main json parsers are also fast, so this is unlikely to be an issue.
If a parser just ignored the java data type and map fields based on names to the fields in your .net objects you may be ok. Otherwise you need a json parser that will give you back dictionary of the fields so you can process them yourself. There should be no need to write you own string processing code to decode the json, that is a solved problem.
There are lots of json libs for .net, as it is a long time since I looked at them, I can’t recommend the best one for you use.

Serialize in C++ then deserialize in C#?

Is there an easy way to serialize data in c++ (either to xml or binary), and then deserialize the data in C#?
I'm working with some remote WINNT machines that won't run .Net. My server app is written entirely in C#, so I want an easy way to share simple data (key value pairs mostly, and maybe some representation of a SQL result set). I figure the best way is going to be to write the data to xml in some predefined format on the client, transfer the xml file to my server, and have a C# wrapper read the xml into a usable c# object.
The client and server are communicating over a tcp connection, and what I really want is to serialize the data in memory on the client, transfer the binary data over the socket to a c# memory stream that I can deserialize into a c# object (eliminating file creation, transfer, etc), but I don't think anything like that exists. Feel free to enlighten me.
Edit
I know I can create a struct in the c++ app and define it in c# and transfer data that way, but in my head, that feels like I'm limiting what can be sent. I'd have to set predefined sizes for objects, etc
Protocol Buffers might be useful to you.
Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages – Java, C++, or Python.
.NET ports are available from Marc Gravell and Jon Skeet.
I checked out all mentioned projects like prottocol buffers, json, xml, etc. but after I have found BSON I use this because of the following reasons:
Easy to use API
Available in many languages (C, C++, Haskell, Go, Erlang, Perl, PHP, Python, Ruby, C#, ...)
Binary therefore very space efficient and fast (less bytes->less time)
constistent over platforms (no problems with endianess, etc)
hierarchical. The data model is comparable to json (what the name suggests) so most data modelling tasks should be solvable.
No precompiler necessary
wideley used (Mongodb, many languages)
C++ doesn't have structural introspection (you can't find out the fields of a class at runtime), so there aren't general mechanisms to write a C++ object. You either have to adopt a convention and use code generation, or (more typically) write the serialisation yourself.
There are some libraries for standard formats such as ASN.1, HDF5, and so on which are implementation language neutral. There are proprietary libraries which serve the same purpose (eg protocol buffers).
If you're targeting a particular architecture and compiler, then you can also just dump the C++ object as raw bytes, and create a parser on the C# side.
Quite what is better depends how tightly coupled you want your endpoints to be, and whether the data is mainly numerical (HDF5), tree and sequence structures (ASN.1), or simple plain data objects (directly writing the values in memory)
Other options would be:
creating a binary file that contains the data in the way you need it
( not a easy & portable solution )
XML
YAML
plain text files
There are a lot of options you can choose from. Named pipes, shared
memory, DDE, remoting... Depends on your particular need.
Quick googling gave the following:
Named pipes
Named Shared Memory
DDE
As mentioned already, Protocol Buffers are a good option.
If that option doesn't suit your needs, then I would look at sending the XML over to the client (you would have to prefix the message with the length so you know how much to read) and then using an implementation of IXmlSerializer or use the DataContract/DataMember attributes in conjunction with the DataContractSerializer to get your representation in .NET.
I would recommend against using the marshaling attributes, as they aren't supported on things like List<T> and a number of other standard .NET classes which you would use normally.

Categories

Resources