Deserialisation/parsing of a custom message format - c#

I am currently looking into writing a fast deesrialisation/parsing of a custom message format which are similar to BNF syntax. There are maybe 50 different objects involved.
The grammar of the objects contains a recursive definition which is the biggest problem for me at the moment.
Do you know any good examples or would you write your own lexer using regular expressions and parsing them using a FIFO queue for the embedded messages?
In Perl I am at the moment converting the messages into JSON messages and use a generic parser, but I am not so sure if this makes sense on C#.
Messages look like this:
"{key1=value1|key2={key3=value3}}".

The following URL shows examples of serialization/deserialization of JSON in C# by Scott Gu and the .NET 3.5 Framework:
http://weblogs.asp.net/scottgu/archive/2007/10/01/tip-trick-building-a-tojson-extension-method-using-net-3-5.aspx
Right before the summary you will find this sentence:
Note: In addition to the
JavaScriptSerializer class, .NET 3.5
also now includes a new
System.Runtime.Serialization.DataContractJsonSerializer
class that you can use for JSON
serialization/deserialization.
Hope this helps:
Andrew

Related

Removing and  to get valid XML?

I have a WCF service (.NET C#) that sometimes returns for example 
 and  which is not correct XML.
I guess I could build a translator that are applied on each string field before sending response but it feels a bit sketchy, I do not know what to look for(more then the above) or what to translate it into. Maybe there is a existing solution for this?
These characters are allowed in XML 1.1 but not in XML 1.0. XML 1.1 has not been a great success and Microsoft has never supported it.
Does the XML declaration at the start of the file say version="1.1"?
A clean way to handle this would be to process the file using a parser that does support XML 1.1, converting it to XML 1.0 in the process. For example, you could do this with a simple Java SAX application, or XSLT if you prefer.
Quite what you want to translate these characters into is largely up to you. It depends whether they have any significance. If you want to translate them losslessly into XML 1.0, you could convert them to processing instructions such as <?char x1E?>.
&#xD stands for a new line (the way i know).
The behavoir is the same as Environment.NewLine. So you can replace it easily:
string text = yourString.ToString().Replace("&#xD", Environment.NewLine).
Dont know if this is what you're searchin for, but thats the only thing thats in my mind right now.
Hope it helps. :)

How to use decimal entity codes in XML instead of hexadecimal entity codes

In my C# application, I am reading data from a source which has \r \n characters, and converting them to an XmlDocument. When using the CreateElement method of XmlDocument, it escapes them using hexadecimal entity codes, like 
 and
.
I have to send this XML to a 3rd party application, which accepts only decimal entity codes. So I have to send as 
 and
How can I configure XmlDocument to use decimal entity codes?
As soon as receiver app is crooked, the easies way is to introduce post-processing step that would bake your XML string to "acceptable" format. So, string.Replace() should help you here for sure. Not efficient, very effective. Sad but true.
When you have a receiving application that isn't accepting correct XML, the best thing to do is to change the receiving application. (Similarly, if you have a sending application that isn't sending correct XML, it's best to change the sending application). This is what standards are all about: if you want to get the benefits of XML, everyone has to play by the rules.
Producing constrained XML is generally going to be difficult and costly, because it constrains your choice of tools, or you have to write stuff by hand rather than using off-the-shelf software.
I don't think there are many tools that give you control over how numeric character references are written, but one that does is Saxon (PE or higher). You can use the extension property saxon:character-representation as an additional serialization property: see http://www.saxonica.com/documentation/index.html#!extensions/output-extras/serialization-parameters.

Usage of JSON for a daily activity journal

To keep track of my new year resolutions I created a file daily.log in the following format.
8:40 AM 1/2/2013
begin:755am
activity:enquired about 3x3 black board;bought book [beginning html 5]
waste:facebook;
meeting:old friend;mechanic
programming:none
blogpost:[asp.net deployment]
do:buy black board
done:
end:1045pm
I am in the process of creating a simple C# console application which would ask me a few questions and fill this file accordingly. One of the future features to this tool would be to display a simple dashboard style web page for measuring the progress of resolutions among other things.
I would to like to use a data serialization or configuration file format for storing daily activity information in this manner, because mature tools are available for these formats rather than for plain text.
I never used JSON before and am wondering whether the JSON format can be used independently with C# (no javascript involved), and even if I can, whether the usage of JSON is appropriate in this case.
If not JSON, its superset YAML? or are any other alternatives that suit well for this purpose?
You can use JSON.NET in C# without using javascript. And I believe this data can be modeled in JSON format.
If your goal is to work with external tools to have them recognize and be able to work with your files, a better bet than JSON would be to use XML. This format is stricter (and you can use XML Schema to validate the format) and there are way more tools that are able to work with XML than there are for JSON.
The .NET Framework also contains extensive support for XML, in the System.Xml namespace (see http://msdn.microsoft.com/en-us/library/system.xml(v=vs.100).aspx).
That being said, there is no reason why JSON would not work with C#. I have personally used the JSON.NET library for most JSON work and it works beautifully (see http://james.newtonking.com/projects/json-net.aspx). Mind you, the data you show in your example is not valid JSON.
Good luck!

Deserializing a java serialized file in C#

I have a have a java project that serializes some objects and ints to a file with functions like
ObjectOutputStream oos = new ObjectOutputStream(fos);
oos.writeInt(RANK_SIZE);
oos.writeObject(_firstArray);
oos.writeObject(_level[3]);
oos.writeObject(_level[4]);
...
Now I have trouble deserializing that file with C# (tried the BinaryFormatter) as it apparently can only deserialize the whole file into a single object (or arrays, but I have different objects with different lenghts).
I tried first to port the generation of these files to C#, but failed miserably. These files are small and I don't must to generate them myself.
Do I need to alter the way those files are generated in Java or can I deserialize it in any way?
There is nothing in the standard .NET framework which knows how to deserialize Java objects. You could, in theory, use the Java serialization spec and write your own deserialization code in C#. But it would be a large and complex project, and I don't think you'd find many customers interested in using it.
Far, far easier would be to change the way you serialize the data to use a portable format: JSON, or some form of XML. There are both Java and C# libraries to deal with such formats and the total effort would be orders of magnitude less. I would vastly prefer this second approach.
Maybe an at first sight counterintuitive solution might be to serialize them to some commonly understandable format like JSON? I'm not even sure the binary format for serialized objects in Java is guaranteed to remain unchanged between Java versions....
Jackson is my personal favorite when it comes to Java JSON libraries.
http://www.ikvm.net/ IKVM can do it perfectly.

XML parser: implementation design

I've always assumed XML documents are a convenient way to store information. Now I've found in XML a good way to "instruct" my application. Delicous.
My problem is integrate the XML parsing in application classes. I'm using C# System.Xml interface, I think it's good and portable (right?).
Is there any standard interface which defines methods to organize recursion on xml tags, or methods to implement common xml implementations (maybe in a base class)?
Initially I can think to write an interface which defines
void Read(XmlReader xml);
void Write(XmlREader xml);
What what about nested tags, common tags and so on...
P.S.: I don't think to implement this using LINQ, except in the case it's supported also in Mono (how to determine this)?
Thank you very much! :)
I think you might be looking for Serialization, this is a beginners Tutorial on Serialization
As Binary Worrier mentioned, XML serialization is a simple and efficient option. You can also use Linq to XML, which is supported in Mono since version 2.0 (released in october 2008)
Using xml to "instruct" your app seems backwards to me. I'd be more inclined to use an IronPython script if that was my aim. Xml, normally, is intended to serialize data. Sure you can write a language via xml, but ultimately it is fighting the system. You would also massively struggle to invoke methods (easy enough to set properties etc via XmlSerializer, though).
Here's A 3 minute guide to embedding IronPython in a C# application to show what might, IMO, be a better way to "instruct" a C# application via a separate script file.

Categories

Resources