Parse byte as byte, not string - c#

I receive some JSON from a Java third-party system that contains Avro schemas in JSON format. An example looks like this:
{"type":"record", "name":"AvroRecord", "namespace":"Parent.Namespace", "fields": [{"name":"AvroField", "type":"bytes", "default":"\u00FF"}]}
I parse this JSON to do some C# code generation. The result would look like this:
public partial class AvroRecord
{
[AvroField(Name = "AvroField", Type = "bytes", DefaultValueText = "ÿ")]
public byte[] AvroField { get; set; }
public AvroRecord() { this.AvroField = new byte[] { 255 }; }
}
Eventually, from the C# representation of the schema, I need to infer back the original schema. Once I get that inferred schema, it will be sent over to the original system for comparison. That is why I want to keep the original string value for the default value, since I don't know if:
{"type":"record", "name":"AvroRecord", "namespace":"Parent.Namespace", "fields": [{"name":"AvroField", "type":"bytes", "default":"\u00FF"}]}
and
{"type":"record", "name":"AvroRecord", "namespace":"Parent.Namespace", "fields": [{"name":"AvroField", "type":"bytes", "default":"ÿ"}]}
will result in an exact match or it will have a problem.
I use JSON.NET to convert from the raw schema as a string to something more useful that I can work with:
JToken token = JToken.Parse(schema);
Is there a way in JSON.NET or any other JSON parsing library to control the parsing and copy a value without being parsed? Basically, a way to avoid "\u00FF" becoming "ÿ"

Related

Why is my array serialising into a string [duplicate]

This question already has an answer here:
Newtonsoft JSON serialization for byte[] property [duplicate]
(1 answer)
Closed 2 years ago.
I have created the following test vector class:
public class TestVector
{
public UInt16 MaxBlockSize { get; }
public byte[] Payload { get; set; }
public TestVector(ushort maxBlockSize, byte[] payload)
{
MaxBlockSize = maxBlockSize;
Payload = payload;
}
}
In my test, I am populating a list of vectors defined as per:
private static HashSet<TestVector> myVectors = new HashSet<TestVector>();
And then serialising "myVectors" using JsonConvert and write the result to a file as per:
var jsonOuput = JsonConvert.SerializeObject(myVectors , new JsonSerializerSettings{ObjectCreationHandling = ObjectCreationHandling.Replace})
File.WriteAllText(#"e:\MyJson.json", jsonOuput);
Here is a Json typical output (with a list/Hashset composed of 2 vectors):
[
{
"MaxBlockSize": 256,
"Payload": "bjQSAAAAAABvNBIAAAAA..."
},
{
"MaxBlockSize": 256,
"Payload": "VjQSVzQSWDQS...."
},
]
Now what I do not get is why "Payload" is serialised as a string and not as an array.
My questions are:
What is this string format (ASCII code maybe?) and why is it used instead of a byte[] type of representation?
Is there a way to get the "Payload" byte[] to be printed in a more readable way?
What is this string format (ASCII code maybe?) and why is it used instead of a byte[] type of representation?
See json.Net documentation for primitive types:
Byte[] String (base 64 encoded)
So the format is base64. This is probably used since it is a reasonably efficient encoding of binary data, encoding 6 bits per character. Encoding values as an array would use much more space.
It is somewhat common to encode images or similar chunks of data as byte arrays. Since these can be large it is useful to keep the size down as much as possible.
Is there a way to get the "Payload" byte[] to be printed in a more readable way?
There are various base64 converters online that can convert it to hex, oct, string, or whatever format you prefer to view your binary data in. But for many applications it is not very useful since the binary data often represents something that is already serialized in some way.

Smartly replace strings

I am working with JSON API. As c# doesn't accept characters like - (minus) or . (point), I had to replace each character by _ (underscore). The replacement happens when the JSON response is received as a string so that every attribute name containing a - or a . will have it replaced by a _ , then every attribute name will be the same as the attributes names in the class it will be deserialized into.
To make it clearer, here are some examples:
I recieve the following JSON : { "id": 1, "result": [ { "data": [ { "adm-pass": ""}]}
In the class I want to deserialize into I have this attribute : public String adm_pass {get; set;}
So I replace the minus with an underscore so that the NewtonSoft parser can deserialize it accordingly.
My problem is that I sometimes I get some negative integers in my JSON. So if I do the string replacement in: {"beta" : -1}, I get a parsing exception since the -1 (integer here) becomes _1 and cannot be deserialized properly and raises an exception.
Is there a way to replace the string smartly so I can avoid this error?
For example if - is followed by an int it's not replaced.
If this way does not exist, is there a solution for this kind of problems?
Newtonsoft allows you to specify the exact name of the JSON property, which it will use to serialize/deserialize.
So you should be able to do this
[JsonProperty("adm-pass")]
public String adm_pass { get; set; }
This way you are not restricted to name your properties exactly as the JSON property names. And in your case, you won't need to do a string replace.
Hope this helps.
You'll have to check that you are replacing the key and not the value, maybe by using a regex like http://regexr.com/3d471
Regex could work as wlalele suggests.
But I would create a new object like this:
Create a new object:
var sharpObj = {};
loop through the objects as properties as described here:
Iterate through object properties
for (var property in object) {
if (object.hasOwnProperty(property)) {
// do stuff
}
}
In the // do stuff section, create a property on sharpObj with the desired string replacements and set the property to the same value.
var cleanProperty = cleanPropertyName(property);
sharpObj[cleanProperty] = orginalObject[property];
Note: I assume you can figure out the cleanPropertyName() method or similar.
Stringify the object
var string = JSON.stringify(sharpObj);
You can substring to check whether the next character is an integer, this can adapt into your code easily as you already find a character, as such you could do
int a;
if(int.TryParse(adm_pass.Substring(adm_pass.IndexOf("-") + 1,1),out a))
{
//Code if next character is an int
}
else
{
adm_pass = adm_pass.Replace("-","_");
}
This kind of code can be looped until there are no remaining hyphens/minuses

Json.NET, can SerializeXmlNode be extended to detect numbers?

I am converting from XML to JSON using SerializeXmlNode. Looks the expected behavior is to convert all XML values to strings, but I'd like to emit true numeric values where appropriate.
// Input: <Type>1</Type>
string json = JsonConvert.SerializeXmlNode(node, Newtonsoft.Json.Formatting.Indented, true);
// Output: "Type": "1"
// Desired: "Type": 1
Do I need to write a custom converter to do this, or is there a way to hook into the serialization process at the appropriate points, through delegates perhaps? Or, must I write my own custom JsonConverter class to manage the transition?
Regex Hack
Given the complexity of a proper solution, here is another (which I'm not entirely proud of, but it works...).
// Convert to JSON, and remove quotes around numbers
string json = JsonConvert.SerializeXmlNode(node, Newtonsoft.Json.Formatting.Indented, true);
// HACK to force integers as numbers, not strings.
Regex rgx = new Regex("\"(\\d+)\"");
json = rgx.Replace(json, "$1");
XML does not have a way to differentiate primitive types like JSON does. Therefore, when converting XML directly to JSON, Json.Net does not know what types the values should be, short of guessing. If it always assumed that values consisting only of digits were ordinal numbers, then things like postal codes and phone numbers with leading zeros would get mangled in the conversion. It is not surprising, then, that Json.Net takes the safe road and treats all values as string.
One way to work around this issue is to deserialize your XML to an intermediate object, then serialize that to JSON. Since the intermediate object has strongly typed properties, Json.Net knows what to output. Here is an example:
class Program
{
static void Main(string[] args)
{
string xml = #"<root><ordinal>1</ordinal><postal>02345</postal></root>";
XmlSerializer xs = new XmlSerializer(typeof(Intermediary));
using (TextReader reader = new StringReader(xml))
{
Intermediary obj = (Intermediary)xs.Deserialize(reader);
string json = JsonConvert.SerializeObject(obj , Formatting.Indented);
Console.WriteLine(json);
}
}
}
[XmlRoot("root")]
public class Intermediary
{
public int ordinal { get; set; }
public string postal { get; set; }
}
Output of the above:
{
"ordinal": 1,
"postal": "02345"
}
To make a more generic solution, yes, you'd have to write your own converter. In fact, the XML-to-JSON conversion that takes place when calling SerializeXmlNode is done using an XmlNodeConverter that ships with Json.Net. This converter itself does not appear to be very extensible, but you could always use its source code as a starting point to creating your own.

Easiest way to use JSON.Net for both BSON and JSON?

I have some pieces of data that are byte arrays byte[], and I need to render them as base64 in JSON, but as raw byte arrays in BSON.
How can I easily do this in JSON.Net?
So, far I have something like so:
class Data
{
public byte[] Bytes{get;set;}
}
Converting to BSON is fine, but when converting to JSON, it is of course not base64 encoded and treated as a string
Hmm, using the following code with Json.Net 6.0.1, it appears to work just as you want with no special treatment: byte arrays are converted to base-64 strings and vice versa. Are you serializing your objects in a different way, or using an old version? If not, can you provide some code that demonstrates the problem?
string s = "Foo Bar Baz Quux";
Data data = new Data
{
Bytes = Encoding.UTF8.GetBytes(s)
};
string json = JsonConvert.SerializeObject(data);
Console.WriteLine(json);
data = JsonConvert.DeserializeObject<Data>(json);
Console.WriteLine(Encoding.UTF8.GetString(data.Bytes));
Output:
{"Bytes":"Rm9vIEJhciBCYXogUXV1eA=="}
Foo Bar Baz Quux

JSON is an object of strings - but I need the last member of the JSON to be a byte array

I've got a web service that "listens" for an HTTP POST request that sends JSON data- here's the start of the method that does this:
// POST api/blah
[HttpPost]
public HttpResponseMessage PostPicture(HttpRequestMessage msg)
{
string data = msg.Content.ReadAsStringAsync().Result;
...
The variable "data" contains the raw JSON, and by nature, it's just a string that's formatted as JSON. Here is the variable "data":
{
"longitude" : 96.84610000000001,
"latitude" : 35.5608,
"username" : "mgallow",
"imageDataBlob" : "\/9j\/4UI5RXhpZgAASUkqAAg..... and so on
}
In the end, I'm taking this data and inserting it into the database- Longitude and Latitude are of type decimal, username is of type nvarchar(50), and imageDataBlob is of type varbinary(MAX). The code below, in the same method, is taking that JSON data and deserializing it into an object of type "Picture", which is what represents my table in the database:
[HttpPost]
public HttpResponseMessage PostPicture(HttpRequestMessage msg)
{
string data = msg.Content.ReadAsStringAsync().Result;
Picture obj = Activator.CreateInstance<Picture>();
using (MemoryStream stream1 = new MemoryStream(Encoding.Unicode.GetBytes(data)))
{
DataContractJsonSerializer serializer = new DataContractJsonSerializer(obj.GetType());
obj = (Picture)serializer.ReadObject(stream1);
}
...
But the error I'm getting, and I understand why I'm getting it, is:
There was an error deserializing the object of type
blah.Models.Picture. End element 'imageDataBlob' from namespace
'' expected. Found text '/'.
Research indicates this is beacuse it's expecting a varbinary, but instead getting a string (from the JSON). As part of my testing, I changed the data type in my table to an nvarchar(MAX), and I didn't get that error anymore.
My question is:
How can I take the JSON that comes through as a string, and get it to map correctly to my object, which is a mix of string, decimal, and byte[]?
Your imageDataBlob field seems to be base64 encoded. Declare it as string in your Picture class and then use Convert.FromBase64String
Look at base64 encoding - that should give you some ideas...
I would suggest you to do a Base64 encode on the data before forming the JSON . Similarly you can decode the data in the server side before updating it in to database.

Categories

Resources