Parsing incorrect JSON strings - c#

I have JSON on input, which contain arrays like this
[13806008,,[[27017723,,[0.25,-180,145],],[26683222,,[0,-125,106],]],0,"0","0","0","0",null,[[176,"673041"],[168,"2"],[175,"val"],[169,"1"]]]
Chrome Web Inspector parses those double commas like undefined elements, but the Newtonsoft Json library throws an exception with this format.
The only way that I see - insert null between double commas first and parse string then.
Is there faster way to parse such JSON strings?

As Darin Dimitrov says in a comment, this isn't JSON. So it's then up to you to figure out how you want to interpet it. From the example, it looks like a pretty simple 'subset' of JSON, so here's what I'd suggest.
I've written a library called canto34 which lets you write your own interpreters for simple language problems like this, and described a program to recognise nested lists of tokens -- in my case, lisp s-expressions, but these are very similar to nested JavaScript lists, just with different brackets. ;)
Here's the kind of structure you need to parse nested lists;
public class SExpressionParser : ParserBase
{
internal dynamic SExpression()
{
if (LA1.Is(SExpressionLexer.OP))
{
Match(SExpressionLexer.OP);
}
else
{
var atom = Match(SExpressionLexer.ATOM).Content;
return atom;
}
var array = new List<dynamic>();
while (!EOF && !LA1.Is(SExpressionLexer.CL))
{
array.Add(SExpression());
}
Match(SExpressionLexer.CL);
return array;
}
}

Looking closely to my string I realized what that's regular JSON array!
I simply parse my sting as JSON Array!
JArray JsonArray= JArray.Parse(responseString);

Related

Split a list of JSON blobs delimited by commas (ignoring commas inside a JSON blob) [duplicate]

This question already has answers here:
Additional text encountered after finished reading JSON content:
(2 answers)
Closed 3 years ago.
Here's a weird one. I'm given an ill-conceived input string that is a list of JSON blobs, separated commas. e.g.:
string input = "{<some JSON object>},{JSON_2},{JSON_3},...,{JSON_n}"
And I have to convert this to an actual list of JSON strings (List<string>).
For context, the unsanitary "input" list of JSONs is read in directly from a .txt file on disk, produced by some other software. I'm writing an "adapter" to allow this data to be consumed by another piece of software that knows how to interpret the individual JSON objects contained within the list. Ideally, the original software could have output one file per JSON object.
The "obvious" solution (using String.Split):
List<string> split = input.Split(',').ToList();
would of course fail to escape commas present within the JSON objects ({}) themselves
I was considering a manual approach - walking the string character-by-character and only splitting out a new element if the count of { is equal to the count of }. Something like:
List<string> JsonBlobs = new List<string>();
int start = 0, nestingLevel = 0;
for (int i = 0; i < input.Length; i++)
{
if (input[i] == '{') nestingLevel++;
else if (input[i] == '}') nestingLevel--;
else if (input[i] == ',' && nestingLevel == 0)
{
JsonBlobs.Add(input.Substring(start, i - start));
start = i + 1;
}
}
(The above likely contains bugs)
I had also considered adding JSON array braces on either end of the string ([]) and letting a JSON serializer deserialize it as a JSON array, then re-serialize each of the array elements one at a time:
List<string> JsonBlobs = Newtonsoft.Json.Linq.JArray.Parse("[" + input + "]")
.Select<Newtonsoft.Json.Linq.JToken, string>(token => token.ToString()).ToList();
But this seems overly-expensive, and could potentially result in newly serialized JSON representations that are not exactly equal to the original string contents.
Any better suggestions?
I'd prefer to use some easily-understandable use of built-in libraries and/or LINQ if possible. Regex would be a last resort, although nifty regex solutions would also be interesting to see.
Trying to parse this out using your own rules is fraught. You noticed the problem where JSON properties are comma-separated, but also bear in mind that JSON values can include strings, which could contain braces and commas, and even quote characters that have nothing to do with the JSON structure.
{"John's comment": "I was all like, \"no way!\" :-}"}
To do it right, you're going to need to write a parser capable of handling all the JSON rules. You're likely to make mistakes, and unlikely to get much value out of the effort you put into it.
I would personally suggest the approach of adding brackets on either side of the string and deserializing the whole thing as a JSON array.
I'd also suggest questioning the requirement to convert the result to a list of strings: Was that requirement based on someone's assumption that producing a list of strings would be simpler than producing a list of JObjects or a list of some specific serialized type?
You can try splitting on:
(?<=}),(?={)
but this of course assumes that a JSON string does not literally contain a sequence of },{ such as:
{"key":"For whatever reason, },{ literally exists in this string"}
it would also fail for an array of objects such as:
{"key1":[{"key2":"value2"},{"key3":"value3"}]}
:-/

Deserialize Json with special character into dictionary

I got some trouble with a problem when use Newtonsoft json.net to deserialize json string to dictionary. It 's a case of my json string have some special character.
string jsonString = "{\"name\":\"Jones Smith\",\"age\":\"20\",\"description\":\"The one live with \"ALIGATOR\"\"}";
Dictionary<string, object> dict = JsonConvert.DeserializeObject<Dictionary<string, object>>(jsonString);
I try to find a solution in the use of json.net but i not found. So the FINAL plan is remove that "characters". So, what is the best solution for this case?
I think you can't do very much in your situation besides changing the format at the origin. The problem with your input is that there are " characters escaped the same way once in your json directly and once in your json values.
Consider the following part: "description":"The one live with "ALIGATOR""
How should a deserializer know which " should be considered part of the value or part of the json format?
I got the answer, like the last comment, that 's not valid JSON, below is valid JSON
{"name":"Jones Smith","age":"20","description":"The one live with \"ALIGATOR\""}
And all i can do is add '\' before special characters if the value of field description is "The one live with "ALIGATOR"" to make a valid JSON and convert to c# like this:
string jsonString = {\"name\":\"Jones Smith\",\"age\":\"20\",\"description\":\"The one live with \\"ALIGATOR\\"\"}

How to convert JSON format plain text to simple plain text

I have a string in plain text which contains brackets like JSON format as it is created using JavaScriptSerializer().Serialize() method. I need to remove brackets and collon and want to convert it into key = value, key = value format.
Need to convert
{
"account":"rf750",
"type":null,
"amount":"31",
"auth_type":"5",
"balance":"2.95",
"card":"re0724"
}
to
'account=rf750,type=null,amount=31,authe=5,balanc=2.95,card=re0724'
Well, you've got three different things going on here.
The first, and surface issue, is: how do you change the string?
Simple - you do some string substitutions, preferably using Regex. Remove the starting/ending braces, change [a]:"[b]", to [a]=[b], - or however you want the final format to look like.
The second, and slightly deeper issue is: JSON isn't just a simple list of keys=values. You can have nesting. You can have non-string data. Simply saying you want to change the JSON result to key=value,key=value,key=value, etc - is fragile. How do you know the JSON structure will be what you're expecting? JSON Serialization will serialize successfully even if you've got nested structures, non string/int data, etc. And if you want solid code that doesn't easily break, you have to figure out: how do I handle this? Can I handle this?
The third, and final thing is: you're taking a standard data format schema and figuring out how to translate it to a nonstandard data format. 90% of the time someone does that, they deserve to be shot. Seriously, spend some solid time asking yourself whether you can use the JSON as-is, and whether the process wanting key=value,key=value,etc can be changed to use an actual standardized data format.
Here is simple solution which (1) parses json to Dictionary and (2) uses String.Join and Linq Select to provide desired output:
using System.Linq;
using Newtonsoft.Json;
..
var dict = JsonConvert.DeserializeObject<Dictionary<string, string>>(json);
var str = string.Join(',', dict.Select(r => $"{r.Key}={r.Value}"));
str-variable now contains:
account=rf750,type=,amount=31,auth_type=5,balance=2.95,card=re0724
Well thanks everyone for your time and response. your answer led me towards solution and finally i found the following solution which resolved the issue perfectly.
var jObj = (JObject)JsonConvert.DeserializeObject(modelString);
modelString = String.Join("&",jObj.Children().Cast<JProperty>().Select(jp => jp.Name + "="+ HttpUtility.UrlEncode(jp.Value.ToString())));
the above code converts the JSON into a url encoded string and remove the JSON format

Deserializing JSON string into string array

I'm somehow having troubles deserializing a json string into a simple List or string[] (I don't care which).
As of what I know, this is how to do this job:
JsonConvert.DeserializeObject<List<string>>(jsonString);
Here I am getting a RuntimeBinderException. It complains about the parameter, although my json string is valid and simple: a:1:{i:0;s:10:"Sahibinden";}
What you have isn't JSON is a serialized PHP object. There have been some tools that work well with this in C# but there isn't native support. If you own the PHP, then convert the object/array to JSON first. If not try the information on this question: https://stackoverflow.com/a/1923626/474702
Your JSON is invalid. Problems:
a:1 should be inside an object bracket of {}
The : before the { is invalid, you need a , there
The ; just after i:0 is invalid, you need a comma there
You repeat the mistake described in 1. and 2. inside your {} brackets as well
Solution: You need to read about JSON and make sure you understand its syntax.

Deserialize Json with invalid markup

I'm currently trying to tackle a situation where we have invalid Json that we are trying to deserialize. The clinch point is we're supplied Json where property assignment is declared with an = instead of a : character.
Example Json:
{
"Field1" = "Hello",
"Field2" = "Stuff",
"Field3" = "I am non-Json Json, fear me",
"Field4" = 8
}
Has anyone had any luck using Json.Net to deserialize this into an object of relating structure into C# where = is in use instead of :
I've been trying to write a JsonConverter to read past the = but it always complains it's got an = instead of a : and throws an exception with the message "Expected ':' but got: =. Path ''".
I don't see any way past this except for writing my own deserialization process and not using the Json.Net library. which sucks for something so close to being valid Json (But I suppose is fair enough as it is invalid)
When reader.ReadAsString(); is hit it should read Field1 out but obviously it hasn't met its friend the : yet and so proceeds to fall over saying "what the hell is this doing here?!". I've not got any JsonConverter implementation examples because there's really not much to show. just me attempting to use any of the "Read..." methods and failing to do so.
If property assignment is declared with an = instead of a : character then it is not JSON.
If you don't expect any = in the values of the object then you can do a
string json = invalidData.Replace("=", ":");
and then try to parse it.
As mentioned by #Icepickle, there are risks involved in doing this.
My answer works as a quick fix/workaround, but you will eventually need to make sure that the data you are receiving is valid JSON.
There is no point in trying to deserialize invalid JSON.
As suggested by others, the easiest way to get around this issue is to use a simple string replace to change = characters to : within the JSON string prior to parsing it. Of course, if you have any data values that have = characters in them, they will be mangled by the replacement as well.
If you are worried that the data will have = characters in it, you could take this a step further and use Regex to do the replacement. For example, the following Regex will only replace = characters that immediately follow a quoted property name:
string validJson = Regex.Replace(invalidJson, #"(""[^""]+"")\s?=\s?", "$1 : ");
Fiddle: https://dotnetfiddle.net/yvydi2
Another possible solution is to alter the Json.Net source code to allow an = where a : normally appears in valid JSON. This is probably the safest option in terms of parsing, but requires a little more effort. If you want to go this route, download the latest source code from GitHub, and open the solution in Visual Studio 2015. Locate the JsonTextReader class in the root of the project. Inside this class is a private method called ParseProperty. Near the end of the method is some code which looks like this:
if (_chars[_charPos] != ':')
{
throw JsonReaderException.Create(this, "Invalid character after parsing property name. Expected ':' but got: {0}.".FormatWith(CultureInfo.InvariantCulture, _chars[_charPos]));
}
If you change the above if statement to this:
if (_chars[_charPos] != ':' && _chars[_charPos] != '=')
then the reader will allow both : and = characters as separators between property names and values. Save the change, rebuild the library, and you should be able to use it on your "special" JSON.

Categories

Resources