Splitting a txtfile of JSONs into individual strings of JSON objects - c#

I have trouble using Regex to split a text file of JSON objects into string. The array of JSON objects are downloaded from an url and is meant to be processed by some javascript function. But I want to read them in C#. I have downloaded the file and just need to split it into individual JSON objects. The format of the text file is:
{......},{"S":...}
So I want to split it into a string[] so each JSON object is a string:
{"S":...}
{"S":...}
{"S":...}
{"S":...}
I want to leave out the comma that separates them in the original text file.
string[] jsons = Regext.Split(txtfile, "\{\"S\":");
But this doesn't work. How can I split it correctly?

If you aren't aware already this is a great tool http://regexr.com?36u96
Try
string[] splits = Regex.Split(txtfile, #"(?<=\}),");

You can use the JsonTextReader class provided by the Newtonsoft.JSON assembly (acquirable through NuGet).

Related

Split a list of JSON blobs delimited by commas (ignoring commas inside a JSON blob) [duplicate]

This question already has answers here:
Additional text encountered after finished reading JSON content:
(2 answers)
Closed 3 years ago.
Here's a weird one. I'm given an ill-conceived input string that is a list of JSON blobs, separated commas. e.g.:
string input = "{<some JSON object>},{JSON_2},{JSON_3},...,{JSON_n}"
And I have to convert this to an actual list of JSON strings (List<string>).
For context, the unsanitary "input" list of JSONs is read in directly from a .txt file on disk, produced by some other software. I'm writing an "adapter" to allow this data to be consumed by another piece of software that knows how to interpret the individual JSON objects contained within the list. Ideally, the original software could have output one file per JSON object.
The "obvious" solution (using String.Split):
List<string> split = input.Split(',').ToList();
would of course fail to escape commas present within the JSON objects ({}) themselves
I was considering a manual approach - walking the string character-by-character and only splitting out a new element if the count of { is equal to the count of }. Something like:
List<string> JsonBlobs = new List<string>();
int start = 0, nestingLevel = 0;
for (int i = 0; i < input.Length; i++)
{
if (input[i] == '{') nestingLevel++;
else if (input[i] == '}') nestingLevel--;
else if (input[i] == ',' && nestingLevel == 0)
{
JsonBlobs.Add(input.Substring(start, i - start));
start = i + 1;
}
}
(The above likely contains bugs)
I had also considered adding JSON array braces on either end of the string ([]) and letting a JSON serializer deserialize it as a JSON array, then re-serialize each of the array elements one at a time:
List<string> JsonBlobs = Newtonsoft.Json.Linq.JArray.Parse("[" + input + "]")
.Select<Newtonsoft.Json.Linq.JToken, string>(token => token.ToString()).ToList();
But this seems overly-expensive, and could potentially result in newly serialized JSON representations that are not exactly equal to the original string contents.
Any better suggestions?
I'd prefer to use some easily-understandable use of built-in libraries and/or LINQ if possible. Regex would be a last resort, although nifty regex solutions would also be interesting to see.
Trying to parse this out using your own rules is fraught. You noticed the problem where JSON properties are comma-separated, but also bear in mind that JSON values can include strings, which could contain braces and commas, and even quote characters that have nothing to do with the JSON structure.
{"John's comment": "I was all like, \"no way!\" :-}"}
To do it right, you're going to need to write a parser capable of handling all the JSON rules. You're likely to make mistakes, and unlikely to get much value out of the effort you put into it.
I would personally suggest the approach of adding brackets on either side of the string and deserializing the whole thing as a JSON array.
I'd also suggest questioning the requirement to convert the result to a list of strings: Was that requirement based on someone's assumption that producing a list of strings would be simpler than producing a list of JObjects or a list of some specific serialized type?
You can try splitting on:
(?<=}),(?={)
but this of course assumes that a JSON string does not literally contain a sequence of },{ such as:
{"key":"For whatever reason, },{ literally exists in this string"}
it would also fail for an array of objects such as:
{"key1":[{"key2":"value2"},{"key3":"value3"}]}
:-/

Deserialize Json with special character into dictionary

I got some trouble with a problem when use Newtonsoft json.net to deserialize json string to dictionary. It 's a case of my json string have some special character.
string jsonString = "{\"name\":\"Jones Smith\",\"age\":\"20\",\"description\":\"The one live with \"ALIGATOR\"\"}";
Dictionary<string, object> dict = JsonConvert.DeserializeObject<Dictionary<string, object>>(jsonString);
I try to find a solution in the use of json.net but i not found. So the FINAL plan is remove that "characters". So, what is the best solution for this case?
I think you can't do very much in your situation besides changing the format at the origin. The problem with your input is that there are " characters escaped the same way once in your json directly and once in your json values.
Consider the following part: "description":"The one live with "ALIGATOR""
How should a deserializer know which " should be considered part of the value or part of the json format?
I got the answer, like the last comment, that 's not valid JSON, below is valid JSON
{"name":"Jones Smith","age":"20","description":"The one live with \"ALIGATOR\""}
And all i can do is add '\' before special characters if the value of field description is "The one live with "ALIGATOR"" to make a valid JSON and convert to c# like this:
string jsonString = {\"name\":\"Jones Smith\",\"age\":\"20\",\"description\":\"The one live with \\"ALIGATOR\\"\"}

How to convert JSON format plain text to simple plain text

I have a string in plain text which contains brackets like JSON format as it is created using JavaScriptSerializer().Serialize() method. I need to remove brackets and collon and want to convert it into key = value, key = value format.
Need to convert
{
"account":"rf750",
"type":null,
"amount":"31",
"auth_type":"5",
"balance":"2.95",
"card":"re0724"
}
to
'account=rf750,type=null,amount=31,authe=5,balanc=2.95,card=re0724'
Well, you've got three different things going on here.
The first, and surface issue, is: how do you change the string?
Simple - you do some string substitutions, preferably using Regex. Remove the starting/ending braces, change [a]:"[b]", to [a]=[b], - or however you want the final format to look like.
The second, and slightly deeper issue is: JSON isn't just a simple list of keys=values. You can have nesting. You can have non-string data. Simply saying you want to change the JSON result to key=value,key=value,key=value, etc - is fragile. How do you know the JSON structure will be what you're expecting? JSON Serialization will serialize successfully even if you've got nested structures, non string/int data, etc. And if you want solid code that doesn't easily break, you have to figure out: how do I handle this? Can I handle this?
The third, and final thing is: you're taking a standard data format schema and figuring out how to translate it to a nonstandard data format. 90% of the time someone does that, they deserve to be shot. Seriously, spend some solid time asking yourself whether you can use the JSON as-is, and whether the process wanting key=value,key=value,etc can be changed to use an actual standardized data format.
Here is simple solution which (1) parses json to Dictionary and (2) uses String.Join and Linq Select to provide desired output:
using System.Linq;
using Newtonsoft.Json;
..
var dict = JsonConvert.DeserializeObject<Dictionary<string, string>>(json);
var str = string.Join(',', dict.Select(r => $"{r.Key}={r.Value}"));
str-variable now contains:
account=rf750,type=,amount=31,auth_type=5,balance=2.95,card=re0724
Well thanks everyone for your time and response. your answer led me towards solution and finally i found the following solution which resolved the issue perfectly.
var jObj = (JObject)JsonConvert.DeserializeObject(modelString);
modelString = String.Join("&",jObj.Children().Cast<JProperty>().Select(jp => jp.Name + "="+ HttpUtility.UrlEncode(jp.Value.ToString())));
the above code converts the JSON into a url encoded string and remove the JSON format

Reading more json strings from one file

I've got two different json files and would like to merge them and read json strings from one file.
{"ipAddress":"1.1.1.1","port":80, "protocol":"http"}
{"ipAddress":"1.1.1.1", "domainName":"domain.com"}
I tried something, but it still doesn't work properly. I tried array and also the following structure:
{"jsonString1": {"ipAddress":"1.1.1.1","port":80, "protocol":"http"},
"jsonString2": {"ipAddress":"1.1.1.1", "domainName":"domain.com"}}
Not sure if the structure is correct. I just need to get "jsonString1", "jsonString2" separately so I don't need to use more json files.
Your 1st fragment is non standard (effectively, not JSON).
Your 2nd IS standard, but is an object, not an array.
If you want an array, use an array:
[{"ipAddress":"1.1.1.1","port":80, "protocol":"http"},
{"ipAddress":"1.1.1.1", "domainName":"domain.com"}]
Alternatively, if you want to use your 2nd version (which is an object), you can access the 2 "sub-objects" by keys: myObj.jsonString1, myObj.jsonString2. BTW, A better name would be "Obj1" & "Obj2" since these are not strings, they're actual objects.
Use JSON Array to keep those files merged.For example
Conside,You have an JSON array for URL then and you want to print it in a paragraph then
<p id="url"></p>
declare a variable like this
var URL=[{"ipAddress":"1.1.1.1","port":80, "protocol":"http"},
{"ipAddress":"1.1.1.1", "domainName":"domain.com"}]
you can access these array by using
document.getElementById("url").innerHTML =
URL[0].ipAddress+ " " + URL[0].port+ " " +URL[0].protocol;
You can get the array values by using the index values
Thank you. I have finally used this syntax:
json:
{"jsonString1": {"ipAddress":"1.1.1.1","port":80, "protocol":"http"},
"jsonString2": {"ipAddress":"1.1.1.1", "domainName":"domain.com"}}
read the string using this command:
var jsonObject = JObject.Parse(jsonData);
string value = jsonObject["jsonString1"].ToString();

Need help using a "root"-less JSON object with Json.Net and JArrays

I'm using the ExportAPI from MailChimp. It sends back a "root"-less Json string like so:
["Email Address", "First Name", "Last Name"]
["jeff#mydomain.com", "Jeff", "Johnson"]
["tubbs#mydomain.com", "Tubbs", "McGraw"]
No brackets, nothing- just a couple of arrays. Loading it into a JArray only picks up the first object:
JArray jsonArray = new JArray.Parse(jsonResponse);
Console.WriteLine(jsonArray);
//Outputs:
//Email Address
//First Name
//Last Name
I'm hoping to copy the contents of the string into a database and need to access the response with LINQ. Any suggestions as to the correct way to work with a Json Object like I've shown above (using Json.net or otherwise?)
Pad the string with a root element, just add '[' and ']'?
This behavior is actually completely on purpose as mentioned in the docs. The reasoning is that a full dump of a list's data can easily be way too large to consistently fit it in memory and parse it. As such, and given the return format, you're expected to use a newline as the delimiter (or read it off the wire that way), parse each object individually, and then do whatever you need with them.
I am not familiar with doing that in C#/Linq, but the PHP example on the docs page does exactly that.

Categories

Resources